Monday, 11 November 2013

Memory hierarchy







[zorang@centos6 x86_64-linux-gnu]$  numactl --membind=0 --cpunodebind=0 ./lat_mem_rd 2000 128
"stride=128
0.00049 1.205
0.00098 1.198
0.00195 1.195
0.00293 1.209
0.00391 1.211
0.00586 1.201
0.00781 1.199
0.01172 1.201
0.01562 1.194
0.02344 1.200
0.03125 1.217
0.04688 3.523
0.06250 3.646
0.09375 3.616
0.12500 3.611
0.18750 3.658
0.25000 4.928
0.37500 5.837
0.50000 5.791
0.75000 5.843
1.00000 5.883
1.50000 5.959
2.00000 5.983
3.00000 6.174
4.00000 9.150
6.00000 15.852
8.00000 19.982
12.00000 21.567
16.00000 21.585
24.00000 21.735
32.00000 21.610
48.00000 22.535
64.00000 22.093
96.00000 22.033
128.00000 22.608
192.00000 21.498
256.00000 21.594
384.00000 21.492
512.00000 21.473
768.00000 22.752
1024.00000 22.462


Can easily see when cache is no longer effective:


1. After 32KB L1 cache full latency increases
2. After 256KB L2 cache full latency increases
3. After 6MB L3 cache full latency increases to main memory latency
* Please note numactl --membind=0 --cpunodebind=0 bind CPU and memory so all memory is local (important for NUMA based servers)

No comments:

Post a Comment