Skip to content

Instantly share code, notes, and snippets.

@vsrinivas
Created November 23, 2022 23:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vsrinivas/1e30df3bf4d9fd2b6af336d57edaece1 to your computer and use it in GitHub Desktop.
Save vsrinivas/1e30df3bf4d9fd2b6af336d57edaece1 to your computer and use it in GitHub Desktop.
always [madvise] never
mode: always
[always] madvise never
Initializing array made of 33554432 64-bit words (256.00 MiB).
Source pages allocated with transparent hugepages: 100.0% (65536 pages, 100.0% flagged)
Legend:
BandW: Implied bandwidth (assuming 64-byte cache line) in MB/s
% Eff: Effectiness of this lane count compared to the prior, as a % of ideal
Speedup: Speedup factor for this many lanes versus one lane
Applying Sattolo's algorithm... chain total: 33554432
Time to sum up the array (linear scan) 0.029 s (x 8 = 0.229 s), bandwidth = 8962.3 MB/s
Size: 33554432 (262144.00 KiB, 256.00 MiB)
---------------------------------------------------------------------
- # of lanes --- time (s) ---- BandW -- ns/hit -- % Eff -- Speedup --
---------------------------------------------------------------------
1 2.630047 779 78.4 0% 1.0
2 1.330493 1539 39.7 99% 2.0
3 0.906941 2258 27.0 96% 2.9
4 0.692072 2959 20.6 95% 3.8
5 0.564003 3631 16.8 93% 4.7
6 0.483775 4233 14.4 85% 5.4
7 0.432110 4740 12.9 75% 6.1
8 0.377198 5430 11.2 102% 7.0
9 0.345835 5922 10.3 75% 7.6
10 0.328327 6238 9.8 51% 8.0
11 0.308601 6636 9.2 66% 8.5
12 0.297754 6878 8.9 42% 8.8
13 0.291655 7022 8.7 27% 9.0
14 0.282194 7257 8.4 45% 9.3
15 0.288255 7105 8.6 -32% 9.1
16 0.287719 7118 8.6 3% 9.1
17 0.286870 7139 8.5 5% 9.2
18 0.284358 7202 8.5 16% 9.2
19 0.289436 7076 8.6 -34% 9.1
20 0.295760 6925 8.8 -44% 8.9
21 0.289971 7063 8.6 41% 9.1
22 0.291660 7022 8.7 -13% 9.0
23 0.284008 7211 8.5 60% 9.3
24 0.289926 7064 8.6 -50% 9.1
25 0.292180 7009 8.7 -19% 9.0
26 0.288586 7097 8.6 32% 9.1
27 0.288376 7102 8.6 2% 9.1
28 0.297003 6896 8.9 -84% 8.9
29 0.292822 6994 8.7 41% 9.0
30 0.293671 6974 8.8 -9% 9.0
31 0.289456 7075 8.6 44% 9.1
32 0.287310 7128 8.6 24% 9.2
33 0.295048 6941 8.8 -89% 8.9
34 0.295525 6930 8.8 -6% 8.9
35 0.296414 6909 8.8 -11% 8.9
36 0.287624 7120 8.6 107% 9.1
37 0.295471 6931 8.8 -101% 8.9
38 0.298230 6867 8.9 -35% 8.8
39 0.294439 6956 8.8 50% 8.9
40 0.294616 6951 8.8 -2% 8.9
Maybe you have about 14 parallel paths?
mode: never
always madvise [never]
Initializing array made of 33554432 64-bit words (256.00 MiB).
Source pages allocated with transparent hugepages: 0.0% (65536 pages, 100.0% flagged)
Legend:
BandW: Implied bandwidth (assuming 64-byte cache line) in MB/s
% Eff: Effectiness of this lane count compared to the prior, as a % of ideal
Speedup: Speedup factor for this many lanes versus one lane
Applying Sattolo's algorithm... chain total: 33554432
Time to sum up the array (linear scan) 0.029 s (x 8 = 0.232 s), bandwidth = 8837.3 MB/s
Size: 33554432 (262144.00 KiB, 256.00 MiB)
---------------------------------------------------------------------
- # of lanes --- time (s) ---- BandW -- ns/hit -- % Eff -- Speedup --
---------------------------------------------------------------------
1 2.980930 687 88.8 0% 1.0
2 1.505097 1361 44.9 99% 2.0
3 1.023700 2001 30.5 96% 2.9
4 0.783129 2615 23.3 94% 3.8
5 0.649609 3153 19.4 85% 4.6
6 0.556561 3680 16.6 86% 5.4
7 0.490677 4174 14.6 83% 6.1
8 0.442962 4623 13.2 78% 6.7
9 0.401537 5100 12.0 84% 7.4
10 0.371809 5508 11.1 74% 8.0
11 0.349054 5867 10.4 67% 8.5
12 0.334601 6121 10.0 50% 8.9
13 0.323253 6336 9.6 44% 9.2
14 0.319347 6413 9.5 17% 9.3
15 0.307541 6659 9.2 55% 9.7
16 0.305902 6695 9.1 9% 9.7
17 0.311355 6578 9.3 -30% 9.6
18 0.314503 6512 9.4 -18% 9.5
19 0.314078 6521 9.4 3% 9.5
20 0.312483 6554 9.3 10% 9.5
21 0.315485 6492 9.4 -20% 9.4
22 0.308035 6649 9.2 52% 9.7
23 0.310508 6596 9.3 -18% 9.6
24 0.315087 6500 9.4 -35% 9.5
25 0.314735 6507 9.4 3% 9.5
26 0.317841 6443 9.5 -26% 9.4
27 0.318662 6427 9.5 -7% 9.4
28 0.316672 6467 9.4 17% 9.4
29 0.317025 6460 9.4 -3% 9.4
30 0.315062 6500 9.4 19% 9.5
31 0.316171 6478 9.4 -11% 9.4
32 0.319314 6414 9.5 -32% 9.3
33 0.320902 6382 9.6 -16% 9.3
34 0.321286 6374 9.6 -4% 9.3
35 0.323953 6322 9.7 -29% 9.2
36 0.326638 6270 9.7 -30% 9.1
37 0.324422 6313 9.7 25% 9.2
38 0.321375 6373 9.6 36% 9.3
39 0.320386 6392 9.5 12% 9.3
40 0.323924 6322 9.7 -44% 9.2
Maybe you have about 15 parallel paths?
Done.
Restauring hugepages to madvise
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment