Skip to content

Instantly share code, notes, and snippets.

@rygorous
Last active May 25, 2020 05:15
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rygorous/e6076b706ad1f423f4fbc79227b72fb2 to your computer and use it in GitHub Desktop.
Save rygorous/e6076b706ad1f423f4fbc79227b72fb2 to your computer and use it in GitHub Desktop.
---- On Ryzen 3950X
SimpleProf :seconds calls count : clk/call clk/count
search_one : 0.3081 1 8388480 : 1078358365.0 128.55
search_one_pf : 0.3415 1 8388480 : 1195232920.0 142.49 <-- speculative prefetching (next options for L and R)
search_multi2 : 0.3062 1 8388480 : 1071663705.0 127.75
search_multi4 : 0.2454 1 8388480 : 859008465.0 102.40
search_multi8 : 0.2113 1 8388480 : 739533515.0 88.16
search_multi16 : 0.1996 1 8388480 : 698443550.0 83.26
search_multi32 : 0.1785 1 8388480 : 624785840.0 74.48
search_multi64 : 0.1669 1 8388480 : 584228540.0 69.65
search_multi4_pf2 : 0.2475 1 8388480 : 866103490.0 103.25 <-- non-speculative; just PFing the element that will get accessed next iter
search_multi8_pf2 : 0.2137 1 8388480 : 748055350.0 89.18
search_multi16_pf2 : 0.1799 1 8388480 : 629659030.0 75.06
search_multi32_pf2 : 0.1568 1 8388480 : 548770005.0 65.42
search_multi64_pf2 : 0.1392 1 8388480 : 487367720.0 58.10
search_multi128_pf2: 0.1440 1 8388480 : 503960800.0 60.08
---- On Core i9-7920X (SKX)
SimpleProf :seconds calls count : clk/call clk/count
search_one : 0.4269 1 8388480 : 1239652192.0 147.78
search_one_pf : 0.3989 1 8388480 : 1158358878.0 138.09 <-- speculative prefetch
search_multi2 : 0.4534 1 8388480 : 1316580511.0 156.95
search_multi4 : 0.3409 1 8388480 : 989846715.0 118.00
search_multi8 : 0.2749 1 8388480 : 798245619.0 95.16
search_multi16 : 0.2721 1 8388480 : 790088042.0 94.19
search_multi32 : 0.2517 1 8388480 : 730975821.0 87.14
search_multi64 : 0.2453 1 8388480 : 712385765.0 84.92
search_multi4_pf2 : 0.3635 1 8388480 : 1055531428.0 125.83
search_multi8_pf2 : 0.2790 1 8388480 : 810197085.0 96.58
search_multi16_pf2 : 0.2303 1 8388480 : 668700214.0 79.72
search_multi32_pf2 : 0.2111 1 8388480 : 613169104.0 73.10
search_multi64_pf2 : 0.2057 1 8388480 : 597329179.0 71.21
search_multi128_pf2: 0.2143 1 8388480 : 622364258.0 74.19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment