Last active
August 29, 2015 14:04
-
-
Save rygorous/153ea6493ce2efabd41c to your computer and use it in GitHub Desktop.
Memory operations test
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Code (and Windows binary) here: https://github.com/rygorous/atomic_ops_test | |
I'd appreciate it if some people could run this and post their results here in a | |
comment, with a short description of what CPU they're using. | |
UPDATE: I have recent Intel CPUs (released within the last 3 years) pretty well covered | |
by now, so if that's what you have in your machine, don't bother running the test. But | |
I'd love to get some more data points for older Intel CPUs and AMD parts! | |
Results so far: | |
---- | |
AMD Phenom II 925 (AMD 10h) @ 2.8GHz | |
interference type: none | |
add: 1.88 cycles/op | |
add_mfence: 36.80 cycles/op | |
lockadd: 18.01 cycles/op | |
xadd: 17.01 cycles/op | |
swap: 16.01 cycles/op | |
cmpxchg: 18.02 cycles/op | |
lockadd_unalign: 137.48 cycles/op | |
interference type: hyperthread_read_line | |
add: 13.87 cycles/op | |
add_mfence: 122.74 cycles/op | |
lockadd: 131.68 cycles/op | |
xadd: 17.01 cycles/op | |
swap: 16.01 cycles/op | |
cmpxchg: 18.02 cycles/op | |
lockadd_unalign: 345.44 cycles/op | |
interference type: hyperthread_write_line | |
add: 14.12 cycles/op | |
add_mfence: 132.87 cycles/op | |
lockadd: 114.37 cycles/op | |
xadd: 131.32 cycles/op | |
swap: 16.06 cycles/op | |
cmpxchg: 18.26 cycles/op | |
lockadd_unalign: 344.94 cycles/op | |
interference type: other_core_read_line | |
add: 1.88 cycles/op | |
add_mfence: 36.85 cycles/op | |
lockadd: 18.01 cycles/op | |
xadd: 17.01 cycles/op | |
swap: 16.01 cycles/op | |
cmpxchg: 18.01 cycles/op | |
lockadd_unalign: 136.15 cycles/op | |
interference type: other_core_write_line | |
add: 1.88 cycles/op | |
add_mfence: 37.07 cycles/op | |
lockadd: 18.01 cycles/op | |
xadd: 17.01 cycles/op | |
swap: 16.01 cycles/op | |
cmpxchg: 18.05 cycles/op | |
lockadd_unalign: 135.07 cycles/op | |
interference type: three_cores_read_line | |
add: 13.84 cycles/op | |
add_mfence: 119.52 cycles/op | |
lockadd: 130.07 cycles/op | |
xadd: 161.28 cycles/op | |
swap: 135.72 cycles/op | |
cmpxchg: 143.83 cycles/op | |
lockadd_unalign: 342.39 cycles/op | |
interference type: three_cores_write_line | |
add: 13.58 cycles/op | |
add_mfence: 130.97 cycles/op | |
lockadd: 18.05 cycles/op | |
xadd: 17.01 cycles/op | |
swap: 16.01 cycles/op | |
cmpxchg: 18.26 cycles/op | |
lockadd_unalign: 135.08 cycles/op | |
---- | |
AMD FX 8350 (Piledriver) @ 4.0GHz | |
interference type: none | |
add: 2.00 cycles/op | |
add_mfence: 94.95 cycles/op | |
lockadd: 43.18 cycles/op | |
xadd: 42.00 cycles/op | |
swap: 42.18 cycles/op | |
cmpxchg: 45.94 cycles/op | |
lockadd_unalign: 216.35 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.00 cycles/op | |
add_mfence: 98.13 cycles/op | |
lockadd: 95.56 cycles/op | |
xadd: 93.84 cycles/op | |
swap: 93.82 cycles/op | |
cmpxchg: 94.22 cycles/op | |
lockadd_unalign: 274.09 cycles/op | |
interference type: hyperthread_write_line | |
add: 25.08 cycles/op | |
add_mfence: 177.02 cycles/op | |
lockadd: 142.13 cycles/op | |
xadd: 149.08 cycles/op | |
swap: 150.16 cycles/op | |
cmpxchg: 3697.73 cycles/op | |
lockadd_unalign: 259.23 cycles/op | |
interference type: other_core_read_line | |
add: 6.42 cycles/op | |
add_mfence: 393.58 cycles/op | |
lockadd: 216.35 cycles/op | |
xadd: 391.48 cycles/op | |
swap: 42.05 cycles/op | |
cmpxchg: 45.93 cycles/op | |
lockadd_unalign: 213.56 cycles/op | |
interference type: other_core_write_line | |
add: 2.00 cycles/op | |
add_mfence: 473.65 cycles/op | |
lockadd: 387.04 cycles/op | |
xadd: 378.94 cycles/op | |
swap: 396.20 cycles/op | |
cmpxchg: 942.90 cycles/op | |
lockadd_unalign: 655.72 cycles/op | |
interference type: three_cores_read_line | |
add: 13.53 cycles/op | |
add_mfence: 835.01 cycles/op | |
lockadd: 443.42 cycles/op | |
xadd: 580.89 cycles/op | |
swap: 834.23 cycles/op | |
cmpxchg: 1048.45 cycles/op | |
lockadd_unalign: 968.52 cycles/op | |
interference type: three_cores_write_line | |
add: 82.73 cycles/op | |
add_mfence: 905.74 cycles/op | |
lockadd: 825.68 cycles/op | |
xadd: 844.77 cycles/op | |
swap: 827.17 cycles/op | |
cmpxchg: 2491.23 cycles/op | |
lockadd_unalign: 1140.24 cycles/op | |
---- | |
Intel Atom D510 (1.66GHz, 2 core, 4 thread) on Linux (port here: https://github.com/maxburke/atomic_ops_test): | |
interference type: none | |
add: 2.88 cycles/op | |
add_mfence: 5.75 cycles/op | |
lockadd: 2.88 cycles/op | |
xadd: 7.83 cycles/op | |
swap: 7.67 cycles/op | |
cmpxchg: 21.50 cycles/op | |
lockadd_unalign: 181.44 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.88 cycles/op | |
add_mfence: 5.75 cycles/op | |
lockadd: 2.88 cycles/op | |
xadd: 7.83 cycles/op | |
swap: 7.67 cycles/op | |
cmpxchg: 21.50 cycles/op | |
lockadd_unalign: 181.41 cycles/op | |
interference type: hyperthread_write_line | |
add: 5.00 cycles/op | |
add_mfence: 4.63 cycles/op | |
lockadd: 5.00 cycles/op | |
xadd: 7.75 cycles/op | |
swap: 7.25 cycles/op | |
cmpxchg: 24.02 cycles/op | |
lockadd_unalign: 180.20 cycles/op | |
interference type: other_core_read_line | |
add: 2.88 cycles/op | |
add_mfence: 5.75 cycles/op | |
lockadd: 2.88 cycles/op | |
xadd: 7.83 cycles/op | |
swap: 7.67 cycles/op | |
cmpxchg: 21.50 cycles/op | |
lockadd_unalign: 181.40 cycles/op | |
interference type: other_core_write_line | |
add: 36.12 cycles/op | |
add_mfence: 70.96 cycles/op | |
lockadd: 35.50 cycles/op | |
xadd: 96.41 cycles/op | |
swap: 68.43 cycles/op | |
cmpxchg: 348.55 cycles/op | |
lockadd_unalign: 209.81 cycles/op | |
interference type: three_cores_read_line | |
add: 2.88 cycles/op | |
add_mfence: 5.75 cycles/op | |
lockadd: 2.88 cycles/op | |
xadd: 7.83 cycles/op | |
swap: 7.67 cycles/op | |
cmpxchg: 21.50 cycles/op | |
lockadd_unalign: 185.16 cycles/op | |
interference type: three_cores_write_line | |
add: 36.19 cycles/op | |
add_mfence: 71.41 cycles/op | |
lockadd: 36.28 cycles/op | |
xadd: 96.52 cycles/op | |
swap: 68.35 cycles/op | |
cmpxchg: 344.29 cycles/op | |
lockadd_unalign: 209.89 cycles/op | |
---- | |
Intel Core i7-920 (Bloomfield [NHM derived]) | |
interference type: none | |
add: 2.74 cycles/op | |
add_mfence: 43.10 cycles/op | |
lockadd: 20.63 cycles/op | |
xadd: 20.70 cycles/op | |
swap: 20.72 cycles/op | |
cmpxchg: 17.74 cycles/op | |
lockadd_unalign: 1218.97 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.05 cycles/op | |
add_mfence: 43.85 cycles/op | |
lockadd: 21.61 cycles/op | |
xadd: 21.43 cycles/op | |
swap: 20.04 cycles/op | |
cmpxchg: 20.89 cycles/op | |
lockadd_unalign: 1216.35 cycles/op | |
interference type: hyperthread_write_line | |
add: 4.58 cycles/op | |
add_mfence: 36.71 cycles/op | |
lockadd: 22.11 cycles/op | |
xadd: 21.99 cycles/op | |
swap: 22.36 cycles/op | |
cmpxchg: 44.44 cycles/op | |
lockadd_unalign: 1236.09 cycles/op | |
interference type: other_core_read_line | |
add: 3.65 cycles/op | |
add_mfence: 156.83 cycles/op | |
lockadd: 85.56 cycles/op | |
xadd: 83.32 cycles/op | |
swap: 83.72 cycles/op | |
cmpxchg: 141.47 cycles/op | |
lockadd_unalign: 1216.30 cycles/op | |
interference type: other_core_write_line | |
add: 4.66 cycles/op | |
add_mfence: 117.53 cycles/op | |
lockadd: 63.93 cycles/op | |
xadd: 61.09 cycles/op | |
swap: 61.17 cycles/op | |
cmpxchg: 145.92 cycles/op | |
lockadd_unalign: 1224.23 cycles/op | |
interference type: three_cores_read_line | |
add: 3.71 cycles/op | |
add_mfence: 227.06 cycles/op | |
lockadd: 138.89 cycles/op | |
xadd: 130.12 cycles/op | |
swap: 133.46 cycles/op | |
cmpxchg: 214.04 cycles/op | |
lockadd_unalign: 1234.76 cycles/op | |
interference type: three_cores_write_line | |
add: 5.51 cycles/op | |
add_mfence: 211.07 cycles/op | |
lockadd: 121.84 cycles/op | |
xadd: 120.34 cycles/op | |
swap: 119.18 cycles/op | |
cmpxchg: 227.27 cycles/op | |
lockadd_unalign: 1265.16 cycles/op | |
---- | |
Intel Core i3-2310M (Sandy Bridge) @ 2.1GHz | |
interference type: none | |
add: 2.08 cycles/op | |
add_mfence: 51.03 cycles/op | |
lockadd: 22.48 cycles/op | |
xadd: 22.58 cycles/op | |
swap: 22.84 cycles/op | |
cmpxchg: 23.22 cycles/op | |
lockadd_unalign: 571.91 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.12 cycles/op | |
add_mfence: 51.49 cycles/op | |
lockadd: 24.57 cycles/op | |
xadd: 36.03 cycles/op | |
swap: 29.83 cycles/op | |
cmpxchg: 30.42 cycles/op | |
lockadd_unalign: 597.80 cycles/op | |
interference type: hyperthread_write_line | |
add: 6.83 cycles/op | |
add_mfence: 55.87 cycles/op | |
lockadd: 31.54 cycles/op | |
xadd: 34.54 cycles/op | |
swap: 35.70 cycles/op | |
cmpxchg: 114.43 cycles/op | |
lockadd_unalign: 585.72 cycles/op | |
interference type: other_core_read_line | |
add: 2.09 cycles/op | |
add_mfence: 56.78 cycles/op | |
lockadd: 110.71 cycles/op | |
xadd: 107.70 cycles/op | |
swap: 26.68 cycles/op | |
cmpxchg: 115.62 cycles/op | |
lockadd_unalign: 570.43 cycles/op | |
interference type: other_core_write_line | |
add: 4.52 cycles/op | |
add_mfence: 51.98 cycles/op | |
lockadd: 24.85 cycles/op | |
xadd: 22.84 cycles/op | |
swap: 22.94 cycles/op | |
cmpxchg: 23.27 cycles/op | |
lockadd_unalign: 597.92 cycles/op | |
interference type: three_cores_read_line | |
add: 2.65 cycles/op | |
add_mfence: 107.83 cycles/op | |
lockadd: 99.94 cycles/op | |
xadd: 108.54 cycles/op | |
swap: 98.83 cycles/op | |
cmpxchg: 114.97 cycles/op | |
lockadd_unalign: 589.32 cycles/op | |
interference type: three_cores_write_line | |
add: 4.52 cycles/op | |
add_mfence: 178.42 cycles/op | |
lockadd: 52.71 cycles/op | |
xadd: 151.44 cycles/op | |
swap: 133.44 cycles/op | |
cmpxchg: 23.26 cycles/op | |
lockadd_unalign: 577.85 cycles/op | |
---- | |
Intel Core i5-2400 (Sandy Bridge) @ 3.10GHz | |
interference type: none | |
add: 1.69 cycles/op | |
add_mfence: 44.08 cycles/op | |
lockadd: 25.19 cycles/op | |
xadd: 25.19 cycles/op | |
swap: 24.46 cycles/op | |
cmpxchg: 25.19 cycles/op | |
lockadd_unalign: 615.60 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.23 cycles/op | |
add_mfence: 66.85 cycles/op | |
lockadd: 25.19 cycles/op | |
xadd: 25.19 cycles/op | |
swap: 24.50 cycles/op | |
cmpxchg: 87.92 cycles/op | |
lockadd_unalign: 613.58 cycles/op | |
interference type: hyperthread_write_line | |
add: 1.69 cycles/op | |
add_mfence: 44.73 cycles/op | |
lockadd: 25.19 cycles/op | |
xadd: 25.19 cycles/op | |
swap: 24.46 cycles/op | |
cmpxchg: 25.19 cycles/op | |
lockadd_unalign: 614.62 cycles/op | |
interference type: other_core_read_line | |
add: 2.22 cycles/op | |
add_mfence: 44.56 cycles/op | |
lockadd: 25.19 cycles/op | |
xadd: 25.22 cycles/op | |
swap: 24.46 cycles/op | |
cmpxchg: 25.23 cycles/op | |
lockadd_unalign: 614.30 cycles/op | |
interference type: other_core_write_line | |
add: 1.69 cycles/op | |
add_mfence: 44.63 cycles/op | |
lockadd: 159.29 cycles/op | |
xadd: 25.19 cycles/op | |
swap: 24.46 cycles/op | |
cmpxchg: 25.19 cycles/op | |
lockadd_unalign: 614.38 cycles/op | |
interference type: three_cores_read_line | |
add: 2.23 cycles/op | |
add_mfence: 113.23 cycles/op | |
lockadd: 108.38 cycles/op | |
xadd: 108.40 cycles/op | |
swap: 108.28 cycles/op | |
cmpxchg: 25.19 cycles/op | |
lockadd_unalign: 612.58 cycles/op | |
interference type: three_cores_write_line | |
add: 3.56 cycles/op | |
add_mfence: 44.53 cycles/op | |
lockadd: 25.90 cycles/op | |
xadd: 160.24 cycles/op | |
swap: 142.26 cycles/op | |
cmpxchg: 25.23 cycles/op | |
lockadd_unalign: 616.72 cycles/op | |
---- | |
Intel Core i7-2677M (Sandy Bridge) @ 1.80 GHz | |
interference type: none | |
add: 1.51 cycles/op | |
add_mfence: 35.09 cycles/op | |
lockadd: 15.48 cycles/op | |
xadd: 15.32 cycles/op | |
swap: 15.78 cycles/op | |
cmpxchg: 15.99 cycles/op | |
lockadd_unalign: 410.44 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.49 cycles/op | |
add_mfence: 34.68 cycles/op | |
lockadd: 15.48 cycles/op | |
xadd: 21.78 cycles/op | |
swap: 24.98 cycles/op | |
cmpxchg: 15.56 cycles/op | |
lockadd_unalign: 450.96 cycles/op | |
interference type: hyperthread_write_line | |
add: 1.51 cycles/op | |
add_mfence: 35.05 cycles/op | |
lockadd: 15.50 cycles/op | |
xadd: 15.53 cycles/op | |
swap: 15.81 cycles/op | |
cmpxchg: 15.94 cycles/op | |
lockadd_unalign: 408.41 cycles/op | |
interference type: other_core_read_line | |
add: 1.51 cycles/op | |
add_mfence: 35.07 cycles/op | |
lockadd: 15.48 cycles/op | |
xadd: 15.33 cycles/op | |
swap: 15.78 cycles/op | |
cmpxchg: 15.98 cycles/op | |
lockadd_unalign: 408.03 cycles/op | |
interference type: other_core_write_line | |
add: 3.35 cycles/op | |
add_mfence: 118.20 cycles/op | |
lockadd: 100.99 cycles/op | |
xadd: 103.57 cycles/op | |
swap: 106.19 cycles/op | |
cmpxchg: 251.31 cycles/op | |
lockadd_unalign: 405.65 cycles/op | |
interference type: three_cores_read_line | |
add: 1.85 cycles/op | |
add_mfence: 70.64 cycles/op | |
lockadd: 73.13 cycles/op | |
xadd: 73.48 cycles/op | |
swap: 68.97 cycles/op | |
cmpxchg: 71.28 cycles/op | |
lockadd_unalign: 439.66 cycles/op | |
interference type: three_cores_write_line | |
add: 3.36 cycles/op | |
add_mfence: 121.96 cycles/op | |
lockadd: 90.12 cycles/op | |
xadd: 91.66 cycles/op | |
swap: 90.41 cycles/op | |
cmpxchg: 230.79 cycles/op | |
lockadd_unalign: 405.44 cycles/op | |
---- | |
Intel Core i7-2600K (Sandy Bridge) @ 3.4GHz | |
interference type: none | |
add: 2.11 cycles/op | |
add_mfence: 50.19 cycles/op | |
lockadd: 22.32 cycles/op | |
xadd: 22.22 cycles/op | |
swap: 22.53 cycles/op | |
cmpxchg: 22.86 cycles/op | |
lockadd_unalign: 648.10 cycles/op | |
interference type: hyperthread_read_line | |
add: 2.12 cycles/op | |
add_mfence: 50.24 cycles/op | |
lockadd: 32.74 cycles/op | |
xadd: 39.23 cycles/op | |
swap: 29.51 cycles/op | |
cmpxchg: 29.36 cycles/op | |
lockadd_unalign: 682.59 cycles/op | |
interference type: hyperthread_write_line | |
add: 6.97 cycles/op | |
add_mfence: 54.63 cycles/op | |
lockadd: 53.94 cycles/op | |
xadd: 36.98 cycles/op | |
swap: 35.85 cycles/op | |
cmpxchg: 131.69 cycles/op | |
lockadd_unalign: 652.76 cycles/op | |
interference type: other_core_read_line | |
add: 2.62 cycles/op | |
add_mfence: 103.76 cycles/op | |
lockadd: 108.31 cycles/op | |
xadd: 108.12 cycles/op | |
swap: 101.97 cycles/op | |
cmpxchg: 113.30 cycles/op | |
lockadd_unalign: 648.32 cycles/op | |
interference type: other_core_write_line | |
add: 4.50 cycles/op | |
add_mfence: 171.69 cycles/op | |
lockadd: 139.92 cycles/op | |
xadd: 140.46 cycles/op | |
swap: 146.66 cycles/op | |
cmpxchg: 360.81 cycles/op | |
lockadd_unalign: 647.92 cycles/op | |
interference type: three_cores_read_line | |
add: 2.72 cycles/op | |
add_mfence: 123.66 cycles/op | |
lockadd: 134.83 cycles/op | |
xadd: 134.35 cycles/op | |
swap: 132.47 cycles/op | |
cmpxchg: 136.49 cycles/op | |
lockadd_unalign: 646.96 cycles/op | |
interference type: three_cores_write_line | |
add: 11.21 cycles/op | |
add_mfence: 412.56 cycles/op | |
lockadd: 331.98 cycles/op | |
xadd: 337.59 cycles/op | |
swap: 383.45 cycles/op | |
cmpxchg: 5916.89 cycles/op | |
lockadd_unalign: 733.50 cycles/op | |
---- | |
Intel Core i5-3427U (Ivy Bridge) @ 1.8Ghz | |
interference type: none | |
add: 1.82 cycles/op | |
add_mfence: 45.40 cycles/op | |
lockadd: 19.11 cycles/op | |
xadd: 19.09 cycles/op | |
swap: 19.30 cycles/op | |
cmpxchg: 18.99 cycles/op | |
lockadd_unalign: 628.82 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.85 cycles/op | |
add_mfence: 45.42 cycles/op | |
lockadd: 19.13 cycles/op | |
xadd: 19.07 cycles/op | |
swap: 19.41 cycles/op | |
cmpxchg: 19.01 cycles/op | |
lockadd_unalign: 621.71 cycles/op | |
interference type: hyperthread_write_line | |
add: 1.82 cycles/op | |
add_mfence: 45.63 cycles/op | |
lockadd: 32.25 cycles/op | |
xadd: 19.08 cycles/op | |
swap: 19.41 cycles/op | |
cmpxchg: 19.01 cycles/op | |
lockadd_unalign: 613.85 cycles/op | |
interference type: other_core_read_line | |
add: 2.28 cycles/op | |
add_mfence: 87.28 cycles/op | |
lockadd: 21.03 cycles/op | |
xadd: 19.07 cycles/op | |
swap: 19.34 cycles/op | |
cmpxchg: 19.02 cycles/op | |
lockadd_unalign: 621.71 cycles/op | |
interference type: other_core_write_line | |
add: 1.82 cycles/op | |
add_mfence: 45.41 cycles/op | |
lockadd: 19.10 cycles/op | |
xadd: 19.05 cycles/op | |
swap: 19.30 cycles/op | |
cmpxchg: 18.99 cycles/op | |
lockadd_unalign: 628.41 cycles/op | |
interference type: three_cores_read_line | |
add: 2.28 cycles/op | |
add_mfence: 97.28 cycles/op | |
lockadd: 96.07 cycles/op | |
xadd: 96.23 cycles/op | |
swap: 94.17 cycles/op | |
cmpxchg: 96.80 cycles/op | |
lockadd_unalign: 619.27 cycles/op | |
interference type: three_cores_write_line | |
add: 4.29 cycles/op | |
add_mfence: 154.34 cycles/op | |
lockadd: 19.27 cycles/op | |
xadd: 19.15 cycles/op | |
swap: 19.36 cycles/op | |
cmpxchg: 19.01 cycles/op | |
lockadd_unalign: 645.76 cycles/op | |
---- | |
Intel Core i7-3770K (Ivy Bridge) @ 3.5GHz | |
interference type: none | |
add: 1.76 cycles/op | |
add_mfence: 44.08 cycles/op | |
lockadd: 18.46 cycles/op | |
xadd: 18.40 cycles/op | |
swap: 18.45 cycles/op | |
cmpxchg: 18.32 cycles/op | |
lockadd_unalign: 661.34 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.78 cycles/op | |
add_mfence: 43.83 cycles/op | |
lockadd: 113.62 cycles/op | |
xadd: 121.86 cycles/op | |
swap: 47.23 cycles/op | |
cmpxchg: 48.57 cycles/op | |
lockadd_unalign: 612.54 cycles/op | |
interference type: hyperthread_write_line | |
add: 6.45 cycles/op | |
add_mfence: 48.10 cycles/op | |
lockadd: 32.62 cycles/op | |
xadd: 31.40 cycles/op | |
swap: 34.96 cycles/op | |
cmpxchg: 121.29 cycles/op | |
lockadd_unalign: 647.97 cycles/op | |
interference type: other_core_read_line | |
add: 2.19 cycles/op | |
add_mfence: 88.60 cycles/op | |
lockadd: 76.83 cycles/op | |
xadd: 77.17 cycles/op | |
swap: 74.09 cycles/op | |
cmpxchg: 79.80 cycles/op | |
lockadd_unalign: 659.39 cycles/op | |
interference type: other_core_write_line | |
add: 3.85 cycles/op | |
add_mfence: 146.39 cycles/op | |
lockadd: 128.53 cycles/op | |
xadd: 126.04 cycles/op | |
swap: 117.97 cycles/op | |
cmpxchg: 294.79 cycles/op | |
lockadd_unalign: 691.28 cycles/op | |
interference type: three_cores_read_line | |
add: 2.27 cycles/op | |
add_mfence: 106.34 cycles/op | |
lockadd: 109.30 cycles/op | |
xadd: 109.28 cycles/op | |
swap: 112.08 cycles/op | |
cmpxchg: 109.78 cycles/op | |
lockadd_unalign: 674.16 cycles/op | |
interference type: three_cores_write_line | |
add: 13.57 cycles/op | |
add_mfence: 329.85 cycles/op | |
lockadd: 248.26 cycles/op | |
xadd: 270.56 cycles/op | |
swap: 267.62 cycles/op | |
cmpxchg: 3827.60 cycles/op | |
lockadd_unalign: 780.82 cycles/op | |
---- | |
Intel Core i5-4460 (Haswell) 3.20GHz | |
interference type: none | |
add: 1.41 cycles/op | |
add_mfence: 43.45 cycles/op | |
lockadd: 22.59 cycles/op | |
xadd: 21.18 cycles/op | |
swap: 21.88 cycles/op | |
cmpxchg: 21.18 cycles/op | |
lockadd_unalign: 586.52 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.41 cycles/op | |
add_mfence: 43.82 cycles/op | |
lockadd: 22.63 cycles/op | |
xadd: 21.56 cycles/op | |
swap: 105.71 cycles/op | |
cmpxchg: 21.56 cycles/op | |
lockadd_unalign: 583.86 cycles/op | |
interference type: hyperthread_write_line | |
add: 2.62 cycles/op | |
add_mfence: 43.70 cycles/op | |
lockadd: 88.30 cycles/op | |
xadd: 21.18 cycles/op | |
swap: 21.88 cycles/op | |
cmpxchg: 21.18 cycles/op | |
lockadd_unalign: 585.65 cycles/op | |
interference type: other_core_read_line | |
add: 1.42 cycles/op | |
add_mfence: 44.93 cycles/op | |
lockadd: 94.54 cycles/op | |
xadd: 106.89 cycles/op | |
swap: 22.39 cycles/op | |
cmpxchg: 22.20 cycles/op | |
lockadd_unalign: 582.73 cycles/op | |
interference type: other_core_write_line | |
add: 2.40 cycles/op | |
add_mfence: 47.28 cycles/op | |
lockadd: 117.44 cycles/op | |
xadd: 21.23 cycles/op | |
swap: 22.07 cycles/op | |
cmpxchg: 21.18 cycles/op | |
lockadd_unalign: 585.65 cycles/op | |
interference type: three_cores_read_line | |
add: 1.90 cycles/op | |
add_mfence: 119.15 cycles/op | |
lockadd: 104.99 cycles/op | |
xadd: 105.29 cycles/op | |
swap: 115.61 cycles/op | |
cmpxchg: 107.39 cycles/op | |
lockadd_unalign: 582.31 cycles/op | |
interference type: three_cores_write_line | |
add: 2.59 cycles/op | |
add_mfence: 47.06 cycles/op | |
lockadd: 22.64 cycles/op | |
xadd: 118.95 cycles/op | |
swap: 118.90 cycles/op | |
cmpxchg: 21.57 cycles/op | |
lockadd_unalign: 584.93 cycles/op | |
---- | |
Intel Core i5-4670K (Haswell), stock clocks [i.e. 3.4GHz], turbos to 4GHz on all 4 cores during run: | |
interference type: none | |
add: 1.28 cycles/op | |
add_mfence: 39.19 cycles/op | |
lockadd: 20.43 cycles/op | |
xadd: 19.13 cycles/op | |
swap: 19.76 cycles/op | |
cmpxchg: 19.13 cycles/op | |
lockadd_unalign: 574.70 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.64 cycles/op | |
add_mfence: 102.07 cycles/op | |
lockadd: 69.59 cycles/op | |
xadd: 19.17 cycles/op | |
swap: 19.80 cycles/op | |
cmpxchg: 19.27 cycles/op | |
lockadd_unalign: 574.58 cycles/op | |
interference type: hyperthread_write_line | |
add: 1.28 cycles/op | |
add_mfence: 39.19 cycles/op | |
lockadd: 20.43 cycles/op | |
xadd: 19.13 cycles/op | |
swap: 19.76 cycles/op | |
cmpxchg: 19.13 cycles/op | |
lockadd_unalign: 575.42 cycles/op | |
interference type: other_core_read_line | |
add: 1.28 cycles/op | |
add_mfence: 39.29 cycles/op | |
lockadd: 20.44 cycles/op | |
xadd: 19.16 cycles/op | |
swap: 19.80 cycles/op | |
cmpxchg: 19.14 cycles/op | |
lockadd_unalign: 575.71 cycles/op | |
interference type: other_core_write_line | |
add: 1.28 cycles/op | |
add_mfence: 39.19 cycles/op | |
lockadd: 20.40 cycles/op | |
xadd: 19.16 cycles/op | |
swap: 19.76 cycles/op | |
cmpxchg: 19.13 cycles/op | |
lockadd_unalign: 578.42 cycles/op | |
interference type: three_cores_read_line | |
add: 1.28 cycles/op | |
add_mfence: 39.20 cycles/op | |
lockadd: 20.44 cycles/op | |
xadd: 19.16 cycles/op | |
swap: 19.80 cycles/op | |
cmpxchg: 19.16 cycles/op | |
lockadd_unalign: 573.75 cycles/op | |
interference type: three_cores_write_line | |
add: 2.66 cycles/op | |
add_mfence: 130.16 cycles/op | |
lockadd: 97.48 cycles/op | |
xadd: 118.87 cycles/op | |
swap: 66.55 cycles/op | |
cmpxchg: 270.15 cycles/op | |
lockadd_unalign: 575.15 cycles/op | |
---- | |
Intel Core i7-4770 (Haswell) CPU @ 3.40GHz | |
interference type: none | |
add: 1.77 cycles/op | |
add_mfence: 46.09 cycles/op | |
lockadd: 19.41 cycles/op | |
xadd: 20.31 cycles/op | |
swap: 22.08 cycles/op | |
cmpxchg: 19.67 cycles/op | |
lockadd_unalign: 704.17 cycles/op | |
interference type: hyperthread_read_line | |
add: 1.78 cycles/op | |
add_mfence: 46.32 cycles/op | |
lockadd: 18.06 cycles/op | |
xadd: 18.07 cycles/op | |
swap: 21.56 cycles/op | |
cmpxchg: 17.87 cycles/op | |
lockadd_unalign: 655.86 cycles/op | |
interference type: hyperthread_write_line | |
add: 10.06 cycles/op | |
add_mfence: 50.14 cycles/op | |
lockadd: 47.36 cycles/op | |
xadd: 39.90 cycles/op | |
swap: 30.60 cycles/op | |
cmpxchg: 111.52 cycles/op | |
lockadd_unalign: 817.70 cycles/op | |
interference type: other_core_read_line | |
add: 2.07 cycles/op | |
add_mfence: 103.08 cycles/op | |
lockadd: 113.44 cycles/op | |
xadd: 114.60 cycles/op | |
swap: 107.72 cycles/op | |
cmpxchg: 106.50 cycles/op | |
lockadd_unalign: 715.33 cycles/op | |
interference type: other_core_write_line | |
add: 3.12 cycles/op | |
add_mfence: 174.72 cycles/op | |
lockadd: 119.09 cycles/op | |
xadd: 119.09 cycles/op | |
swap: 116.53 cycles/op | |
cmpxchg: 376.28 cycles/op | |
lockadd_unalign: 742.58 cycles/op | |
interference type: three_cores_read_line | |
add: 2.08 cycles/op | |
add_mfence: 121.34 cycles/op | |
lockadd: 163.34 cycles/op | |
xadd: 162.96 cycles/op | |
swap: 161.23 cycles/op | |
cmpxchg: 151.60 cycles/op | |
lockadd_unalign: 741.54 cycles/op | |
interference type: three_cores_write_line | |
add: 7.59 cycles/op | |
add_mfence: 314.44 cycles/op | |
lockadd: 238.19 cycles/op | |
xadd: 238.33 cycles/op | |
swap: 237.93 cycles/op | |
cmpxchg: 2635.58 cycles/op | |
lockadd_unalign: 833.31 cycles/op | |
AMD A10-4600M (2.30GHz) with 8GB RAM
interference type: none
add: 1.71 cycles/op
dependent_adds: 0.85 cycles/op
add_mfence: 81.16 cycles/op
lockadd: 36.80 cycles/op
xadd: 35.96 cycles/op
swap: 35.95 cycles/op
cmpxchg: 39.15 cycles/op
lockadd_unalign: 173.03 cycles/op
interference type: hyperthread_read_line
add: 1.73 cycles/op
dependent_adds: 0.85 cycles/op
add_mfence: 84.60 cycles/op
lockadd: 80.40 cycles/op
xadd: 61.32 cycles/op
swap: 80.49 cycles/op
cmpxchg: 81.13 cycles/op
lockadd_unalign: 212.81 cycles/op
interference type: hyperthread_write_line
add: 20.24 cycles/op
dependent_adds: 2.70 cycles/op
add_mfence: 150.56 cycles/op
lockadd: 120.42 cycles/op
xadd: 126.89 cycles/op
swap: 48.89 cycles/op
cmpxchg: 46.72 cycles/op
lockadd_unalign: 254.07 cycles/op
interference type: other_core_read_line
add: 5.24 cycles/op
dependent_adds: 0.85 cycles/op
add_mfence: 302.17 cycles/op
lockadd: 48.86 cycles/op
xadd: 48.45 cycles/op
swap: 45.28 cycles/op
cmpxchg: 52.77 cycles/op
lockadd_unalign: 385.89 cycles/op
interference type: other_core_write_line
add: 2.46 cycles/op
dependent_adds: 5.90 cycles/op
add_mfence: 321.18 cycles/op
lockadd: 37.93 cycles/op
xadd: 48.45 cycles/op
swap: 47.70 cycles/op
cmpxchg: 43.81 cycles/op
lockadd_unalign: 272.36 cycles/op
interference type: three_cores_read_line
add: 5.26 cycles/op
dependent_adds: 0.85 cycles/op
add_mfence: 323.13 cycles/op
lockadd: 270.49 cycles/op
xadd: 50.26 cycles/op
swap: 305.33 cycles/op
cmpxchg: 52.17 cycles/op
lockadd_unalign: 443.51 cycles/op
interference type: three_cores_write_line
add: 32.33 cycles/op
dependent_adds: 0.85 cycles/op
add_mfence: 251.92 cycles/op
lockadd: 133.34 cycles/op
xadd: 50.48 cycles/op
swap: 400.78 cycles/op
cmpxchg: 52.03 cycles/op
lockadd_unalign: 597.53 cycles/op
AMD Opteron 6272 (16 module @ 2.1GHz)
interference type: none
add: 2.19 cycles/op
add_mfence: 95.01 cycles/op
lockadd: 56.15 cycles/op
xadd: 53.15 cycles/op
swap: 53.14 cycles/op
cmpxchg: 54.68 cycles/op
lockadd_unalign: 247.77 cycles/op
interference type: hyperthread_read_line
add: 2.50 cycles/op
add_mfence: 100.42 cycles/op
lockadd: 138.72 cycles/op
xadd: 97.43 cycles/op
swap: 108.15 cycles/op
cmpxchg: 143.43 cycles/op
lockadd_unalign: 293.73 cycles/op
interference type: hyperthread_write_line
add: 36.76 cycles/op
add_mfence: 171.70 cycles/op
lockadd: 226.86 cycles/op
xadd: 222.19 cycles/op
swap: 226.44 cycles/op
cmpxchg: 4643.72 cycles/op
lockadd_unalign: 348.22 cycles/op
interference type: other_core_read_line
add: 86.59 cycles/op
add_mfence: 548.38 cycles/op
lockadd: 453.45 cycles/op
xadd: 435.06 cycles/op
swap: 453.95 cycles/op
cmpxchg: 502.23 cycles/op
lockadd_unalign: 758.78 cycles/op
interference type: other_core_write_line
add: 75.59 cycles/op
add_mfence: 421.52 cycles/op
lockadd: 445.01 cycles/op
xadd: 416.42 cycles/op
swap: 405.71 cycles/op
cmpxchg: 54.79 cycles/op
lockadd_unalign: 687.30 cycles/op
interference type: three_cores_read_line
add: 186.46 cycles/op
add_mfence: 921.73 cycles/op
lockadd: 767.26 cycles/op
xadd: 752.58 cycles/op
swap: 769.43 cycles/op
cmpxchg: 885.06 cycles/op
lockadd_unalign: 1240.31 cycles/op
interference type: three_cores_write_line
add: 214.71 cycles/op
add_mfence: 1095.98 cycles/op
lockadd: 875.85 cycles/op
xadd: 865.22 cycles/op
swap: 887.61 cycles/op
cmpxchg: 8397.39 cycles/op
lockadd_unalign: 1399.55 cycles/op
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Intel i5 760
http://pastie.org/9429871