Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save redknightlois/f52db834a071a36621be47e79e0b48dd to your computer and use it in GitHub Desktop.
Save redknightlois/f52db834a071a36621be47e79e0b48dd to your computer and use it in GitHub Desktop.
Memory Compare Benchmarks with SIMD
// * Summary *
BenchmarkDotNet=v0.9.4.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz, ProcessorCount=4
Frequency=3218751 ticks, Resolution=310.6795 ns, Timer=TSC
HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
JitModules=clrjit-v4.6.1078.0
Type=CompareAlgo Mode=Throughput
Method | Median | StdDev | Scaled | Size |
--------------------------- |------------ |----------- |------- |----- |
CompareBaseBandwidth4Simd | 9.0179 ns | 0.0309 ns | 1.17 | 8 |
CompareBaseTailSimd | 10.3019 ns | 0.0384 ns | 1.34 | 8 |
CompareBranchlessInnerSimd | 9.6574 ns | 0.0461 ns | 1.26 | 8 |
CompareBranchlessSimd | 10.6273 ns | 0.0712 ns | 1.38 | 8 |
CompareCurrent | 7.6871 ns | 0.0299 ns | 1.00 | 8 |
CompareLongTailSimd | 9.4816 ns | 0.0393 ns | 1.23 | 8 |
CompareNaiveSimd | 8.3890 ns | 0.0287 ns | 1.09 | 8 |
CompareBaseBandwidth4Simd | 9.8630 ns | 0.0402 ns | 1.16 | 16 |
CompareBaseTailSimd | 16.7932 ns | 0.1320 ns | 1.98 | 16 |
CompareBranchlessInnerSimd | 18.0723 ns | 0.3028 ns | 2.13 | 16 |
CompareBranchlessSimd | 18.2318 ns | 0.2505 ns | 2.15 | 16 |
CompareCurrent | 8.4955 ns | 0.0392 ns | 1.00 | 16 |
CompareLongTailSimd | 9.6173 ns | 0.0271 ns | 1.13 | 16 |
CompareNaiveSimd | 16.6113 ns | 0.1139 ns | 1.96 | 16 |
CompareBaseBandwidth4Simd | 9.8959 ns | 0.0431 ns | 1.16 | 17 |
CompareBaseTailSimd | 17.1973 ns | 0.0691 ns | 2.02 | 17 |
CompareBranchlessInnerSimd | 18.4301 ns | 0.3083 ns | 2.16 | 17 |
CompareBranchlessSimd | 18.0579 ns | 0.2576 ns | 2.12 | 17 |
CompareCurrent | 8.5160 ns | 0.0641 ns | 1.00 | 17 |
CompareLongTailSimd | 9.7069 ns | 0.0383 ns | 1.14 | 17 |
CompareNaiveSimd | 16.4413 ns | 0.0779 ns | 1.93 | 17 |
CompareBaseBandwidth4Simd | 10.8656 ns | 0.0749 ns | 0.99 | 32 |
CompareBaseTailSimd | 14.7042 ns | 0.0471 ns | 1.34 | 32 |
CompareBranchlessInnerSimd | 18.9323 ns | 0.3837 ns | 1.73 | 32 |
CompareBranchlessSimd | 19.2875 ns | 0.3977 ns | 1.76 | 32 |
CompareCurrent | 10.9340 ns | 0.0402 ns | 1.00 | 32 |
CompareLongTailSimd | 10.9599 ns | 0.0777 ns | 1.00 | 32 |
CompareNaiveSimd | 14.4782 ns | 0.0769 ns | 1.32 | 32 |
CompareBaseBandwidth4Simd | 10.9628 ns | 0.0772 ns | 1.01 | 33 |
CompareBaseTailSimd | 14.7349 ns | 0.0523 ns | 1.35 | 33 |
CompareBranchlessInnerSimd | 18.6459 ns | 0.2427 ns | 1.71 | 33 |
CompareBranchlessSimd | 19.2408 ns | 0.4033 ns | 1.77 | 33 |
CompareCurrent | 10.8893 ns | 0.0402 ns | 1.00 | 33 |
CompareLongTailSimd | 10.9834 ns | 0.0736 ns | 1.01 | 33 |
CompareNaiveSimd | 14.3373 ns | 0.0532 ns | 1.32 | 33 |
CompareBaseBandwidth4Simd | 12.7401 ns | 0.1349 ns | 0.82 | 64 |
CompareBaseTailSimd | 17.0544 ns | 0.0814 ns | 1.09 | 64 |
CompareBranchlessInnerSimd | 22.1344 ns | 0.1691 ns | 1.42 | 64 |
CompareBranchlessSimd | 21.6692 ns | 0.3149 ns | 1.39 | 64 |
CompareCurrent | 15.5970 ns | 0.0753 ns | 1.00 | 64 |
CompareLongTailSimd | 13.7588 ns | 0.0898 ns | 0.88 | 64 |
CompareNaiveSimd | 16.9833 ns | 0.1060 ns | 1.09 | 64 |
CompareBaseBandwidth4Simd | 18.4544 ns | 0.2709 ns | 0.68 | 128 |
CompareBaseTailSimd | 23.8798 ns | 0.2336 ns | 0.88 | 128 |
CompareBranchlessInnerSimd | 30.2087 ns | 0.2617 ns | 1.12 | 128 |
CompareBranchlessSimd | 28.0667 ns | 0.5955 ns | 1.04 | 128 |
CompareCurrent | 27.0001 ns | 0.1498 ns | 1.00 | 128 |
CompareLongTailSimd | 21.2962 ns | 0.2447 ns | 0.79 | 128 |
CompareNaiveSimd | 23.5978 ns | 0.1681 ns | 0.87 | 128 |
CompareBaseBandwidth4Simd | 30.2047 ns | 0.4535 ns | 0.58 | 256 |
CompareBaseTailSimd | 41.9658 ns | 0.3931 ns | 0.80 | 256 |
CompareBranchlessInnerSimd | 57.6410 ns | 0.5105 ns | 1.10 | 256 |
CompareBranchlessSimd | 40.8807 ns | 0.6340 ns | 0.78 | 256 |
CompareCurrent | 52.1947 ns | 2.1092 ns | 1.00 | 256 |
CompareLongTailSimd | 50.1272 ns | 3.5173 ns | 0.96 | 256 |
CompareNaiveSimd | 47.5215 ns | 2.5930 ns | 0.91 | 256 |
CompareBaseBandwidth4Simd | 57.9281 ns | 0.3686 ns | 0.73 | 512 |
CompareBaseTailSimd | 65.6032 ns | 0.5260 ns | 0.82 | 512 |
CompareBranchlessInnerSimd | 80.1848 ns | 3.4445 ns | 1.01 | 512 |
CompareBranchlessSimd | 62.6676 ns | 0.7027 ns | 0.79 | 512 |
CompareCurrent | 79.6960 ns | 0.3297 ns | 1.00 | 512 |
CompareLongTailSimd | 63.8368 ns | 2.9785 ns | 0.80 | 512 |
CompareNaiveSimd | 68.2715 ns | 1.8770 ns | 0.86 | 512 |
CompareBaseBandwidth4Simd | 102.4053 ns | 0.5398 ns | 0.70 | 1024 |
CompareBaseTailSimd | 114.2047 ns | 0.9333 ns | 0.78 | 1024 |
CompareBranchlessInnerSimd | 137.2662 ns | 5.9032 ns | 0.94 | 1024 |
CompareBranchlessSimd | 112.4247 ns | 0.6679 ns | 0.77 | 1024 |
CompareCurrent | 145.7933 ns | 0.5189 ns | 1.00 | 1024 |
CompareLongTailSimd | 116.5475 ns | 2.1035 ns | 0.80 | 1024 |
CompareNaiveSimd | 115.6135 ns | 0.6733 ns | 0.79 | 1024 |
CompareBaseBandwidth4Simd | 204.6726 ns | 1.0604 ns | 0.67 | 2048 |
CompareBaseTailSimd | 231.2622 ns | 1.4042 ns | 0.76 | 2048 |
CompareBranchlessInnerSimd | 244.0345 ns | 5.2983 ns | 0.80 | 2048 |
CompareBranchlessSimd | 232.6503 ns | 0.8428 ns | 0.77 | 2048 |
CompareCurrent | 303.9156 ns | 0.7018 ns | 1.00 | 2048 |
CompareLongTailSimd | 230.5062 ns | 1.5599 ns | 0.76 | 2048 |
CompareNaiveSimd | 231.6967 ns | 0.8536 ns | 0.76 | 2048 |
CompareBaseBandwidth4Simd | 327.3108 ns | 1.8075 ns | 0.65 | 4096 |
CompareBaseTailSimd | 374.1856 ns | 1.9902 ns | 0.74 | 4096 |
CompareBranchlessInnerSimd | 415.2377 ns | 17.2633 ns | 0.83 | 4096 |
CompareBranchlessSimd | 374.4198 ns | 1.3948 ns | 0.74 | 4096 |
CompareCurrent | 503.2272 ns | 1.8815 ns | 1.00 | 4096 |
CompareLongTailSimd | 379.7526 ns | 2.3612 ns | 0.75 | 4096 |
CompareNaiveSimd | 375.3323 ns | 1.4686 ns | 0.75 | 4096 |
// ***** BenchmarkRunner: End *****
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment