Skip to content

Instantly share code, notes, and snippets.

@killercup
Created November 30, 2015 10:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save killercup/0d9765c47ebdfad5b8d7 to your computer and use it in GitHub Desktop.
Save killercup/0d9765c47ebdfad5b8d7 to your computer and use it in GitHub Desktop.
hash-rs benchmark
bytes horner sip farm xx fnv
1 33 33 18 24 500
2 66 66 37 44 666
4 129 117 66 105 800
8 258 228 115 190 727
16 470 390 186 326 551
32 680 603 157 551 426
64 1103 853 231 1000 349
128 1580 1057 329 1620 317
256 2169 1190 423 2392 309
512 2572 1270 489 3220 288
1024 2844 1301 554 3806 289
2048 2981 1332 582 4257 289
bytes horner sip farm xx fnv
1 30 30 53 41 2
2 30 30 53 45 3
4 31 34 60 38 5
8 31 35 69 42 11
16 34 41 86 49 29
32 47 53 203 58 75
64 58 75 276 64 183
128 81 121 389 79 403
256 118 215 604 107 827
512 199 403 1047 159 1775
1024 360 787 1848 269 3543
2048 687 1537 3518 481 7079
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz
machdep.cpu.family: 6
machdep.cpu.model: 23
machdep.cpu.extmodel: 1
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 10
machdep.cpu.feature_bits: 290732854951541759
machdep.cpu.extfeature_bits: 4832888832
machdep.cpu.signature: 67194
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 XSAVE
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF
machdep.cpu.logical_per_package: 2
machdep.cpu.cores_per_package: 2
machdep.cpu.microcode_version: 2571
machdep.cpu.processor_flag: 7
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.linesize_max: 64
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.sub_Cstates: 51520032
machdep.cpu.thermal.sensor: 1
machdep.cpu.thermal.dynamic_acceleration: 0
machdep.cpu.thermal.invariant_APIC_timer: 0
machdep.cpu.thermal.thresholds: 2
machdep.cpu.thermal.ACNT_MCNT: 1
machdep.cpu.thermal.core_power_limits: 0
machdep.cpu.thermal.fine_grain_clock_mod: 0
machdep.cpu.thermal.package_thermal_intr: 0
machdep.cpu.thermal.hardware_feedback: 1
machdep.cpu.thermal.energy_policy: 0
machdep.cpu.xsave.extended_state: 3 576 576 0
machdep.cpu.xsave.extended_state1: 0 0 0 0
machdep.cpu.arch_perf.version: 2
machdep.cpu.arch_perf.number: 2
machdep.cpu.arch_perf.width: 40
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.events: 0
machdep.cpu.arch_perf.fixed_number: 3
machdep.cpu.arch_perf.fixed_width: 40
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.L2_associativity: 12
machdep.cpu.cache.size: 3072
machdep.cpu.tlb.inst.small: 128
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.data.small: 16
machdep.cpu.tlb.data.small_level1: 256
machdep.cpu.tlb.data.large: 16
machdep.cpu.tlb.data.large_level1: 32
machdep.cpu.address_bits.physical: 36
machdep.cpu.address_bits.virtual: 48
machdep.cpu.core_count: 2
machdep.cpu.thread_count: 2
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.tsc_ccc.denominator: 0
<!DOCTYPE HTML>
<html>
<head>
<script src="//cdnjs.cloudflare.com/ajax/libs/dygraph/1.1.1/dygraph-combined.js"></script>
<style>
.dygraph-legend {
background: none !important;
}
p {
max-width: 60em;
}
</style>
</head>
<body id="inject">
<p>
These are benchmarks of how the different Rust hashing algorithm implementations
compare. In particular, they are using the Hasher interface, which may not be
optimal for all algorithms. Implementation quality may also vary.
</p>
<p>
Numbers are generated from a MacBook Pro (x64 OSX) with a 2.8 GHz Intel Core
i7 (Haswell) and 16GB of RAM. This is unfortunately quite important, as
even different cpu revisions by the same manufacturer (e.g. Haswell vs Sandy Bridge)
can produce significantly different results for these functions. As always,
bench your workload on your hardware for best results.
</p>
<p>
Variance bars excluded because this dang charting library was absolutely
freaking out with them. Note also that both axes are on a log-scale, so small
vertical differences may actually represent a large difference.
</p>
<p>
<a href="https://github.com/Gankro/hash-rs">Source Can be Found Here</a>
</p>
<p>
The hash functions on display are:
</p>
<p>
<strong>SipHash 2-4 (sip)</strong>: This is the state-of-the-art for hashing that is
completely resistant against even theoretical hash flooding DDOS attacks when
paired with a random seed. While most applications probably do not require
this, it's important for some usecases (for instance, many webservers can be
completely locked up by POST requests with colliding keys). Since this is a
relatively obscure attack, but quite costly when applicable, we believe that
it's important to default to such a hash function. As such, this is the default hasher
used by Rust's HashMap, but it can be overloaded. Sip's performance is overall
mediocre but consistent. As such, it's not an awful default.
</p>
<p>
Performance could be gained by moving to SipHash 2-3, which has slightly inferior
cryptographic properties, but is sufficient for preventing flooding.
</p>
<p>
<strong>Fowler-Noll-Vo (fnv)</strong>: A hash function with excellent distribution, but no particular
cryptographic properties. FNV is the best at small inputs, but does terribly on large
inputs. Since small inputs are much more common, it's the
Rust community's most popular override for SipHash where security doesn't matter.
The Rust compiler itself uses this function.
</p>
<p>
<strong>xxHash (xx)</strong>: Like fnv, a hash function with good distribution, but no crypto.
It's our best performer for large inputs, and performs strongly on small ones.
</p>
<p>
<strong>FarmHash (farm)</strong>: Another decent distribution, but no crypto. This algorithm is
intended to perform excellently on all sizes of input, but is massively pessimized
by Rust's streaming hashing architecture, as it's designed to process the input
in one shot, forcing the entire input to be buffered in a growable array. In order
for it to compete, Rust needs to adopt a solution that allows the top-level type
in a hashing to identify whether it can be hashed in a one-shot manner, for a
specialized result. However this can lead to confusing or inconsistent results,
particularly if hashing derivation is done incorrectly.
</p>
<p>
<strong>Horner (horner)</strong>: A universal hash function without the full cryptographic
strength of SipHash. It defends against all hash flooding attacks that have
been demonstrated in practice, but is theoretically vulnerable to a timing
attack if the attacker is allowed to continuously query against maps with the
same seed. It's unclear how practical this attack is under ideal conditions for
the attacker, but for most situations this hasher is likely sufficient.
Performance is roughly on-par with xxHash.
</p>
<p>
<strong>btree</strong> is not a hash function at all, but rather feeding the same
inputs into Rust's BTreeMap, which is based on comparisons instead of hashing. This
serves as a baseline, as well as a demonstration of how ordered data structures
can completely stomp unordered data structures under particular workloads, even
when not used for their ordering properties. <strong>btreenew</strong> is an
experimental rewrite of std's btree that is kicking its butt.
</p>
<script type="text/javascript">
function makeBench(name, title, xLabel) {
var benchDiv = document.createElement("div");
var timeDiv = makeDiv();
var tputDiv = makeDiv();
var heading = document.createElement("h1");
heading.textContent = title;
heading.style = "text-align: center;"
benchDiv.appendChild(heading);
benchDiv.appendChild(timeDiv);
benchDiv.appendChild(tputDiv);
document.getElementById("inject").appendChild(benchDiv);
makeGraph("Time (lower is better)",
xLabel,
"time (ns)",
name + "-time.csv",
timeDiv);
makeGraph("Throughput (higher is better)",
xLabel,
"throughput (MB/s)",
name + "-throughput.csv",
tputDiv);
}
function makeDiv() {
var div = document.createElement("div");
div.style = "display: inline-block; width:650px; height:600px;";
return div;
}
function makeGraph(title, xLabel, yLabel, csv, div) {
var graph = new Dygraph(
div,
csv,
{
logscale: true,
title: title,
xlabel: xLabel,
ylabel: yLabel,
strokeWidth: 2.0,
labelsDivWidth: 550,
legend: "always",
axes: {
x: { logscale: true, drawGrid: false },
y: { logscale: true, drawGrid: false },
},
series: {
sip: { color: "#cc0000" },
xx: { color: "#00cc00" },
fnv: { color: "#0000cc" },
farm: { color: "#cc00cc" },
murmur: { color: "#cccc00" },
city: { color: "#00cccc" },
btree: { color: "#000000" },
btreenew: { color: "#aaaaaa" },
horner: { color: "#884444" }
},
}
);
}
makeBench("bytes",
"Hashing an array of bytes",
"bytes hashed");
makeBench("mapcountdense",
"Counting number of occurrences of 1000 byte-strings (mostly duplicates)",
"bytes per string");
makeBench("mapcountsparse",
"Counting number of occurrences of 1000 byte-strings (mostly unique)",
"bytes per string");
</script>
</body>
</html>
bytes horner sip fnv btree farm btreenew xx
1 12 10 18 6 7 12 9
2 24 20 21 13 14 22 18
4 46 41 40 24 30 41 37
8 87 81 66 46 54 75 73
16 143 141 97 88 63 139 123
32 237 229 131 165 102 256 190
64 350 346 170 295 143 429 321
128 509 432 210 483 214 651 495
256 645 522 224 684 271 843 660
512 765 586 231 892 311 1033 792
1024 830 622 236 1038 347 1142 888
2048 870 638 237 1142 360 1193 948
bytes horner sip fnv btree farm btreenew xx
1 81192 92846 54473 152709 135611 77634 102479
2 81766 95924 92826 152815 134300 88553 106925
4 85223 96319 98993 162156 131309 96267 105380
8 91747 98551 119646 171744 147518 105624 109026
16 111351 112757 164668 180455 253230 115097 129439
32 134915 139416 242637 193075 311363 124625 167744
64 182515 184442 374761 216434 445425 148869 198950
128 250943 296146 606498 264545 595698 196451 258523
256 396601 490181 1138813 373793 940762 303500 387412
512 668390 873162 2204625 573532 1642835 495155 645892
1024 1232017 1642074 4317642 985668 2946019 895311 1150827
2048 2349093 3196236 8564622 1789690 5677068 1714705 2156567
bytes sip btreenew horner xx btree fnv farm
1 9 8 10 8 5 14 6
2 9 9 8 8 6 10 7
4 18 18 18 17 13 19 15
8 36 36 37 35 26 36 31
16 76 72 76 72 51 69 45
32 143 143 139 130 101 111 79
64 254 280 264 254 199 162 131
128 421 551 497 484 394 209 213
256 631 1084 855 859 783 246 303
512 864 2140 1323 1453 1549 268 380
1024 1059 4186 1862 2192 3039 281 455
2048 1179 7931 2297 2902 5838 280 481
bytes sip btreenew horner xx btree fnv farm
1 105825 115104 95163 117944 187610 67096 143903
2 212237 215113 227831 230713 303786 196365 253028
4 217854 216216 218441 226054 304685 209466 251443
8 220713 217515 216180 226857 306647 218186 253689
16 208325 220005 210505 221063 309961 230690 354871
32 222327 223744 229746 245215 315944 287830 400488
64 251644 227974 241627 251659 321075 394786 485787
128 303666 231872 257043 264062 324602 610071 598153
256 405614 236070 299058 297766 326664 1038513 843094
512 592230 239221 386750 352149 330353 1905839 1343383
1024 965682 244589 549505 467052 336911 3630006 2244850
2048 1733847 258136 890816 705677 350714 7293134 4237448
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment