Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
JS simple hash function
var hash = function(s) {
/* Simple hash function. */
var a = 1, c = 0, h, o;
if (s) {
a = 0;
/*jshint plusplus:false bitwise:false*/
for (h = s.length - 1; h >= 0; h--) {
o = s.charCodeAt(h);
a = (a<<6&268435455) + o + (o<<14);
c = a & 266338304;
a = c!==0?a^c>>21:a;
}
}
return String(a);
};
@florianfrey1

This comment has been minimized.

Copy link

@florianfrey1 florianfrey1 commented May 16, 2018

Great hash function! Works very good.
What's the name of this function?

@bryc

This comment has been minimized.

Copy link

@bryc bryc commented Sep 4, 2018

Seems to be from ga.js (Google Analytics), and was used to hash domain name strings. It's not that good honestly. Distribution and collision resistance is very poor. Only useful for its original purpose: hashing domain name strings.

For example, hash for "a" is 00184061 and "b" is 00188062. "revenge" is 097599b9 and "revenue" is 0575be19. They have many common bits, which implies bad mixing. And collisions have higher-than-chance probability: "1zbN" and "1t6g" = 00e8b236, "RivW" and "R1G7" = 10063cd2.

Did a quick sanity test:

Test: Single rolling bit over 4096 bytes (FAILED) Collisions: 10224 of 10240 (100%)
Test: Incrementing bytes to 255          (FAILED) Collisions: 8829 of 18360 (48%)
Test: Four unique random ASCII chars     (FAILED) Collisions: 484 of 102401 (0%)

Fails most tests, but 484 in 102,401 is not bad. I can see why they used it for domain strings. But an even simpler function is faster AND better:

var funhash = function(s) {
    for(var i = 0, h = 0xdeadbeef; i < s.length; i++)
        h = Math.imul(h ^ s.charCodeAt(i), 2654435761);
    return (h ^ h >>> 16) >>> 0;
};
@Undefitied

This comment has been minimized.

Copy link

@Undefitied Undefitied commented Sep 19, 2019

@bryc, can you, please, share your test functions?

@bryc

This comment has been minimized.

Copy link

@bryc bryc commented Sep 20, 2019

@Undefitied Here you go. It's set to 4096 iterations per test, just for faster loading. And it also tests the bit distribution under "Bit averages range", where 50% is the expected value of true randomness. Just keep in mind these are just basic tests with small sample sizes. SMHasher should be better but you need to be a rocket scientist to use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment