Skip to content

Instantly share code, notes, and snippets.

@eplawless
Last active March 13, 2024 16:06
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save eplawless/52813b1d8ad9af510d85 to your computer and use it in GitHub Desktop.
Save eplawless/52813b1d8ad9af510d85 to your computer and use it in GitHub Desktop.
function hash(str) {
var len = str.length;
var hash = 5381;
for (var idx = 0; idx < len; ++idx) {
hash = 33 * hash + str.charCodeAt(idx);
}
return hash;
}
@polarathene
Copy link

polarathene commented Jul 8, 2020

Just want to point out this is not reliable in JS due to numbers being 53-bit floats not 32-bit ints where overflow can be relied on instead of losing precision.

Your hash = 33 * hash + str.charCodeAt(idx) will result in hash("hello") return value of 2osu1y1l(via result.toString(36), base36 is the highest radix supported via this method, charset is 0-9a-v)

If you use another variant that does bitshift arithmetic for multiplication, hash = (hash << 5) + hash) + str.charCodeAt(idx), you'll get: 4bj995. To get the same value from your method, instead of return hash, use return hash >>> 0, this will convert to a 32-bit positive number(uint instead of int). This doesn't help with long enough inputs however as the number in the loop is not kept at/near 32-bit, if the iterations is long enough, you can end up with Infinity..

You'll still want to use return hash >>> 0 with the bitshift arithmetic version too, as once the hash value in the loop exceeds 32-bits, you'll have similar problems. The reason is that the bitshift will keep it within the 32-bit range as bitwise operators in JS work on 32-bits, however, addition follows right after it and that can push the number over 32-bit range..

There is an improved variant of djb2 that uses XOR(^) which as another bitwise operator will keep the JS value within the 32-bit range, it does not have the equivalent effect of addition however, so results will not match prior approaches, for "hello", this time you'll get a base36 value of "2y0dev".

TL;DR: Here's what you want to use in JS:

function djb2_xor(str) {
 let len = str.length
 let h = 5381

 for (let i = 0; i < len; i++) {
   h = h * 33 ^ str.charCodeAt(i)
  }
  return h >>> 0
}

Using h * 33 here is ok now, since the final operator ^ is truncating each iteration to 32-bits. The following should help validate that:

let h = Math.pow(2, 32)-1

(h * 33) // 141733920735 == 0x20FFFFFFDF
((h << 5)+h) // 4294967263 == 0x00FFFFFFDF

// Either result above will be cast to 32-bit here:
h >>> 0 // 4294967263 == 0xFFFFFFDF

// If you used the XOR, you'd also get normalized results (122 == "Z")
(h * 33) ^ 122 // -91
((h << 5)+h) ^ 122 // -91

// Note how zero-fill right shift '>>> 0' converts to positive number
// Both values are the same binary value '11111111111111111111111110100101' or '0xFFFFFFA5'
// But as JS only has one number type to work with, the representation matters.
h // -91, toString(32) == "-2r"
h >>> 0 // 4294967205, toString(32) == "3vvvvt5"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment