Skip to content

Instantly share code, notes, and snippets.

@nealey
Last active May 29, 2022 12:33
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nealey/8087fd81b694cb7b5706e7b20ba0edc1 to your computer and use it in GitHub Desktop.
Save nealey/8087fd81b694cb7b5706e7b20ba0edc1 to your computer and use it in GitHub Desktop.
djb / dbj2 hash in awk
function ord(c) {
for (i = 0; i < 256; i += 1) {
if (sprintf("%c", i) == c) {
return i
}
}
return 256
}
function hash(str) {
h = 5381
for (pos = 1; pos <= length(str); pos += 1) {
c = ord(substr(str, pos, 1))
h = (((h * 33) + c) % 4294967296)
}
return h
}
{
print hash($0)
}
@nealey
Copy link
Author

nealey commented Aug 31, 2019

Does the world need this? Probably not.

$ awk -f hash.awk <<EOD
> h
> he
> hel
> hell
> hello
> EOD
177677
5863442
193493694
2090324714
261238937

PS: This is a very slow implementation.

@mogando668
Copy link

it's only slow cuz the full list of bytes are sequentially printed out in order to check. have it pre-made into array, or string lookup, and bunch up multiple characters before 1 mod and store op, and that should help a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment