DJB2 is a very simple, yet powerful, hash function. I am currently implementing DEFLATE + Zlib in Rust. DEFLATE needs a hash function to keep track of duplicate strings. The fastest way we can do this without SIMD is impelementing it in Assembly.
DJB2 hash function, as defined in http://www.cse.yorku.ca/~oz/hash.html, is:
unsigned long
hash(unsigned char *str)
{
unsigned long hash = 5381;
int c;
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
return hash;
}
DJB2 relies on a magic number, 5381. It simply multiplies prime number 33 with this magic number and adds every single byte of the message to it.
The message size for my use is ony 3 bytes. However you can expand upon it based on your needs. There are two files in this Gist:
1- Aarch64/ARM64 DJB2 for 3 Bytes (djb2.aarch64.asm) 2- x86-64/AMD64 DJB2 for 3 bytes (djb2.amd64.asm)
In both these implementations, we will hash the message "djb" encoded in ASCII.
If we check with od
, we will see that:
$ echo -e -n 'djb' | od -t x1
0000000 64 6a 62
...the ASCII bytes for "djb" in decimal are 100, 106 and 98 respectively and a quick trip to the serpant will tell us what the final value should be:
$ python3 -c "h=5381;[exec(f'h=h*33+{c}', globals()) for c in b'djb'];print(h)"
193489493
These are very simple implementations, nothing to write home about. USE THEM AT YOUR OWN RISK. I am not responsible for future-proofness, performance, or functionality of this code. I am using it not for a piece of software I am paid for, rather, a project I am making for educational and recreational reasons. You've been warned.
isn't it
hash * 32 + c
instead of33
?