Blog 2020/8/20
<- previous | index | next ->
Let's learn how to implement a hash table in C! (See also part 1 and part 2).
In order to handle hash collisions, we can do a byte-for-byte comparison of the keys. To do this, each slot needs to store a reference to the key as well as its length.
struct _HTSlot {
+ void* keyp;
+ size_t keyLength;
size_t keyHash;
void* valuep;
struct _HTSlot* nextp;
};
typedef struct _HTSlot HTSlot;
HashTableGet
and HashTableSet
are both updated to perform a full byte-for-byte comparison of the keys.
-void* HashTableGet(HashTable* htp, size_t keyHash) {
+void* HashTableGet(HashTable* htp, void* keyp, size_t keyLength,
+ size_t keyHash
+) {
size_t slotIndex = keyHash % htp->capacity;
HTSlot* cursorp = htp->slotspp[slotIndex];
while (cursorp != NULL) {
- if (cursorp->keyHash == keyHash) {
+ if (cursorp->keyHash == keyHash
+ && cursorp->keyLength == keyLength
+ && memcmp(cursorp->keyp, keyp, keyLength) == 0
+ ) {
return cursorp->valuep;
} else {
cursorp = cursorp->nextp;
continue;
}
}
return NULL;
}
Additionally, HashTableSet
is updated to store the key pointer and key length.
-void HashTableSet(HashTable* htp, size_t keyHash, void* valuep) {
+void HashTableSet(HashTable* htp, void* keyp, size_t keyLength,
+ size_t keyHash, void* valuep
+) {
+ size_t slotIndex = keyHash % htp->capacity;
+ HTSlot* cursorp = htp->slotspp[slotIndex];
+ while (cursorp != NULL) {
+ if (cursorp->keyHash == keyHash
+ && cursorp->keyLength == keyLength
+ && memcmp(cursorp->keyp, keyp, keyLength) == 0
+ ) {
cursorp->valuep = valuep;
if (cursorp->valuep == NULL && valuep != NULL) {
htp->population++;
} else if (cursorp->valuep != NULL && valuep == NULL) {
htp->population--;
}
return;
} else {
cursorp = cursorp->nextp;
continue;
}
}
// no existing match? append a new slot to the list.
if (valuep != NULL) {
HTSlot* newSlotp = _HTSlotMake();
+ newSlotp->keyp = keyp;
+ newSlotp->keyLength = keyLength;
newSlotp->keyHash = keyHash;
newSlotp->valuep = valuep;
if (htp->slotspp[slotIndex] == NULL) {
htp->slotspp[slotIndex] = newSlotp;
} else {
HTSlot* lastp = htp->slotspp[slotIndex];
while (lastp->nextp != NULL) {
lastp = lastp->nextp;
continue;
}
lastp->nextp = newSlotp;
}
htp->population++;
}
}
Our hash table now handles hash collisions!
./tests
testMake... OK
testFree... OK
testSet... OK
testGet... OK
testSlotCollision... OK
testHashCollision... OK
However, this is a bit of a performance hit, as every operation performs a full memcmp
of the keys.
In a future part, we will look at adding a slightly unsafe set of functions which address this performance hit.
In part 4, we will look at automatically resizing the array of slot pointers.