Skip to content

Instantly share code, notes, and snippets.

@nficano
Created March 26, 2020 22:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nficano/aa599dc62c039f67ca8ac106f25a1629 to your computer and use it in GitHub Desktop.
Save nficano/aa599dc62c039f67ca8ac106f25a1629 to your computer and use it in GitHub Desktop.
hash_functions

Hash Functions

A hash table is a data structure that implements an dictionary abstract data type, and is used to map keys to values.

array (key)
  ┌───┐
0 │ ⊙─┼─► value
  ├───┤
1 │ ⊙─┼─► value
  ├───┤
2 │   │  A Hash table is implemented by using a large array.
  ├───┤  This is usually called the table's "backing array
3 │   │  because it backs the hash table.
  ├───┤

To place an item A 4 │ │ into the backing array, ├───┤ calculate a hash value. 5 │ │ ├───┤ 6 │ │ ├╶╶╶│ ┌──────┐ │ │ "cab" ──► │ hash │ │ │ │ func │ │ │ └───┼──┘ │ │ Multiple items might have the same hash value. │ │ │ This is called a collision. There are 2 ways │ │ │ to handle collisions: chaining and linear probing. │ ├╶╶╶│ │ │ │ linked-list │ ├───┤ ┌───────┬───┐ ┌─────────┬───┐ └──► 98 │ ⊙─┼──► │ "cab" │ ⊙─┼──┼─► "cat" │ ⊙─┼────► the hash value is ├───┤ └───────┴───┘ └─────────┴───┘ then mod(%) with 99 │ │ the array size. └───┘

Additional Notes:

  • Load Factor is the Number of Entries in the Hash Table divided by the size of the backing array. The higher the load factor, the greater the probability of collisions.
  • A Load factor of 0.75 is generally considered a good point to double the size of the backing array.
  • When you resize the backing array, you have to remap all the entries in the table.

Requirements:

  1. Deterministic
  2. Evenly distributed
  3. Should use all values of the input

Hash Codes:

To generate hash codes, we use polynomials. We set x to a prime number, (e.g, 31, 53 or 101).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment