Skip to content

Instantly share code, notes, and snippets.

@ddjerqq
Last active June 22, 2023 14:52
Show Gist options
  • Save ddjerqq/599ae9f8590f6bd5c87b8160d66f56b7 to your computer and use it in GitHub Desktop.
Save ddjerqq/599ae9f8590f6bd5c87b8160d66f56b7 to your computer and use it in GitHub Desktop.
The hash ring, used for distributing items evenly across shards / threads / processes

Hash Ring

the hash ring, use this to distribute items evenly accross shards / threads / processes

     *  0  *
   *         *
 *             *
270            90
  *            *
   X         *
   | * 180 *
   |----^

lets say that we have 4 shards. see ASCII ring above ^ as you can see the ring, we have four points, exactly at 90 degree offsets (total_degrees == 360 / num_shards == 4).

there are four shards in this example.

now lets sat that we want to find out which shard the item X belongs to. there are three steps to this.

    1. modulo the X by the total number of degrees (360). lets say the X is a user id - 725773984808960050. if we mod that number by 360 we will get 250
    1. find the largest shard degree, which is lower than the remainder. in this case it will be 180.
    1. get the index of this offset on the circle, by dividing it by the circle's regular offset (90 by default). we will get: 2 (3rd shard)
import typing as t
class HashRing:
"""
the hash ring, use this to distribute items evenly across shards / threads / processes
* 0 *
* *
* *
270 90
* *
X *
| * 180 *
|----^
lets say that we have 4 shards. see ASCII ring above ^
as you can see the ring, we have four points, exactly at 90 degree offsets
(total_degrees == 360 / num_shards == 4).
there are four shards in this example.
now lets sat that we want to find out which shard the item X belongs to.
there are three steps to this.
- 1. modulo the X by the total number of degrees (360).
lets say the X is a user id - 725773984808960050.
if we mod that number by 360 we will get 250
- 2. find the largest shard degree, which is lower than the remainder.
in this case it will be 180.
- 3. get the index of this offset on the circle, by dividing it by the
circle's regular offset (90 by default).
we will get: 2 (3rd shard)
"""
__DEGREES = 360
"""
the number of degrees in the circle. must be more than `shard_count`
"""
def __init__(self, shard_count: int, *, _ring_degrees: t.Optional[int] = None) -> None:
"""
create a HashCircle, with the given total shards.
optionally supply a parameter for maximum ring degrees.
360 is usually a very divisible number, but if you want something
larger or more customized, overide the parameters.
Args:
:shard_count int: the number of shards that are going to be on the ring.
:_ring_degrees int: override the degrees available on the circle. useful for when you have
more than 360 (default) shards.
"""
self._shard_count = shard_count
if _ring_degrees is not None:
self.__DEGREES = _ring_degrees
@property
def _offset(self) -> int:
return self.degrees // self._shard_count
@property
def _shards(self) -> list[int]:
yield from map(lambda i: self._offset * i, range(self._shard_count))
@property
def degrees(self) -> int:
"""
get the number of degrees on the current ring.
unless overriden, always returns 360, otherwise returns the overriden amount.
"""
return self.__DEGREES
def get_shard_for_item(self, item: int) -> int:
"""
given an item (integer), determine which shard it belongs to.
see the docstring on the class for more info about how this works.
Args:
:item int: the item, must be a positive integer. the size does not matter,
as we take the modulo of it by the amount of degrees in the current ring.
"""
# get the smallest number which is above the item from the list;
item %= self.degrees
min_shard = next(self._shards)
for shard in self._shards:
if shard <= item:
min_shard = shard
return min_shard // self._offset
def is_item_for_shard(self, item: int, shard: int) -> bool:
"""
simplifies determination of item ownership among shards.
all this method does is check whether or not `get_shard_for_item(item) == shard`
Args:
:item int: the integer item which you want to determine ownership of the shard to.
:shard int: the shard to test against.
"""
return self.get_shard_for_item(item) == shard
import unittest
class HashRingTest(unittest.TestCase):
def test_default_value_equal_distribution(self):
n = 10
circle = HashRing(n)
shards = [0 for _ in range(n)]
for i in range(circle.degrees * n):
shard = circle.get_shard_for_item(i)
shards[shard] += 1
assert all(shard_count == circle.degrees for shard_count in shards)
def test_overriden_circle_degrees_works(self):
n = 400
circle = HashRing(n, _ring_degrees=n)
shards = [circle.get_shard_for_item(i) for i in range(n)]
assert all(i == shard for i, shard in enumerate(shards)), shards
if __name__ == "__main__":
unittest.main()
@ddjerqq
Copy link
Author

ddjerqq commented Jun 22, 2023

see the unit tests for usage hints!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment