Skip to content

Instantly share code, notes, and snippets.

@gogsbread
Last active December 16, 2015 02:50
Show Gist options
  • Save gogsbread/5365827 to your computer and use it in GitHub Desktop.
Save gogsbread/5365827 to your computer and use it in GitHub Desktop.
Simple algorithm that returns the same shard irrespective of the number of new boxes added.
'''
Question:
Assume you want to shard big SQL user table. How will you obtain the same shard even after increasing the sql boxes.
Answer:
Same shard number can be obtained if the current system state is preserved before adding new boxes.
NOTE: Code below requires python 2.6 or 2.7
'''
MAX_USER_ID = 0# for demonstration. This value is normally obtained from DB or some global state.
total_boxes = 5 #initial number of boxes
system_states = [] # maintains system state as tuple (range,boxes).
def add_boxes(n):
global total_boxes
system_states.append((MAX_USER_ID,total_boxes)) # preserving system state before adding boxes
total_boxes += n
def get_shard(userId):# userid is assumed to be a sequentially increasing integer value.
for state in system_states:
userRange,boxes = state
if userId <= userRange: # check previous system states and use that to compute shard
return userId % boxes
return userId % total_boxes # this is a new userid whose value has not been seen yet. Use the current set of boxes to compute shard
if __name__ == '__main__':
print get_shard(999)
print get_shard(1000)
print get_shard(1001)
MAX_USER_ID = 1001 #we have added 1001 users so far.
add_boxes(3)#adding 3 more boxes
print 'After adding 3 more boxes'
print get_shard(999)
print get_shard(1000)
print get_shard(1001)
print get_shard(2000)
print get_shard(2007)
MAX_USER_ID = 2007 #we have added 2007 users so far.
add_boxes(5)
print 'After adding 5 more boxes'
print get_shard(999)
print get_shard(1000)
print get_shard(1001)
print get_shard(2000)
print get_shard(2007)
print get_shard(4867)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment