eugene-eeo/starflake.md

## starflake.md

      
    Raw
  

              starflake.md
            
          
    *-flake

most distributed uid generation schemes are inspired by twitter's snowflake.
basic principle is that we generate ids that can be split into 3 chunks, TSM where:

T = time difference
S = server 'id'
M = server counter, in case we get the same T

ideas

uniqueness of S

use some global registry (probably the most practical solution),
and race with another server to register.
alternative:

join cluster
N = cluster size (can probably just query neighbour)
get consensus that you can register as server N + 1.

stable sorting

problem: ideally we want a global order on the ids.
this is ok, TSM is good enough.
stronger:
if I get id x1 before another id x2, I want x1 < x2.
this may not always hold, but most of the time it's fine.
most applications dont need this anyways.
sol 1 (cheat):

if we relax constraint to be only true for a client,
i.e. if client 1 gets x1 and makes request to get x2,
then x2 > x1.
then it is simpler to solve,
client need just to attach maximal id it has seen on requests for new ids.
server just need to guarantee that id it returns is larger than seen id.
this may help with clock drift too.
generate(x)
    while True:
        x' = generate_id()
        (T, S, M) = x
        (T', S', M') = x'
        if x' > x:
            return x'
        if T' < T:
            correct drift
        wait until T' can be > T

difficulty comes from the fact that we cannot just return T | S | M+1
because it may violate the core guarantee (universally unique).
pick large enough bits for T => small enough dt then waiting
for e.g. 10ms is probably not a big deal.
no cheating:

generate()
    x = generate_id()
    forward x to all servers with id < own id
    return x

upon receipt of x by server i, i uses the algorithm to generate
a strictly larger id as above.
this needs i to remember the largest id it has seen.
note that solution is probably not very practical;
in the time taken for s to broadcast x,
it is likely that T is no longer valid.
best effort => broadcast using UDP.