Skip to content

Instantly share code, notes, and snippets.

@eugene-eeo
Last active April 25, 2019 23:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save eugene-eeo/b480241899951940b3dc893ce4c23d24 to your computer and use it in GitHub Desktop.
Save eugene-eeo/b480241899951940b3dc893ce4c23d24 to your computer and use it in GitHub Desktop.
*-flake

*-flake

most distributed uid generation schemes are inspired by twitter's snowflake. basic principle is that we generate ids that can be split into 3 chunks, TSM where:

  • T = time difference
  • S = server 'id'
  • M = server counter, in case we get the same T

ideas

uniqueness of S

use some global registry (probably the most practical solution), and race with another server to register.

alternative:

  1. join cluster
  2. N = cluster size (can probably just query neighbour)
  3. get consensus that you can register as server N + 1.

stable sorting

problem: ideally we want a global order on the ids. this is ok, TSM is good enough.

stronger: if I get id x1 before another id x2, I want x1 < x2. this may not always hold, but most of the time it's fine. most applications dont need this anyways.

sol 1 (cheat):

if we relax constraint to be only true for a client, i.e. if client 1 gets x1 and makes request to get x2, then x2 > x1. then it is simpler to solve, client need just to attach maximal id it has seen on requests for new ids. server just need to guarantee that id it returns is larger than seen id. this may help with clock drift too.

generate(x)
    while True:
        x' = generate_id()
        (T, S, M) = x
        (T', S', M') = x'
        if x' > x:
            return x'
        if T' < T:
            correct drift
        wait until T' can be > T

difficulty comes from the fact that we cannot just return T | S | M+1 because it may violate the core guarantee (universally unique). pick large enough bits for T => small enough dt then waiting for e.g. 10ms is probably not a big deal.

no cheating:

generate()
    x = generate_id()
    forward x to all servers with id < own id
    return x

upon receipt of x by server i, i uses the algorithm to generate a strictly larger id as above. this needs i to remember the largest id it has seen.

note that solution is probably not very practical; in the time taken for s to broadcast x, it is likely that T is no longer valid. best effort => broadcast using UDP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment