Skip to content

Instantly share code, notes, and snippets.

@antirez
Last active December 30, 2015 23:38
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save antirez/7901666 to your computer and use it in GitHub Desktop.
Save antirez/7901666 to your computer and use it in GitHub Desktop.
There are five nodes, using master-slave replication. When we start A is the master.
The nodes are capable of synchronous replication, when a client writes,
it gets as relpy the number or replicas that received the write. A client
can consider a write accepted only when "3" or more is returned, otherwise
the result is not determined (false negatives are possbile).
Every node has a serial number, called the replication offset. It is always
incremented as the replication stream is processed by a replica. Replicas
are capable of telling an external entity, called "the controller", what
is the replication offset processed so far.
At some point, the controller, dictates that the current master is down,
and decides that a failover is needed, so the master is switched to another
one, in this way:
1) The controller completely partition away the current master.
2) The controller selects, out of a majority of replicas that are still
available, the one with the higher replication offset.
3) The controller tells all the reachable slaves what is the new master:
the slaves start to get new data from the new master.
4) The controller finally reconfigure all the clients to write to the new master.
So everything starts again. We assume that a re-appearing master, or other
slaves that are again available after partitions heal, are capable of
understand what the new master is. However both the old master and the slaves
can't accept writes. Slaves are read-only, while the re-apprearing master will
not be able to return the majority on writes, so the client will not be able
to consider the writes accepted.
In this model, it is possible to reach linearizability? I believe, yes,
because we removed all the hard part, for which the strong protocols like
Raft use epochs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment