Skip to content

Instantly share code, notes, and snippets.

@otrack
Last active September 17, 2020 18:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save otrack/3655317f9d605e39b2d07635d7e90990 to your computer and use it in GitHub Desktop.
Save otrack/3655317f9d605e39b2d07635d7e90990 to your computer and use it in GitHub Desktop.

Fixing Egalitarian Paxos

Replication is a fundamental building block in modern computing infrastructures. This technique consists in de-duplicating data over several distributed machines, or replicas. It provides high availability, as data survives the failure of one or more replicas, and locality, clients having the possibility to access a nearby replica.

Unfortunately, the existing algorithms to coordinate access to replicated data are infamously difficult. Both failures and asynchrony make the interleaving of the processes unpredictable, complex, and often exponential in their numbers. In a recent work, it was established that Egalitarian Paxos (EPaxos) [1], a recent protocol for data replication which was published at the prestigious Symposium on Operating Systems (SOSP'13), is flawed [2]. The problem comes from the use of a single variable to track the decisions taken in the algorithm. As a result, processes may behave incorrectly, when concurrent accesses over replicated data occur in asynchrony periods, and break the guarantees offered by the protocol.

The goal of this research project is to come-up with a provably-correct fix for the algorithm.

Work plan

For starters, the student(s) will study the EPaxos algorithm in depth. This study will rely on the pseudo-code provided by the authors [1], an implementation in Go [3], as well as recent works [4] that deconstruct EPaxos into simpler building blocks. Then, a fix will be proposed to avoid the incorrect behaviors. The soundness of this fix will be established with the help of machine-guided tools (e.g., theorem proving or symbolic model checking).

The project is open to students from either IP Paris Masters (PDS, HPDA), or VAP ASR at Telecom SudParis.

Contact

Prof. Pierre Sutra

References

[1] I. Moraru, D. G. Andersen, and M. Kaminsky. There Is More Consensus in Egalitarian Parliaments. In Symposium on Operating Systems Principles (SOSP), 2013.
[2] P. Sutra. On the correctness of Egalitarian Paxos. Information. Processing Letters, 2020.
[3] https://github.com/efficient/epaxos
[4] T. França Rezende, P. Sutra, Leaderless State-Machine Replication: Specification, Properties, Limits, In International Symposium on Distributed Computing (DISC), 2020.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment