The standard way of implementing fault-tolerant distributed services is state-machine replication (SMR). In SMR, a service is defined by a deterministic state machine, and each process maintains its own local copy of the machine. Classical SMR protocols such as Paxos and Raft rely on a leader replica to order state-machine commands. The leader orchestrates a growing sequence of agreements, or consensus, each defining the next command to apply on the state machine. Such a scheme has however clear limitations, especially in a geo-distributed setting. First, it increases latency for clients that are far away from the leader. Second, as the leader becomes a bottleneck or its network gets slower, system performance decreases. Last, this approach harms availability because when the leader fails the whole system cannot serve new requests until an election takes place.
To sidestep the above limitations, a new class of leaderless state-machine replication (LSMR) protocols has recently emerged. These protocols allow any replica to make progress as long as it is able to contact enough of its peers. Compared with leader-based protocols, they typically exhibit higher performance and better availability. Unfortunately to date LSMR has been only applied successfully to the crash-failure model. Very few works exist when failures are arbitrary (aka., byzantine), as typically found in blockchain environments.
The goal of this project is to apply LSMR to the byzantine world, using it to implement an efficient distributed ledger.
The project is open to students from either IP Paris Masters (PDS, HPDA), or VAP ASR at Telecom SudParis. One or more students may work jointly on it.
Student(s) will first dig into the fundamentals of leaderless state-machine replication, when processes are subject both to crash  and byzantine  failures. Based upon these readings, a first prototype will be developped. This prototype is built following a micro-service architecture that includes (1) a membership service, (2) a dependency discovery service, and (3) a (single-shot byzantine) consensus service (e.g., Raft  or PBFT ). In a second step, the students will implement a distributed ledger atop their prototype. Then they will compare the performance of their ledger to an existing blockchain solution (e.g., Algorand ) using real-world traces.
 Leaderless State-Machine Replication: Specification, Properties, Limits, Tuanir F. Rezende and Pierre Sutra, Proceedings of the 34th International Symposium on Distributed Computing (DISC'20).
 Leaderless State-Machine Replication: From Fail-stop to Byzantine Failures, Tuanir F. Rezende, PhD thesis, Chapter 4 (2021).
 Raft, Diego Ongaro and John Ousterhout, The Raft consensus algorithm.
 sawtooth-pbft, The Hyperledger Foundation, Swatooth-PBFT.
 algorand, Algorand, Inc, Algorand.