Modify the environment variable ZK_HOME
and BYTEMAN_HOME
in the setup.sh
script.
Don't forget to compile the ZooKeeper in ZK_HOME
.
$ ./setup.sh
The setup.sh
script creates the environment and other scripts in this directory.
$ ./start_zookeeper_cluster.sh
Start the ZooKeeper cluster of 3 nodes, with all data in this directory. The injection is based on Byteman spec serverSocketAccept-exception.btm
. In the log, we can see that server 3 keeps trying to join the quorum and always fails. Server 1 and server 2 keep receiving these requests but can't accept server 3 due to the injection in the leader (server 2).
$ ./stop_injection.sh 2
Stop the injection in the leader (assuming it's server 2), otherwise every time the leader recovers from the fault, the same fault will be injected. With the fix and cancelling the injection, the problematic follow is able to join the quorum and works well. It could be double-checked by feeding some workload to the problematic server in the client ./client.sh 3
where 3 is the server id of the problematic follower.
$ ./stop_zookeeper_cluster.sh
Stop the ZooKeeper cluster for this experiment.