- Questions:
- How does Consul behave in the different parts of a cluster during a network partition? Will it recover by itself when the partition is gone?
- Would Consul change the Raft Leader if the current leader won't be able to communicate with some nodes while some other node will?
- How does Consul perform in a lossy network?
- Requirements: Vagrant, VirtualBox
- Put the attached
Vagrantfile
to the current dir $ vagrant up
- Wait until vagrant completes the provisioning
$ python -m webbrowser -t "http://127.0.0.1:8501"
Ensure that the cluster has been bootstrapped:
Make some queries:
$ curl http://127.0.0.1:8501/v1/status/peers
["192.168.99.11:8300","192.168.99.13:8300","192.168.99.14:8300","192.168.99.15:8300","192.168.99.12:8300"]
$ curl http://127.0.0.1:8501/v1/status/leader
"192.168.99.11:8300"
Create a new K/V:
$ curl \
--request PUT \
--data hi \
http://127.0.0.1:8501/v1/kv/testkey
true
Ensure it has spread across the cluster:
$ curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
$ curl http://127.0.0.1:8503/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
$ vagrant ssh node5 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -I OUTPUT -o eth1 -j DROP
We now have 2 splits. Check the largest one:
$ time curl -v http://127.0.0.1:8503/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8503 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8503
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Consul-Index: 50
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Wed, 12 Jun 2019 16:40:09 GMT
< Content-Length: 92
<
* Connection #0 to host 127.0.0.1 left intact
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.028s
user 0m0.006s
sys 0m0.008s
Now check the node5 which has been partitioned out:
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Date: Wed, 12 Jun 2019 16:40:49 GMT
< Content-Length: 17
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
No cluster leader
real 0m7.218s
user 0m0.006s
sys 0m0.006s
So, it responds with 500 after a 7-seconds delay. But there's a workaround:
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey'?'stale
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey?stale HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Consul-Index: 50
< X-Consul-Knownleader: false
< X-Consul-Lastcontact: 165214
< Date: Wed, 12 Jun 2019 16:41:41 GMT
< Content-Length: 92
<
* Connection #0 to host 127.0.0.1 left intact
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.026s
user 0m0.007s
sys 0m0.009s
In the UI the node5 is just missing. It's not even marked as failing!
Okay, let's restore the connectivity:
$ vagrant ssh node5 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -D OUTPUT -o eth1 -j DROP
Ensure that the cluster has recovered:
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Consul-Index: 50
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Wed, 12 Jun 2019 16:44:29 GMT
< Content-Length: 92
<
* Connection #0 to host 127.0.0.1 left intact
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.019s
user 0m0.006s
sys 0m0.005s
Now make 5 splits so that there will be no consensus anywhere:
$ vagrant ssh node1 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node1 -- sudo iptables -I OUTPUT -o eth1 -j DROP
$ vagrant ssh node2 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node2 -- sudo iptables -I OUTPUT -o eth1 -j DROP
$ vagrant ssh node3 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node3 -- sudo iptables -I OUTPUT -o eth1 -j DROP
$ vagrant ssh node4 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node4 -- sudo iptables -I OUTPUT -o eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -I INPUT -i eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -I OUTPUT -o eth1 -j DROP
Poke some nodes:
$ time curl http://127.0.0.1:8505/v1/kv/testkey
No cluster leader
real 0m7.058s
user 0m0.007s
sys 0m0.009s
$ time curl http://127.0.0.1:8502/v1/kv/testkey
No cluster leader
real 0m7.148s
user 0m0.006s
sys 0m0.006s
Restore connectivity:
$ vagrant ssh node1 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node1 -- sudo iptables -D OUTPUT -o eth1 -j DROP
$ vagrant ssh node2 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node2 -- sudo iptables -D OUTPUT -o eth1 -j DROP
$ vagrant ssh node3 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node3 -- sudo iptables -D OUTPUT -o eth1 -j DROP
$ vagrant ssh node4 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node4 -- sudo iptables -D OUTPUT -o eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -D INPUT -i eth1 -j DROP
$ vagrant ssh node5 -- sudo iptables -D OUTPUT -o eth1 -j DROP
Ensure that the cluster has recovered:
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.018s
user 0m0.006s
sys 0m0.005s
$ time curl http://127.0.0.1:8502/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.018s
user 0m0.006s
sys 0m0.005s
Get the current leader:
$ curl http://127.0.0.1:8501/v1/status/leader
"192.168.99.11:8300"
Create a network partition between node1 and node2
$ vagrant ssh node2 -- sudo iptables -I INPUT -i eth1 -s 192.168.99.11 -j DROP
$ vagrant ssh node2 -- sudo iptables -I OUTPUT -o eth1 -d 192.168.99.11 -j DROP
Poke node2:
$ time curl http://127.0.0.1:8502/v1/kv/testkey
No cluster leader
real 0m7.126s
user 0m0.006s
sys 0m0.006s
Well... It didn't recover. Even after 10 minutes:
$ curl http://127.0.0.1:8501/v1/status/leader
"192.168.99.11:8300"
And it's all green in the UI:
But it doesn't work:
$ time curl http://127.0.0.1:8502/v1/kv/testkey
No cluster leader
real 0m7.111s
user 0m0.006s
sys 0m0.005s
Apparently, Consul doesn't tolerate partial network partitions as good as I thought.
Remove the partition:
$ vagrant ssh node2 -- sudo iptables -D INPUT -i eth1 -s 192.168.99.11 -j DROP
$ vagrant ssh node2 -- sudo iptables -D OUTPUT -o eth1 -d 192.168.99.11 -j DROP
$ time curl http://127.0.0.1:8502/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m0.017s
user 0m0.006s
sys 0m0.005s
Simulate a VERY lossy network for node5:
$ vagrant ssh node5 -- sudo tc qdisc add dev eth1 root netem delay 5000ms loss 70%
Poke node5:
$ time curl http://127.0.0.1:8505/v1/kv/testkey
rpc error making call: EOF
real 0m28.610s
user 0m0.006s
sys 0m0.008s
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Date: Wed, 12 Jun 2019 17:14:37 GMT
< Content-Length: 17
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
No cluster leader
real 0m7.218s
user 0m0.006s
sys 0m0.006s
Okay, that was too much. Make it less severe:
$ vagrant ssh node5 -- sudo tc qdisc del dev eth1 root netem
$ vagrant ssh node5 -- sudo tc qdisc add dev eth1 root netem delay 2000ms loss 30%
Poke node5:
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m6.567s
user 0m0.006s
sys 0m0.005s
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m7.246s
user 0m0.006s
sys 0m0.006s
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m2.020s
user 0m0.006s
sys 0m0.005s
Okay, that's not bad at all!
Make it worse again:
$ vagrant ssh node5 -- sudo tc qdisc del dev eth1 root netem
$ vagrant ssh node5 -- sudo tc qdisc add dev eth1 root netem delay 2000ms loss 50%
Poke node5:
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m7.228s
user 0m0.006s
sys 0m0.006s
$ time curl http://127.0.0.1:8505/v1/kv/testkey
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m6.447s
user 0m0.006s
sys 0m0.005s
$ time curl http://127.0.0.1:8505/v1/kv/testkey
rpc error making call: EOF
real 0m11.044s
user 0m0.006s
sys 0m0.006s
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Date: Wed, 12 Jun 2019 17:21:26 GMT
< Content-Length: 26
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
rpc error making call: EOF
real 0m20.887s
user 0m0.007s
sys 0m0.008s
$ time curl -v http://127.0.0.1:8505/v1/kv/testkey
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8505 (#0)
> GET /v1/kv/testkey HTTP/1.1
> Host: 127.0.0.1:8505
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Accept-Encoding
< X-Consul-Index: 50
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Wed, 12 Jun 2019 17:21:52 GMT
< Content-Length: 92
<
* Connection #0 to host 127.0.0.1 left intact
[{"LockIndex":0,"Key":"testkey","Flags":0,"Value":"aGk=","CreateIndex":50,"ModifyIndex":50}]
real 0m21.870s
user 0m0.006s
sys 0m0.006s
20 seconds! Just... wow.
Ok, that's enough for today. Remove the tc rule:
$ vagrant ssh node5 -- sudo tc qdisc del dev eth1 root netem
Answers to the questions:
- How does Consul behave in the different parts of a cluster during
a network partition? Will it recover by itself when the partition
is gone?
- The partition without a leader responds with HTTP 500 status after a 7-second timeout
- It does recover by itself
- Would Consul change the Raft Leader if the current leader won't
be able to communicate with some nodes while some other node will?
- Unfortunately it doesn't
- How does Consul perform in a lossy network?
- Not bad at all! Even with 50% loss it manages to serve some queries.