Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kshailen/756d08137dd566910fc2d798be8378fe to your computer and use it in GitHub Desktop.
Save kshailen/756d08137dd566910fc2d798be8378fe to your computer and use it in GitHub Desktop.
Robustness test of etcd cluster
1.Test procedure
A- Take cluster of 3 etcd noses as shown below
core@core-02 ~ $ etcdctl cluster-health
member 348dd9a63bc9c9d3 is healthy: got healthy result from http://172.17.8.102:2379
member 7d26e3d2ee11a98e is healthy: got healthy result from http://172.17.8.103:2379
member 95d2e7af71fc961d is healthy: got healthy result from http://172.17.8.101:2379
cluster is healthy
core@core-02 ~ $ etcdctl member list
348dd9a63bc9c9d3: name=219d42232433483c8ad19163ba1c6020 peerURLs=http://172.17.8.102:2380 clientURLs=http://172.17.8.102:2379 isLeader=true
7d26e3d2ee11a98e: name=edbfd19500b0496485e286801bdfa04b peerURLs=http://172.17.8.103:2380 clientURLs=http://172.17.8.103:2379 isLeader=false
95d2e7af71fc961d: name=5b74677278ea4b6ca5dcc43262d2b0e5 peerURLs=http://172.17.8.101:2380 clientURLs=http://172.17.8.101:2379 isLeader=false
core@core-02 ~ $
core@core-02 ~ $
core@core-02 ~ $
B- Put a key valu pair using no leader member of etcd cluster as shown below
core@core-02 ~ $ curl -X PUT http://172.17.8.101:2379/v2/keys/message -d value="Hello"
{"action":"set","node":{"key":"/message","value":"Hello","modifiedIndex":1879285,"createdIndex":1879285}}
core@core-02 ~ $
C- Shutdown 172.17.8.101 and see cluster health.
core@core-02 ~ $ ping 172.17.8.101
PING 172.17.8.101 (172.17.8.101) 56(84) bytes of data.
^C
--- 172.17.8.101 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2032ms
core@core-02 ~ $ etcdctl member list
348dd9a63bc9c9d3: name=219d42232433483c8ad19163ba1c6020 peerURLs=http://172.17.8.102:2380 clientURLs=http://172.17.8.102:2379 isLeader=true
7d26e3d2ee11a98e: name=edbfd19500b0496485e286801bdfa04b peerURLs=http://172.17.8.103:2380 clientURLs=http://172.17.8.103:2379 isLeader=false
95d2e7af71fc961d: name=5b74677278ea4b6ca5dcc43262d2b0e5 peerURLs=http://172.17.8.101:2380 clientURLs=http://172.17.8.101:2379 isLeader=false
core@core-02 ~ $ etcdctl cluster-health
member 348dd9a63bc9c9d3 is healthy: got healthy result from http://172.17.8.102:2379
member 7d26e3d2ee11a98e is healthy: got healthy result from http://172.17.8.103:2379
failed to check the health of member 95d2e7af71fc961d on http://172.17.8.101:2379: Get http://172.17.8.101:2379/health: dial tcp 172.17.8.101:2379: i/o timeout
member 95d2e7af71fc961d is unreachable: [http://172.17.8.101:2379] are all unreachable
cluster is healthy
core@core-02 ~ $
2.Pass criteria:
We should be able to get key/value pair message/Hello, if one member is down in a cluster of 3 nodes. And in heath check cluster should be healthy.
Fault Tolerance Table:
It is recommended to have an odd number of members in a cluster. Having an odd cluster size doesn't change the number needed for majority, but you gain a higher tolerance for failure by adding the extra member. You can see this in practice when comparing even and odd sized clusters:
CLUSTER SIZE MAJORITY FAILURE TOLERANCE
1 1 0
2 2 0
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
8 5 3
9 5 4
3.Result
core@core-02 ~ $ curl -X GET http://172.17.8.102:2379/v2/keys/message
{"action":"get","node":{"key":"/message","value":"Hello","modifiedIndex":1879285,"createdIndex":1879285}}
core@core-02 ~ $
core@core-02 ~ $ curl -L http://127.0.0.1:2379/health
{"health": "true"}core@core-02 ~ $
core@core-02 ~ $
4.If fails Recovery steps
IF it fails then we should bring this VM up from Hypervisor and restart etcd service on this VM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment