Skip to content

Instantly share code, notes, and snippets.

Created February 10, 2017 16:02
Show Gist options
  • Save haf/1983206cf11846f6f3f291f78acee5cf to your computer and use it in GitHub Desktop.
Save haf/1983206cf11846f6f3f291f78acee5cf to your computer and use it in GitHub Desktop.
Getting a broken consul cluster up

Consul: 0.7.2

You may have crashed your cluster so that all Consul servers have been offline at some point. You may be running on Kubernetes. The default 96 hours didn't pass, so there was no reaping of Consul servers. Restarting it all doesn't work. You've read this issue five times over and nothing works. On top of it all, which makes it harder, you're running a StatefulSet on Kubernetes, so you need to do kubectl delete pods/consul-1 to make the container arguments (kubectl replace consul/consul.yml) bite. On top of this, if you kubectl exec -it consul-1 and then kill -9 5, Kubernetes goes into a crash loop with exponential backoff, eating into your time.

Sounds like a Friday pleasure, right?

The tools you have at your disposal are:

  • -bootstrap
  • kubectl replace consul/consul.yml
  • -expect-bootstrap=3
  • consul force-leave <ip>
  • consul operator raft -remove-peer -address=<ip>:8300
  • kubectl delete pods/consul-<number>
  • kubectl logs consul-<number> -f
  • consul members
  • echo '["<ip1>:8300", "<ip2>:8300", "<ip3>:8300"]' >/consul/data/raft/peers.json

It's a planning game with a time aspect. You need to make a computer take leadership.

  1. Add -bootstrap to the args of the container
  2. kubectl replace consul/consul.yml to update the scheduling-to-come
  3. kubectl delete pods/consul-0 pods/consul-1 pods/consul-2 to make them all restart
  4. They'll complain they can't connect to the old IPs
  5. Run echo '["<ip1>:8300", "<ip2>:8300", "<ip3>:8300"]' >/consul/data/raft/peers.json on one of the nodes
  6. Use up your first restart on that node, with kill -9 5 where 5 is the child-pid of consul (under the docker process)
  7. It'll come up and start leader election
  8. Do the same (5-7) for consul-1 and consul-2
  9. They should now all complain that they're all running in bootstrap mode
  10. Some of them will try to contact old nodes. Use consul operator raft -remove-peer -address=<ip>:8300 on those nodes to make them reconsider (force-leave doesn't work since it's a graceful leave but the old machine is gone)
  11. Now you only have complaining about the bootstrap flag left. One machine is leader. Don't touch that machine.
  12. Edit your consul.yml file, removing -bootstrap
  13. Delete the two non-leader pods: kubectl delete pods/consul-<number>
  14. Wait until they start again. Use operator to remove the old IPs
  15. Verify with consul members on the leader – this only lists serf-members, not raft-members.
  16. You should now have a three-node cluster back up without removing any folders.

It's a planning game.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment