Skip to content

Instantly share code, notes, and snippets.

@heug
Last active January 14, 2020 06:53
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save heug/ef2f0574b3c97fbc9d88c7e2efb721b2 to your computer and use it in GitHub Desktop.
Save heug/ef2f0574b3c97fbc9d88c7e2efb721b2 to your computer and use it in GitHub Desktop.
Hautelook debug
# Hautelook experienced an outage where their services crashed overnight (no particular elevated build volume.
# They attempted a restart and received errors such as:
### - Timeout waiting for event Migrator Finished - VM Service
### - Timeout waiting for event Migrator Finished - Permissions Service
### - Timeout waiting for event Postgresql 9.4 ready-5432
#
# Weirdly enough, the container logs for each of those looked to be just fine. Replicated logs would show the timeout errors, however.
# Doing the Replicated dance did not fix the issue, so I had them nuke all existing images and containers and reinstall Replicated.
# Good to reassure them that no data will be destroyed through all this.
### Steps taken
# Stop Application, all Replicated Services (service replicated/-ui/-operator stop)
# Confirm all containers stopped with docker ps
# Remove all stopped containers
$ docker stop $(docker ps -a -q)
$ docker rm $(docker ps -a -q)
# Remove all images
$ docker rmi $(docker images -a -q)
# At this point you have no more Replicated images, so time to reinstall that
$ curl -sSk -o /tmp/get_replicated.sh "https://get.replicated.com/docker?replicated_tag=2.10.3&replicated_ui_tag=2.10.3&replicated_operator_tag=2.10.3"
# And then run the install script, you'll need to insert the Private IP
$ bash /tmp/get_replicated.sh local-address="$PRIVATE_IP" no-proxy no-docker
# Once that's done you'll need to start all the Replicated services if they're not up
# Go through starting CircleCI again and cross your fingers!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment