Skip to content

Instantly share code, notes, and snippets.

View ReToCode's full-sized avatar

Reto Lehmann ReToCode

  • Red Hat
  • Bern, Switzerland
View GitHub Profile
@ReToCode
ReToCode / kube_graceful_shutdown.txt
Last active March 9, 2018 11:31
Better pod termination handling - Zero Downtime
# Problem
We operate more than 3500 containers on a large OpenShift cluster. A lot of of applications have the same problem with the current termination process. To achieve zero downtime in rolling updates, pod restarts and evacuation of nodes due to maintenance an application has to do the following things:
- Pod has to be killed due to some of the above mentioned events
- Kubernetes/OpenShift sends a SIGTERM signal
- Application has to catch the SIGTERM signal
- Application has to set its readyness probe to false to stop getting new traffic
- Application has to wait until no new connections are sent via service
- In OpenShift the application also has to wait until the HA-Proxy is reloaded to not get any traffic from there. Also the HA-Proxy has it's own health check which is not in sync with the readyness state of Kubernetes. To signal HA-Proxy to not send any more traffic the HTTP listener port has to be closed.
- The application has to finish it's active requests (within the terminationGracePeriodSeconds
@ReToCode
ReToCode / pruning.md
Last active November 23, 2017 07:20
OpenShift image pruning

Generally:

  • The pruner should not ever delete any used image (in any possible openshift object - also the beta/alpha types!)
  • The pruner should not ever delete layers that are used (also via cache) in such an image

I don't know what issues you already know about, or are already fixed in 3.5+. Here a few things to consider if the basic solution stays the way it is today:

Things to consider:

Builds happen on an node based on "docker caching logic" and on the "local docker pool", not based on the upstream registry

  • This way the cache-reuse happens based on the "docker logic" and against the local docker pool -> The pruning must know about dockers "caching logic". This could be tricky if docker changes that behaviour in a future version and the pruner is not aligned
@ReToCode
ReToCode / Remove all node_module folders recursively
Last active November 23, 2017 11:42 — forked from cosemansp/gist:6ebd879d6f53886574c8
Remove all node_module folders recursively
find . -name "node_modules" -exec rm -rf '{}' +