Skip to content

Instantly share code, notes, and snippets.

@ReToCode
Last active March 9, 2018 11:31
Show Gist options
  • Save ReToCode/66bdb0d6d2102e7f217358aa5ca499bf to your computer and use it in GitHub Desktop.
Save ReToCode/66bdb0d6d2102e7f217358aa5ca499bf to your computer and use it in GitHub Desktop.
Better pod termination handling - Zero Downtime
# Problem
We operate more than 3500 containers on a large OpenShift cluster. A lot of of applications have the same problem with the current termination process. To achieve zero downtime in rolling updates, pod restarts and evacuation of nodes due to maintenance an application has to do the following things:
- Pod has to be killed due to some of the above mentioned events
- Kubernetes/OpenShift sends a SIGTERM signal
- Application has to catch the SIGTERM signal
- Application has to set its readyness probe to false to stop getting new traffic
- Application has to wait until no new connections are sent via service
- In OpenShift the application also has to wait until the HA-Proxy is reloaded to not get any traffic from there. Also the HA-Proxy has it's own health check which is not in sync with the readyness state of Kubernetes. To signal HA-Proxy to not send any more traffic the HTTP listener port has to be closed.
- The application has to finish it's active requests (within the terminationGracePeriodSeconds) and then terminate itself
@SBB we implemented this behaviour for Spring Boot 1 & 2 with this extension library:
https://github.com/SchweizerischeBundesbahnen/springboot-graceful-shutdown
But this solution only works for java apps. All other languages/webservers have to implement the same thing again and again. We talked to a lot of people/companies that use OpenShift/Kubernetes and all of them struggle with this issue. Thus, I would like to propose a solution where the container platform handles the termination a bit differently.
# Solution proposal
Introduce a pod new life cycle state, something like "TerminationPreparation", in this state
- Stop the readyness-check and set the container to "NotReady", stop sending SDN traffic to that container
- In OpenShift remove the container from the HA-Proxy config, wait until all HA-Proxies are reloaded
- Introduce a new setting where an application can define how long it needs to finish existing requests. This could also be something like the readyness/liveness checks (e.g. terminationPreparationGracePeriodSeconds).
- If this time is up, or the application signals that is is done processing requests, send the SIGTERM
- Applications still can handle that signal if they have to do things like cleanup, but it is no longer mandatory for "zero downtime"
This would massively improve the availability of our applications during any form of container termination. Developers would no longer need to take care of that manually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment