Created
May 15, 2024 19:20
-
-
Save danielhanold/8d03787c7830e8b5f6eb894dda92aa15 to your computer and use it in GitHub Desktop.
No-downtime NGINX Kubernetes deployments
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: apps/v1 | |
kind: Deployment | |
metadata: | |
name: nginx-no-downtime | |
spec: | |
replicas: 1 | |
selector: | |
matchLabels: | |
app: nginx-no-downtime | |
strategy: | |
rollingUpdate: | |
maxSurge: 1 | |
maxUnavailable: 0 | |
type: RollingUpdate | |
template: | |
metadata: | |
labels: | |
app: nginx-no-downtime | |
spec: | |
terminationGracePeriodSeconds: 30 | |
containers: | |
- name: nginx-no-downtime | |
image: nginx:latest | |
# By default, Kubernetes sends a SIGTERM signal to a container when it's safe to terminate it, | |
# e.g. when another instance of this deployment becomes available during a rollout. | |
# NGINX will interpret this signal as a "fast shutdown", i.e. it will not gracefully shut down | |
# and will not wait until until any existing connections have finished. | |
# This will lead to dropped connections and "5xx" response codes. | |
# @see https://ubuntu.com/blog/avoiding-dropped-connections-in-nginx-containers-with-stopsignal-sigquit | |
# | |
# It could be assumed that a new instance of the Nginx deployment requires startupProbes or readinessProbes | |
# to determine if the new instance can receive traffic, however that assumption is incorrect. | |
# Instead, the issue is related to the existing instance of the Nginx deployment pre-maturely terminating | |
# without waiting for all connections to finish. | |
# | |
# We want NGINX to gracefully terminate and wait until any existing connections finish. | |
# To achieve this, we plug in to the "preStop" Kubernetes lifecycle event, which allows us to | |
# execute a command before Kubernetes sends the SIGTERM signal. | |
# For this event stage, we shut down NGINX gracefully ourselves. This will wait for up to | |
# the value of terminationGracePeriodSeconds (default = 30 seconds) and then continue | |
# with the container termination. In our testing, this has lead to no dropped connections | |
# during load testing & triggering a deployment event for a deployment with 1 replica. | |
# @see https://nginx.org/en/docs/control.html | |
# @see https://medium.com/inside-personio/graceful-shutdown-of-fpm-and-nginx-in-kubernetes-f362369dff22 | |
lifecycle: | |
preStop: | |
exec: | |
command: | |
- /bin/sh | |
- -c | |
- kill -s QUIT $(cat /var/run/nginx.pid) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment