Skip to content

Instantly share code, notes, and snippets.

@stewartpark
Last active April 21, 2023 23:04
Show Gist options
  • Save stewartpark/14a5690afbf717be04857d2cda1d8620 to your computer and use it in GitHub Desktop.
Save stewartpark/14a5690afbf717be04857d2cda1d8620 to your computer and use it in GitHub Desktop.
Userspace Early OOM Killer for Kubernetes Nodes
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: oom-killer
namespace: kube-system
labels:
k8s-app: oom-killer
spec:
selector:
matchLabels:
k8s-app: oom-killer
template:
metadata:
labels:
k8s-app: oom-killer
spec:
containers:
- name: oom-killer
image: ubuntu:xenial
args:
- 'sh'
- '-c'
- |
while true; do
if [ "`free -m | head -n2 | tail -n1 | awk '{ print $7 }'`" -lt "${MIN_MEMORY:-100}" ]; then
echo f > /host_proc/sysrq-trigger
echo "Kernel OOM killer invoked."
fi
sleep 60
done
env:
- name: 'MIN_MEMORY'
value: '350'
securityContext:
privileged: true
volumeMounts:
- mountPath: /host_proc
name: proc
volumes:
- name: proc
hostPath:
path: /proc

Userspace Early OOM Killer for Kubernetes Nodes

On production environments on a budget, it's hard to provision enough to support maximum possible usage spikes. Sometimes when a bad combination of memory usage spikes hits, the performance quickly degrades and eventually the node becomes unresponsive for an unacceptable amount of time.

A node under a low memory condition kills processes based on cgroups memory limits and OOM score, and/or cleans up the page cache in kswapd. But when this doesn't suffice to free up enough memory to operate normally, the node becomes unresponsive, usually resulting in kswapd going off crazy.

To mitigate this problem, you could set up a swap memory on each node, but this is not an option if your system has to maintain low latency. If your production environment is set up with a high availability configuration, it's better to just kill and restart applications than to have a very high latency service.

To prevent the severe memory starvation beforehand, you can deploy this user-land OOM trigger script before the amount of available memory becomes not enough to maintain the node healthy.

kubectl apply -f oom-killer.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment