Skip to content

Instantly share code, notes, and snippets.

@oetiker
Last active May 16, 2024 15:40
Show Gist options
  • Save oetiker/8e7fccee8f1f2ad87c5006d23aef872e to your computer and use it in GitHub Desktop.
Save oetiker/8e7fccee8f1f2ad87c5006d23aef872e to your computer and use it in GitHub Desktop.
Protecting a Proxmox VM from being killed by the OOM Killer

Protecting a Proxmox VM from the OOM Killer

VMs tend to occupy a lot of memory, but they are normally also the official denizens of a server. So if memory gets tight we rather have the oom killer kill the new kid on the block (some process which has suddenly started using more ram) instead of the regular VM crowd.

Obviously bad things will happen if the oom killer is not able to free enough memory to make your machine happy again, but with this you at least have some control over who gets killed.

The Hookscript Method

Proxmox allows to configure a script which is run at various points in the lifetime of a machine. There is an example script in /usr/share/pve-docs/examples/guest-example-hookscript.pl for inspiration.

So I created a script to adjust the oom_score

mkdir -p /var/lib/vz/snippets
cat <'EOF' >/var/lib/vz/snippets/oom_config.sh
#!/bin/sh

vmid=$1
phase=$2
vmpid=$(cat /run/qemu-server/$vmid.pid)

if [ $phase = 'post-start' ]; then
	echo "Protecting the vm from the oom-killer"
	echo -1000 > /proc/$vmpid/oom_score_adj
fi
EOF
chmod 755 /var/lib/vz/snippets/oom_config.sh

Activate the script on a vm of your choice (run qm list to see the numbers in the shell)

qm set 104 --hookscript local:snippets/oom_config.sh

If you are in a cluster, you may want to move the script to a filesystem which is available on all nodes, or make sure you copy it to all of them, so that the vms can get properly configured as they move arround.

The Systemd Method

Here the idea is to have a systemd path unit wait for the vm to be started, and then modify the oom score such that the oom killer will never touch it.

This example shows how to protect vm 104

cat <<'EOF' >/etc/systemd/system/104-oom.path
[Path]
PathModified=/run/qemu-server/104.pid

[Install]
WantedBy=multi-user.target
EOF

cat <<'EOF' >/etc/systemd/system/104-oom.service 
[Unit]
Description=Protect 104 from the OOM Killer

[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo -1000 >/proc/$(cat /run/qemu-server/104.pid)/oom_score_adj'
RemainAfterExit=no

[Install]
WantedBy=multi-user.target
EOF

activate everything

systemctl daemon-reload
systemctl enable 104-oom.path

Pro tip: to edit the actual systemd unit file, and not use an override, try

systemctl edit --full --force 104-oom.path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment