VMs tend to occupy a lot of memory, but they are normally also the official denizens of a server. So if memory gets tight we rather have the oom killer kill the new kid on the block (some process which has suddenly started using more ram) instead of the regular VM crowd.
Obviously bad things will happen if the oom killer is not able to free enough memory to make your machine happy again, but with this you at least have some control over who gets killed.
Proxmox allows to configure a script which is run at various points in the lifetime of a machine. There is an example script in
/usr/share/pve-docs/examples/guest-example-hookscript.pl
for inspiration.
So I created a script to adjust the oom_score
mkdir -p /var/lib/vz/snippets
cat <'EOF' >/var/lib/vz/snippets/oom_config.sh
#!/bin/sh
vmid=$1
phase=$2
vmpid=$(cat /run/qemu-server/$vmid.pid)
if [ $phase = 'post-start' ]; then
echo "Protecting the vm from the oom-killer"
echo -1000 > /proc/$vmpid/oom_score_adj
fi
EOF
chmod 755 /var/lib/vz/snippets/oom_config.sh
Activate the script on a vm of your choice (run qm list
to see the numbers in the shell)
qm set 104 --hookscript local:snippets/oom_config.sh
If you are in a cluster, you may want to move the script to a filesystem which is available on all nodes, or make sure you copy it to all of them, so that the vms can get properly configured as they move arround.
Here the idea is to have a systemd path unit wait for the vm to be started, and then modify the oom score such that the oom killer will never touch it.
This example shows how to protect vm 104
cat <<'EOF' >/etc/systemd/system/104-oom.path
[Path]
PathModified=/run/qemu-server/104.pid
[Install]
WantedBy=multi-user.target
EOF
cat <<'EOF' >/etc/systemd/system/104-oom.service
[Unit]
Description=Protect 104 from the OOM Killer
[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo -1000 >/proc/$(cat /run/qemu-server/104.pid)/oom_score_adj'
RemainAfterExit=no
[Install]
WantedBy=multi-user.target
EOF
activate everything
systemctl daemon-reload
systemctl enable 104-oom.path
Pro tip: to edit the actual systemd unit file, and not use an override, try
systemctl edit --full --force 104-oom.path