Skip to content

Instantly share code, notes, and snippets.

@egg82
Last active May 4, 2024 14:03
Show Gist options
  • Star 82 You must be signed in to star a gist
  • Fork 24 You must be signed in to fork a gist
  • Save egg82/90164a31db6b71d36fa4f4056bbee2eb to your computer and use it in GitHub Desktop.
Save egg82/90164a31db6b71d36fa4f4056bbee2eb to your computer and use it in GitHub Desktop.
NVidia Proxmox + LXC

Proxmox

Find the proper driver at the NVidia website.

Note: Make sure to select "Linux 64-bit" as your OS

Hit the "Search" button.

Hit the "Download" button.

Right-click the download button and "Copy link address".

SSH into to your Proxmox instace.

Create the file /etc/modprobe.d/nvidia-installer-disable-nouveau.conf with the following contents:

# generated by nvidia-installer
blacklist nouveau
options nouveau modeset=0

Reboot the machine:

reboot now

Run the following:

apt install build-essential pve-headers-$(uname -r)
wget <link you copied>
chmod +x ./NVIDIA-Linux-x86_64-<VERSION>.run
./NVIDIA-Linux-x86_64-<VERSION>.run

Edit /etc/modules-load.d/modules.conf and add the following to the end of the file:

nvidia
nvidia_uvm

Run the following:

update-initramfs -u

Create the file /etc/udev/rules.d/70-nvidia.rules and add the following:

# /etc/udev/rules.d/70-nvidia.rules
# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loaded
KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
# Create the CUDA node when nvidia_uvm CUDA module is loaded
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"

Reboot the machine.

For each container

SSH into the Proxmox.

Run the following:

modprobe nvidia-uvm
ls /dev/nvidia* -l

Note these numbers, you'll need them in the next step

Edit /etc/pve/lxc/<container ID>.conf and add the following:

lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.cgroup.devices.allow: c <number from previous step>:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

Container/LXC

SSH into your container.

Run the following:

dpkg --add-architecture i386
apt update
apt install libc6:i386

wget <link you copied for the Proxmox step>
chmod +x ./NVIDIA-Linux-x86_64-<VERSION>.run
./NVIDIA-Linux-x86_64-<VERSION>.run --no-kernel-module

Reboot the container.

CUDA

SSH back into your container.

Run the following:

apt install nvidia-cuda-toolkit nvidia-cuda-dev

Note: Plex DOES NOT USE THE GPU until you install CUDA

Plex will pick up the fact that you have a GPU in the install process and will enable the hardware transcoding checkbox, but it will NOT use the GPU until CUDA is installed.

Python/cuDNN

SSH into your container.

Run the following:

apt install python3 python3-dev python3-pip python3-pycuda

Check your CUDA version:

nvidia-smi

Download the correct cuDNN library from the NVidia website (requires creating an account, but it's free).

Upload it to your container.

Run the following:

tar -xvf cudnn-<tab>
mkdir -p /usr/local/cuda/lib64/
mkdir -p /usr/local/cuda/include/
cp cuda/lib64/* /usr/local/cuda/lib64/
cp cuda/include/* /usr/local/cuda/include/
export CUDA_ROOT=/usr/local/cuda
export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$LD_LIBRARY_PATH
export CPATH=$CUDA_ROOT/include:$CPATH
export LIBRARY_PATH=$CUDA_ROOT/lib64:$LIBRARY_PATH
echo "export CUDA_ROOT=/usr/local/cuda" >> .bashrc
echo "export LD_LIBRARY_PATH=\$CUDA_ROOT/lib64:\$LD_LIBRARY_PATH" >> .bashrc
echo "export CPATH=\$CUDA_ROOT/include:\$CPATH" >> .bashrc
echo "export LIBRARY_PATH=\$CUDA_ROOT/lib64:\$LIBRARY_PATH" >> .bashrc

Done!

@fisherwei
Copy link

pve 7.x should use cgroup2

TLDR: lxc.cgroup.devices.allow MUST be changed to lxc.cgroup2.devices.allow

https://forum.proxmox.com/threads/pve-7-0-lxc-intel-quick-sync-passtrough-not-working-anymore.92025/

@BlummNikkiS
Copy link

Thank you!

@bobo-jamson
Copy link

I think there is a minor error, based on the pattern of mapping /dev/nvidia-* to the same dev/nvidia-*,

lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

the /dev/nvidia-uvm-tools should likely be /dev/nvidia-caps/nvidia-cap1 and /dev/nvidia-caps/nvidia-cap2.

Thanks for this really helpful guide by the way.

@doughnet
Copy link

Should include that every time a kernel update is completed and a reboot done the numbers from the commend "ls /dev/nvidia* -l" can change rendering the container with the passthrough not seeing the NVIDIA device.

@doughnet
Copy link

This is also a good link to have for a patcher and direct link for the linux drivers. In additional the script in the repo allows for bypassing NVIDIA restrictions

https://github.com/keylase/nvidia-patch

@GarckaMan
Copy link

I upgraded to Proxmox 8 and it comes with 6.2 kernel.
With that, I can't install CUDA drivers anymore. Any solution to that?

@ryc111
Copy link

ryc111 commented Aug 19, 2023

for pve8 and debian lxc:

I tried this on both host and lxc to replace this step apt install nvidia-cuda-toolkit nvidia-cuda-dev , for install cuda:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Debian&target_version=11&target_type=deb_network

Do both on host and lxc to install cuda and related driver:

wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
add-apt-repository contrib
apt-get update
apt-get -y install cuda

And it works now!

@Jugrnot
Copy link

Jugrnot commented Sep 19, 2023

I hate to necro this, but I'm having some issues. Followed the original writeup to the T and had no errors result from any of the process. Edited the lxc conf, added what was necessary, rebooted. From inside my plex container for example, nvidia-smi reflects the cards are present, plex install recognizes that an nvidia card exists, yet in the plex options for transcode I can't select anything other than "auto" for hardware. Using "auto" nothing ever touches the GPUs. Both the host and container have the exact same version of driver and cuda installed. Did I install the wrong version or something for this to work??

image

image

@swahpy
Copy link

swahpy commented Nov 2, 2023

This is also a good link to have for a patcher and direct link for the linux drivers. In additional the script in the repo allows for bypassing NVIDIA restrictions

Hi, did you just simply replace apt install nvidia-cuda-toolkit nvidia-cuda-dev with following commands?

wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
add-apt-repository contrib
apt-get update
apt-get -y install cuda

But when I run above commands it prompted that the driver version was not compatible and always let me run nvidia uninstaller to remove the driver. I could not even run nvidia-smi successfully.
So could help to share all the steps you did? Thank you very much.

@chyld
Copy link

chyld commented Feb 9, 2024

Here are my changes to /etc/pve/lxc/100.conf

lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 234:* rwm
lxc.cgroup2.devices.allow: c 237:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap1 none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-caps/nvidia-cap2 none bind,optional,create=file

It works flawlessly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment