vadimstasiev/readme.md

## readme.md

      
    Raw
  

              readme.md
            
          
    Source: https://www.reddit.com/r/jellyfin/comments/cig9kh/nvidia_quadro_p400_passthrough_on_proxmox/
Nvidia Quadro P400 Passthrough on Proxmox

Ok, finally got it working.  I'll provide a writeup here of my steps. But these are for Proxmox with an Ubuntu 1810 container only. If you want to use a different setup, you will have to figure it out yourself.
On the host (Proxmox):


Install the Proxmox Linux Headers. The version should match your kernel

apt install pve-headers-$(uname -r)
Download the Nvidia driver, make it executable and run it:

wget http://us.download.nvidia.com/XFree86/Linux-x86_64/430.34/NVIDIA-Linux-x86_64-430.34.run && chmod +x NVIDIA-Linux-x86_64-430.34.run && ./NVIDIA-Linux-x86_64-430.34.run

You want to use dkms but not update the x config file
Load the Nvidia kernel modules at boot time. For this  edit the file /etc/modules-load.d/modules.conf and add the lines nvidia and nvidia-uvm. The file should something like this:


    # /etc/modules: kernel modules to load at boot time.
    #
    # This file contains the names of kernel modules that should be loaded
    # at boot time, one per line. Lines beginning with "#" are ignored.
    nvidia
    nvidia-uvm


Create a script. (I used vim /root/nvidia-dev-node-setup) and fill it with following bash script code


#!/bin/bash

/sbin/modprobe nvidia

if [ "$?" -eq 0 ]; then
    # Count the number of NVIDIA controllers found.
    NVDEVS=`lspci | grep -i NVIDIA`
    N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
    NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
    N=`expr $N3D + $NVGA - 1`
    for i in `seq 0 $N`; do
        mknod -m 666 /dev/nvidia$i c 195 $i
    done
    mknod -m 666 /dev/nvidiactl c 195 255
else
    exit 1
fi

/sbin/modprobe nvidia-uvm

if [ "$?" -eq 0 ]; then
     # Find out the major device number used by the nvidia-uvm driver
     D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
     mknod -m 666 /dev/nvidia-uvm c $D 0
else
    exit 1
fi

/usr/bin/nvidia-modprobe -u -c 0

/usr/bin/nvidia-persistenced --persistence-mode


Edit your crontab with crontab -e and add following line at the end:@reboot /root/nvidia-dev-node-setup
Reboot your Proxmox host. When rebooted, the command ls -lah /dev/nvidia* should show these devices:


root@pve:~# ls -lah /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Jul 28 14:34 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jul 28 14:34 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Jul 28 14:34 /dev/nvidia-modeset
crw-rw-rw- 1 root root 237,   0 Jul 28 14:34 /dev/nvidia-uvm
crw-rw-rw- 1 root root 237,   1 Jul 28 14:34 /dev/nvidia-uvm-tools


When executing ls -lah /dev/dri/* you should see something like this:


root@pve:~# ls -lah /dev/dri/*
crw-rw---- 1 root video 226,   0 Jul 28 14:34 /dev/dri/card0
crw-rw---- 1 root video 226,   1 Jul 28 14:34 /dev/dri/card1
crw-rw---- 1 root video 226, 128 Jul 28 14:34 /dev/dri/renderD128


Please note the numbers in the fifth column when executing both these commands (eg. 195, 237, 226)
Last but not least the Quadro should be recognized by nvidia-smi:


root@pve:~# nvidia-smi
Sun Jul 28 14:51:13 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.34       Driver Version: 430.34       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro P400         On   | 00000000:07:00.0 Off |                  N/A |
| 34%   35C    P8    N/A /  N/A |      1MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In the container


Set up an privileged (dunno if unprivileged works too) Ubuntu 18.10 LXC container on proxmox
Setup a sudo user https://linuxize.com/post/how-to-create-a-sudo-user-on-ubuntu/
Login as the new user and install jellyfin (taken from the official documentation)

sudo apt install -y apt-transport-https software-properties-common && sudo add-apt-repository universe && wget -O - https://repo.jellyfin.org/ubuntu/jellyfin_team.gpg.key | sudo apt-key add - && echo "deb [arch=$( dpkg --print-architecture )] https://repo.jellyfin.org/ubuntu $( lsb_release -c -s ) main" | sudo tee /etc/apt/sources.list.d/jellyfin.list && sudo apt update && sudo apt install -y jellyfin && sudo systemctl enable jellyfin && sudo reboot
After rebooting download and install the same Nvidia driver as on the host, but without the kernel modules (since we share the kernel with the host and it already has the modules)

wget http://us.download.nvidia.com/XFree86/Linux-x86_64/430.34/NVIDIA-Linux-x86_64-430.34.run && chmod +x NVIDIA-Linux-x86_64-430.34.run && sudo ./NVIDIA-Linux-x86_64-430.34.run --no-kernel-module

Again on the host


When this is done, shut down the container and edit its conf file. For me it was the container with the id 115:

vim /etc/pve/nodes/pve/lxc/115.conf
Add following lines to the conf file. But replace the numbers in the first three lines with the numbers you saw above when doing the ls commands. Adjust the number of lines accordingly. One line per number:


lxc.cgroup.devices.allow: c 226:* rwm
lxc.cgroup.devices.allow: c 195:* rwm
lxc.cgroup.devices.allow: c 237:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri/card0 dev/dri/card0 none bind,optional,create=file
lxc.mount.entry: /dev/dri/card1 dev/dri/card1 none bind,optional,create=file
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file


Reboot the container

Back in the container


Run nvidia-smi. The graphics card should be recognized as shown above
Run ls -lah /dev/nvidia* and ls -lah /dev/dri*. You should see the same nodes as on the host.
Test ffmpeg. A transcoding command I picked up from Jellyfin logs (dunno if it is versatile enough to work with all test files on any machine) You should not see any error messages

/usr/lib/jellyfin-ffmpeg/ffmpeg -c:v h264_cuvid -resize 426x238 -i file:"/path/to/input.mkv" -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -force_key_frames "expr:gte(t,n_forced*5)" -copyts -avoid_negative_ts disabled -start_at_zero -pix_fmt yuv420p -preset default -b:v 64000 -maxrate 64000 -bufsize 128000 -profile:v high -vsync -1 -map_metadata -1 -map_chapters -1 -threads 0 -codec:a:0 libmp3lame -ac 2 -ab 128000 -af "volume=2" -y /path/to/output.mkv
Go to the Jellyfin web interface, Hamburger Menu/Admin/Dashboard/Transcoding. Choose Nvidia NVENC. Check all the boxes that appear (formats and the hardware acceleration boxes). Save the settings
Play something in Jellyfin. nvidia-smi should show a thread (On the host and/or container)
Congratulations. You have passed through your graphics card to Jellyfin in a LXC Container on Proxmox.

Talking points


I really struggled with Reddit markdown language when writing this. So if someone whats to structure/format this guide in a better way: be my guest
I ended up passing through everything graphics/Nvidia related to the container. Most likely it is not necessary, but when doing some test with removing device nodes, everything stopped working. So I leave it no as is is. But everyone is invited to optimize this.
The device cgroup number might change when rebooting the host. If this becomes a problem, a script might become necessary to update the LXC conf file before starting the container. Or is there a way to fix these values?


[Initial post]
Hi guys,
I bought a Quadro P400 for my home server to do some transcoding for 4K videos. I spent the last weekend to figure out what to do to get transcoding working in my Jellyfin LXC container on proxmox. I ended up passing through /dev/dri/* (which only seems to be for VAAPI) but also /dev/nvidia0,  /dev/nvidiactl and /dev/nvidia-uvm. Transcoding still didn't work though.
Does someone know how to setup a Quadro/Nvidia passthrough on Proxmox with Jellyfin?

Thanks!
permalink
by FriedrichNietzsche84 (↑ 16/ ↓ 0)