Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Harvester GPU Provisioning

Harvester GPU Provisioning

  1. Install Harvester, then SSH into the server.

  2. Edit /boot/grub/grub.cfg as follows:

 set default=0
 set timeout=10
 
 set gfxmode=auto
 set gfxpayload=keep
 insmod all_video
 insmod gfxterm
 
 menuentry "Start Harvester" {
   search.fs_label HARVESTER_STATE root
   set sqfile=/k3os/system/kernel/current/kernel.squashfs
   loopback loop0 /$sqfile
   set root=($root)
-  linux (loop0)/vmlinuz printk.devkmsg=on console=tty1
+  linux (loop0)/vmlinuz printk.devkmsg=on intel_iommu=on modprobe.blacklist=nouveau pci=noaer
   initrd /k3os/system/kernel/current/initrd
 }
  • intel_iommu=on: enables intel IOMMU support. For AMD, use amd_iommu=on
  • modprobe.blacklist=nouveau: Disable the nouveau driver. We will configure the vfio-pci driver instead later.
  • pci=noaer: Prevents some issues related to USB device passthrough

Reboot.

  1. Find the PCI Device IDs for your GPU and any other devices that may be in the same IOMMU group.
$ kubectl run -it --privileged --image ubuntu <pod name>
=> $ apt update && apt install pciutils
=> $ lspci -nnk -d 10de: # colon at the end is required

Note that "10de" is nvidia's vendor ID. One or more devices may be shown. For example:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] [10de:1fb9] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)

If multiple devices are shown, they may be grouped together in the same card and will all need to be configured for PCIe passthrough.

  1. Get the IDs of the devices. In this case, 10de:1fb9,10de:10fa. Edit /boot/grub/grub.cfg as follows:
 set default=0
 set timeout=10
 
 set gfxmode=auto
 set gfxpayload=keep
 insmod all_video
 insmod gfxterm
 
 menuentry "Start Harvester" {
   search.fs_label HARVESTER_STATE root
   set sqfile=/k3os/system/kernel/current/kernel.squashfs
   loopback loop0 /$sqfile
   set root=($root)
-  linux (loop0)/vmlinuz printk.devkmsg=on intel_iommu=on modprobe.blacklist=nouveau pci=noaer
+  linux (loop0)/vmlinuz printk.devkmsg=on intel_iommu=on modprobe.blacklist=nouveau vfio-pci.ids=10de:1fb9,10de:10fa pci=noaer
   initrd /k3os/system/kernel/current/initrd
 }

This configuration tells the kernel to use the vfio-pci drivers for these devices.

Reboot.

  1. Verify the devices are using the correct driver:
$ kubectl run -it --privileged --image ubuntu <pod name>
=> $ apt update && apt install pciutils
=> $ lspci -nnk -d 10de:

If configured correctly, you should see Kernel driver in use: vfio-pci

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GLM [Quadro T1000 Mobile] [10de:1fb9] (rev a1)
	Kernel driver in use: vfio-pci
01:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
	Kernel driver in use: vfio-pci
  1. Install the nvidia kubevirt gpu device plugin

$ kubectl create -f https://raw.githubusercontent.com/NVIDIA/kubevirt-gpu-device-plugin/master/manifests/nvidia-kubevirt-gpu-device-plugin.yaml

  1. Check the log output: $ kubectl -n kube-system logs nvidia-kubevirt-gpu-dp-daemonset-xxxxx

You should see the following:

2021/07/19 15:52:28 Not a device, continuing
2021/07/19 15:52:28 Nvidia device  0000:01:00.0
2021/07/19 15:52:28 Iommu Group 1
2021/07/19 15:52:28 Device Id 1fb9
2021/07/19 15:52:28 Nvidia device  0000:01:00.1
2021/07/19 15:52:28 Iommu Group 1
2021/07/19 15:52:28 Error accessing file path "/sys/bus/mdev/devices": lstat /sys/bus/mdev/devices: no such file or directory
2021/07/19 15:52:28 Iommu Map map[1:[{0000:01:00.0} {0000:01:00.1}]]
2021/07/19 15:52:28 Device Map map[1fb9:[1]]
2021/07/19 15:52:28 vGPU Map  map[]
2021/07/19 15:52:28 GPU vGPU Map  map[]
2021/07/19 15:52:28 DP Name TU117GLM_Quadro_T1000_Mobile
2021/07/19 15:52:28 Devicename TU117GLM_Quadro_T1000_Mobile
2021/07/19 15:52:28 TU117GLM_Quadro_T1000_Mobile Device plugin server ready

Copy the device plugin name, in this example it is TU117GLM_Quadro_T1000_Mobile

  1. At the time of writing, Harvester does not have a UI for provisioning GPUs, so we will need to edit the YAML for a virtual machine. Create a VM instance and stop it. Then, edit its yaml as follows:
...
    spec:
      domain:
        cpu:
          cores: 4
          sockets: 1
          threads: 1
        devices:
          disks:
          - disk:
              bus: virtio
            name: disk-0
          - bootOrder: 1
            disk:
              bus: virtio
            name: disk-1
+         gpus:
+         - deviceName: nvidia.com/TU117GLM_Quadro_T1000_Mobile
+           name: gpu1
...

(Replace the part after nvidia.com/ with your device name)

  1. Start the VM. If configured correctly, you should see the following output from the kubevirt gpu plugin pod:
2021/07/19 15:53:08 In allocate
2021/07/19 15:53:08 Allocated devices [0000:01:00.0 0000:01:00.1]

If the VM fails to start, check kubevirt logs. If you see errors such as: Please ensure all devices within the iommu_group are bound to their vfio bus driver., this means there are other devices in the same IOMMU group as your GPU which also need to be configured with the vfio-pci driver. Edit the kernel cmdline to include these devices and reboot.

@czadikem
Copy link

czadikem commented Nov 17, 2021

What has changed now that they are using grub2?

@kralicky
Copy link
Author

kralicky commented Nov 17, 2021

@czadikem The kernel parameters should not be any different but if /etc/default/grub is no longer read-only you should edit them there and use grub2-mkconfig instead.

@czadikem
Copy link

czadikem commented Nov 18, 2021

Any idea why Harvestor when I ssh as root that is deletes any of the files of grub I edit? I am unable to do grub2-mkconfig as I get ""/usr/sbin/grub2-probe: error: failed to get canonical path of `overlay'."

@kralicky
Copy link
Author

kralicky commented Nov 18, 2021

@mirceanton
Copy link

mirceanton commented Jun 24, 2022

I followed the steps presented in the documentation here:

  1. Mount the state dir in rw mode
mount -o remount,rw /dev/sda3 /run/initramfs/cos-state
  1. Edited the /run/initramfs/cos-state/grub2/grub.cfg file to contain:
# ...
set gfxmode=auto
set gfxpayload=keep
insmod all_video
insmod all_video
insmod gfxterm
insmod loopback
insmod squash4

menuentry "${display_name}" --id cos {
  search --no-floppy --no-floppy --label --set=root COS_STATE
  set img=/cOS/active.img
  set label=COS_ACTIVE
  loopback loop0 /$img
  set root=($root)
  source (loop0)/etc/cos/bootargs.cfg
  linux (loop0)$kernel $kernelcmd ${extra_cmdline} ${extra_active_cmdline} intel_iommu=on modprobe.blacklist=nouveau vfio-pci.ids=10de:0fb9,10de:1c81,10de:1ad9,10de:1ad8,10de:10f8,10de:1e84 pci=noaer
  initrd (loop0)$initramfs
}
# ...
  1. Reboot

  2. SSH in and sudo su to get root access

# lspci -k -d 10de:
65:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3ffc
65:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3ffc
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel
65:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3ffc
        Kernel driver in use: xhci_hcd
        Kernel modules: xhci_pci
65:00.3 Serial bus controller: NVIDIA Corporation TU104 USB Type-C UCSI Controller (rev a1)
        Subsystem: Gigabyte Technology Co., Ltd Device 3ffc
b4:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85d7
b4:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device 85d7
        Kernel driver in use: snd_hda_intel
        Kernel modules: snd_hda_intel

Which is the same output I had even before configuring anything.

I did try to install the nvidia gpu plugin, just in case, and the logs say:

2022/06/24 11:45:02 Not a device, continuing
2022/06/24 11:45:02 Nvidia device  0000:65:00.0
ERROR: logging before flag.Parse: E0624 11:45:02.561376       1 device_plugin.go:257] Could not read link driver for device 0000:65:00.0: readlink /sys/bus/pci/devices/0000:65:00.0/driver: no such file or directory
2022/06/24 11:45:02 Could not get driver for device  0000:65:00.0
2022/06/24 11:45:02 Nvidia device  0000:65:00.1
2022/06/24 11:45:02 Nvidia device  0000:65:00.2
2022/06/24 11:45:02 Nvidia device  0000:65:00.3
ERROR: logging before flag.Parse: E0624 11:45:02.561505       1 device_plugin.go:257] Could not read link driver for device 0000:65:00.3: readlink /sys/bus/pci/devices/0000:65:00.3/driver: no such file or directory
2022/06/24 11:45:02 Could not get driver for device  0000:65:00.3
2022/06/24 11:45:02 Nvidia device  0000:b4:00.0
ERROR: logging before flag.Parse: E0624 11:45:02.561879       1 device_plugin.go:257] Could not read link driver for device 0000:b4:00.0: readlink /sys/bus/pci/devices/0000:b4:00.0/driver: no such file or directory
2022/06/24 11:45:02 Could not get driver for device  0000:b4:00.0
2022/06/24 11:45:02 Nvidia device  0000:b4:00.1
2022/06/24 11:45:02 Error accessing file path "/sys/bus/mdev/devices": lstat /sys/bus/mdev/devices: no such file or directory
2022/06/24 11:45:02 Iommu Map map[]
2022/06/24 11:45:02 Device Map map[]
2022/06/24 11:45:02 vGPU Map  map[]
2022/06/24 11:45:02 GPU vGPU Map  map[]

Which is honestly what I would expect... driver issues, since I can't seem to get the vfio-pci driver to load.

Any ideas what I am doing wrong, or if anything has dramatically changed since writing this guide?

@kralicky
Copy link
Author

kralicky commented Jun 24, 2022

@mirceanton This guide is pretty old (written before harvester 1.0), and I'm not a maintainer of Harvester so I can't guarantee that this still works. Some things I would try:

  • Check /proc/cmdline to see if your kernel parameters were actually applied
  • Check to see if vfio-pci.ko exists in /usr/lib/modules
  • Try running modprobe vfio-pci while the gpu is not bound to the nvidia driver, and see if it binds to vfio_pci.

@mirceanton
Copy link

mirceanton commented Jun 24, 2022

Managed to find a solution. The process has substantially changed up until step 5. Will test and post an update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment