Skip to content

Instantly share code, notes, and snippets.

@kaapstorm
Last active January 17, 2021 10:13
Show Gist options
  • Save kaapstorm/cec3fe9307b1ae0f455476ba51080a83 to your computer and use it in GitHub Desktop.
Save kaapstorm/cec3fe9307b1ae0f455476ba51080a83 to your computer and use it in GitHub Desktop.
How to configure GPU passthrough on saitama

GPU Passthrough on saitama

Preamble

saitama has an AMD socket AM4 B350 motherboard. It does not have an integrated graphics device. It has two NVIDIA cards plugged into PCIe ports. The worse of the two cards is in the first port, making it the primary card. The better is in the fourth port. As a result the better card has a slower port, but this was found to be necessary; if the primary card was passed through, the host could be configured not to take it, but the VM BIOS and OS would not detect it. Ubuntu worked as a guest OS, but Windows did not.

Prepare the host

  1. If necessary, enable IOMMU in the BIOS.

  2. Edit /etc/default/grub. Append "iommu=pt iommu=1" to GRUB_CMDLINE_LINUX_DEFAULT [1]:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash iommu=pt iommu=1"
    
  3. Update grub:

    $ sudo update-grub
    
  4. Reboot

  5. Identify the PCIe bus that the GPU we're passing through is on:

    $ lspci -nnk
    ...
    1b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107 [GeForce GTX 750] [10de:1381] (rev a2)
     Subsystem: Gigabyte Technology Co., Ltd GM107 [GeForce GTX 750] [1458:362e]
     Kernel driver in use: nvidia
     Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
    1b:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
     Subsystem: Gigabyte Technology Co., Ltd GM107 High Definition Audio Controller [GeForce 940MX] [1458:362e]
     Kernel driver in use: snd_hda_intel
     Kernel modules: snd_hda_intel
    ...
    
  6. Check what group the graphics card for Windows is in:

    $ find /sys/kernel/iommu_groups/ -type l
    
  7. Check that only devices that you want to pass through to Windows are in the IOMMU group. In the case of saitama, that includes a graphics card, and also a USB controller, a PCIe controller, a SATA controller and an Ethernet controller. At first this seems like a poor choice. But the alternatives turn out to be worse; passing the primary graphics card through to the guest OS can be done by disabling the EFI framebuffer, but it seemed that neither the BIOS of the guest VM nor the guest OS could detect the card. Ubuntu worked as a guest OS, but Windows did not, and the OVMF/TianoCore BIOS could not show a graphical splash screen. Installing a prepared kernel that had been patched to allow ACS Override also failed. The option of maintaining a self-compiled kernel seemed less attractive. The benefit of passing through a USB, SATA and Ethernet controllers is that Windows gets a complete set of near-native-speed devices.

    $ ls -lhA /sys/bus/pci/devices/0000\:1b\:00.0/iommu_group/devices/
    total 0
    ... 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.0
    ... 0000:03:00.1 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.1
    ... 0000:03:00.2 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2
    ... 0000:16:00.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:00.0
    ... 0000:16:01.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:01.0
    ... 0000:16:04.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:04.0
    ... 0000:18:00.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:01.0/0000:18:00.0
    ... 0000:1b:00.0 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:04.0/0000:1b:00.0
    ... 0000:1b:00.1 -> ../../../../devices/pci0000:00/0000:00:01.3/0000:03:00.2/0000:16:04.0/0000:1b:00.1
    
  8. Blacklist the GPU we're passing through to the VM so that the graphics driver can't grab it. We use the pci-stub module to claim the card before nvidia or nouveau can. Add "pci-stub" to /etc/initramfs-tools/modules:

    $ echo "pci-stub" | sudo tee -a /etc/initramfs-tools/modules
    
  9. Pass pci-stub the IDs of the graphics controller and the audio device (as found with lspci -nnk above). Set pci-stub as a dependency for drm, otherwise the graphics driver will be loaded before the pci-stub driver. (Check dmesg to see when this happens.)

    $ sudo vim /lib/modprobe.d/pci-stub.conf
    options pci-stub ids=10de:1381,10de:0fbc
    softdep drm pre: pci-stub
    
  10. Update the existing initramfs image:

    $ sudo update-initramfs -u
    
  11. Reboot

  12. Confirm that pci-stub claimed the devices:

    $ lspci -nnk
    ...
    1b:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107 [GeForce GTX 750] [10de:1381] (rev a2)
            Subsystem: Gigabyte Technology Co., Ltd GM107 [GeForce GTX 750] [1458:362e]
            Kernel driver in use: pci-stub
            Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
    1b:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
            Subsystem: Gigabyte Technology Co., Ltd GM107 High Definition Audio Controller [GeForce 940MX] [1458:362e]
            Kernel driver in use: pci-stub
            Kernel modules: snd_hda_intel
    
[1]https://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM

Create scripts to bind devices

  1. Create a script to bind passthrough devices to vfio-pci:

    $ vim bind-vfio-pci
    #!/bin/sh
    
    modprobe vfio-pci
    
    # Bind vfio-pci to the USB controller
    echo '0000:03:00.0' | tee /sys/bus/pci/drivers/xhci_hcd/unbind
    echo '0000:03:00.0' | tee /sys/bus/pci/drivers/vfio-pci/bind
    
    # Bind vfio-pci to the SATA controller
    echo '0000:03:00.1' | tee /sys/bus/pci/drivers/ahci/unbind
    echo '0000:03:00.1' | tee /sys/bus/pci/drivers/vfio-pci/bind
    
    # vfio-pci does not support bridges. Just unbind it from the host.
    echo '0000:03:00.2' | tee /sys/bus/pci/drivers/pcieport/unbind
    echo '0000:16:00.0' | tee /sys/bus/pci/drivers/pcieport/unbind
    echo '0000:16:01.0' | tee /sys/bus/pci/drivers/pcieport/unbind
    echo '0000:16:04.0' | tee /sys/bus/pci/drivers/pcieport/unbind
    
    $ chmod +x bind-vfio-pci
    
  2. Create another script to return devices to their normal drivers:

    $ vim unbind-vfio-pci
    #!/bin/sh
    
    echo '0000:03:00.0' > /sys/bus/pci/drivers/vfio-pci/unbind
    echo '0000:03:00.0' > /sys/bus/pci/drivers/xhci_hcd/bind
    
    echo '0000:03:00.1' > /sys/bus/pci/drivers/vfio-pci/unbind
    echo '0000:03:00.1' > /sys/bus/pci/drivers/ahci/bind
    
    echo '0000:03:00.2' > /sys/bus/pci/drivers/pcieport/bind
    echo '0000:16:00.0' > /sys/bus/pci/drivers/pcieport/bind
    echo '0000:16:01.0' > /sys/bus/pci/drivers/pcieport/bind
    echo '0000:16:04.0' > /sys/bus/pci/drivers/pcieport/bind
    
    $ chmod +x unbind-vfio-pci
    

Create script for Windows VM

  1. Install QEMU, KVM, and the OVMF UEFI BIOS:

    $ sudo apt-get install qemu-kvm ovmf
    
  2. Copy the OVMF variables image to support UEFI variables:

    $ cp /usr/share/OVMF/OVMF_VARS.fd ovmf_vars.fd
    
  3. Create a script for your VM.

    • Here is the script. We will unpack it next.

         $ vim windows
         #!/bin/sh
      
         USB_DEVICE=03:00.0
         SATA_DEVICE=03:00.1
         ETH_DEVICE=18:00.0
         GPU_VIDEO=1b:00.0
         GPU_AUDIO=1b:00.1
      
         ./bind-vfio-pci
      
         qemu-system-x86_64 \
             -enable-kvm \
             -monitor stdio \
             -name win10 \
             \
             -machine type=q35,accel=kvm,kernel_irqchip=on \
             -cpu EPYC,kvm=off,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_vendor_id=NoFortyThree \
             -m 4G \
             -net none \
             -usb \
             -vga none \
             \
             -drive if=pflash,format=raw,readonly,file=/usr/share/OVMF/OVMF_CODE.fd \
             -drive if=pflash,format=raw,file=ovmf_vars.fd \
             \
             -device vfio-pci,host=$USB_DEVICE \
             -device vfio-pci,host=$SATA_DEVICE,rombar=0 \
             -device vfio-pci,host=$ETH_DEVICE \
             -device vfio-pci,host=$GPU_VIDEO,multifunction=on,x-vga=on \
             -device vfio-pci,host=$GPU_AUDIO \
      
         ./unbind-vfio-pci
      
      $ chmod +x windows
      
    • For convenience we set GPU_DEVICE to the device ID of the graphics card to be passed through.

    • We chose -machine type=q35 to use a PCIe bus.

    • We use -cpu EPYC because -cpu host causes Windows to keep rebooting, and -cpu EPYC best models the features of the hosts's Ryzen 5 CPU.

    • We told the VM to hide the fact that we are using KVM with kvm=off. This is to avoid a "bug" in NVIDIA's driver for Windows.

    • We enabled Hyper-V enlightenments with hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time.

    • We are using drive interface if=pflash for the BIOS to support UEFI variables.

Set up the Windows VM

See How to configure GPU passthrough on trillian for instructions to download an image, write it to a file or device, and change the image to boot using UEFI.

Alternative: Copy the Windows VM from another host

I set up the Windows VM once, on trillian, and copy it to saitama using netcat:

  1. Follow the steps for setting up the Windows VM on trillian until the NVIDIA driver and Steam are installed.

  2. On saitama, wipe the MBR and GPT filesystem data off the drive which is passed through to the VM. (On saitama /dev/sdx is /dev/sda.)

    $ sudo wipefs -a --backup /dev/sdx
    
  3. Use netcat to listen on port, say, 4444, and, as root, send the data directly to /dev/sdx:

    $ sudo su -
    # nc -l 4444 > /dev/sdx
    
  4. On trillian, cat the VM drive to 4444 on saitama:

    (trillian) $ sudo cat /dev/sdxX | nc -N 172.16.2.XXX 4444
    

    ... or ...

    (trillian) $ sudo losetup --show -o XXXXXX -f /dev/sdxX
    /dev/loopN
    (trillian) $ sudo cat /dev/loopN | nc -N 172.16.2.XXX 4444
    

Reinstalling Windows

  1. Mount Windows drive C.
  2. Back up the IEUser directory. Use tar + gzip because it will preserve the right metadata for NTFS.
  3. Follow the steps for trillian to set up the Windows VM, until the NVIDIA driver and Steam are installed. The documentation refers to the target device as /dev/sdxX. On saitama the target device is /dev/sda.

Optimising Windows

  1. Use GParted to add a data drive:

    $ sudo gparted /dev/sdx
    
  2. Assign more processors to Windows. nproc will tell you how many processors are available. e.g.

    $ nproc
    8
    $ vim windows
    ...
        -smp 4 \
    ...
    
  3. Disable Hibernate: Run "cmd" as administrator, and

    > powercfg -h off
    
  4. Disable Suspend: Start > Settings > System > Power & sleep > Sleep: "Never"

  5. Disable Cortana:

    1. Run "gpedit.msc"
    2. Go to Computer Configuration > Administrative Templates > Windows Components > Search
    3. Find "Allow Cortana", and set it to "Disabled". Click "OK".
    4. Reboot, or log out and log in.
  6. If you are going to assign more than 4GB of RAM to Windows, hugepage support will improve performance. hugepages is installed in Ubuntu by default. The following is based on Ubuntu Community Help Wiki and ArchWiki.

    1. Confirm that hugepage size is 2048 KB:

      $ cat /proc/meminfo | grep Hugepagesize
      Hugepagesize:       2048 kB
      
    2. If we want to assign 6 GB to Windows, that will be 6 × 1024 × 1024 ÷ 2048 = 6 × 1024 ÷ 2 = 3072 hugepages. Round up to 3100. If we want to assign 12 GB to Windows, that will be 12 × 1024 ÷ 2 = 6144 hugepages. Round up to 6150. Add the following to the script to reserve 3100 (for example) hugepages:

      $ vim windows
      ...
      sysctl vm.nr_hugepages=3100
      ...
          -m 6G \
          -mem-path /dev/hugepages \
      ...
      
    3. You can check, while the VM is running, how many pages are used:

      $ cat /proc/meminfo | grep HugePages
      

References

How to take back the NVIDIA

If you want to use the NVIDIA in Linux again, use the following steps to take the graphics card back:

  1. Comment out "pci_stub" in /etc/initramfs-tools/modules

  2. Move /lib/modprobe.d/pci-stub.conf to ~/doc/

    $ sudo mv /lib/modprobe.d/pci-stub.conf ~/doc/
    
  3. Update the initramfs image:

    $ sudo update-initramfs -u
    
  4. Reboot.

  1. If necessary, check prime graphics device and monitor configuration:

    $ xsudo nvidia-settings
    $ rm ~/.config/monitors.xml
    

    Reboot, log in, and configure monitors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment