Skip to content

Instantly share code, notes, and snippets.

@rdnetto
Last active April 30, 2018 23:37
Show Gist options
  • Save rdnetto/ac293f0c551213f8472c1d747dd0e53f to your computer and use it in GitHub Desktop.
Save rdnetto/ac293f0c551213f8472c1d747dd0e53f to your computer and use it in GitHub Desktop.
Multiseat PCIe Passthrough Writeup
IOMMU group 17
11:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XT [Radeon RX Vega 64] [1002:687f] (rev c1)
IOMMU group 7
00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 15
0f:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1470] (rev c1)
IOMMU group 5
[RESET] 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU group 13
03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b9] (rev 02)
03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b5] (rev 02)
03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b0] (rev 02)
04:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
04:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
04:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
04:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b4] (rev 02)
[RESET] 06:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 02)
07:00.0 PCI bridge [0604]: ASMedia Technology Inc. Device [1b21:1184]
08:01.0 PCI bridge [0604]: ASMedia Technology Inc. Device [1b21:1184]
08:03.0 PCI bridge [0604]: ASMedia Technology Inc. Device [1b21:1184]
08:05.0 PCI bridge [0604]: ASMedia Technology Inc. Device [1b21:1184]
08:07.0 PCI bridge [0604]: ASMedia Technology Inc. Device [1b21:1184]
[RESET] 09:00.0 Network controller [0280]: Intel Corporation Device [8086:24fb] (rev 10)
[RESET] 0b:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU group 3
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 11
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric Device 18h Function 6 [1022:1466]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU group 1
[RESET] 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU group 18
11:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:aaf8]
IOMMU group 8
00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
[RESET] 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
12:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:145a]
12:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Device [1022:1456]
[RESET] 12:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] USB3 Host Controller [1022:145c]
IOMMU group 16
10:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1471]
IOMMU group 6
[RESET] 00:03.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU group 14
0e:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/580] [1002:67df] (rev ef)
0e:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
IOMMU group 4
00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 12
[RESET] 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]
IOMMU group 2
[RESET] 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:1453]
IOMMU group 10
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 0
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU group 9
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
[RESET] 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
13:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:1455]
13:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
13:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Device [1022:1457]
# /etc/modprobe.d/vfio.conf
# Vega
# options vfio-pci ids=1002:687f,1002:aaf8
# Needed as we can't remap interrupts otherwise
options vfio_iommu_type1 allow_unsafe_interrupts=1
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<name>win10</name>
<uuid>8010c34a-de70-4970-8b60-f0f83418ddc2</uuid>
<title>Windows 10 (GPU Passthrough)</title>
<memory unit='KiB'>8388608</memory>
<currentMemory unit='KiB'>8388608</currentMemory>
<vcpu placement='static'>8</vcpu>
<os>
<type arch='x86_64' machine='pc-i440fx-2.11'>hvm</type>
<loader readonly='yes' type='pflash'>/usr/share/edk2-ovmf/OVMF_CODE.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/win10_VARS.fd</nvram>
<bootmenu enable='yes'/>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state='on'/>
<vapic state='on'/>
<spinlocks state='on' retries='8191'/>
</hyperv>
<vmport state='off'/>
</features>
<cpu mode='host-model' check='partial'>
<model fallback='allow'/>
<topology sockets='1' cores='4' threads='2'/>
</cpu>
<clock offset='localtime'>
<timer name='rtc' tickpolicy='catchup'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
<timer name='hypervclock' present='yes'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/Win10.iso'/>
<target dev='sdb' bus='sata'/>
<readonly/>
<boot order='2'/>
<address type='drive' controller='0' bus='0' target='0' unit='1'/>
</disk>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/path/to/virtio-win-0.1.149.iso'/>
<target dev='sdc' bus='sata'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='2'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw'/>
<source dev='/dev/sde'/>
<target dev='vda' bus='virtio'/>
<boot order='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/>
</disk>
<controller type='usb' index='0' model='ich9-ehci1'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci1'>
<master startport='0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci2'>
<master startport='2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
</controller>
<controller type='usb' index='0' model='ich9-uhci3'>
<master startport='4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
</controller>
<controller type='sata' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<controller type='scsi' index='0' model='virtio-scsi'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</controller>
<controller type='pci' index='0' model='pci-root'/>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</controller>
<interface type='network'>
<mac address='52:54:00:f3:99:07'/>
<source network='br0'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<serial type='pty'>
<target type='isa-serial' port='0'>
<model name='isa-serial'/>
</target>
</serial>
<console type='pty'>
<target type='serial' port='0'/>
</console>
<input type='tablet' bus='usb'>
<address type='usb' bus='0' port='1'/>
</input>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<sound model='ich6'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</sound>
<video>
<model type='cirrus' vram='16384' heads='1' primary='yes'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</video>
<hostdev mode='subsystem' type='usb' managed='yes'>
<source>
<vendor id='0x24f0'/>
<product id='0x0137'/>
</source>
<address type='usb' bus='0' port='4'/>
</hostdev>
<hostdev mode='subsystem' type='usb' managed='yes'>
<source>
<vendor id='0x045e'/>
<product id='0x077d'/>
</source>
<address type='usb' bus='0' port='5'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x0000' bus='0x11' slot='0x00' function='0x0'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</hostdev>
<hostdev mode='subsystem' type='usb' managed='yes'>
<source>
<vendor id='0x046d'/>
<product id='0xc408'/>
</source>
<address type='usb' bus='0' port='6'/>
</hostdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='2'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='3'/>
</redirdev>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</memballoon>
</devices>
<qemu:commandline>
<qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
<qemu:env name='QEMU_PA_SERVER' value='/var/run/pulse/native'/>
<qemu:env name='QEMU_PA_SAMPLES' value='8192'/>
<qemu:env name='QEMU_AUDIO_TIMER_PERIOD' value='99'/>
</qemu:commandline>
</domain>

A few people expressed interest in finding out how this went, so I thought I'd do a writeup of my experiences getting PCIe passthrough working with multiseat.

One of the more interesting things to note is that hot-plugging aside, it works fine with Vega, despite the card not being shown as resettable by the Arch wiki script.

I've mostly followed the Arch wiki, with additional sources linked throughout.

Why would you want to do this?

Multiseat enables multiple people to use the same computer simultaneously. This can reduce setup costs (you only need one motherboard, CPU, etc.) and improve resource utilization (if one seat is idle, the other can make full use of the CPU + memory). The only parts needed per seat are the screen, peripherals, graphics card and a USB soundcard (opt.)

PCIe passthrough allows you to connect one of the seats to virtualized Windows. My motivation for this was to play games without preventing my wife from using her seat.

How well does it work?

I wasn't able to get re-binding to work with Vega, which means that I need to edit /etc/modprobe.d/vfio.conf, regenerate my initd and reboot each time I want to switch the seat between Linux and Windows. In practice, this isn't particularly onerous. (I could reduce the effort needed by adding a GRUB entry with different kernel args, but I haven't gotten around to it yet.) In the future (4.18 or later?) I wouldn't be surprised if this changes, given that 4.17 fixes some of the issues, and it already works perfectly for the RX 570.

Apart from that, it works pretty flawlessly. I haven't implemented any optimizations like CPU pinning, but the gaming performance is subjectively as good as native. (To be fair, I haven't been playing any CPU-intensive games, and the 1800X is overkill for almost every workload I've thrown at it.)

Hardware

  • Ryzen 7 1800X
  • Asrock Taichi x370
  • RX Vega 64
  • RX 570

Software

  • distro: Sabayon
  • kernel: Linux 4.16
  • display manager: lightdm
  • guest OS: Windows 10

PCIe Passthrough (Gotchas)

  • QEMU doesn't support exposing a hyperthreaded AMD CPU yet, so I had to configure it with 8 cores, 1 thread each

  • I needed to add options vfio_iommu_type1 allow_unsafe_interrupts=1 to modprobe.conf, otherwise I was getting the kernel error:

      vfio_iommu_type1_attach_group: No interrupt remapping support.  Use the module param "allow_unsafe_interrupts" to enable VFIO IOMMU support on this platform
    

    This is explained here. Tldr; it creates a vulnerability if you don't trust the guest OS. EDIT: This is only needed if you do not have CONFIG_IRQ_REMAP enabled in your kernel. See here for more info, or here

  • if you get the TianoCore logo and it looks like it's frozen, but you're dropped into an EFI shell after a full minute, you probably need to fix the boot order

  • migrating an existing Windows installation from the host to the guest was unsuccessful for me. No idea why, but I got a black screen the moment it loaded the drivers. A clean installation using the virtio drivers from the start worked flawlessly.

  • Getting the network working was surprisingly painful - see this site for details. Note that using the pre-existing Docker bridge interface didn't work.

    • If your VM isn't able to get a DHCP address, you might need to set the following sysctl:

      net.ipv4.ip_forward = 1

  • Sound

    • Add the following lines to the <domain> element of your VM

        <qemu:commandline>
            <qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
            <qemu:env name='QEMU_PA_SERVER' value='/var/run/pulse/native'/>
      
            <!-- Without this, you get horrible crackling -->
            <qemu:env name='QEMU_PA_SAMPLES' value='8192'/>
            <qemu:env name='QEMU_AUDIO_TIMER_PERIOD' value='99'/>
        </qemu:commandline>
      
    • If you're running Pulseaudio in system mode (common for cooperative multiseat), you'll probably ahve this in /etc/pulse/system.pa: load-module module-native-protocol-unix auth-group=pulse-access auth-group-enable=1

      Since the VM runs as the qemu user by default, you'll need to add it to the pulse-access group to get audio working.

    • Audio is very choppy out of the box - you need to set Windows to use the same sample rate as Pulseaudio, as documented here. Even after this, I found the audio choppy until another seat had started.

  • If only one CPU core is detected, you might need to edit your VM like so to ensure that it starts with all the cores actually online: 4 4

Dynamically rebinding the graphics card

The idea here was to be able to switch a seat between Windows and Linux without rebooting the host. This works flawlessly for the RX 570, but not for the RX Vega 64, which produces kernel errors from null pointer dereferencing. (This is improved, but not fixed in 4.17-rc2, in which unbinding works but rebinding is still broken.)

The Gentoo wiki and this blog have good info on how to do this dynamically. Note that any args set for the vfio-pci module at boot are just the defaults, and can be removed using the unbind files in /sys.

The general approach is:

  • remove userspace consumers of the graphics card with loginctl terminate-seat seat1
  • unbind the card from amdgpu and rebind it to vfio-pci
  • start the VM

To move back to Linux, shutdown the VM, unbind it from vfio-pci, and re-scan to re-bind it to amdgpu. Logind will automatically recreate the seat, login screen and all.

Limitations

  • If you terminate seat0, LightDM will shutdown all the seats. This means that this seat must always run Linux.
  • If there are any consumers of the graphics card when you attempt to rebind it, you'll get an error in the kernel log with a stack trace
  • The kernel logs to the framebuffer count as a consumer, so you either need to disable them with video=efifb:off, or ensure they go to seat0.
  • Despite running Linux 4.16, I suffered from the PCI reinit bug - couldn't get the VM to start a second time without putting the host to sleep for a second. (Sometimes it woke up by itself, sometimes I had to push the power button to wake it.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment