Skip to content

Instantly share code, notes, and snippets.

@jpotier
Forked from CRTified/README.md
Created December 6, 2019 17:12
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jpotier/e4829f9cd7d9442731aa223e00245f3f to your computer and use it in GitHub Desktop.
Save jpotier/e4829f9cd7d9442731aa223e00245f3f to your computer and use it in GitHub Desktop.
VFIO Passthrough on NixOS

VFIO Setup on NixOS

Disclaimer: Nobody else tested my setup so far, so this is a "works on my machine" scenario. I am not responsible for anything you break on your machine (although I'd not expect much harm).

Hardware

My system has the following hardware:

  • Board: Asus ROG Strix Z270G
  • Processor: Intel i7-7700k (4 x 4,2GHz)
  • GPU (Primary): Palit GeForce GTX1080 Dual OC (at 0000:01:00.*, IDs are 10de:1b80 and 10de:10f0)
  • GPU (Secondary): AMD Radeon Pro WX3100 (at 0000:02:00.*, IDs are not relevant)

Files

  • usage.nix contains the relevant snippet of my configuration where I use preexisting modules and my custom ones.
  • win_vm.xml contains the libvirtd xml dump of my Windows 10 guest
    • It is customized to my setup, I only use this one for gaming
    • Line ~37 was required to prevent the classical nvidia qemu driver problem
    • GPU Passthrough in Line 124-130
    • I use a custom rom file for my GPU (Line 128) - this might be dangerous to do
    • looking-glass in the section from Line 147
    • HID passthrough in Line 154-161
    • Audio Passthrough in Line 162/163
  • virtualisation.nix is a module to augment the virtualisations subtree by the ability to create shared memory files. This is required for looking-glass and pulseaudio-scream (or pulseaudio-ivshmem)
    • Audio integration is not tested by me. Last time I checked, neither pa-scream nor pa-ivshmem were available and it did not annoy me enough to play around with it
  • vfio.nix contains the biggest part of the config. It allows setting a few properties like:
    • IOMMUType, either intel or amd, sets the apropriate kernel parameters for IOMMU
    • devices, which is a list of PCI IDs that shall be bound to vfio-pci
    • disableEFIfb disables the EFI framebuffer. I pass through my primary GPU, so I need to prevent the kernel from touching it
    • blacklistNvidia additionally blacklists nvidia and nouveau kernel modules
    • ignoreMSRs toggles kvm.ignore_msrs as a kernel parameter
    • applyACSpatch applies the well known ACS patch to weaken the IOMMU grouping. IMPORTANT: This results in a kernel compilation in most cases.
  • libvirt.nix adds twovirtualisation.libvirtd options
    • deviceACL adds devices to the cgroup_device_acl option, which is often required to access the devices from qemu
    • clearEmulationCapabilities toggles the clear_emulation_capabilities setting for qemu

TODO

  • DANGER: qemu still runs as root
  • Add any shared memory audio solution
  • Bake the libvirt xml file into the system configuration
  • I have a strange bug where I need to destroy the VM once, elsewise it won't boot
    • After doing that once, I can start/stop as often as I want (although I can not use the reboot function out of the guest system)
    • I suspect that there is some windows/nvidia/pcie voodoo required to resolve it, but it did not bother me enough to try to fix it
    • Basically, I start the VM once per boot with these steps:
virsh -c 'qemu:///system' start vm_gaming; 
sleep 3; 
virsh -c 'qemu:///system' destroy vm_gaming; 
sleep 3; 
virsh -c 'qemu:///system' start vm_gaming;
{ lib, pkgs, config, ... }:
with lib;
let
cfg = config.virtualisation.libvirtd;
boolToZeroOne = x: if x then "1" else "0";
aclString = with lib.strings;
concatMapStringsSep ''
,
'' escapeNixString cfg.deviceACL;
in {
options.virtualisation.libvirtd = {
deviceACL = mkOption {
type = types.listOf types.str;
default = [ ];
};
clearEmulationCapabilities = mkOption {
type = types.bool;
default = true;
};
};
config.virtualisation.libvirtd.qemuVerbatimConfig = ''
clear_emulation_capabilities = ${
boolToZeroOne cfg.clearEmulationCapabilities
}
cgroup_device_acl = [
${aclString}
]
'';
#config.services.udev.extraRules = ''
# SUBSYSTEM=="usb", ATTR{"DEVPATH"}=="/dev/input/by-id/usb-04d9_USB_Keyboard-event-kbd", GROUP="wheel"
#'';
}
{
hardware.pulseaudio.extraConfig = ''
# Local socket for QEMU
load-module module-native-protocol-unix auth-anonymous=1 socket=/tmp/pulse-socket
'';
virtualisation = {
sharedMemoryFiles = {
looking-glass = {
size = 32; # Needs to be a power of 2 for looking-glass
user = "root";
group = "root";
mode = "666";
};
};
libvirtd = {
enable = true;
qemuOvmf = true;
clearEmulationCapabilities = false;
deviceACL = [
"/dev/input/by-path/pci-0000:00:14.0-usb-0:6.3:1.0-event-mouse" # Trackball
"/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.0-event-kbd" # Keyboard
"/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.1-event-kbd" # Keyboard
"/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.1-event" # Keyboard
"/dev/vfio/vfio"
"/dev/vfio/14"
"/dev/vfio/15"
"/dev/kvm"
"/dev/shm/looking-glass"
];
};
vfio = {
enable = true;
IOMMUType = "intel";
devices = [ "10de:1b80" "10de:10f0" ];
blacklistNvidia = true;
disableEFIfb = true;
ignoreMSRs = true;
applyACSpatch = true;
};
};
{ lib, pkgs, config, ... }:
with lib;
let cfg = config.virtualisation.vfio;
in {
options.virtualisation.vfio = {
enable = mkEnableOption "VFIO Configuration";
IOMMUType = mkOption {
type = types.enum [ "intel" "amd" ];
example = "intel";
description = "Type of the IOMMU used";
};
devices = mkOption {
type = types.listOf (types.strMatching "[0-9a-f]{4}:[0-9a-f]{4}");
default = [ ];
example = [ "10de:1b80" "10de:10f0" ];
description = "PCI IDs of devices to bind to vfio-pci";
};
disableEFIfb = mkOption {
type = types.bool;
default = false;
example = true;
description = "Disables the usage of the EFI framebuffer on boot.";
};
blacklistNvidia = mkOption {
type = types.bool;
default = false;
description = "Add Nvidia GPU modules to blacklist";
};
ignoreMSRs = mkOption {
type = types.bool;
default = false;
example = true;
description =
"Enables or disables kvm guest access to model-specific registers";
};
applyACSpatch = mkOption {
type = types.bool;
default = false;
description = ''
If set, the following things will happen:
- The ACS override patch is applied
- Applies the i915-vga-arbiter patch
- Adds pcie_acs_override=downstream to the command line
'';
};
};
config = lib.mkIf cfg.enable {
boot.kernelParams = (if cfg.IOMMUType == "intel" then [
"intel_iommu=on"
"intel_iommu=igfx_off"
] else
[ "amd_iommu=on" ]) ++ (optional (builtins.length cfg.devices > 0)
("vfio-pci.ids=" + builtins.concatStringsSep "," cfg.devices))
++ (optional cfg.applyACSpatch
"pcie_acs_override=downstream,multifunction")
++ (optional cfg.disableEFIfb "video=efifb:off")
++ (optional cfg.ignoreMSRs "kvm.ignore_msrs=1");
boot.kernelModules = [ "vfio_virqfd" "vfio_pci" "vfio_iommu_type1" "vfio" ];
boot.initrd.kernelModules =
[ "vfio_virqfd" "vfio_pci" "vfio_iommu_type1" "vfio" ];
boot.blacklistedKernelModules =
optionals cfg.blacklistNvidia [ "nvidia" "nouveau" ];
boot.kernelPatches = optionals cfg.applyACSpatch [
{
name = "add-acs-overrides";
patch = pkgs.fetchurl {
name = "add-acs-overrides.patch";
url =
"https://aur.archlinux.org/cgit/aur.git/plain/add-acs-overrides.patch?h=linux-vfio&id=6f5c5ff2e42abf6606564383d5cb3c56b13d895e";
sha256 = "1qd68s9r0ppynksbffqn2qbp1whqpbfp93dpccp9griwhx5srx6v";
};
}
{
name = "i915-vga-arbiter";
patch = pkgs.fetchurl {
name = "i915-vga-arbiter.patch";
url =
"https://aur.archlinux.org/cgit/aur.git/plain/i915-vga-arbiter.patch?h=linux-vfio&id=6f5c5ff2e42abf6606564383d5cb3c56b13d895e";
sha256 = "1mg06dmlsdzf9w6jy73izjpa8ma7yh80k48rjj6iq30qs4jw1d5g";
};
}
];
};
}
{ lib, pkgs, config, ... }:
with lib;
let
cfg = config.virtualisation;
functionBlock = name: f: ''
fallocate -l ${toString f.size}M /dev/shm/${name};
chown ${f.user}:${f.group} /dev/shm/${name};
chmod ${f.mode} /dev/shm/${name};
'';
in {
options.virtualisation.sharedMemoryFiles = mkOption {
type = types.attrsOf (types.submodule ({ name, ... }: {
options = {
name = mkOption {
visible = false;
default = name;
type = types.str;
};
size = mkOption {
type = types.int;
default = 0;
description = "Size in MB.";
};
user = mkOption {
type = types.str;
default = "root";
description = "Owner of the memory file";
};
group = mkOption {
type = types.str;
default = "root";
description = "Group of the memory file";
};
mode = mkOption {
type = types.str;
default = "0600";
description = "Group of the memory file";
};
};
}));
default = { };
};
config.system.activationScripts.sharedMemoryFiles = {
text = concatStringsSep "\n"
(mapAttrsToList (functionBlock) cfg.sharedMemoryFiles);
deps = [ ];
};
}
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<name>vm_gaming</name>
<uuid>fd888352-4004-4047-a6e9-1aa45d6cd461</uuid>
<memory unit='KiB'>16777216</memory>
<currentMemory unit='KiB'>16777216</currentMemory>
<vcpu placement='static'>8</vcpu>
<iothreads>1</iothreads>
<cputune>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='5'/>
<vcpupin vcpu='2' cpuset='6'/>
<vcpupin vcpu='3' cpuset='7'/>
<emulatorpin cpuset='0-3'/>
<iothreadpin iothread='1' cpuset='0-3'/>
</cputune>
<os>
<type arch='x86_64' machine='pc-q35-3.0'>hvm</type>
<loader readonly='yes' type='pflash'>/run/libvirt/nix-ovmf/OVMF_CODE.fd</loader>
<nvram template='/run/libvirt/nix-ovmf/OVMF_VARS.fd'>/var/lib/libvirt/VARS_vm_gaming.fd</nvram>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<hyperv>
<relaxed state='on'/>
<vapic state='on'/>
<spinlocks state='on' retries='8191'/>
<vpindex state='on'/>
<runtime state='on'/>
<synic state='on'/>
<stimer state='on'/>
<reset state='on'/>
<vendor_id state='on' value='whatever'/>
</hyperv>
<kvm>
<hidden state='on'/>
</kvm>
<vmport state='off'/>
</features>
<cpu mode='host-model' check='partial'>
<model fallback='allow'/>
<topology sockets='1' cores='2' threads='4'/>
</cpu>
<clock offset='localtime'>
<timer name='rtc' tickpolicy='catchup' track='guest'/>
<timer name='pit' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
<timer name='hypervclock' present='yes'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<pm>
<suspend-to-mem enabled='no'/>
<suspend-to-disk enabled='no'/>
</pm>
<devices>
<emulator>/run/libvirt/nix-emulators/qemu-kvm</emulator>
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='writeback' io='threads' discard='unmap' detect_zeroes='on'/>
<source dev='/dev/nvme_vg/vm_windows'/>
<target dev='sda' bus='scsi'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='pci' index='0' model='pcie-root'/>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x8'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x9'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='ioh3420'/>
<target chassis='3' port='0xa'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0xb'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
</controller>
<controller type='virtio-serial' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</controller>
<controller type='usb' index='0' model='nec-xhci'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</controller>
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='8' iothread='1'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='sata' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
</controller>
<interface type='network'>
<mac address='52:54:00:b4:07:69'/>
<source network='default'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<channel type='spicevmc'>
<target type='virtio' name='com.redhat.spice.0'/>
<address type='virtio-serial' controller='0' bus='0' port='2'/>
</channel>
<input type='keyboard' bus='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</input>
<input type='mouse' bus='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
</input>
<input type='mouse' bus='ps2'/>
<input type='keyboard' bus='ps2'/>
<input type='tablet' bus='virtio'>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</input>
<sound model='ac97'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</sound>
<hostdev mode='subsystem' type='pci' managed='yes'>
<source>
<address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</source>
<rom file='/etc/nixos/misc/10de:1b80.rom'/>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</hostdev>
<hostdev mode='subsystem' type='usb' managed='yes'>
<source>
<vendor id='0x28de'/>
<product id='0x1142'/>
</source>
<address type='usb' bus='0' port='3'/>
</hostdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='2'/>
</redirdev>
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='4'/>
</redirdev>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</memballoon>
<shmem name='looking-glass'>
<model type='ivshmem-plain'/>
<size unit='M'>32</size>
<address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</shmem>
</devices>
<qemu:commandline>
<qemu:arg value='-object'/>
<qemu:arg value='input-linux,id=mouse0,evdev=/dev/input/by-path/pci-0000:00:14.0-usb-0:6.3:1.0-event-mouse'/>
<qemu:arg value='-object'/>
<qemu:arg value='input-linux,id=keyboard0,evdev=/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.0-event-kbd,grab_all=on,repeat=on'/>
<qemu:arg value='-object'/>
<qemu:arg value='input-linux,id=keyboard1,evdev=/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.1-event-kbd'/>
<qemu:arg value='-object'/>
<qemu:arg value='input-linux,id=keyboard2,evdev=/dev/input/by-path/pci-0000:00:14.0-usb-0:6.1:1.1-event'/>
<qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
<qemu:env name='QEMU_PA_SERVER' value='unix:/tmp/pulse-socket'/>
</qemu:commandline>
</domain>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment