Skip to content

Instantly share code, notes, and snippets.

@matt22207
Last active April 28, 2024 14:46
Show Gist options
  • Star 40 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save matt22207/bb1ba1811a08a715e32f106450b0418a to your computer and use it in GitHub Desktop.
Save matt22207/bb1ba1811a08a715e32f106450b0418a to your computer and use it in GitHub Desktop.
Proxmox 5700G APU GPU Passthrough Notes

Some random notes on trying (and failing) to get Proxmox as host with 5700G APU GPU PCI Passthrough to Ubuntu guest VM working:

References:

Made some progress, but i am thinking this may not possible (for now without more support from AMD?) due to the shared memory of the GPU and system. Various AMD features (like PSP and TMZ) are meant for security and ensuring the programs/cpu can't read your GPU memory. I am totally unknowledgeable here, so I am speculating based off various threads. I suspect the stability issues various people mentioned may be due to issues with reserving memory for the GPU. Not sure, but it seems the memory sharing happens inside the VM at the driver level (rather than on the host at a higher level of hardware), as when i give the VM 10GB, i can see the amdgpu driver grabs 2GB and only 8GB are left to the OS on the host. So maybe this is all self-contained cleanly in the VM, and not really an issue. Not sure.

Anyway, I'm trying Proxmox and I was able to get an external display working (only on first boot??) inside an Ubuntu guest VM, and it seems like various 3d apps run at high fps (vkcube, glxgears) although i haven't installed any games yet to test, and not sure if this is really running on the GPU or CPU since the grpahic driver is shown as llvmpipe. Additionally, the amdgpu driver is still not loading and is giving some errors even though the monitor and those 3d apps work. Also, no audio yet either :( .

In case it helps others solve this, here's some things i've learned:

  • On the Host, modify /etc/default/grub and then run update-grub :
GRUB_DEFAULT=0
GRUB_TIMEOUT=0
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt video=efifb:off pcie_acs_override=downstream,multifunction amdgpu.exp_hw_support=1 modprobe.blacklist=amdgpu,snd_hda_intel,ccp textonly loglevel=0 silent text nomodeset"
GRUB_TERMINAL=serial
#GRUB_TERMINAL_OUTPUT -- comment this out along with GRUB_TERMINAL_INPUT if included, since GRUB_TERMINAL overrides
GRUB_FORCE_HIDDEN_MENU="true"
  • Not sure if all of these kernel params are needed, and some may be excessive, but most of this is random guesses trying to get the host not to touch any sort of display during boot, etc. Also, the pcie_acs_override is to split the PCI IOMMU groups since by default the GPU group included USB devices that i did not want to pass through. I am not sure if this is a problem, and possibly the full original group (including USB needs to be passed through.

  • On the host, blacklist these via /etc/modprobe.d/blacklist.conf (or similar) to avoid having your host initialize the AMD GPU and related devices (snd_hda_intel for audio and ccp for PSP security device i think) :

echo "blacklist amdgpu" >> /etc/modprobe.d/pve-blacklist.conf
echo "blacklist snd_hda_intel" >> /etc/modprobe.d/pve-blacklist.conf
echo "blacklist ccp" >> /etc/modprobe.d/pve-blacklist.conf
  • reboot

  • ensure these weren't activated by checking the boot logs such as sudo dmesg | grep amdgpu

    • note, ccp still gets loaded by kvm_amd no matter what i tried. not sure if this is a blocker
  • Confirm your IOMMU groups are split if needed so you can pass in the GPU+audio on its own by looking for different groups via this script: https://gist.github.com/flungo/428c374c040de1d0a30fd4a593d39040

  • Install/Setup your VM using q35 and OVMF (UEFI) without setting up PCI passthrough yet .. for me in ProxMox, this is just for initial setup in the noVNC console as hardware display may not work yet.. I had to switch to SeaBIOS after setup was complete to get the physical display output working. I would hope/assume we could get UEFI working, but it didn't see to work for me at all with this vbios.

  • After you have a working VM, install a backup way to get in such as regular VNC or openssh-server, then shut it down, and switch the VM to SeaBIOS .. i left the UEFI disk in place so i could switch back and forth as needed.

  • To add the PCI Passthrough you'll need to and get a vbios rom:

  • I found a copy of the 5700G VBIOS from a similar machine here: https://rog.asus.com/us/desktops/mid-tower/rog-strix-g10dk-series/helpdesk_bios ..

  • Extract using VBiosFinder : https://github.com/coderobe/VBiosFinder

    • Install all the dependencies. Especially UEFIExtract [ binaries ] and rom-parser
  • This will output a few rom's, and vbios_1002_1638_1.rom seemed to work the best for me. (Sorry, not sure of the legality of posting the actual rom file)

  • Copy this rom onto your Host in the appropriate dir. For me on proxmox, this was in /usr/share/kvm

  • Get your PCI id's via lspci on the host, which for me are:

05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [1002:1638] (rev c8)
05:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1637]
05:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
05:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
05:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir USB 3.1 [1022:1639]
05:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir Audio Processor [1022:15e2] (rev 01)
05:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller [1022:15e3]
  • I think I only want #0 and #1 and #2 to passthrough (although again, i wonder if the others are required). Not sure about the #2 PSP (especially since i'm disabling PSP later) and 5/6 audio if they are duplicate of #1 audio. I did confirm that including the USB controllers will crash the host system, so if this is even possible you'd need the pcie_acs_override in grub to exclude those into different IOMMU groups. In your host's terminal, add your PCI id's to /etc/pve/qemu-server/100.conf (assuming this is your first VM #100) since you can't add the romfile setting via the proxmox web screens:
hostpci0: 0000:05:00.0;0000:05:00.1;0000:05:00.2,pcie=1,x-vga=1,romfile=vbios_1002_1638_1.rom
  • Note that x-vga=1 will force this to be the primary display, and proxmox's built in "noVNC" viewer will stop working.

    • Ensure you have another way to get in to your VM before this, such as passing through a usb keyboard/mouse, or openssh-server, regular VNC, etc.
  • Start your guest VM and you may get video output on your external display (i'm using the HDMI output), and various errors in the guest's dmesg if you grep for amdgpu ..

  • To fix error complaining about PSP unable to load rom.. Inside the Guest (NOT THE HOST!!!) , add these kernel boot params (for me via /etc/default/grub and then update-grub) : amgdpu.fw_load_type=0

    • This disables PSP which was failing to read the vbios rom, not sure if this has other side effects or removes needed functionality, but it gets some of the errors to go away.
GRUB_CMDLINE_LINUX_DEFAULT="amdgpu.fw_load_type=0"
  • Also, I played with disabling TMZ in the Guest's /etc/default/grub kernel params via amdgpu.tmz=0 .. not sure this did anything, so i left it out

  • reboot the guest and you should get rid of many errors. I am left with:

[    0.000000] Linux version 5.13.0-27-generic (buildd@lcy02-amd64-014) (gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.37) #29-Ubuntu SMP Wed Jan 12 17:36:47 UTC 2022 (Ubuntu 5.13.0-27.29-generic 5.13.19)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-27-generic root=/dev/mapper/vgubuntu-root ro amdgpu.fw_load_type=0 amdgpu.dc=1 radeon.cik_support=0 amdgpu.cik_support=1
[    0.017576] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-27-generic root=/dev/mapper/vgubuntu-root ro amdgpu.fw_load_type=0 amdgpu.dc=1 radeon.cik_support=0 amdgpu.cik_support=1
[   11.189820] [drm] amdgpu kernel modesetting enabled.
[   11.189897] amdgpu: CRAT table not found
[   11.189899] amdgpu: Virtual CRAT table created for CPU
[   11.189906] amdgpu: Topology: Add CPU node
[   11.189947] fb0: switching to amdgpudrmfb from VESA VGA
[   11.190056] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[   11.190329] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature enabled
[   11.197046] amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
[   11.197048] amdgpu: ATOM BIOS: 113-CEZANNE-017
[   11.229736] amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
[   11.229738] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
[   11.229739] amdgpu 0000:01:00.0: amdgpu: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
[   11.229791] [drm] amdgpu: 2048M of VRAM memory ready
[   11.229793] [drm] amdgpu: 3072M of GTT memory ready.
[   12.708341] amdgpu 0000:01:00.0: amdgpu: SMU is initialized successfully!
[   12.872471] amdgpu 0000:01:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
[   12.872675] [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
[   12.872825] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v9_0> failed -110
[   12.872939] amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
[   12.872942] amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
[   12.872993] amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.

open issues:

  • amdgpu driver on guest still doesn't load as shown above
  • audio doesn't work, likely same issue
  • general stability. haven't tracked this thoroughly, but maybe only the first boot of the VM works.. after that, subsequent boots of the VM can freeze, and even once my proxmox host froze. again, i think the shared memory allocation may be a problem, but no clue.
  • On host. i think the USB controllers in the original IOMMU group really are tied to the APU, and there is an error trying to reset the extra audio devices after the VM shuts down. I had to remove these devices from the pci passthrough to get rid of this error.
Jan 23 17:54:55 minispve kernel: usb 1-2: reset low-speed USB device number 2 using xhci_hcd
Jan 23 17:59:30 minispve QEMU[7222]: kvm: vfio: Cannot reset device 0000:05:00.6, depends on group 17 which is not owned.
Jan 23 17:59:30 minispve QEMU[7222]: kvm: vfio: Cannot reset device 0000:05:00.5, depends on group 17 which is not owned.
  • I want to try a Windows10 VM as well to see if there's any difference
@elandwg
Copy link

elandwg commented Mar 1, 2024

Thanks for sharing your files. I didn't see any difference using your files vs mine. However, it lead me to try a bunch of other things and now IT WORKS!!! (With caveats, of course).

When it started working, the GPU output a video signal immediately when the VM started instead of showing the stale Proxmox boot log. This is what I saw with a successful Nvidia GPU passthrough as well. So don't even worry about trying to solve the Error 43 until you have a video signal output over HDMI / DP. The problem lies upstream of the Windows VM.

The vbios I had previously extracted using the method here worked fine. It was the same size as yours, but slightly different in binary. Not sure if that's important. (Edit: The vbios extracted by UBU also works.)

What seems to have been very important was to extract the correct AMDGopDriver.rom for my specific device. I downloaded the motherboard bios from the mini pc maker's website, extracted the GOP driver .efi from the bios rom file with UBU, then converted the extracted GOP .efi file to a rom with EfiRom.exe from here.

I also made a number of other changes to the grub startup line, blacklist, and vfio.conf, so I don't know yet whether any of those changes are necessary I'll post back here if I find any of those settings were important. (Edit: None of these changes I made had any effect. All have been removed and it still works.).

Generally, I used various bits of information from the following sources: https://forum.proxmox.com/threads/pci-gpu-passthrough-on-proxmox-ve-8-installation-and-configuration.130218/ https://www.youtube.com/watch?v=iWwdf66JpxE https://www.youtube.com/watch?v=BElSsyLSX5c https://github.com/isc30/ryzen-7000-series-proxmox

The last link is the most complete guide, however it does not tell you how to extract AMDGopDriver.efi and convert it into a rom.

Caveats:

  1. GPU can only be initialized (a Windows VM launched) once per boot of the host machine. All subsequent attempts fail to initialize the GPU.
  2. On some boots of the Windows VM, the GPU does not initialize correctly until the device is disabled and then re-enabled in device manager. I haven't tried RadeonResetBugFixService yet.

Edit: By the way, I am using OVMF (UEFI).

My dude, you're the hero. It works for me too. Idk if AMDGopDriver is different for every PC, but I uploaded mine here. Also, here is the short guide that I made:

  1. You'll need to have the BIOS file for your PC to extract AMDGopDriver.rom, or try mine from the Google Drive link above (or if you don't have the BIOS).
  2. If you want to extract yours, then get UBU here and extract the AMDGopDriver.efi.
  3. Get EfiRom.exe from here. Run it: EfiRom.exe -f VendorId -i DeviceId -e AMDGopDriver.efi -o AMDGopDriver.rom. To get Vendor and Device ID run lspci -nn | grep -e 'AMD/ATI' on your Proxmox and you'll get (...) Renoir Radeon High Definition Audio Controller [1002:1637] (...), where 1002 is your Vendor ID and 1637 is Device ID.
  4. In Powershell scp [PATH TO FOLDER]\AMDGopDriver.rom root@[YOUR IP]:/usr/share/kvm/
  5. Follow this guide including the optional step of getting UEFI BIOS to work (obv use AMDGopDriver that we extracted). You must use OVMF (UEFI), otherwise no video output (at least on my PC).

Edit: I tried RadeonResetBugFixService and the system survived 3 VM reboots. No code 43 or instability so far.

@augiem
Copy link

augiem commented Mar 1, 2024

My dude, you're the hero. It works for me too. Idk if AMDGopDriver is different for every PC, but I uploaded mine here. Also, here is the short guide that I made:

  1. You'll need to have the BIOS file for your PC to extract AMDGopDriver.rom, or try mine from the Google Drive link above (or if you don't have the BIOS).
  2. If you want to extract yours, then get UBU here and extract the AMDGopDriver.efi.
  3. Get EfiRom.exe from here. Run it: EfiRom.exe -f VendorId -i DeviceId -e AMDGopDriver.efi -o AMDGopDriver.rom. To get Vendor and Device ID run lspci -nn | grep -e 'AMD/ATI' on your Proxmox and you'll get (...) Renoir Radeon High Definition Audio Controller [1002:1637] (...), where 1002 is your Device ID and 1637 is Vendor ID.
  4. In Powershell scp [PATH TO FOLDER]\AMDGopDriver.rom root@[YOUR IP]:/usr/share/kvm/
  5. Follow this guide including the optional step of getting UEFI BIOS to work (obv use AMDGopDriver that we extracted). You must use OVMF (UEFI), otherwise no video output (at least on my PC).

Edit: I tried RadeonResetBugFixService and the system survived 3 VM reboots. No code 43 or instability so far.

That's awesome that it worked!

I just compared your current GOP driver to mine and they're almost identical. I wonder if the difference is just because of the product id difference? It looks like you used the product id of the audio controller and I used the product id of the VGA controller. I just tested your GOP driver and it works for me too. I've also tried using the product id of the vbios_1638.dat file UBU creates and that also works.

It seems to me that if the only difference between our two GOP drivers is the product id we used, then maybe the GOP driver is the same for anything that uses the same GPU as the 5700U. That would seem to make sense.

The owner of this repo added my GOP file today, but at the time, I was under the impression that it was motherboard specific. Out of curiosity, what computer / motherboard are you using? Since your driver works for me as well, I'll ask him to change the name of the GOP driver file to _5700U rather than the name of my mini pc.

Thanks for writing the extraction guide. One small correction: the 2nd number (1002:1637) is the product/device id, in your case 1637.

I also wrote up a guide on how to extract the vbios and GOP driver here. I was thinking of making a PR and update the main readme with the instructions. The repo already has some similar tools in the Tools directory, but this isn't mentioned at all in the main readme.

As for RadeonResetBugFixService, I haven't had much luck yet getting it working correctly. It doesn't seem to be detecting the VirtIO virtual display driver so it can disable it as it's supposed to when it enables the Radeon. Also, I had the iGPU fail to initialize on one boot and the host crashed once when I tried to reset the VM. Without the bug fix, with the iGPU set as the primary GPU, I haven't seen it fail to initialize once, nor have I seen it crash on rebooting the VM. Without the bug fix, it's only when I shut down the VM and try to start it again that it fails.

@elandwg
Copy link

elandwg commented Mar 1, 2024

One small correction: the 2nd number (1002:1637) is the product/device id, in your case 1637.

Didn't see that, fixed it.

Out of curiosity, what computer / motherboard are you using?

Awoostar R7 mini PC. Bios ver.: 2.22.1282.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment