Skip to content

Instantly share code, notes, and snippets.

@numinit
Last active Oct 29, 2020
Embed
What would you like to do?
patch -p0 < fix-vega-reset.patch
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44c4ae1abd00..27840129e4b0 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3433,6 +3433,14 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset);
*/
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);
+/*
+ * Radeon RX Vega and Navi devices break on bus reset. Oi...
+ * This is *not a real workaround* - disabling bus reset
+ * for your GPU may have unintended consequences.
+ */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 0x687f, quirk_no_bus_reset);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 0xaaf8, quirk_no_bus_reset);
+
static void quirk_no_pm_reset(struct pci_dev *dev)
{
/*
@gnif
Copy link

gnif commented Jul 17, 2019

Because this issue exists in all modern Radeon cards and if we just implement a quirk AMD will just keep releasing broken cards.

@gnif
Copy link

gnif commented Jul 18, 2019

@numinit
Copy link
Author

numinit commented Jul 18, 2019

Please don't upstream my terrible patch. :-(

@numinit
Copy link
Author

numinit commented Jul 18, 2019

Unless there's a really good reason to work around broken hardware (i.e. it's non-fixable even in the VBIOS), this problem is AMD's to solve.

@slade87
Copy link

slade87 commented Jul 18, 2019

So the alternative is to wait for something that might never happen?

@gnif
Copy link

gnif commented Jul 18, 2019

To include this is to promote bad behaviour on AMD's part. Not only that, this is not a fix, if your guest VM crashes or fails to shutdown, or the guest AMDGPU driver crashes (which happens often) or your physical bios posts the AMD GPU before you can post it inside your VM, this patch does nothing.

@numinit
Copy link
Author

numinit commented Jul 18, 2019

Renamed the patch file to hopefully be more clear that this is not a real workaround. Sure, it works for some people, hopefully people get mileage out of it, but bus reset itself being broken is a bad problem that needs to be fixed.

@slade87
Copy link

slade87 commented Jul 29, 2019

Incredible work thank you. I'll test this out on my Vega 56 on the weekend!

@pdc4444
Copy link

pdc4444 commented Aug 7, 2019

FYI: I'm new to kernel patching so this might just be my inexperience, but the current patch fails when running the patch command on Ubuntu 18.04 using Kernel 5.2.7.

I ended up grabbing the code from your initial commit which worked.

peter@ElephantBox:~/Downloads/linux-5.2.7$ patch -p1 < ~/Downloads/patch_for_vega/fix-vega-reset.patch 
patching file drivers/pci/quirks.c
patch: **** malformed patch at line 18:  {

peter@ElephantBox:~/Downloads/linux-5.2.7$ nano ~/Downloads/patch_for_vega/fix-vega-reset.patch 
peter@ElephantBox:~/Downloads/linux-5.2.7$ patch -p1 < ~/Downloads/patch_for_vega/fix-vega-reset.patch 
patching file drivers/pci/quirks.c
Hunk #1 succeeded at 3433 with fuzz 1 (offset 60 lines).
peter@ElephantBox:~/Downloads/linux-5.2.7$ 

@stefanleh
Copy link

stefanleh commented Sep 19, 2019

Is it correct to just add

/*
 * Radeon RX Vega and Navi devices break on bus reset. Oi...
 * This is *not a real workaround* - disabling bus reset
 * for your GPU may have unintended consequences.
 */
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 0x687f, quirk_no_bus_reset);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, 0xaaf8, quirk_no_bus_reset);

after

DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CAVIUM, 0xa100, quirk_no_bus_reset);

?

(I've got Kernel 5.2.13 so the patch wont apply in its current state.)

@gnif
Copy link

gnif commented Sep 19, 2019

yes

@aiberia
Copy link

aiberia commented Sep 20, 2019

@numinit This patch has invalid syntax since you added two comment lines without updating the line count (12->14 on the @@ line). Here is a fixed version (diffed from 5.3.0): https://gist.github.com/aiberia/dee39e883defbcb430994c2abc7d9fff

@numinit
Copy link
Author

numinit commented Sep 22, 2019

@aiberia Thank you, fixed it.

@methanoid
Copy link

methanoid commented Dec 12, 2019

Awesome, but please do not upstream this patch. I am working with AMD to produce a proper reset of the device also as a PCI quirk.

This is unfixed a year or more later. Who should we be chasing at AMD? Are they working with you on a proper fix?

@salcin
Copy link

salcin commented Dec 15, 2019

Hi,

Is this patch still relevant? I have compiled a kernel 5.4.3 with this patch and my system even doesn't detect anymore the graphic card.

On the standard kernel 5.3.0-3 (without patch), Winows boot but i get frequently a error "pci header type '127' for device" when i try to reboot my VM.

I have to disconnect the power wire of my pc before to reboot

@ImreBrassai
Copy link

ImreBrassai commented Dec 26, 2019

this isnt working for me, it keeps asking for 

[imre@localhost Desktop]$ patch -p0 < fix-vega-reset.patch
can't find file to patch at input line 5
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
|index 44c4ae1abd00..27840129e4b0 100644
|--- a/drivers/pci/quirks.c
|+++ b/drivers/pci/quirks.c
--------------------------
File to patch: 

@gnif
Copy link

gnif commented Dec 26, 2019

@ImreBrassai patch -p0 != patch -p1

@ImreBrassai
Copy link

ImreBrassai commented Dec 27, 2019

what

@ImreBrassai
Copy link

ImreBrassai commented Dec 27, 2019

[imre@localhost ~]$ patch -p1 < vega.patch 
can't find file to patch at input line 5
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
|index 44c4ae1abd00..27840129e4b0 100644
|--- a/drivers/pci/quirks.c
|+++ b/drivers/pci/quirks.c
--------------------------
File to patch: 

@gnif
Copy link

gnif commented Dec 27, 2019

You need to patch the kernel source and recompile, you don't just run the command provided...

Please stop bolding your text too, just use the insert code button.

@ImreBrassai
Copy link

ImreBrassai commented Dec 28, 2019

ok sorry about that, it bolds automatically dont know why

@ImreBrassai
Copy link

ImreBrassai commented Dec 29, 2019

so i did what you said, i downloaded the kernel and patched it, i installed the kernel, and when i run your script to test if it worked it gives me this

[imre@localhost linux-5.4.1]$ ~/Downloads/reset-test 0000:0a:00.0
============================================================================

AMD Vega 10/12 Reset Application (Version: 1.0)
Copyright (c) 2019 Geoffrey McRae <geoff@hostfission.com>

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

This tool is intended as an interim workaround while I port this into the
kernel driver. If you like my work and want to support it you can contribute
using the following methods:

* Ko-Fi   - https://ko-fi.com/lookingglass
* Patreon - https://www.patreon.com/gnif
* BTC     - 14ZFcYjsKPiVreHqcaekvHGL846u3ZuT13

============================================================================

Unsupported device 1002:731f

@gnif
Copy link

gnif commented Dec 30, 2019

@ImreBrassai
Copy link

ImreBrassai commented Dec 30, 2019

OH! i see now

@NateTheGreatt
Copy link

NateTheGreatt commented Jan 10, 2020

@gnif just wanted to thank you for your hard work, you saved my brand new threadripper build. really can't thank you enough. new patreon incoming.

has there been any progress made on an upstream fix for this?

@gnif
Copy link

gnif commented Jan 11, 2020

Thanks mate.

any progress made on an upstream fix for this

Not yet, things slowed down across the holiday break, contacts have gone quiet for now ;)
In the interim work is progressing on Looking Glass :)

@Transistor4aCPU
Copy link

Transistor4aCPU commented Jan 11, 2020

Which file should you patch? How do I apply the patch?

@NateTheGreatt
Copy link

NateTheGreatt commented Jan 17, 2020

Which file should you patch? How do I apply the patch?

first you must download the source code of the linux kernel. the patch is applied in the root directory of the linux kernel source, before compiling. please google how to apply patches to the linux kernel using your distro of choice. this thread should be for information pertinent to the patch, not generic questions about the linux kernel itself.

@c0d3st0rm
Copy link

c0d3st0rm commented Apr 12, 2020

Could this be applied with kpatch/live patching?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment