Skip to content

Instantly share code, notes, and snippets.

@joanbm
Last active January 10, 2024 16:57
Show Gist options
  • Save joanbm/2ec3c512a1ac21f5f5c6b3c1a4dbef35 to your computer and use it in GitHub Desktop.
Save joanbm/2ec3c512a1ac21f5f5c6b3c1a4dbef35 to your computer and use it in GitHub Desktop.
Tentative fix for NVIDIA 470.199.02 driver for Linux 6.6-rc1
From a1879549b0bf049de790c0775c25971c82da8638 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Joan=20Bruguera=20Mic=C3=B3?= <joanbrugueram@gmail.com>
Date: Sat, 15 Jul 2023 22:26:18 +0000
Subject: [PATCH] Tentative fix for NVIDIA 470.199.02 driver for Linux 6.6-rc1
You will also need to apply this patch for Linux 6.5 support:
https://gist.github.com/joanbm/dfe8dc59af1c83e2530a1376b77be8ba
---
nvidia-drm/nvidia-drm-drv.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/nvidia-drm/nvidia-drm-drv.c b/nvidia-drm/nvidia-drm-drv.c
index b93642a..1b310f3 100644
--- a/nvidia-drm/nvidia-drm-drv.c
+++ b/nvidia-drm/nvidia-drm-drv.c
@@ -808,8 +808,12 @@ static struct drm_driver nv_drm_driver = {
.ioctls = nv_drm_ioctls,
.num_ioctls = ARRAY_SIZE(nv_drm_ioctls),
+// Rel. commit "drm/prime: Unexport helpers for fd/handle conversion" (Thomas Zimmermann, 20 Jun 2023)
+// Those functions are no longer exported, but leaving them to NULL is equivalent
+#if LINUX_VERSION_CODE < KERNEL_VERSION(6, 6, 0)
.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
+#endif
.gem_prime_import = nv_drm_gem_prime_import,
.gem_prime_import_sg_table = nv_drm_gem_prime_import_sg_table,
--
2.41.0
@joanbm
Copy link
Author

joanbm commented Jan 9, 2024

@blastwave If this also happens with recent but non-cutting-edge kernels like 6.2.x to 6.6.x, it may be worth reporting it to NVIDIA with reproduction steps. Not sure how responsive they are with problems with those "old" drivers but as far as I can tell, those kernel versions are officially supported.

@blastwave
Copy link

@blastwave If this also happens with recent but non-cutting-edge kernels like 6.2.x to 6.6.x

get the 6.7.0 kernel. The issue here is that the NVidia devs are doing nasty wrapper calls in
their code.

it may be worth reporting it to NVIDIA with reproduction steps

How? The NVidia folks do not really have a bugzilla.

Not sure how responsive they are with problems with those "old" drivers

The real issue is that NVidia wants to drop support on all the Kepler hardware that has
the ability to perform FP64 floating point operations at full speed. It is about money. 
Of course.

However nothing will get around the nasty code tricks that NVidia devs perform inside
their secret proprietary driver code. That is why we get a NULL pointer deref from code.
There were some changes in the way things are done in the Linux kernel after 6.1.x where
people really need to stick to the well known _syscallX ( for X = 0 ... 6 ) type calls and
nothing else. No digging around into __this_funky_non_API type call and no weird
wrapper calls. However the NVidia folks seem to just want to do whatever they do and
we get garbage drivers.

I can give a try with 545.29.06 and see what happens.

Dennis Clarke
RISC-V/SPARC/PPC/ARM/CISC
UNIX and Linux spoken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment