Skip to content

Instantly share code, notes, and snippets.

@s41m0n
Last active January 31, 2024 22:23
Show Gist options
  • Star 28 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save s41m0n/323513c95290c85f7054384ac34c41c5 to your computer and use it in GitHub Desktop.
Save s41m0n/323513c95290c85f7054384ac34c41c5 to your computer and use it in GitHub Desktop.
This guide is supposed to help people experiencing problems with the Nvidia dedicated graphic card management.

Linux - Nvidia switchable setup guide

The aim of this guide is to provide a working strategy to make your dedicated graphic card turn on/off correctly in a Linux environment (with xorg).

The following scripts have been created by tyrells and this guide is a remake of Graff's one.

Required Packages

The following two packages are stricly required:

  • nvidia
  • bumblebee (to use optirun)

In addition, this guide covers even the scenario in which also these packages are installed:

  • tlp
  • powertop (mostly used for verification)

Configuration

First of all, if you have tlp installed you need to teach him not to manage the Nvidia power consumption, since we would not be able to turn it on/off. To find out the pci of your Nvidia graphic card:

~ lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)

Let's insert the value 01:00.0 in the blacklist of tlp:

/etc/default/tlp
RUNTIME_PM_BLACKLIST="01:00.0"

Once done, we need to modify the bumblebee configuration file to specify the used method for saving power by disabling the nvidia card:

/etc/bumblebee/bumblebee.conf
...
Driver=nvidia
...

And in the Nvidia section:

...
PMMETHOD=none
...

Then, we need to create the following file in order to allow GPU to poweroff on boot. Before, be sure to retrieve the correct value to insert by typing:

~ ls /sys/bus/pci/devices | grep 01:00.0        
0000:01:00.0

And now create the file:

/etc/tmpfiles.d/nvidia_pm.conf
w /sys/bus/pci/devices/0000:01:00.0/power/control - - - - auto

The following two configurations are supposed to configure your xorg environment to not automatically add a GPU when detected. Moreover, you have to specify your integrated card driver (in my case Intel):

/etc/X11/xorg.conf.d/01-noautogpu.conf
Section "ServerFlags"
	Option "AutoAddGPU" "off"
EndSection
/etc/X11/xorg.conf.d/20-intel.conf
Section "Device"
 Identifier  "Intel Graphics"
 Driver      "intel"
EndSection

Blacklist files

Now that the general configuration has been correctly made, let's focus on blacklisting some modules. It is required to prevent some modules to be loaded, and since we want to manually turn on/off the gpu we have to add this file:

/etc/modprobe.d/blacklist.conf
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
blacklist nv
blacklist nvidia
blacklist nvidia-drm
blacklist nvidia-modeset
blacklist nvidia-uvm
blacklist ipmi_msghandler
blacklist ipmi_devintf 

Moreover, there are many modules which are automatically loaded together with nvidia and block its unloading. Since we do not want to find ourself in this scenario, we disable them by creating:

/etc/modprobe.d/disable-ipmi.conf
install ipmi_msghandler /usr/bin/false
install ipmi_devintf /usr/bin/false

And the same thing for the nvidia module:

/etc/modprobe.d/disable-nvidia.conf
install nvidia /bin/false

GPU management scripts

The following two scripts are used to switch on/off the GPU by just calling them in a terminal. They not only are responsible of switching the state of the correct pci, but they also unload/reload all the needed modules.

/bin/enableGpu.sh
#!/bin/sh
# allow to load nvidia module
mv /etc/modprobe.d/disable-nvidia.conf /etc/modprobe.d/disable-nvidia.conf.disable

# remove NVIDIA card (currently in power/control = auto)
echo -n 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove
sleep 1
# change PCIe power control
echo -n on > /sys/bus/pci/devices/0000\:00\:01.0/power/control
sleep 1
# rescan for NVIDIA card (defaults to power/control = on)
echo -n 1 > /sys/bus/pci/rescan
/bin/disableGpu.sh
modprobe -r nvidia_drm
modprobe -r nvidia_uvm
modprobe -r nvidia_modeset
modprobe -r nvidia

# change NVIDIA card power control
echo -n auto > /sys/bus/pci/devices/0000\:01\:00.0/power/control
sleep 1
# change PCIe power control
echo -n auto > /sys/bus/pci/devices/0000\:00\:01.0/power/control
sleep 1

# lock system form loading nvidia module
mv /etc/modprobe.d/disable-nvidia.conf.disable /etc/modprobe.d/disable-nvidia.conf

Please note: to make them work correctly set the execute permission. chmod +x /bin/enableGpu.sh /bin/disableGpu.sh

Service to lock GPU on shutdown

The unit we are going to create represent a service which locks the GPU on shutdown/restart in case it is not disable yet. This is necessary, otherwise on the next boot both nvidia and ipmi modules will be loaded and it would not be possible to unload them anymore, even though we have created the blacklist file.

/etc/systemd/system/disable-nvidia-on-shutdown.service
Description=Disables Nvidia GPU on OS shutdown

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c "mv /etc/modprobe.d/lock-nvidia.conf.disable /etc/modprobe.d/lock-nvidia.conf || true"

[Install]
WantedBy=multi-user.target

Reload systemd daemons and enable the new service:

systemctl daemon-reload 
systemctl enable disable-nvidia-on-shutdown.service

Final remarks

To make the system work:

  1. Reboot and verity that nvidia module is not loaded lsmod | grep nvidia
  2. Verify with powertop under Device stats that the Nvidia card has 0% of Power supply
  3. Enable your GPU by using the script enableGpu.sh
  4. Verify again that this time the power supply is 100%
  5. Check if GPU is loaded by using nvidia-smi
  6. Try to run a program with optirun optirun glxsphere64 no matter what
  7. Disable the Nvidia card disableGpu.sh
  8. Check once again the power consumption to be sure

IMPORTANT: if you have a dual boot installation of Windows which uses the Nvidia card and of course manage it differently, I noticed that on the next boot of Linux it would be possible that Nvidia card is loaded. In this case, since we would not be able to manually unload the nvidia module, we have to simulate the opposite action performed by the created service disable-nvidia-on-shutdown.service which is in particular mv /etc/modprobe.d/disable-nvidia.conf.disable /etc/modprobe.d/disable-nvidia.conf . So all we have to do is to open a terminal and perform the opposite renaming:

sudo mv /etc/modprobe.d/disable-nvidia.conf /etc/modprobe.d/disable-nvidia.conf.disable

Restart the computer and notice if effectively the Nvidia card is loaded or not; if not, try to disable it by using the script and hopefully it will work! I did not dig deeper this scenario, so every tips are welcome :D

@notdodo
Copy link

notdodo commented Jan 21, 2019

It's working for me! 👍
I also added

blacklist nvidia-update
blacklist nvidia-experimental

in the blacklist.conf and the X11 Intel conf should not be necessary (Xorg "should" be able to pick up the correct driver).
Anyway, thank you!

EDIT: Why I cannot use /usr/share/acpi_call/examples/turn_off_gpu.sh to disable the GPU?

@s41m0n

@SamuelAlev
Copy link

Hello,
Is it normal that in blacklist.conf, we have '-' between 'nvidia' and the right part

blacklist nvidia-drm
blacklist nvidia-modeset
blacklist nvidia-uvm

and in disableGpu.sh we have '_'

modprobe -r nvidia_drm
modprobe -r nvidia_uvm
modprobe -r nvidia_modeset

@s41m0n
Copy link
Author

s41m0n commented Nov 14, 2019

Hello,
Is it normal that in blacklist.conf, we have '-' between 'nvidia' and the right part

blacklist nvidia-drm
blacklist nvidia-modeset
blacklist nvidia-uvm

and in disableGpu.sh we have '_'

modprobe -r nvidia_drm
modprobe -r nvidia_uvm
modprobe -r nvidia_modeset

Hi Samuel,
yes for those packages is completely normal since you can use both their name property (modinfo nvidia-drm, you will see name=nvidia_drm) and the file name (modprobe will look for the .ko file named nvidia-drm).

Hope I've been clear enough, for any doubt ask :)

@SamuelAlev
Copy link

Hello,
Yes it is totally clear thank you for that !

I think etc/tempfiles.d/nvidia_pm.conf should be etc/tmpfiles.d/nvidia_pm.conf (no 'e' to tmp) as the folder doesn't exist

@s41m0n
Copy link
Author

s41m0n commented Nov 17, 2019

Hello,
Yes it is totally clear thank you for that !

I think etc/tempfiles.d/nvidia_pm.conf should be etc/tmpfiles.d/nvidia_pm.conf (no 'e' to tmp) as the folder doesn't exist

I'm really glad :)
You're totally right, I've just fixed it.
Thanks again!

@sursu
Copy link

sursu commented Jun 18, 2020

In order to switch between graphics cards I usually run e.g. sudo prime-select nvidia to select the NVIDIA GPU and I need to restart my machine in order for the changes to take effect.

Do I understand it right that by following this guide I will no longer have to restart if I want to switch to the other GPU?
If so, whey isn't this already part of Ubuntu?

On the NVIDIA Optimus Bumblebee, it is written:

With recent releases of Xorg and the NVIDIA driver, Bumblebee is no longer required.

Do we still need to use Bumblebee?

@s41m0n
Copy link
Author

s41m0n commented Jun 18, 2020

In order to switch between graphics cards I usually run e.g. sudo prime-select nvidia to select the NVIDIA GPU and I need to restart my machine in order for the changes to take effect.

Do I understand it right that by following this guide I will no longer have to restart if I want to switch to the other GPU?
If so, whey isn't this already part of Ubuntu?

On the NVIDIA Optimus Bumblebee, it is written:

With recent releases of Xorg and the NVIDIA driver, Bumblebee is no longer required.

Do we still need to use Bumblebee?

Hi,

by following this guide you will manually be able to switch on/off the NVIDIA graphic card and using it with optirun.
Unfortunately, It's been few months since I do not play with NVIDIA, so am not aware of the latest updates.
All I can tell, now in Ubuntu, thanks to NVIDIA-SETTINGS you can choose one of the following modes:

  • Performance (full NVIDIA usage)
  • On-demand (nvidia used only on specific apps)
  • Integrated card (no NVIDIA)

This said, I tried the On-Demand mode, but it seems that what it effectively does is to keep the Nvidia even though you are not using it (xorg is booted using the Nvidia), having an higher power consume than using the IntegratedCard mode.

I hope they'll fix.

@adrian-venegasreynoso
Copy link

Hello,
when I run the script enableGpu.sh I get the following errors:
mv: cannot stat '/etc/modprobe.d/disable-nvidia.conf': No such file or directory ./enableGpu.sh: line 6: /sys/bus/pci/devices/0000:02:00.0/remove: No such file or directory

So when I run optirun glxspheres I'm getting.
Cannot access secondary GPU - error: Could not load GPU driver Aborting because fallback start is disabled

Do you know what could be happening? Thanks in advance!

@s41m0n
Copy link
Author

s41m0n commented Oct 27, 2020

Hi advery,
are you sure your PCI is 0000:02:00.0?

@thebabush
Copy link

I had Cannot access secondary GPU - error: Could not load GPU driver Aborting because fallback start is disabled, but then used prime-select nvidia and it worked.
But then after doing prime-select intel + disableGPU.sh the nvidia module is still loaded :/

@thebabush
Copy link

thebabush commented Nov 8, 2020

Ok, so in enableGPU.sh I have to add prime-select nvidia.
In disableGPU.sh I have to add prime-select nvidia and run sudo systemctl stop nvidia-persistenced.service before removing the drivers.

Seems to be working now :)

EDIT: And now I'm in this weird state in which if I run optirun glxgears sometimes it brings the dGPU back on, sometimes not. But whatever good enough, as disableGpu.sh now can kill it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment