Skip to content

Instantly share code, notes, and snippets.

@s41m0n
Last active January 31, 2024 22:23
Show Gist options
  • Star 28 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save s41m0n/323513c95290c85f7054384ac34c41c5 to your computer and use it in GitHub Desktop.
Save s41m0n/323513c95290c85f7054384ac34c41c5 to your computer and use it in GitHub Desktop.
This guide is supposed to help people experiencing problems with the Nvidia dedicated graphic card management.

Linux - Nvidia switchable setup guide

The aim of this guide is to provide a working strategy to make your dedicated graphic card turn on/off correctly in a Linux environment (with xorg).

The following scripts have been created by tyrells and this guide is a remake of Graff's one.

Required Packages

The following two packages are stricly required:

  • nvidia
  • bumblebee (to use optirun)

In addition, this guide covers even the scenario in which also these packages are installed:

  • tlp
  • powertop (mostly used for verification)

Configuration

First of all, if you have tlp installed you need to teach him not to manage the Nvidia power consumption, since we would not be able to turn it on/off. To find out the pci of your Nvidia graphic card:

~ lspci | grep NVIDIA
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)

Let's insert the value 01:00.0 in the blacklist of tlp:

/etc/default/tlp
RUNTIME_PM_BLACKLIST="01:00.0"

Once done, we need to modify the bumblebee configuration file to specify the used method for saving power by disabling the nvidia card:

/etc/bumblebee/bumblebee.conf
...
Driver=nvidia
...

And in the Nvidia section:

...
PMMETHOD=none
...

Then, we need to create the following file in order to allow GPU to poweroff on boot. Before, be sure to retrieve the correct value to insert by typing:

~ ls /sys/bus/pci/devices | grep 01:00.0        
0000:01:00.0

And now create the file:

/etc/tmpfiles.d/nvidia_pm.conf
w /sys/bus/pci/devices/0000:01:00.0/power/control - - - - auto

The following two configurations are supposed to configure your xorg environment to not automatically add a GPU when detected. Moreover, you have to specify your integrated card driver (in my case Intel):

/etc/X11/xorg.conf.d/01-noautogpu.conf
Section "ServerFlags"
	Option "AutoAddGPU" "off"
EndSection
/etc/X11/xorg.conf.d/20-intel.conf
Section "Device"
 Identifier  "Intel Graphics"
 Driver      "intel"
EndSection

Blacklist files

Now that the general configuration has been correctly made, let's focus on blacklisting some modules. It is required to prevent some modules to be loaded, and since we want to manually turn on/off the gpu we have to add this file:

/etc/modprobe.d/blacklist.conf
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv
blacklist nv
blacklist nvidia
blacklist nvidia-drm
blacklist nvidia-modeset
blacklist nvidia-uvm
blacklist ipmi_msghandler
blacklist ipmi_devintf 

Moreover, there are many modules which are automatically loaded together with nvidia and block its unloading. Since we do not want to find ourself in this scenario, we disable them by creating:

/etc/modprobe.d/disable-ipmi.conf
install ipmi_msghandler /usr/bin/false
install ipmi_devintf /usr/bin/false

And the same thing for the nvidia module:

/etc/modprobe.d/disable-nvidia.conf
install nvidia /bin/false

GPU management scripts

The following two scripts are used to switch on/off the GPU by just calling them in a terminal. They not only are responsible of switching the state of the correct pci, but they also unload/reload all the needed modules.

/bin/enableGpu.sh
#!/bin/sh
# allow to load nvidia module
mv /etc/modprobe.d/disable-nvidia.conf /etc/modprobe.d/disable-nvidia.conf.disable

# remove NVIDIA card (currently in power/control = auto)
echo -n 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove
sleep 1
# change PCIe power control
echo -n on > /sys/bus/pci/devices/0000\:00\:01.0/power/control
sleep 1
# rescan for NVIDIA card (defaults to power/control = on)
echo -n 1 > /sys/bus/pci/rescan
/bin/disableGpu.sh
modprobe -r nvidia_drm
modprobe -r nvidia_uvm
modprobe -r nvidia_modeset
modprobe -r nvidia

# change NVIDIA card power control
echo -n auto > /sys/bus/pci/devices/0000\:01\:00.0/power/control
sleep 1
# change PCIe power control
echo -n auto > /sys/bus/pci/devices/0000\:00\:01.0/power/control
sleep 1

# lock system form loading nvidia module
mv /etc/modprobe.d/disable-nvidia.conf.disable /etc/modprobe.d/disable-nvidia.conf

Please note: to make them work correctly set the execute permission. chmod +x /bin/enableGpu.sh /bin/disableGpu.sh

Service to lock GPU on shutdown

The unit we are going to create represent a service which locks the GPU on shutdown/restart in case it is not disable yet. This is necessary, otherwise on the next boot both nvidia and ipmi modules will be loaded and it would not be possible to unload them anymore, even though we have created the blacklist file.

/etc/systemd/system/disable-nvidia-on-shutdown.service
Description=Disables Nvidia GPU on OS shutdown

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/bin/bash -c "mv /etc/modprobe.d/lock-nvidia.conf.disable /etc/modprobe.d/lock-nvidia.conf || true"

[Install]
WantedBy=multi-user.target

Reload systemd daemons and enable the new service:

systemctl daemon-reload 
systemctl enable disable-nvidia-on-shutdown.service

Final remarks

To make the system work:

  1. Reboot and verity that nvidia module is not loaded lsmod | grep nvidia
  2. Verify with powertop under Device stats that the Nvidia card has 0% of Power supply
  3. Enable your GPU by using the script enableGpu.sh
  4. Verify again that this time the power supply is 100%
  5. Check if GPU is loaded by using nvidia-smi
  6. Try to run a program with optirun optirun glxsphere64 no matter what
  7. Disable the Nvidia card disableGpu.sh
  8. Check once again the power consumption to be sure

IMPORTANT: if you have a dual boot installation of Windows which uses the Nvidia card and of course manage it differently, I noticed that on the next boot of Linux it would be possible that Nvidia card is loaded. In this case, since we would not be able to manually unload the nvidia module, we have to simulate the opposite action performed by the created service disable-nvidia-on-shutdown.service which is in particular mv /etc/modprobe.d/disable-nvidia.conf.disable /etc/modprobe.d/disable-nvidia.conf . So all we have to do is to open a terminal and perform the opposite renaming:

sudo mv /etc/modprobe.d/disable-nvidia.conf /etc/modprobe.d/disable-nvidia.conf.disable

Restart the computer and notice if effectively the Nvidia card is loaded or not; if not, try to disable it by using the script and hopefully it will work! I did not dig deeper this scenario, so every tips are welcome :D

@s41m0n
Copy link
Author

s41m0n commented Oct 27, 2020

Hi advery,
are you sure your PCI is 0000:02:00.0?

@thebabush
Copy link

I had Cannot access secondary GPU - error: Could not load GPU driver Aborting because fallback start is disabled, but then used prime-select nvidia and it worked.
But then after doing prime-select intel + disableGPU.sh the nvidia module is still loaded :/

@thebabush
Copy link

thebabush commented Nov 8, 2020

Ok, so in enableGPU.sh I have to add prime-select nvidia.
In disableGPU.sh I have to add prime-select nvidia and run sudo systemctl stop nvidia-persistenced.service before removing the drivers.

Seems to be working now :)

EDIT: And now I'm in this weird state in which if I run optirun glxgears sometimes it brings the dGPU back on, sometimes not. But whatever good enough, as disableGpu.sh now can kill it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment