Skip to content

Instantly share code, notes, and snippets.

@bmcbm
Last active March 18, 2023 23:33
Embed
What would you like to do?
NVIDIA Suspend fix
# Use systemd for managing NVIDIA driver suspend in drivers ====>>> PRIOR to version 470 <<<=====
# https://download.nvidia.com/XFree86/Linux-x86_64/450.66/README/powermanagement.html
# https://forums.developer.nvidia.com/t/unable-to-set-nvidia-kernel-module-parameters/161306
# Please note: In Fedora Linux you may need to just install the xorg-x11-drv-nvidia-power pakage
# as sugested by @goombah88 in the comments below.
TMP_PATH=/var/tmp
TMPL_PATH=/usr/share/doc/nvidia-driver-460/
echo "options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=${TMP_PATH}" | sudo tee /etc/modprobe.d/nvidia-power-management.conf
sudo install --mode 644 "${TMPL_PATH}/nvidia-suspend.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/nvidia-hibernate.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/nvidia-resume.service" /etc/systemd/system
sudo install "${TMPL_PATH}/nvidia" /lib/systemd/system-sleep
sudo install "${TMPL_PATH}/nvidia-sleep.sh" /usr/bin
sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service
sudo systemctl enable nvidia-resume.service
@apienk
Copy link

apienk commented Nov 25, 2021

For me, with nvidia-driver-495, the simple solution was to remove the damaged symlinks from systemd. You most likely have them if you upgraded from nvidia-driver-470, because 470 still included the .service files in /lib/systemd/system/. The files are no longer included in 495 but the postinst script does not remove the symlinks. So, remove them with:

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

Works instantly, no need of rebooting or logging off.

@khteh
Copy link

khteh commented Nov 25, 2021 via email

@apienk
Copy link

apienk commented Nov 25, 2021

@khteh Here are detailed guides for all major distributions: https://linuxconfig.org/install-the-latest-nvidia-linux-driver

@khteh
Copy link

khteh commented Nov 26, 2021

I sudo apt install -y nvidia-driver-495, reboot the laptop and ubuntu-drivers devices still shows nvidia-driver-470 being selected. What do I miss?

@apienk
Copy link

apienk commented Nov 27, 2021

@khteh nvidia-driver-470 must have been automatically uninstalled when you installed nvidia-driver-495 because apt treats them as conflicting. ubuntu-drivers devices does not show the selected driver for me, only recommended. A better way to check which driver is being used:

if using dual-GPU in On-Demand mode:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxinfo | grep "OpenGL version string"

if using just nVidia:

glxinfo | grep "OpenGL version string"

@khteh
Copy link

khteh commented Nov 30, 2021

But why is it that when I apt install nvidia-driver-495, ubuntu-drivers devices still show 470 as being the recommended one and nvidia-smi and nvidia-settings do NOT work properly?

$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

@devent
Copy link

devent commented Nov 30, 2021

You should better ask the Ubuntu devs. This is what I get. I'm using Mint that is based on Ubuntu.

cat /etc/os-release 
NAME="Linux Mint"
VERSION="20.2 (Uma)"

ubuntu-drivers devices
WARNING:root:_pkg_get_support nvidia-driver-390: package has invalid Support Legacyheader, cannot determine support level
== /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001C8Dsv00001043sd00001B30bc03sc02i00
vendor   : NVIDIA Corporation
model    : GP107M [GeForce GTX 1050 Mobile]
driver   : nvidia-driver-495 - third-party non-free recommended
driver   : nvidia-driver-465 - third-party non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-460-server - distro non-free
driver   : nvidia-driver-455 - third-party non-free
driver   : nvidia-driver-418-server - distro non-free
driver   : nvidia-driver-460 - third-party non-free
driver   : nvidia-driver-450 - third-party non-free
driver   : nvidia-driver-470 - third-party non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

@bmcbm
Copy link
Author

bmcbm commented Nov 30, 2021

@khteh You may try to use ubuntu-drivers to install the nvidia driver:

$ sudo ubuntu-drivers install nvidia:495

@khteh
Copy link

khteh commented Dec 1, 2021

It doesn't do/help anything at all since I have installed using apt

@ajayyy
Copy link

ajayyy commented Dec 19, 2021

Still working for me on latest Manjaro KDE in a fresh install

@sarka9000
Copy link

Thanks, this fixed my suspend problem which probably came as I was messing with drivers and Steam VR (I think nvidia-dkms-470 emptied those service files?).

For me, pm-suspend worked fine but xfce4-session-logout --suspend and systemctl suspend only took my network down for a few seconds without suspending

@Trikenstein
Copy link

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

@apienk Thanks. that is the solution that worked for my case (Ubuntu 20.04.3). I had installed nvidia driver v470. Then changed for a Radeon card. On a routine system update, all the nvidia were removed. But somehow, systemd-hibernate and systemd-suspend still invoke the nvidia services. journalctl -rb -1 shows

Jan 15 11:11:00 mycomputer-abc systemd-logind[1437]: Error during inhibitor-delayed operation (already returned success to client): Unit nvidia-suspend.service is masked.

Strangely enough, this is enough to hang the system. Removing all the orphan nvidia services as you showed fixed the issue.

@cnske
Copy link

cnske commented Feb 8, 2022

@bmcbm I'm on Debian 11 stable running the 470.94 version of the driver. My system resumes sometimes fine and sometimes not from hibernation and I assume that this setting discussed here may be the issue.

I tried to follow your description from this comment, but no files are located there. They are also not located in /usr/share/doc/nvidia-driver on my system. I wonder if there is another place where I could look for the files.

@bmcbm
Copy link
Author

bmcbm commented Mar 14, 2022

Above steps was required for Nvidia Linux drivers prior to 470. The 470 driver improved suspend/resume, but I found it was still not very stable.

In my experience suspend/resume seem to have stabilized since driver version 495, and the steps mentioned above steps are no longer necessary (and may actually break suspend/resume)

CC @cnske

@392781
Copy link

392781 commented Apr 20, 2022

@bmcbm This is simply not true. I'm on Linux Mint 20.3 using Nvidia drivers 510 and the only way to make sure that I can wake from a suspend is to disable /usr/bin/nvidia-sleep.sh by setting exit 0 at the top of the file.

@cnske
Copy link

cnske commented Apr 20, 2022

@bmcbm I have to agree it is not very stable and still sometimes can not resume from sleep or hibernation.
@392781 would love to add this line, but this file does not exist on my system. I'm running version 470.103.01 of the driver.

@bmcbm
Copy link
Author

bmcbm commented Apr 20, 2022

@392781 - you are aware that you should NOT use above scripts with the 510 driver, right? As stated above that may actually break suspend/resume as you seem to experience.

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.

@cnske
Copy link

cnske commented Apr 20, 2022

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.
Yep I did. Since I'm on Debian stable it is the most recent version I can get

Edit: just thought I should mention this. Waking up from sleep or returning from hibernation, works in ~90% of the cases, just in 10% it goes horribly wrong. Currently I'm on X11, even though the default would be Wayland on Debian 11. However, the installation of the NVidia driver changed this so I assumed it may be the more stable option.
If someone has different experience with using Wayland I'm happy to hear, since I would actually prefer Wayland over X11.

@392781
Copy link

392781 commented Apr 20, 2022

@392781 - you are aware that you should NOT use above scripts with the 510 driver, right? As stated above that may actually break suspend/resume as you seem to experience.

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.

I'm not. Half of the files mentioned don't exist. I just know that with each update, I need to go and add exit 0 to that file to get suspend/hibernate to work again properly. I was just replying to you stating that a fix is no longer needed to get it to work... which isn't true.

@bmcbm
Copy link
Author

bmcbm commented Apr 21, 2022

@392781 OK. Thanks. Odd that you need to do that. I do not have to change that file for sleep/resume to work on my system (ubuntu 20.04, nvidia driver v510). But what ever works. Maybe others need this tip as well.

@han-nes
Copy link

han-nes commented May 5, 2022

hey there, for the nvidia-340 i can not find the required service's in the corresponding directory /usr/share/doc/nvidia-340. is there a work-around so i can use the services from other nvidia-driver's. any ideas?

@bmcbm
Copy link
Author

bmcbm commented May 6, 2022

@han-nes The 340 series dates back to 2014. Sorry, can't help you there.

@mike-lawrence
Copy link

So, for those of us now on 510, but were previously on 470, does the procedure for getting suspend to work start with deleting the 470 files via

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

and then running the code of your gist (replacing nvidia-driver-460with nvidia-driver-510), or is merely deleting the files sufficient?

@bmcbm
Copy link
Author

bmcbm commented May 17, 2022

@mike-lawrence The instructions in the gist above is for drivers PRIOR to v470 as stated in the comment in the top.

You should not use it for v470 and later drivers. Since v470 all required scripts are installed and updated automatically (as far as I am aware)

@7h3ju57
Copy link

7h3ju57 commented May 17, 2022

@392781 wow, adding exit 0 to nvidia-sleep.sh fixed my sleep issues!
Curious if its my old GTX970 causing issues.
Running NVIDIA510 on PopOS 22.04

@bmharper
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend.
The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths.
But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

@Akib-Alvee
Copy link

@apienk Thank you very much. I was bit upset as I didn't find any solution.Then your procedures works for me perfectly.

@marani
Copy link

marani commented Jun 13, 2022

After installing latest nvidia driver & cuda tool kit I was unable to suspend or hibernate with sudo systemctl suspend or sudo systemctl hibernate as usual, the log shows this

> journalctl -b | grep suspend
... systemd-logind[1841]: Error during inhibitor-delayed operation (already returned success to client): Unit nvidia-suspend.service not found.
...

Which is correct because

> systemctl status nvidia-suspend nvidia-hibernate nvidia-resume
Unit nvidia-suspend.service could not be found.
Unit nvidia-hibernate.service could not be found.
Unit nvidia-resume.service could not be found.

My nvidia libs version

> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  515.48.07  Fri May 27 03:26:43 UTC 2022
GCC version:  gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

My /usr/share/doc/nvidia-driver-515 looks pretty empty

> ls /usr/share/doc/nvidia-driver-515/
LICENSE.gz                        changelog.Debian.gz               nvidia-persistenced-init.tar.bz2
NVIDIA_Changelog.gz               copyright
README.txt.gz                     html/

find /usr/share/doc/ -print | grep "suspend" shows no result

So does anyone know what TMPL_PATH should be for the latest driver 515.48.07? Or is my installation broken?

Some findings so far:

  • This comment above says the nvidia-suspend.service files were no longer needed and should be deleted, driver version was 495.
  • However, one guy in this post somehow found the old service config and script file and then put it in the correct location, thus resolved the issue, driver version was also 495.
  • Same issue was investigated here. The OP's resolution was to set NVreg_PreserveVideoMemoryAllocations=0, driver version 470.
  • Some relevant links

@bmcbm
Copy link
Author

bmcbm commented Jun 13, 2022

@marani I also experienced issues with hibernate after installing Nvidia CUDA drivers.

See the comment by @bmharper above.

I found that there were some broken links to the missing systemd service units for uspend and hibernate. Check if you have such broken links and remove them if you do.

find /etc/systemd -type l -exec file {} \; | grep broken | grep nvidia

@marani
Copy link

marani commented Jun 13, 2022

Thanks for the suggestion, I tried deleting and it worked.

However, according to docs v515, the the nvidia systemd config files are still used, the documentation mentioned it as if the files are supposed to be there (Unless they forgot to update the docs). Looking at the shell script in this post, it seems to call nvidia driver suspend, so removing them would remove this behavior?

/usr/bin/nvidia-sleep.sh

#!/bin/bash

if [ ! -f /proc/driver/nvidia/suspend ]; then
    exit 0
fi

RUN_DIR="/var/run/nvidia-sleep"
XORG_VT_FILE="${RUN_DIR}"/Xorg.vt_number

PATH="/bin:/usr/bin"

case "$1" in
    suspend|hibernate)
        mkdir -p "${RUN_DIR}"
        fgconsole > "${XORG_VT_FILE}"
        chvt 63
        if [[ $? -ne 0 ]]; then
            exit $?
        fi
        echo "$1" > /proc/driver/nvidia/suspend
        exit $?
        ;;
    resume)
        echo "$1" > /proc/driver/nvidia/suspend 
        #
        # Check if Xorg was determined to be running at the time
        # of suspend, and whether its VT was recorded.  If so,
        # attempt to switch back to this VT.
        #
        if [[ -f "${XORG_VT_FILE}" ]]; then
            XORG_PID=$(cat "${XORG_VT_FILE}")
            rm "${XORG_VT_FILE}"
            chvt "${XORG_PID}"
        fi
        exit 0
        ;;
    *)
        exit 1
esac

@rdominguez89
Copy link

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

This works perfectly

@AugustineYang
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend. The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths. But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

This works! Thanks a lot!

@albertomercurio
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend. The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths. But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

This doesn't worked form me.
I removed all nvidia files you said. Indeed, if i do systemctl list-unit-files | grep nvidia it returns

nvidia-persistenced.service                                               enabled         enabled
nvidia-powerd.service                                                     enabled         enabled

and with find /etc/systemd -iname nv*

/etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service
/etc/systemd/system/multi-user.target.wants/nvidia-powerd.service

but I'm still not able to resume the laptop.

My nvidia-smi is

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P8     2W /  N/A |    634MiB /  4096MiB |      8%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2112      G   /usr/lib/xorg/Xorg                216MiB |
|    0   N/A  N/A      2468      G   /usr/bin/gnome-shell               50MiB |
|    0   N/A  N/A      3944      G   ...589584913286235972,131072      366MiB |
+-----------------------------------------------------------------------------+

@fxmarty
Copy link

fxmarty commented Sep 19, 2022

@bmharper Your solution works perfectly for me (driver 515)!

@mikeshiyan
Copy link

Thanks @bmharper !!! Worked for me too, in the Nvidia On-Demand mode. I haven't tried other modes yet, but I'm fine with this one.

My story is: Linux Mint 20.3 Cinnamon, kernel 5.4.0-126. Recently installed CUDA toolkit, which updated my nvidia driver from 470 (if I'm not mistaken) to 515, which in turn had broken my auto-suspend on lid close. The Quit->Suspend button haven't worked either. Only way I could suspend is by running pm-suspend from terminal.

@yingtanairbussv
Copy link

@bmharper That works for alienware laptop with nvidia driver 515

@yingtanairbussv
Copy link

It works. The power drain I see before is not related to nvidia graphic card.

I tried to put the alienware laptop to airplane mode and pull out usb dondle before put into suspend, and put in the backpack for the whole night. The battery level only down by 9% which is normal, and no extra heat in the backpack.

@yingtanairbussv
Copy link

BTW: I also update the driver to version 520, and upgrade ubuntu to 22.04.1 LTS. Works fine.

@sbwcwso
Copy link

sbwcwso commented Oct 28, 2022

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

worked for me. Thank you very much.

@mtz29
Copy link

mtz29 commented Jan 14, 2023

For me, with nvidia-driver-495, the simple solution was to remove the damaged symlinks from systemd. You most likely have them if you upgraded from nvidia-driver-470, because 470 still included the .service files in /lib/systemd/system/. The files are no longer included in 495 but the postinst script does not remove the symlinks. So, remove them with:

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

Works instantly, no need of rebooting or logging off.

This does the job. Thank you.

@kolubex
Copy link

kolubex commented Feb 16, 2023

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

Thank you, This worked in my case.

@ermenkov
Copy link

ermenkov commented Mar 8, 2023

The method works. In case you need fresh files install the services, you can get your driver version from NVIDIA. Use nvidia-smi to get the exact version and then run the .run file with --extract-only flag: sudo bash NVIDIA-Linux-x86_64-515.65.01.run --extract-only.

Then you run the given commands setting the TMPL_PATH to the path where you've extracted the files. Keep in mind that the files for this driver (515.65.01) have a slightly different locations as stated in the original post. So the updated commands should be like this:

TMPL_PATH=/home/user/Downloads/NVIDIA-Linux-x86_64-515.65.01/systemd

sudo install --mode 644 "${TMPL_PATH}/system/nvidia-suspend.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/system/nvidia-hibernate.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/system/nvidia-resume.service" /etc/systemd/system
sudo install "${TMPL_PATH}/system-sleep/nvidia" /lib/systemd/system-sleep
sudo install "${TMPL_PATH}/nvidia-sleep.sh" /usr/bin

sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service
sudo systemctl enable nvidia-resume.service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment