Skip to content

Instantly share code, notes, and snippets.

@bmcbm
Last active September 18, 2024 17:09
Show Gist options
  • Save bmcbm/375f14eaa17f88756b4bdbbebbcfd029 to your computer and use it in GitHub Desktop.
Save bmcbm/375f14eaa17f88756b4bdbbebbcfd029 to your computer and use it in GitHub Desktop.
NVIDIA Suspend fix
# Use systemd for managing NVIDIA driver suspend in drivers ====>>> PRIOR to version 470 <<<=====
# https://download.nvidia.com/XFree86/Linux-x86_64/450.66/README/powermanagement.html
# https://forums.developer.nvidia.com/t/unable-to-set-nvidia-kernel-module-parameters/161306
# Please note: In Fedora Linux you may need to just install the xorg-x11-drv-nvidia-power pakage
# as sugested by @goombah88 in the comments below.
TMP_PATH=/var/tmp
TMPL_PATH=/usr/share/doc/nvidia-driver-460/
echo "options nvidia NVreg_PreserveVideoMemoryAllocations=1 NVreg_TemporaryFilePath=${TMP_PATH}" | sudo tee /etc/modprobe.d/nvidia-power-management.conf
sudo install --mode 644 "${TMPL_PATH}/nvidia-suspend.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/nvidia-hibernate.service" /etc/systemd/system
sudo install --mode 644 "${TMPL_PATH}/nvidia-resume.service" /etc/systemd/system
sudo install "${TMPL_PATH}/nvidia" /lib/systemd/system-sleep
sudo install "${TMPL_PATH}/nvidia-sleep.sh" /usr/bin
sudo systemctl enable nvidia-suspend.service
sudo systemctl enable nvidia-hibernate.service
sudo systemctl enable nvidia-resume.service
@bmcbm
Copy link
Author

bmcbm commented Nov 30, 2021

@khteh You may try to use ubuntu-drivers to install the nvidia driver:

$ sudo ubuntu-drivers install nvidia:495

@khteh
Copy link

khteh commented Dec 1, 2021

It doesn't do/help anything at all since I have installed using apt

@ajayyy
Copy link

ajayyy commented Dec 19, 2021

Still working for me on latest Manjaro KDE in a fresh install

@sarka9000
Copy link

Thanks, this fixed my suspend problem which probably came as I was messing with drivers and Steam VR (I think nvidia-dkms-470 emptied those service files?).

For me, pm-suspend worked fine but xfce4-session-logout --suspend and systemctl suspend only took my network down for a few seconds without suspending

@Trikenstein
Copy link

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

@apienk Thanks. that is the solution that worked for my case (Ubuntu 20.04.3). I had installed nvidia driver v470. Then changed for a Radeon card. On a routine system update, all the nvidia were removed. But somehow, systemd-hibernate and systemd-suspend still invoke the nvidia services. journalctl -rb -1 shows

Jan 15 11:11:00 mycomputer-abc systemd-logind[1437]: Error during inhibitor-delayed operation (already returned success to client): Unit nvidia-suspend.service is masked.

Strangely enough, this is enough to hang the system. Removing all the orphan nvidia services as you showed fixed the issue.

@cnske
Copy link

cnske commented Feb 8, 2022

@bmcbm I'm on Debian 11 stable running the 470.94 version of the driver. My system resumes sometimes fine and sometimes not from hibernation and I assume that this setting discussed here may be the issue.

I tried to follow your description from this comment, but no files are located there. They are also not located in /usr/share/doc/nvidia-driver on my system. I wonder if there is another place where I could look for the files.

@bmcbm
Copy link
Author

bmcbm commented Mar 14, 2022

Above steps was required for Nvidia Linux drivers prior to 470. The 470 driver improved suspend/resume, but I found it was still not very stable.

In my experience suspend/resume seem to have stabilized since driver version 495, and the steps mentioned above steps are no longer necessary (and may actually break suspend/resume)

CC @cnske

@392781
Copy link

392781 commented Apr 20, 2022

@bmcbm This is simply not true. I'm on Linux Mint 20.3 using Nvidia drivers 510 and the only way to make sure that I can wake from a suspend is to disable /usr/bin/nvidia-sleep.sh by setting exit 0 at the top of the file.

@cnske
Copy link

cnske commented Apr 20, 2022

@bmcbm I have to agree it is not very stable and still sometimes can not resume from sleep or hibernation.
@392781 would love to add this line, but this file does not exist on my system. I'm running version 470.103.01 of the driver.

@bmcbm
Copy link
Author

bmcbm commented Apr 20, 2022

@392781 - you are aware that you should NOT use above scripts with the 510 driver, right? As stated above that may actually break suspend/resume as you seem to experience.

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.

@cnske
Copy link

cnske commented Apr 20, 2022

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.
Yep I did. Since I'm on Debian stable it is the most recent version I can get

Edit: just thought I should mention this. Waking up from sleep or returning from hibernation, works in ~90% of the cases, just in 10% it goes horribly wrong. Currently I'm on X11, even though the default would be Wayland on Debian 11. However, the installation of the NVidia driver changed this so I assumed it may be the more stable option.
If someone has different experience with using Wayland I'm happy to hear, since I would actually prefer Wayland over X11.

@392781
Copy link

392781 commented Apr 20, 2022

@392781 - you are aware that you should NOT use above scripts with the 510 driver, right? As stated above that may actually break suspend/resume as you seem to experience.

@cnske Did you see my comment from 14 Mar above? Maybe try to update to a more recent driver.

I'm not. Half of the files mentioned don't exist. I just know that with each update, I need to go and add exit 0 to that file to get suspend/hibernate to work again properly. I was just replying to you stating that a fix is no longer needed to get it to work... which isn't true.

@bmcbm
Copy link
Author

bmcbm commented Apr 21, 2022

@392781 OK. Thanks. Odd that you need to do that. I do not have to change that file for sleep/resume to work on my system (ubuntu 20.04, nvidia driver v510). But what ever works. Maybe others need this tip as well.

@wiesenklee
Copy link

wiesenklee commented May 5, 2022

hey there, for the nvidia-340 i can not find the required service's in the corresponding directory /usr/share/doc/nvidia-340. is there a work-around so i can use the services from other nvidia-driver's. any ideas?

@bmcbm
Copy link
Author

bmcbm commented May 6, 2022

@Han-nes The 340 series dates back to 2014. Sorry, can't help you there.

@mike-lawrence
Copy link

So, for those of us now on 510, but were previously on 470, does the procedure for getting suspend to work start with deleting the 470 files via

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

and then running the code of your gist (replacing nvidia-driver-460with nvidia-driver-510), or is merely deleting the files sufficient?

@bmcbm
Copy link
Author

bmcbm commented May 17, 2022

@mike-lawrence The instructions in the gist above is for drivers PRIOR to v470 as stated in the comment in the top.

You should not use it for v470 and later drivers. Since v470 all required scripts are installed and updated automatically (as far as I am aware)

@7h3ju57
Copy link

7h3ju57 commented May 17, 2022

@392781 wow, adding exit 0 to nvidia-sleep.sh fixed my sleep issues!
Curious if its my old GTX970 causing issues.
Running NVIDIA510 on PopOS 22.04

@bmharper
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend.
The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths.
But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

@Akib-Alvee
Copy link

@apienk Thank you very much. I was bit upset as I didn't find any solution.Then your procedures works for me perfectly.

@marani
Copy link

marani commented Jun 13, 2022

After installing latest nvidia driver & cuda tool kit I was unable to suspend or hibernate with sudo systemctl suspend or sudo systemctl hibernate as usual, the log shows this

> journalctl -b | grep suspend
... systemd-logind[1841]: Error during inhibitor-delayed operation (already returned success to client): Unit nvidia-suspend.service not found.
...

Which is correct because

> systemctl status nvidia-suspend nvidia-hibernate nvidia-resume
Unit nvidia-suspend.service could not be found.
Unit nvidia-hibernate.service could not be found.
Unit nvidia-resume.service could not be found.

My nvidia libs version

> cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  515.48.07  Fri May 27 03:26:43 UTC 2022
GCC version:  gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_May__3_18:49:52_PDT_2022
Cuda compilation tools, release 11.7, V11.7.64
Build cuda_11.7.r11.7/compiler.31294372_0

My /usr/share/doc/nvidia-driver-515 looks pretty empty

> ls /usr/share/doc/nvidia-driver-515/
LICENSE.gz                        changelog.Debian.gz               nvidia-persistenced-init.tar.bz2
NVIDIA_Changelog.gz               copyright
README.txt.gz                     html/

find /usr/share/doc/ -print | grep "suspend" shows no result

So does anyone know what TMPL_PATH should be for the latest driver 515.48.07? Or is my installation broken?

Some findings so far:

  • This comment above says the nvidia-suspend.service files were no longer needed and should be deleted, driver version was 495.
  • However, one guy in this post somehow found the old service config and script file and then put it in the correct location, thus resolved the issue, driver version was also 495.
  • Same issue was investigated here. The OP's resolution was to set NVreg_PreserveVideoMemoryAllocations=0, driver version 470.
  • Some relevant links

@bmcbm
Copy link
Author

bmcbm commented Jun 13, 2022

@marani I also experienced issues with hibernate after installing Nvidia CUDA drivers.

See the comment by @bmharper above.

I found that there were some broken links to the missing systemd service units for uspend and hibernate. Check if you have such broken links and remove them if you do.

find /etc/systemd -type l -exec file {} \; | grep broken | grep nvidia

@marani
Copy link

marani commented Jun 13, 2022

Thanks for the suggestion, I tried deleting and it worked.

However, according to docs v515, the the nvidia systemd config files are still used, the documentation mentioned it as if the files are supposed to be there (Unless they forgot to update the docs). Looking at the shell script in this post, it seems to call nvidia driver suspend, so removing them would remove this behavior?

/usr/bin/nvidia-sleep.sh

#!/bin/bash

if [ ! -f /proc/driver/nvidia/suspend ]; then
    exit 0
fi

RUN_DIR="/var/run/nvidia-sleep"
XORG_VT_FILE="${RUN_DIR}"/Xorg.vt_number

PATH="/bin:/usr/bin"

case "$1" in
    suspend|hibernate)
        mkdir -p "${RUN_DIR}"
        fgconsole > "${XORG_VT_FILE}"
        chvt 63
        if [[ $? -ne 0 ]]; then
            exit $?
        fi
        echo "$1" > /proc/driver/nvidia/suspend
        exit $?
        ;;
    resume)
        echo "$1" > /proc/driver/nvidia/suspend 
        #
        # Check if Xorg was determined to be running at the time
        # of suspend, and whether its VT was recorded.  If so,
        # attempt to switch back to this VT.
        #
        if [[ -f "${XORG_VT_FILE}" ]]; then
            XORG_PID=$(cat "${XORG_VT_FILE}")
            rm "${XORG_VT_FILE}"
            chvt "${XORG_PID}"
        fi
        exit 0
        ;;
    *)
        exit 1
esac

@rdominguez89
Copy link

sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-hibernate.service.requires/nvidia-hibernate.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-resume.service
sudo rm /etc/systemd/system/systemd-suspend.service.requires/nvidia-suspend.service

This works perfectly

@AugustineYang
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend. The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths. But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

This works! Thanks a lot!

@albertomercurio
Copy link

I'm on Ubuntu 22.04, and after installing CUDA I was unable to suspend. The following steps worked for me:

  • Inside /etc/systemd, delete all of the files that include nvidia and suspend or hibernate. There are three services inside /etc/systemd/system, which I deleted. There are also some other dead nvidia links inside systemd-hibernate.service.requires, systemd-suspend.service.requires, which I deleted.

I wish I knew the exact names of the files that I deleted, but since they're now gone, I can't remember their exact paths. But in a nutshell, go into /etc/systemd, and do find . -iname nv*, and then delete all the of the suspend, resume, and hibernate scripts.

DO NOT DELETE nvidia-powerd.service and nvidia-persistenced.service.

Once you're done with that, do systemctl daemon-reload.

This doesn't worked form me.
I removed all nvidia files you said. Indeed, if i do systemctl list-unit-files | grep nvidia it returns

nvidia-persistenced.service                                               enabled         enabled
nvidia-powerd.service                                                     enabled         enabled

and with find /etc/systemd -iname nv*

/etc/systemd/system/multi-user.target.wants/nvidia-persistenced.service
/etc/systemd/system/multi-user.target.wants/nvidia-powerd.service

but I'm still not able to resume the laptop.

My nvidia-smi is

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P8     2W /  N/A |    634MiB /  4096MiB |      8%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2112      G   /usr/lib/xorg/Xorg                216MiB |
|    0   N/A  N/A      2468      G   /usr/bin/gnome-shell               50MiB |
|    0   N/A  N/A      3944      G   ...589584913286235972,131072      366MiB |
+-----------------------------------------------------------------------------+

@fxmarty
Copy link

fxmarty commented Sep 19, 2022

@bmharper Your solution works perfectly for me (driver 515)!

@mikeshiyan
Copy link

Thanks @bmharper !!! Worked for me too, in the Nvidia On-Demand mode. I haven't tried other modes yet, but I'm fine with this one.

My story is: Linux Mint 20.3 Cinnamon, kernel 5.4.0-126. Recently installed CUDA toolkit, which updated my nvidia driver from 470 (if I'm not mistaken) to 515, which in turn had broken my auto-suspend on lid close. The Quit->Suspend button haven't worked either. Only way I could suspend is by running pm-suspend from terminal.

@yingtanairbussv
Copy link

@bmharper That works for alienware laptop with nvidia driver 515

@yingtanairbussv
Copy link

It works. The power drain I see before is not related to nvidia graphic card.

I tried to put the alienware laptop to airplane mode and pull out usb dondle before put into suspend, and put in the backpack for the whole night. The battery level only down by 9% which is normal, and no extra heat in the backpack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment