Skip to content

Instantly share code, notes, and snippets.

@DavidAce
Last active November 16, 2024 16:47
Show Gist options
  • Save DavidAce/67bec5675b4a6cef72ed3391e025a8e5 to your computer and use it in GitHub Desktop.
Save DavidAce/67bec5675b4a6cef72ed3391e025a8e5 to your computer and use it in GitHub Desktop.
Nvidia power limit at boot

The original inspiration for this gist can be found here.

Set Nvidia power limit after boot

These instructions will create a systemd service that runs after booting up, to automatically invoke nvidia-smi to set the power limit.

Motivation

Normally any set power limit is reset between reboots.

Reducing the power limit to ~80-90% may increase longevity in cases where the gpu is expected to run 24/7 at 100% load, for instance in scientific computing or mining. On the other hand, and increased power limit may increase performance and yield better overclocking results.

Check your current power settings

First, check your current power settings with

>  sudo nvidia-smi -q -d POWER

Example output on RTX 2080 TI:

==============NVSMI LOG==============

Timestamp                                 : Sun Nov  7 17:02:10 2021
Driver Version                            : 495.44
CUDA Version                              : 11.5

Attached GPUs                             : 1
GPU 00000000:41:00.0
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 291.63 W
        Power Limit                       : 300.00 W
        Default Power Limit               : 300.00 W
        Enforced Power Limit              : 300.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 366.00 W
    Power Samples
        Duration                          : 2.40 sec
        Number of Samples                 : 119
        Max                               : 313.22 W
        Min                               : 284.31 W
        Avg                               : 296.45 W

The field of interest is Power Limit, which will be reduced from 300 W to 275 W below.

Set up a service for systemd

  • Save the files nvidia-tdp.service and nvidia-tdp.timer below to /etc/systemd/system
    • -pm 1 enables persistance mode
    • Edit -pl 275 to what is appropriate for your gpu
  • Run sudo systemctl daemon-reload to make the service available to systemd (rerun this step after editing services)
  • Run sudo systemctl enable --now nvidia-tdp.timer to enable and start the new service. The command written to ExecStart in nvidia-tdp.service will be run after boot.

Example output after reboot:

> sudo nvidia-smi -q -d POWER

==============NVSMI LOG==============

Timestamp                                 : Sun Nov  7 17:21:16 2021
Driver Version                            : 495.44
CUDA Version                              : 11.5

Attached GPUs                             : 1
GPU 00000000:41:00.0
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 276.79 W
        Power Limit                       : 275.00 W
        Default Power Limit               : 300.00 W
        Enforced Power Limit              : 275.00 W
        Min Power Limit                   : 100.00 W
        Max Power Limit                   : 366.00 W
    Power Samples
        Duration                          : 2.40 sec
        Number of Samples                 : 119
        Max                               : 283.82 W
        Min                               : 237.65 W
        Avg                               : 273.43 W

Multiple gpus

The procedure above will set the same power limit to all nvidia gpus on your system. Use flags -i 0, -i 1 and so on, when calling nvidia-smi to select gpu 0, gpu 1... For instance:

ExecStart=/usr/bin/nvidia-smi -pm 1 && /usr/bin/nvidia-smi -pl 275 -i 0  && /usr/bin/nvidia-smi -pl 250 -i 1 

would set the first gpu to 275 W and the second to 250 W.

[Unit]
Description=Set NVIDIA power limit above default
[Service]
Type=oneshot
ExecStartPre=/usr/bin/nvidia-smi -pm 1
ExecStart=/usr/bin/nvidia-smi -pl 275
[Unit]
Description=Set power limit 5 seconds after boot
[Timer]
OnBootSec=5
[Install]
WantedBy=timers.target
@MarvinTeichmann
Copy link

MarvinTeichmann commented Dec 12, 2022

Thanks for this great gist. The section about multiple GPUs does not work. On the two tested systems (Ubuntu 18.04 & 22.04 LTS) ExecStart does not accept the && syntax. However it is possible to use the following command to set the power limit of GPUs 1,2 and 3 to 220W:

/usr/bin/nvidia-smi -i 1,2,3 -pl 220

It has the downside however, that you have to use the same power limit for all GPUs.

@MarvinTeichmann
Copy link

Also as a hint the following command can be helpful to debug any issues in the setup:

systemctl status nvidia-tdp.service

@MarkRose
Copy link

MarkRose commented Feb 10, 2023

@MarvinTeichmann You can have multiple ExecStartPre= lines in a systemd service file if you need to configure your GPUs independently. You can also not specify the device if you want to apply the same setting to all devices. I don't believe && works unless you shell out: e.g. ExecStartPre=/bin/bash -c 'commmand1 && command2'.

@DavidAce Thanks for these!

@gabr1elt
Copy link

There is another (slightly older but still works fine) way available here:

https://www.pugetsystems.com/labs/hpc/quad-rtx3090-gpu-power-limiting-with-systemd-and-nvidia-smi-1983/

FYI: on debian 12 at least the presistance is handled by a daemon, so it's not needed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment