Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
A script to control Nvidia GPU fan speed on headless (non-X) linux nodes
#!/bin/bash
# cool_gpu2.sh This script will enable or disable fixed gpu fan speed
#
# Description: A script to control GPU fan speed on headless (non-X) linux nodes
# Original Script by Axel Kohlmeyer <akohlmey@gmail.com>
# https://sites.google.com/site/akohlmey/random-hacks/nvidia-gpu-coolness
#
# Modified for newer drivers and removed old work-arounds
# Tested on Ubuntu 14.04 with driver 352.41
# Copyright 2015, squadbox
# Requirements:
# * An Nvidia GPU
# * Nvidia Driver V285 or later
# * xorg
# * Coolbits enabled and empty config setting
# nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration
# You may have to run this as root or with sudo if the current user is not authorized to start X sessions.
# Paths to the utilities we will need
SMI='/usr/bin/nvidia-smi'
SET='/usr/bin/nvidia-settings'
# Determine major driver version
VER=`awk '/NVIDIA/ {print $8}' /proc/driver/nvidia/version | cut -d . -f 1`
# Drivers from 285.x.y on allow persistence mode setting
if [ ${VER} -lt 285 ]
then
echo "Error: Current driver version is ${VER}. Driver version must be greater than 285."; exit 1;
fi
# Read a numerical command line arg between 40 and 100
if [ "$1" -eq "$1" ] 2>/dev/null && [ "0$1" -ge "40" ] && [ "0$1" -le "100" ]
then
$SMI -pm 1 # enable persistance mode
speed=$1 # set speed
echo "Setting fan to $speed%."
# how many GPU's are in the system?
NUMGPU="$(nvidia-smi -L | wc -l)"
# loop through each GPU and individually set fan speed
n=0
while [ $n -lt $NUMGPU ];
do
# start an x session, and call nvidia-settings to enable fan control and set speed
xinit ${SET} -a [gpu:${n}]/GPUFanControlState=1 -a [fan:${n}]/GPUTargetFanSpeed=$speed -- :0 -once
let n=n+1
done
echo "Complete"; exit 0;
elif [ "x$1" = "xstop" ]
then
$SMI -pm 0 # disable persistance mode
echo "Enabling default auto fan control."
# how many GPU's are in the system?
NUMGPU="$(nvidia-smi -L | wc -l)"
# loop through each GPU and individually set fan speed
n=0
while [ $n -lt $NUMGPU ];
do
# start an x session, and call nvidia-settings to enable fan control and set speed
xinit ${SET} -a [gpu:${n}]/GPUFanControlState=0 -- :0 -once
let n=n+1
done
echo "Complete"; exit 0;
else
echo "Error: Please pick a fan speed between 40 and 100, or stop."; exit 1;
fi
@aschoenauer-sebag

This comment has been minimized.

Copy link

aschoenauer-sebag commented Feb 25, 2017

Hi,

Thanks for this hack. However, I get the following errors (for each gpu):
ERROR: Error querying enabled displays on GPU 0 (Missing Extension).
ERROR: Error querying connected displays on GPU 0 (Missing Extension).
ERROR: Error resolving target specification 'gpu:1' (No targets match target specification), specified in assignment '[gpu:3]/GPUFanControlState=1'.
I am using the 375.26 nvidia drivers on a headless tower. If you have any ideas on how to solve this it'd be great.

Thanks!

@renaudcerrato

This comment has been minimized.

Copy link

renaudcerrato commented Jun 16, 2017

Same goes here.

@teisho

This comment has been minimized.

Copy link

teisho commented Jun 20, 2017

Same here.

@mattics

This comment has been minimized.

Copy link

mattics commented Jun 30, 2017

This seems to be working great to set fan speed on three 1070's, but causing the second and third cards to slow down drastically. When mining they are at about 10% of their normal rate, is this due to them still being attached to a screen?

@Macrum

This comment has been minimized.

Copy link

Macrum commented Jul 1, 2017

Hello,
first I want to thank you for sharing this script.
Works great, but unfortunately only for my first GPU.

For all other GPU's I get the following error:

ERROR: Error assigning value 100 to attribute 'GPUTargetFanSpeed'
       (hostname:0[fan:1]) as specified in assignment
       '[fan:1]/GPUTargetFanSpeed=100' (Unknown Error).

Would be really nice if you could have a look into it.

Thanks in advance!

@raoulh

This comment has been minimized.

Copy link

raoulh commented Oct 30, 2017

As a workaround for the error setting fan speed to GPU 1 or 2 with the error, you can try this:

nvidia-xconfig -s -a --force-generate --allow-empty-initial-configuration --cool-bits=12 --registry-dwords="PerfLevelSrc=0x2222" --no-sli --connected-monitor="DFP-0"

Then it worked on my RIG.

@khavernathy

This comment has been minimized.

Copy link

khavernathy commented Mar 16, 2018

@raoulh lifesaver. same here. Thanks.

@wlara

This comment has been minimized.

Copy link

wlara commented Mar 18, 2018

has anyone found a solution for the errors?

ERROR: Error querying enabled displays on GPU 0 (Missing Extension).
ERROR: Error querying connected displays on GPU 0 (Missing Extension).
ERROR: Error resolving target specification 'gpu:0' (No targets match target
       specification), specified in assignment '[gpu:0]/GPUFanControlState=1'.
@streslab

This comment has been minimized.

Copy link

streslab commented Mar 25, 2018

@wlara try running:
export DISPLAY=:0.0

@tinfever

This comment has been minimized.

Copy link

tinfever commented Feb 26, 2019

I'm attempting to use this on a Ubuntu server install and it does work, after installing xinit and related packages, but after X is killed, the GPUs become stuck in low power state P8 which is essentially idle. This doesn't occur if I install and run lightdm so that an X instance stays running on each of the GPUs though. Any thoughts?

I know it works this way but it seems like blasphemy to have to install lightdm on a headless machine.

@isarandi

This comment has been minimized.

Copy link

isarandi commented Nov 15, 2019

I finally solved my problem. Previously I got the error

ERROR: Error assigning value 100 to attribute 'GPUTargetFanSpeed'
       (hostname:0[fan:1]) as specified in assignment
       '[fan:1]/GPUTargetFanSpeed=100' (Unknown Error).

In my case (Titan RTX), each GPU has two individually tunable fans! So fan:0 and fan:1 have to be set with gpu:0 and fan:2, fan:3 with gpu:1.

nvidia-settings -a [gpu:0]/GPUFanControlState=1 -a [fan:0]/GPUTargetFanSpeed=100 -a [fan:1]/GPUTargetFanSpeed=100 -c :0
nvidia-settings -a [gpu:1]/GPUFanControlState=1 -a [fan:2]/GPUTargetFanSpeed=100 -a [fan:3]/GPUTargetFanSpeed=100 -c :0

Hope it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.