Skip to content

Instantly share code, notes, and snippets.

@Wampa842
Last active November 6, 2022 00:46
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Wampa842/83c78c171b8ca2f46e382875b6a76744 to your computer and use it in GitHub Desktop.
Save Wampa842/83c78c171b8ca2f46e382875b6a76744 to your computer and use it in GitHub Desktop.
Qtile widget to display arbitrary data from Nvidia GPU(s)

NvidiaSensors2

A Qtile widget to display arbitrary sensor information about Nvidia GPU(s). Based on, but not backwards compatible with libqtile.widget.NvidiaSensors.

Requires nvidia-smi (from nvidia-utils).

Sensor data

The key difference between NvidiaSensors2 and NvidiaSensors is that this one allows you to specify arbitrary fields to be queried by nvidia-smi using the sensors keyword argument when instantiating the widget. The queried field names can then be referenced in the format kwarg with dots . replaced with underscores _.

Run nvidia-smi --help-gpu-query to get a full list of supported field names. Some fields of interest are listed below. Note that not all fields will be available on all models.

Queried fields can be referenced by the format argument to print them to the text box. All printed fields must be queried, but not all queried fields have to be printed - this is of specific significance for the temperature alert.

Example:

sensors = ["utilization.gpu", "temperature.gpu", "fan.speed"],
format = "GPU {utilization_gpu}% {temperature_gpu}°C ({fan_speed}%)"

Temperature alert

The widget can be set up to use different formatting if any GPU's core temperature exceeds a threshold. This requires that the temperature.gpu field be queried. In case of an alert, format and format_all are replaced by format_alert and format_all_alert, as long as they are not None.

Example (using Pango markup):

threshold = 70,
sensors = ["utilization.gpu", "temperature.gpu"],
format = "GPU {utilization_gpu}%",
format_alert = "<span color='#ffa000'>HOT HOT HOT! {utilization_gpu}% {temperature_gpu}°C</span>"

Constructor arguments

Only new and altered arguments are listed. Refer to NvidiaSensors for the rest.

key default description
sensors ["utilization.gpu", "temperature.gpu"] A list of fields to be queried by nvidia-smi.
format "{utilization_gpu}% {temperature_gpu}°C" Format string applied to values from individual GPUs. It can only refer to fields that are also defined in the sensors list.
format_all "{}" Format string applied to the splatted list of individual results processed by format. Only displays the first GPU by default - it must be changed to include multiple GPUs.
format_alert None If not None, this format string overrides the format argument in case of a temperature alert.
format_all_alert None If not None, overrides format_all in case of a temperature alert.
threshold 70 If any one GPU core's temperature exceeds this value, format_alert and format_all_alert override their respective non-alerting format strings.
foreground_alert This argument has been removed.

Some interesting fields

name note
fan.speed Fan speed in percent, as set by the firmware. May not match real fan speed.
pstate Performance state from P0 to P12.
memory.used Current allocated VRAM.
memory.free Current unallocated VRAM.
utilization.gpu Percent of time during which the GPU is busy.
utilization.memory Percent of time during which the memory is being read or written.
encoder.stats.sessionCount Number of running encoder sessions.
encoder.stats.averageFps Average framerate (1/second) over all sessions.
encoder.stats.averageLatency Average latency (microseconds) over all sessions.
temperature.gpu GPU core temperature (degrees Celsius).
temperature.memory HBM memory temperature (degrees Celsius).
power.draw Last measured power draw (watts).
clocks.gr Graphics clock frequency.
clocks.sm Streaming multiprocessor clock frequency.
clocks.mem Memory clock frequency.
clocks.video Video encoder/decoder clock frequency.
import csv
import re
from subprocess import CalledProcessError
from libqtile.widget import base
class NvidiaSensors2(base.ThreadPoolText):
"""
Displays arbitrary sensor data from Nvidia GPU(s).
Not backwards-compatible with ``libqtile.widget.NvidiaSensors``.
"""
# TODO: Try backwards compatibility? Might not be possible
defaults = [
(
"format",
"{utilization_gpu}% {temperature_gpu}°C",
"Display string format applied to individual GPUs. Available "
"options are as definedin the ``sensors`` kwarg, except dots (.) "
"are replaced with underscores (_)."
),
(
"format_all",
"{}",
"Format string applied to the splatted list of results that are "
"already formatted (individually) by ``format``. Shows only the first "
"GPU by default - MUST CHANGE TO DISPLAY MULTIPLE GPUS!"
),
(
"format_alert",
None,
"Format string that replaces ``format`` if temperature above threshold."
),
(
"format_all_alert",
None,
"Format string that replaces ``format_all`` if temperature above threshold."
),
(
"threshold",
70,
"If the current temperature value is above, "
"then change to foreground_alert colour",
),
(
"gpu_bus_id",
"",
"GPU's Bus ID, ex: ``01:00.0``. If leave empty will display all " "available GPU's",
),
(
"update_interval",
2,
"Update interval in seconds."
),
(
"sensors",
["utilization.gpu", "temperature.gpu"],
"List of sensor names to query. Run 'nvidia-smi --help-query-gpu' for full list."
),
]
def __init__(self, **config):
base.ThreadPoolText.__init__(self, "", **config)
self.add_defaults(NvidiaSensors2.defaults)
self.foreground_normal = self.foreground
# If format(_all)_alert is not defined, default to the non-alerting formats.
if self.format_alert is None:
self.format_alert = self.format
if self.format_all_alert is None:
self.format_all_alert = self.format_all
def _get_sensors_data(self, command):
return csv.reader(
self.call_process(command, shell=True).strip().replace(" ", "").split("\n")
)
def _parse_format_string(self):
return {sensor for sensor in re.findall("{(.+?)}", self.format_per_gpu)}
def _temperature_alert_check(self, data):
# Return false if 'threshold' is unset or the 'temperature.gpu' field is not queried
if self.threshold is None or "temperature.gpu" not in self.sensors:
return False
# Return true if any of the core temps >= threshold
for gpu in data:
if gpu["temperature_gpu"].isnumeric() and int(gpu["temperature_gpu"]) > self.threshold:
return True
# Otherwise return false
return False
def poll(self):
# Command to retrieve GPU info
bus_id = f"-i {self.gpu_bus_id}" if self.gpu_bus_id else ""
command = "nvidia-smi {} --query-gpu={} --format=csv,noheader".format(
bus_id,
",".join(self.sensors)
)
try:
result = self._get_sensors_data(command)
# Replace dots with underscores to avoid conflict with str.format
sensors_alt_names = [ name.replace(".", "_") for name in self.sensors ]
sensors_data = [ dict(zip(sensors_alt_names, [val.replace("%", "").strip() for val in gpu])) for gpu in result ] # List items represent individual GPUs. Dict items represent sensor name/value pairs.
# If any GPU's core temp is above the threshold, set alert
if self._temperature_alert_check(sensors_data):
formatted_per_gpu = [self.format_alert.format(**gpu) for gpu in sensors_data]
return self.format_all_alert.format(*formatted_per_gpu)
else:
formatted_per_gpu = [self.format.format(**gpu) for gpu in sensors_data]
return self.format_all.format(*formatted_per_gpu)
except CalledProcessError as ex: # Invalid sensor name
return ex.stdout
except Exception as ex:
return str(ex)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment