Skip to content

Instantly share code, notes, and snippets.

@uaudith
Created August 10, 2023 15:52
Show Gist options
  • Save uaudith/5a910b25e135e4a10e7dd4a402f05525 to your computer and use it in GitHub Desktop.
Save uaudith/5a910b25e135e4a10e7dd4a402f05525 to your computer and use it in GitHub Desktop.
Are there too many individuals utilizing the GPU? Avoid waiting and manually checking for availability. Opt to use it immediately when a GPU's utilization falls below a certain threshold.
#!/bin/bash
# Written by uaudith[uaudith.eu.org]
# Use the environment variable $CUDA_VISIBLE_DEVICES inside your program to get comma separated list of free devices
command_to_run="$@"
utilization_threshold=0 #Will run the command if the device utilization percentage is less than this
get_free_gpus() {
mapfile -t util_array < <(nvidia-smi --query-gpu=utilization.gpu --format=csv,nounits,noheader)
free_gpus=""
for idx in "${!util_array[@]}"; do
if [[ "${util_array[idx]}" -le $utilization_threshold ]]; then
if [[ -n "$free_gpus" ]]; then
free_gpus+=",$idx"
else
free_gpus="$idx"
fi
fi
done
echo "$free_gpus"
}
low_utilization_count=0
while true; do
free_gpu_idx=$(get_free_gpus)
if [[ -n "$free_gpu_idx" ]]; then
if [[ "$low_utilization_count" -ge 3 ]]; then
echo "$(date '+%Y-%m-%d %H:%M:%S') - Running command: $command_to_run on GPUs: $free_gpu_idx"
DEVICES=$free_gpu_idx "$command_to_run"
break
else
low_utilization_count=$((low_utilization_count + 1))
echo "Low GPU utilization detected. Count: $low_utilization_count"
fi
else
#echo "All GPUs are currently in use. Resetting low utilization counter..."
low_utilization_count=0
fi
sleep 5 # Adjust the sleep interval as needed
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment