Skip to content

Instantly share code, notes, and snippets.

@SSARCandy
Last active August 26, 2021 17:18
Show Gist options
  • Save SSARCandy/46da40fb53d7d49b10a4c38e6f96ae23 to your computer and use it in GitHub Desktop.
Save SSARCandy/46da40fb53d7d49b10a4c38e6f96ae23 to your computer and use it in GitHub Desktop.
Show GPU occupying status in dgx-1 (using nvidia-smi)
#!/bin/bash
function show_gpu_user {
pid=$(pstree -sg $1 | grep -Eo 'bash\([0-9]*\)' | head -1 | grep -Eo '[0-9]*');
docker ps -q | xargs docker inspect --format "{{.Name}} {{.State.Pid}}" | grep $pid | awk '{printf "%-24s", $1}';
ps aux | grep $1 | grep -v grep | awk '{ for(i=1;i<=NF;i++) {if ( i >= 11 ) printf $i" "}; printf "\n" }';
}
echo " ";
gpu_status=$(nvidia-smi --query-compute-apps=gpu_serial,pid --format=csv,noheader);
busy_gpu=$(nvidia-smi --query-compute-apps=gpu_serial,pid --format=csv,noheader | awk "{print $1}" |sort -u | wc -l);
echo "Total GPUs: 8";
echo "Tasks on GPU: $busy_gpu";
echo " ";
if [[ $(docker ps -q) ]]; then
echo -e "GPU_serial\tPorcess_id\tContainer_name\t\tProcess_name";
echo "======================================================================================";
while read -r line; do
process=$(echo $line | awk '{print $2}');
echo -en "$line\t\t" | sed 's/,\s/\t/';
show_gpu_user $process;
done <<< "$gpu_status"
else
echo "No process found."
fi
echo " ";
@SSARCandy
Copy link
Author

SSARCandy commented Apr 27, 2017

sample output:

r05725031@dgx1:~$ gpu-status.sh 
 
Total GPUs: 8
Avalible GPUs: 2
 
GPU_serial      Porcess_id      Container_name          Process_name
======================================================================================
0323616134471   18567           /big_pike               python model_cvpr_dgx.py 
0323716019841   29383           /b02902054_tf           python -u tf_seqs2seq_model.py
0323716019841   6684            /b02902054_tf           python -u tf_seqs2seq_model.py 
0323716019794   18784           /r05944047_gan2         ./build/tools/caffe train --solver=./face_solver.prototxt -gpu=3,4 
0323716019794   1130            /r05922005-crn          python train_crn3d.py 
0323716019809   18784           /r05944047_gan2         ./build/tools/caffe train --solver=./face_solver.prototxt -gpu=3,4 
0323716019809   51266           /r05922005-crn          python train_crn3d.py --data perturbed 
0323616133931   43952           /r03922007_MTK          python3 -u MTK-Flickr-exp-G2-4/TF.py 
0323716019820   70648           /r03922007_MTK          python3 -u EXP-G1/LPGAN-exp-G1-8/TF.py 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment