Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@willfurnass
Last active April 19, 2023 15:00
Show Gist options
  • Star 25 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save willfurnass/90847c3da83b93c8c02eecdaef2e521f to your computer and use it in GitHub Desktop.
Save willfurnass/90847c3da83b93c8c02eecdaef2e521f to your computer and use it in GitHub Desktop.
(Son of) Grid Engine tips and tricks

Random Grid Engine tips and tricks

The following work with Son of Grid Engine (SGE) 8.1.9 as configured on the University of Sheffield's ShARC and Iceberg clusters.

Jobs with dependancies

You can use the -hold_jid <<job-name or job-name>> option to make jobs run only when other jobs have finished, rather than having jobs start and sit waiting for other tasks to complete.

Query accounting files if using log rotation

You can query multiple SGE log (accounting) files in series using:

for f in $SGE_ROOT/default/common/accounting $SGE_ROOT/default/common/accounting-archive/accounting-*; do
    qacct -f $f -j '*'
done

Show jobs running on host plus load, CPU use etc

qhost -j -h hal9000-node099

Show jobs running on all hosts in host group

function sge_jobs_host_grp () 
{ 
    if [[ $# -ne 1 || "$1" == '-h' ]]; then
        echo "Show all jobs running on all host of a Grid Engine host group." 1>&2;
        echo "usage: sge_jobs_host_grp name_of_sge_host_group" 1>&2;
        return;
    fi;
    all_hosts="$(qconf -shgrp_tree $1 | grep $SGE_CLUSTER_NAME | xargs | tr ' ' ',')";
    qhost -j -h $all_hosts
}

Summary of unallocated GPUs per node:

Here gpu resources are general purpose GPUs (GPGPUs) and gfx resources are for hardware-accellerated visualisation.

qhost -F gpu,gfx | grep -B1 'Host Resource'

Summary of jobs using GPGPUs

qhost -l gpu=1 -u '*

Check if any queues on any nodes not availble

qstat -j | grep 'not available'

Info on all queues

qstat -f

What all users have running in a given queue

qstat -u \* -q some.q -t -s r

Here:

  • -t - extended information about the controlled sub-tasks of the displayed parallel jobs
  • -s r - job state is running

Check if particular queue in error state

qstat -f -explain E -q somequeue.q

Clear error state for queue instance on a node

From an Administrative Host:

sudo -u root -i qmod -cq somequeue.q@hal9000-node099

Check if all queue instances disabled on a node

e.g. because someone else is doing maintenance

qhost -q -h hal9000-node103

Disable and reenable a queue instance (e.g. before reboot)

From an Administrative Host:

sudo -u root -i qmod -d somequeue.q@hal9000-node126
# Fix stuff :)
sudo -u root -i qmod -e somequeue.q@hal9000-node126

NB these operations support wildcards

Add a host or a host group to a cluster queue

From an Administrative Host:

sudo -u root -i qconf -mq somequeue.q

What hosts associated with a job?

qstat -s r -g t 

Show all hostgroups a host belongs to

for aa in $(qconf -shgrpl) ; do 
    qconf -shgrp $aa | grep -q $1 && echo $aa ; 
done

Find nodes that match a spec

qconf -sobjl exechost complex_values '*gfx=*'
qconf -sobjl exechost complex_values '*gfx=2,*'
qconf -sobjl exechost load_values '*num_proc=20,*'

Find queues that match some criteria

Similar to the above.

qselect -l gfx
qselect -l gfx=2

and possibly limit the output to queues that a user has access to:

qselect -l gfx=2 -u te1st

Set a complex on a node

sudo -u root -i qconf -mattr exechost complex_values my_complex=some_value node666

Really nice summary of resource requests of pending/running jobs

qstat -u $USER -r

or

$ qstat -q gpu*.q -u '*' -r

Setting env vars for qrsh sessions

From prolog scripts etc. Users need to:

qrsh -pty y bash -li

or

exec qrsh -pty y bash

(the second being effectively what qrshx does).

SGE logs

Node log on node nodeX (includes messages from user-process-supervising shepherd processes):

/var/spool/sge/nodeX/messages

qmaster log:

$SGE_ROOT/default/spool/qmaster/messages

Location of hostfile on master node of parallel job:

/var/spool/sge/${HOSTNAME}/active_jobs/${JOB_ID}.1/pe_hostfile

Test the MUNGE uid/gid validation mechanism between two nodes

munge -n | ssh node666 unmunge

Resource Quota Sets (RQS)

A chain of rules for limiting resources (pre-defined or custom complexes) per resource consumer (e.g. user, host, queue, project, department, parallel env).

See sge_resource_quota(5).

Forced resources

If a complex is defined so that FORCED is true then these complexes need to be explicitly requested by the user for them to be used.

Reserving resources for larger jobs

Typically need to ensure resources reserved for larger (e.g. parallel) jobs to ensure that they're not forever waiting behind smaller (e.g. single-core) jobs. E.g.

$ qconf -ssconf
...
max_reservation                   20
default_duration                  8760:00:00

Need to then submit the larger jobs using -R y.

What cores are available in your job's cpuset on the current node?

Not SGE-specific but relevant and useful. Works even in a qrsh session where JOB_ID is not set.

cat /dev/cpuset/$(awk -F: '/cpuset/ { print $3 }' /proc/$$/cgroup)/cpuset.cpus

NB hyperthreads are enumerated in cgroups

Did a job request a gfx resource?

Works even in a qrsh session where JOB_ID is not set:

qstat -xml -j $(awk -F'/' '/sge/ {print $3}' /proc/$$/cpuset) | xmllint --xpath '/detailed_job_info/djob_info/element/JB_project[text()="gfx"]' - | grep -q gfx

qstat output without truncated job names (Univa Grid Engine only)

SGE_LONG_JOB_NAMES=1 qstat -u someuser

Get per-job unix group

Grid Engine creates a per-job unix group to help track resources associated with a job. This group has an integer ID but not a name, as can be seen if you run groups from a GE job:

$ groups
somegroup anothergroup groups: cannot find name for group ID 20016
20016 yetanothergroup

To learn the ID of the unix group for the current Grid Engine job:

awk -F= '/^add_grp_id/ { print $2 }' "${SGE_JOB_SPOOL_DIR}/config"
@jkwmoore
Copy link

Not as useful - but grabbing the number of nodes in a queue on SLURM (bessemer):

sinfo -N | grep sheffield | awk '/node/ {print $3}' | sort | uniq -c

@jkwmoore
Copy link

jkwmoore commented Mar 8, 2021

For SLURM getting a list of jobs from a partition (dcs-gpu) and showing the number of GPUs in use (cheers @ PeterHeywood)

squeue -p dcs-gpu -o "%.18i %.12j %.12u %.12b %.2t %.10M %.10l %R"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment