Skip to content

Instantly share code, notes, and snippets.

View GJSissons's full-sized avatar

Gord Sissons GJSissons

View GitHub Profile
@GJSissons
GJSissons / tower_gistfile_3
Created July 24, 2023 12:29
Automating Tower workflows 3
$ wget https://github.com/seqeralabs/tower-cli/releases/download/v0.8.0/tw-linux-x86_64
$ mv tw-* tw
$ chmod +x ./tw
$ sudo mv tw /usr/local/bin/
@GJSissons
GJSissons / tower_gistfile_2
Created July 24, 2023 12:26
Automating Tower workflows 2
'''
{
"user": {
"id": 4610,
"userName": "fred",
"email": "frederick.sanger@seqera.io",
"firstName": null,
"lastName": null,
...
"lastAccess": "2023-07-10T13:06:01Z",
@GJSissons
GJSissons / gist:23f970d50760a5361f4f3797d8ea199b
Last active July 24, 2023 13:18
Automating Tower workflows 1
'''
$ curl -X GET "https://api.tower.nf/user-info" \
-H "Accept: application/json" \
-H "Authorization: Bearer <your token>" \
| jq
@GJSissons
GJSissons / gpu_sharing_3_17
Created September 15, 2020 14:45
3.17 - Reserving resources, and scheduling a four-way parallel MPI job requiring 28 slots and four GPUs per host
$ qrsub -pe mpi 112 -l hgpu=4 -d 1:0:0
$ qsub -ar <id> -pe mpi 4 -par 4 -l hgpu=4 gpu_mpi_job.sh
@GJSissons
GJSissons / gpu_sharing_3_16
Created September 15, 2020 14:42
3.16 - Scheduling all cores with affinity to a selected GPU to prevent workload conflicts
$ qsub -l gpu=1 cudajob.py
@GJSissons
GJSissons / gpu_sharing_3_14
Created September 15, 2020 14:40
3.14 - Using a binding policy based on a topology_mask in a resource map
$ qsub -binding linear:1 -l gpu=1 cudajob.py
@GJSissons
GJSissons / gpu_sharing_3_13
Created September 15, 2020 14:38
3.13 - Configuring topology_masks as part of a gpu resource map.
[root@master]# qconf -me host1
hostname host1
load_scaling NONE
complex_values m_mem_free=815.000000M, gpu=4\
(gpu0[cuda_id=0,device=/dev/nvidia0] \
gpu1[cuda_id=1,device=/dev/nvidia1]\
gpu2[cuda_id=2,device=/dev/nvidia2]\
gpu3[cuda_id=3,device=/dev/nvidia3])
complex_values gpu=4\
(gpu0:SCCCCccccScccccccc \
@GJSissons
GJSissons / gpu_sharing_3_12
Created September 15, 2020 14:35
3.12 - Protecting access to GPU devices using cgroups in Univa Grid Engine
[root@master]# qconf -mconf host1
cgroup_params cgroup_path=/sys/fs/cgroup \
devices=/dev/nvidia[0-3]
@GJSissons
GJSissons / gpu_sharing_3_11
Created September 15, 2020 14:33
3.11 Enabling the SET_CUDA_VISIBLE_DEVICES feature
[root@master]# qconf -mconf global
..
qmaster_params none
execd_params KEEP_ACTIVE=ERROR UGE_DCGM_PORT=5555 \
SET_CUDA_VISIBLE_DEVICES=true
reporting_params accounting=true reporting=false \
flush_time=00:00:15 joblog=false sharelog=00:00:00
finished_jobs 0
..
@GJSissons
GJSissons / gpu_sharing_3_10
Last active September 22, 2020 12:25
3.10 - With DCGM enabled we can request the name of devices and ensure scheduling affinity
$ qsub -l gpu=”2(V100)[affinity=2]” volta_gpu_job.sh