Intel Colfax Cluster - Notes - Index Page
Notes taken from Colfax Deepdive Video Series: Session 01 - Intel Architecture and Modern Code.
Usually, we are able to SSH to the Login node (c001-001), but not the cluster nodes (the ones that run the jobs after we've submitted a qsub
).
In Deep Dive Lecture 1, Andrey exposed one pretty cool (and potentially handy) trick for us to "peak" into the cluster node and even SSH into it.
Submit a sleep job to a cluster node. For example:
[userxxx@c001] $ echo sleep 600 | qsub -l nodes=1:knl:flat
21083.c001
While the job is running on remote, we can obtain the remote node name:
[userxxx@c001] $ qstat -f 21083 | grep exec_host
exec_host = c001-n029/0
... and SSH into it:
[userxxx@c001] $ ssh c001-n029
Notice that the prompt now changes to the remote node:
[userxx@c001-n029 ~]$
Once we are at the remote node we can do some queries, such as lscpu
, top
, numactl -H
, etc.
When we are done, just logout (or hit Ctrl+D
):
[userxx@c001-n029 ~]$ logout
Notice that we are now back to the login node:
[userxxx@c001 lec-01]$
- Colfax Deepdive Video Series: Session 01 - Intel Architecture and Modern Code. From around 84:50.