Colfax Intel Cluster - Note - Deepdive - Lecture 1 - Trick - SSH to Cluster Node

Notes taken from Colfax Deepdive Video Series: Session 01 - Intel Architecture and Modern Code.


Usually, we are able to SSH to the Login node (c001-001), but not the cluster nodes (the ones that run the jobs after we've submitted a qsub).

In Deep Dive Lecture 1, Andrey exposed one pretty cool (and potentially handy) trick for us to "peak" into the cluster node and even SSH into it.


Submit a sleep job to a cluster node. For example:

[userxxx@c001] $ echo sleep 600 | qsub -l nodes=1:knl:flat

While the job is running on remote, we can obtain the remote node name:

[userxxx@c001] $ qstat -f 21083 | grep exec_host
exec_host = c001-n029/0

... and SSH into it:

[userxxx@c001] $ ssh c001-n029

Notice that the prompt now changes to the remote node:

[userxx@c001-n029 ~]$

Once we are at the remote node we can do some queries, such as lscpu, top, numactl -H, etc.

When we are done, just logout (or hit Ctrl+D):

[userxx@c001-n029 ~]$ logout

Notice that we are now back to the login node:

[userxxx@c001 lec-01]$


