Skip to content

Instantly share code, notes, and snippets.

@goweiting
Last active November 20, 2017 16:17
Show Gist options
  • Save goweiting/6b11f4ef3e18188d04800b2b2970977f to your computer and use it in GitHub Desktop.
Save goweiting/6b11f4ef3e18188d04800b2b2970977f to your computer and use it in GitHub Desktop.
Running longjob on DICE machines

A quick overview of how to run longjob on DICE - here, focusing on opening a jupyter notebook for a long period of time (28days)

Setup

  1. ssh through the network gateway: ssh sXXX@student.ssh.inf.ed.ac.uk, and into any compute server you wish to use.
  2. create a screen so that it doesnt kill the process after you log out:
$ screen -S <session-name> # name the screen session
$ screen -S mlp (e.g) 

This opens a new screen terminal (more on screen here)

  1. activate your virtual environment: source activate mlp
  2. Start the longjob:
$ longjob -28day -c <jobname/command> # start a long job executing the jobname/command
$ longjob -28day -c "(nohup nice -n 19 jupyter notebook --no-browser --port=<remoteport>)"

<remoteport> refers to the port that you wish your notebook to run on the remote server

Troubleshooting

  1. If you run into problem generating a kerebos key, is it possible that there is already a key that was cached. This might happen:
Waiting for job to start...
krenew: unable to run command (nohup: No such file or directory
krenew: error reading ticket cache: No credentials cache found (filename: /tmp/krb5cc_14asdas42427_GBltpqweqweqeF)
krenew: cannot destroy ticket cache: No credentials cache found (filename: /tmp/krb5cc_14asdas42427_GBltasdasdqwepF)

Due to the mechanism of the command,longjob (see [1]), we can destroy the current key so that a new one will the kinit-ed:

(mlp) $ kdestroy # destroy all the keys generated!

Try running the longjob command again.

screen

Here is a good tutorial: https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/

References:

[1] http://computing.help.inf.ed.ac.uk/afs-top-ten-tips

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment