Skip to content

Instantly share code, notes, and snippets.

@vsoch
Last active September 23, 2019 09:48
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vsoch/f2034e2ff768de7eb14d42fef92cc43e to your computer and use it in GitHub Desktop.
Save vsoch/f2034e2ff768de7eb14d42fef92cc43e to your computer and use it in GitHub Desktop.
a quick tutorial to use the forward tool with sherlock/py3-jupyter to transfer a notebook first

The user issue was asking how to work with notebooks that are on the host. This is the response, associated with this release


Let's just make sure we are working from the same thing. Make sure your forward repository is up to date with the latest on Github, and that you have run setup.sh so that there is a CONTAINERSHARE and RESOURCE variable in your params.sh

USERNAME="vsochat"
PORT="43453"
PARTITION="russpold"
RESOURCE="sherlock"
MEM="20G"
TIME="8:00:00"
CONTAINERSHARE="/scratch/users/vsochat/share"

And you also should have run the hosts/ssh_sherlock.sh so that you have your ssh configuration in ~/.ssh/config like:

Host sherlock
    User vsochat
    Hostname sh-ln06.stanford.edu
    GSSAPIDelegateCredentials yes
    GSSAPIAuthentication yes
    ControlMaster auto
    ControlPersist yes
    ControlPath ~/.ssh/%l%r@%h:%p

And this would mean that if you type ssh sherlock you can issue a command after it! Eg:

ssh sherlock squeue -u vsochat

The first time you do that in a terminal, you will have to authenticate. The times after that you won't :)

Okay let's step back for a second and talk about your use case.

Use Case 1: Shared Reproducible Notebook

If you are creating an environment and notebooks that you want to move around, publish or otherwise share, then you would want to use a jupyter template and build your own container (already with the notebooks inside) and this is done just by copying repo2docker-julia and adding your notebooks, and building, and then running the command to point to your build, e.g.,:

bash start.sh sherlock/singularity-notebook docker://<username>/<repository>

And actually, you might still want to do this when your notebook is done and shiny and ready to submit to a paper, but I intuit from your post that you want more of a working environment, brought up on the fly, without much work in advance. So let's talk about this use case (and we can get back to the first when you are ready to publish!)

Use Case 2: Working (Not Reproducible) Notebook

This second use case is what I think you want, and it's only non reproducible because we aren't going to use a container (we will use modules on sherlock which may not always be there, might change, etc.) and the notebook files you also want to specify sort of "on the fly." The bug I see in what you are describing (and this is also a bug in my documentation not making it clear) is that the folder BPA would need to already be on the cluster somewhere (the path that you provide is relative to the cluster and not the local machine). BUT we can add a quick command to make this easy. Let's write up an example for how we would get this from the host.

Let's say I have a folder at /tmp/analysis with an analysis of interest! This already isn't reproducible because theoretically I've created this notebook with some jupyter notebook on my host that might have a mismatch in kernel with one on the cluster. Let's assume that it's the same. Here is the folder:

tree /tmp
   numpy_notebook.ipynb

And in this numpy notebook I have a Python 3 kernel that is pretty simple and useless, but will run something:

import numpy
stuffy = numpy.zeros((4,4))
print(stuffy)

Okay, so now I want to use this on sherlock, using forward. The first thing I want to do is copy the entire directory somewhere on the cluster, and I can use scp for that:

# Here is how we can make a directory to move our stuff to!
ssh sherlock mkdir -p /scratch/users/vsochat/my-analysis

# Now let's copy everything from the local folder there
scp /tmp/analysis/* vsochat@login.sherlock.stanford.edu:/scratch/users/vsochat/my-analysis
numpy_notebook.ipynb                                                                                                        100%  846     0.8KB/s   00:00    
v

If you want you can do another ssh sherlock command to check that it worked!

$ ssh sherlock ls /scratch/users/vsochat/my-analysis
numpy_notebook.ipynb

okay cool! Now we want to create the notebook there!

bash start.sh sherlock/py3-jupyter /scratch/users/vsochat/my-analysis

Here is the output. If you don't see exactly something like this, you probably have an older version, and should pull from master (I need to do tags / versions proper, developing pretty quickly and haven't yet!)

== Finding Script ==
Looking for sbatches/sherlock/sherlock/py3-jupyter.sbatch
Looking for sbatches/sherlock/py3-jupyter.sbatch
Script      sbatches/sherlock/py3-jupyter.sbatch

== Checking for previous notebook ==
No existing sherlock/py3-jupyter jobs found, continuing...

== Getting destination directory ==

== Uploading sbatch script ==
py3-jupyter.sbatch                                                                                                          100%  146     0.1KB/s   00:00    

== Submitting sbatch ==
sbatch --job-name=sherlock/py3-jupyter --partition=russpold --output=/home/users/vsochat/forward-util/py3-jupyter.sbatch.out --error=/home/users/vsochat/forward-util/py3-jupyter.sbatch.err --mem=20G --time=8:00:00 /home/users/vsochat/forward-util/py3-jupyter.sbatch 43453 "/scratch/users/vsochat/my-analysis"
Submitted batch job 23423516

== View logs in separate terminal ==
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err

== Waiting for job to start, using exponential backoff ==
Attempt 0: not ready yet... retrying in 1..
Attempt 1: not ready yet... retrying in 2..
Attempt 2: resources allocated to sh-01-31!..
sh-01-31
sh-01-31
notebook running on sh-01-31

== Setting up port forwarding ==
ssh -L 43453:localhost:43453 sherlock ssh -L 43453:localhost:43453 -N sh-01-31 &
== Connecting to notebook ==


== View logs in separate terminal ==
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.out
ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err

== Instructions ==
1. Password, output, and error printed to this terminal? Look at logs (see instruction above)
2. Browser: http://sh-02-21.int:43453/ -> http://localhost:43453/...
3. To end session: bash end.sh sherlock/py3-jupyter

Now since this isn't a container, the password is the one that I've set up in advance for jupyter notebook (loading the same module on sherlock, and setting the password, let me know if you haven't done this and need the instruction again, I believe it's in the README)

$ ssh sherlock cat /home/users/vsochat/forward-util/py3-jupyter.sbatch.err
[I 15:20:15.124 NotebookApp] Writing notebook server cookie secret to /tmp/jupyter/notebook_cookie_secret
[I 15:20:29.269 NotebookApp] Serving notebooks from local directory: /scratch/users/vsochat/my-analysis
[I 15:20:29.270 NotebookApp] 0 active kernels 
[I 15:20:29.270 NotebookApp] The Jupyter Notebook is running at: http://localhost:43453/
[I 15:20:29.270 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Then when I open the browser (and enter my password) I get the web interface, and there is my little notebook <3

image

That should be the complete instructions to get the functionality that you need, and when your notebook is done you would want to make a container (and not have the potential to have errors with versioning, etc.

Let me know if you have more questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment