This guide documents how we set up an easy workflow for using the IPython Notebook on our compute cluster managed with Sun Grid Engine (SGE).
Summary: We provide a script to the cluster users that runs qrsh
to schedule an ipython notebook
job using SSL and password protection.
Of course IPython has to be installed on the login and compute nodes. We use IPython 1.0 and installed it on the shared filesystem using pip
.
Copy the following shell script as notebook
to a directory in every user's $PATH
. We used /usr/local/bin
which is on a shared filesystem.
#!/bin/bash
qrsh -cwd -V -N notebook \
ipython notebook \
--no-browser \
--ip=\$\(hostname --fqdn\)
Only a few nodes in our cluster are accessible from our desktop machines. We made a dedicated SGE queue (notebook.q
) that includes only these nodes.
#!/bin/bash
QUEUE=notebook.q
qrsh -cwd -V -N notebook -q $QUEUE \
ipython notebook \
--no-browser \
--ip=\$\(hostname --fqdn\)
It is highly recommended to password-protect the IPython Notebook user. If you don't do this, anyone can connect to any session and run arbitrary Python and shell commands on the cluster.
We store the password hash in .notebook.password
in the user's home directory. With the following update, the launcher will create it for the user on the first run:
#!/bin/bash
QUEUE=notebook.q
PASSWORD_FILE=~/.notebook.password
if [ ! -f $PASSWORD_FILE ]; then
python -c 'from IPython.lib import passwd; print passwd()' | tail -1 > $PASSWORD_FILE
fi
PASSWORD=$(cat $PASSWORD_FILE)
qrsh -cwd -V -N notebook -q $QUEUE \
ipython notebook \
--no-browser \
--ip=\$\(hostname --fqdn\) \
--NotebookApp.password=$PASSWORD
Although our cluster is only reachable from the institute's internal network, it is still a good idea to communicate with the notebook server over a secure connection. For this, we created a self-signed certificate that is valid for all nodes on which we intend to run the IPython Notebook (using Subject Alternative Names), roughly following this guide.
First create a modified OpenSSL configuration:
cp /etc/ssl/openssl.cnf .
emacs openssl.cnf
Where you make sure the following content is present:
[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
[alt_names]
DNS.1 = node1.cluster.intern
DNS.2 = node2.cluster.intern
DNS.3 = node3.cluster.intern
Of course substituting your own names for the cluster nodes.
Generate a private key, a certificate signing request, and a certificate:
openssl genrsa -out notebook.key 2048
openssl req -new -out notebook.csr -key notebook.key -config openssl.cnf
openssl x509 -req -days 3650 -in notebook.csr -signkey notebook.key -out notebook.crt -extensions v3_req -extfile openssl.cnf
When asked for Common Name, enter your first cluster node name.
Copy notebook.key
and notebook.crt
somewhere on the cluster filesystem and update the launcher script to use them:
#!/bin/bash
QUEUE=notebook.q
PASSWORD_FILE=~/.notebook.password
CERT_FILE=/usr/local/notebook/cert/notebook.crt
KEY_FILE=/usr/local/notebook/cert/notebook.key
if [ ! -f $PASSWORD_FILE ]; then
python -c 'from IPython.lib import passwd; print passwd()' | tail -1 > $PASSWORD_FILE
fi
PASSWORD=$(cat $PASSWORD_FILE)
qrsh -cwd -V -N notebook -q $QUEUE \
ipython notebook \
--no-browser \
--ip=\$\(hostname --fqdn\) \
--NotebookApp.password=$PASSWORD \
--NotebookApp.certfile=$CERT_FILE \
--NotebookApp.keyfile=$KEY_FILE
This can be improved since every cluster user has access to the private key. However, we don't want every user to have to create a certificate. This is a trade-off.
Any user can now start a session by typing
notebook
and opening the link that is printed in a local browser. The session is ended by pressing Ctrl-C
.