IPython Notebook on an SGE cluster
This guide documents how we set up an easy workflow for using the IPython Notebook on our compute cluster managed with Sun Grid Engine (SGE).
Summary: We provide a script to the cluster users that runs
qrsh to schedule an
ipython notebook job using SSL and password protection.
Of course IPython has to be installed on the login and compute nodes. We use IPython 1.0 and installed it on the shared filesystem using
Installing the launcher script
Copy the following shell script as
notebook to a directory in every user's
$PATH. We used
/usr/local/bin which is on a shared filesystem.
#!/bin/bash qrsh -cwd -V -N notebook \ ipython notebook \ --no-browser \ --ip=\$\(hostname --fqdn\)
Using a dedicated SGE queue
Only a few nodes in our cluster are accessible from our desktop machines. We made a dedicated SGE queue (
notebook.q) that includes only these nodes.
#!/bin/bash QUEUE=notebook.q qrsh -cwd -V -N notebook -q $QUEUE \ ipython notebook \ --no-browser \ --ip=\$\(hostname --fqdn\)
Protecting the IPython Notebook with a password
It is highly recommended to password-protect the IPython Notebook user. If you don't do this, anyone can connect to any session and run arbitrary Python and shell commands on the cluster.
We store the password hash in
.notebook.password in the user's home directory. With the following update, the launcher will create it for the user on the first run:
#!/bin/bash QUEUE=notebook.q PASSWORD_FILE=~/.notebook.password if [ ! -f $PASSWORD_FILE ]; then python -c 'from IPython.lib import passwd; print passwd()' | tail -1 > $PASSWORD_FILE fi PASSWORD=$(cat $PASSWORD_FILE) qrsh -cwd -V -N notebook -q $QUEUE \ ipython notebook \ --no-browser \ --ip=\$\(hostname --fqdn\) \ --NotebookApp.password=$PASSWORD
Using a self-signed certificate
Although our cluster is only reachable from the institute's internal network, it is still a good idea to communicate with the notebook server over a secure connection. For this, we created a self-signed certificate that is valid for all nodes on which we intend to run the IPython Notebook (using Subject Alternative Names), roughly following this guide.
First create a modified OpenSSL configuration:
cp /etc/ssl/openssl.cnf . emacs openssl.cnf
Where you make sure the following content is present:
[req] distinguished_name = req_distinguished_name req_extensions = v3_req [ v3_req ] basicConstraints = CA:FALSE keyUsage = nonRepudiation, digitalSignature, keyEncipherment subjectAltName = @alt_names [alt_names] DNS.1 = node1.cluster.intern DNS.2 = node2.cluster.intern DNS.3 = node3.cluster.intern
Of course substituting your own names for the cluster nodes.
Generate a private key, a certificate signing request, and a certificate:
openssl genrsa -out notebook.key 2048 openssl req -new -out notebook.csr -key notebook.key -config openssl.cnf openssl x509 -req -days 3650 -in notebook.csr -signkey notebook.key -out notebook.crt -extensions v3_req -extfile openssl.cnf
When asked for Common Name, enter your first cluster node name.
notebook.crt somewhere on the cluster filesystem and update the launcher script to use them:
#!/bin/bash QUEUE=notebook.q PASSWORD_FILE=~/.notebook.password CERT_FILE=/usr/local/notebook/cert/notebook.crt KEY_FILE=/usr/local/notebook/cert/notebook.key if [ ! -f $PASSWORD_FILE ]; then python -c 'from IPython.lib import passwd; print passwd()' | tail -1 > $PASSWORD_FILE fi PASSWORD=$(cat $PASSWORD_FILE) qrsh -cwd -V -N notebook -q $QUEUE \ ipython notebook \ --no-browser \ --ip=\$\(hostname --fqdn\) \ --NotebookApp.password=$PASSWORD \ --NotebookApp.certfile=$CERT_FILE \ --NotebookApp.keyfile=$KEY_FILE
This can be improved since every cluster user has access to the private key. However, we don't want every user to have to create a certificate. This is a trade-off.
Starting the IPython Notebook
Any user can now start a session by typing
and opening the link that is printed in a local browser. The session is ended by pressing