Skip to content

Instantly share code, notes, and snippets.

@widdowquinn
Created December 9, 2017 16:33
Show Gist options
  • Star 25 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save widdowquinn/e91b9bb850ece8873bbd084944798fe2 to your computer and use it in GitHub Desktop.
Save widdowquinn/e91b9bb850ece8873bbd084944798fe2 to your computer and use it in GitHub Desktop.
Set up JupyterHub on AWS

JupyterHub on AWS

EC2 Setup

  • Log in to AWS
  • Go to a sensible region
  • Start a new instance with Ubuntu Trusty (14.04) - compute-optimised instances have a high vCPU:memory ratio, and the lowest-cost CPU time. c4.2xlarge is a decent choice.
  • Set security group (firewall) to have ports 22, 80, and 443 open (SSH, HTTP, HTTPS)
  • If you want a static IP address (for long-running instances) then select Elastic IP for this VM
  • If you want to use HTTPS, you'll probably need a paid certificate, or to use Amazon's Route 53 to get a non-Amazon domain (to avoid region blocking).

Route 53

  • Open Route 53 on Amazon
  • If you don't already have a domain name registered, register/transfer a domain name.
  • This will create a new Hosted Zone and Record Set for you, for the 'parent' domain

To use the parent domain

  • Create a new record set
  • Enter the subdomain name, and choose no for Alias. Enter the IP address for the EC2 setup above.
  • Accept the other defaults (TTL, Routing Policy) and click Create.

To use a new subdomain

  • Create a new Hosted Zone, and give it the full subdomain name (subdomain.parent.domain)
  • Copy the nameservers from the NS nameserver Record Set in the subdomain.
  • Create a new nameserver Record Set in the parent domain Hosted Zone, with the full subdomain name (subdomain.parent.domain), and paste in the subdomain nameservers you copied above.
  • Return to the subdomain Hosted Zone, and create a new Record Set of type A.
  • Enter the subdomain name, and choose no for Alias. Enter the IP address for the EC2 instance you set up above.
  • Accept the other defaults (TTL, Routing Policy) and click Create.

Set up server

  • SSH into your new server
  • Create server directory, and perform some updates
sudo mkdir /srv/jupyterhub
sudo chown -R ubuntu:ubuntu /srv/jupyterhub
sudo apt-get update
sudo apt-get install git
  • Get SSL keys using LetsEncrypt (this requires a registered domain name)
git clone https://github.com/letsencrypt/letsencrypt
cd letsencrypt
./letsencrypt-auto certonly --standalone -v -d <host.domain.name>
# You'll need to enter an email, and I'd recommend sharing info with EFF
  • Store the keys in the server directory
mkdir /srv/jupyterhub/ssl
sudo cp /etc/letsencrypt/live/<host.domain.name>/fullchain.pem /etc/letsencrypt/live/<host.domain.name>/privkey.pem /srv/jupyterhub/ssl
  • Install and start Docker
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo "deb https://apt.dockerproject.org/repo ubuntu-trusty main" | sudo tee /etc/apt/sources.list.d/docker.list >/dev/null
sudo apt-get update
sudo apt-get upgrade docker-engine
sudo usermod -aG docker ubuntu
sudo service docker start
  • Log out and then back in, and test docker:
docker run hello-world
  • Install Python3
sudo apt-get install python3-pip
  • Install npm and its dependencies
sudo apt-get install npm nodejs-legacy
sudo npm install -g configurable-http-proxy
  • Install jupyterhub
sudo pip3 install jupyterhub
sudo pip3 install --upgrade notebook
  • Install and configure OAuth
sudo pip3 install oauthenticator
  • Visit https://github.com/settings/applications/new and enter the following:

  • Application name: something to identify your site

  • Homepage URL: https://<your_host>

  • Application description: some text describing your site

  • Callback URL: https://<your_host>/hub/oauth_callback

  • Click on Register application

  • Create a new jupyter_config.py file:

jupyterhub --generate-config
  • The settings required are:
# jupyterhub_config.py
c = get_config()
        
import os
pjoin = os.path.join

runtime_dir = pjoin('/srv/jupyterhub')
ssl_dir = pjoin(runtime_dir, 'ssl')
if not os.path.exists(ssl_dir):
    os.makedirs(ssl_dir)

# https on :8443
c.JupyterHub.port = 8443
c.JupyterHub.ssl_key = '/etc/letsencrypt/live/<HOSTNAME>/privkey.pem'
c.JupyterHub.ssl_cert = '/etc/letsencrypt/live/<HOSTNAME>/fullchain.pem'

# put the JupyterHub cookie secret and state db
# in /var/run/jupyterhub
c.JupyterHub.cookie_secret_file = pjoin(runtime_dir, 'cookie_secret')
c.JupyterHub.db_url = pjoin(runtime_dir, 'jupyterhub.sqlite')
# or `--db=/path/to/jupyterhub.sqlite` on the command-line

# use GitHub OAuthenticator for local users

c.JupyterHub.authenticator_class = 'oauthenticator.LocalGitHubOAuthenticator'
c.GitHubOAuthenticator.oauth_callback_url = 'https://<HOSTNAME>/hub/oauth_callback'
c.GitHubOAuthenticator.client_id = <GITHUB_CLIENT_ID>
c.GitHubOAuthenticator.client_secret = <GITHUB_CLIENT_SECRET>

# specify users and admin
c.Authenticator.whitelist = {'<USERNAME>', }
c.Authenticator.admin_users = {'<USERNAME>', }

# start single-user notebook servers in ~/assignments,
# with ~/assignments/Welcome.ipynb as the default landing page
c.Spawner.notebook_dir = '~/assignments'
c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb']

c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'
  • Redirect port 8443 to HTTPS:
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to 8443
  • Configure JupyterHub as a service
wget https://gist.githubusercontent.com/lambdalisue/f01c5a65e81100356379/raw/ecf427429f07a6c2d6c5c42198cc58d4e332b425/jupyterhub
sudo mv jupyterhub /etc/init.d/jupyterhub
sudo mkdir /etc/jupyterhub
sudo jupyterhub --generate-config -f /etc/jupyterhub/jupyterhub_config.py
  • Allow notebook widget extensions
sudo pip3 install ipywidgets
sudo jupyter nbextension enable --py --sys-prefix widgetsnbextension
  • Start JupyterHub
sudo service jupyterhub start
  • Logging in
$ ssh -i ".ssh/<YOUR_PEM>.pem" <YOUR_SERVER>
$ ssh <USERNAME>@<HOSTNAME>
@btomtom5
Copy link

wow! Thank you so much. This helped a ton!

@itzceekay
Copy link

Hey thanks, you saved my capstone project. But I get an "400 : Bad Request
OAuth state missing from cookies: When I login using the github Id. Experienced this before ?

@maneeshdisodia
Copy link

Great work sir.
Just stuck in error "500 : Internal Server Error", i have replicated all the steps you have mentioned in gist.
and service setup needs some more clarifications.
Thanks

@yuyueugene84
Copy link

yuyueugene84 commented Mar 24, 2019

Thanks for writing this guide!

Make sure the nodejs you installed supports ES6 syntax, otherwise you will get an error like this: winstonjs/winston#1256

I suggest you add the following into this guide:

curl -sL https://deb.nodesource.com/setup_10.x | sudo -E bash -
sudo apt-get install -y nodejs

Cheers!

@JeremyMcCormick
Copy link

My god, what a configuration nightmare. :(

@phaustin
Copy link

Note the official docs now include AWS: https://zero-to-jupyterhub.readthedocs.io/en/latest/

@anujonthemove
Copy link

Where did you make use of docker in the whole process?

@widdowquinn
Copy link
Author

Fair question, @anujonthemove. I don't remember. These were notes to myself for setting up JupyterHub for students to use in a training course. It's possible that the Docker instructions are there so I didn't forget them, but that I installed it for a completely different reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment