I found it rather difficult to set up a AWS EC2 server to use fast.ai in a jupyter notebook on it. Especially, since I use a Windows 10 computer (with permanent VPN). I therefore documented the way I used to make it happen. Most of it is just a combination of the fast.ai tutorial, AWS tutorials, and the very helpful tutorial by Baligh Mnassri (https://github.com/mnassrib).
Follow steps 1 to 5 detailed here: https://course.fast.ai/start_aws
The SSH connection is realized using PuTTY (https://www.putty.org/).
PuTTY cannot use the .pem file created by SSH. It needs to be converted. The following tutorial is from AWS (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html)
- From the Start menu, choose All Programs, PuTTY, PuTTYgen.
- Under Type of key to generate, choose RSA. If your version of PuTTYgen does not include this option, choose SSH-2 RSA. RSA key in PuTTYgen.
- Choose Load. By default, PuTTYgen displays only files with the extension .ppk. To locate your .pem file, choose the option to display files of all types.
- Select your .pem file for the key pair that you specified when you launched your instance and choose Open. PuTTYgen displays a notice that the .pem file was successfully imported. Choose OK.
- To save the key in the format that PuTTY can use, choose Save private key. PuTTYgen displays a warning about saving the key without a passphrase. Choose Yes.
- Specify the same name for the key that you used for the key pair (for example, my-key-pair) and choose Save. PuTTY automatically adds the .ppk file extension.
Your private key is now in the correct format for use with PuTTY. You can now connect to your instance using PuTTY's SSH client.
The following tutorial is from AWS (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html)
- Start PuTTY (from the Start menu, choose All Programs, PuTTY, PuTTY).
- In the Category pane, choose Session and complete the following fields:
- In the Host Name box add the instance's public DNS name, enter my-instance-user-name@my-instance-public-dns-name.
- Ensure that the Port value is 22.
- Under Connection type, select SSH.
- (Optional) You can configure PuTTY to automatically send 'keepalive' data at regular intervals to keep the session active. This is useful to avoid disconnecting from your instance due to session inactivity. In the Category pane, choose Connection, and then enter the required interval in the Seconds between keepalives field. For example, if your session disconnects after 10 minutes of inactivity, enter 180 to configure PuTTY to send keepalive data every 3 minutes.
- In the Category pane, expand Connection, expand SSH, and then choose Auth. Complete the following:
- Choose Browse.
- Select the .ppk file that you generated for your key pair and choose Open.
- In the Category pane, expand Connection, and then choose Data. Add
ubuntu
to the Auto-löogin username field. - (Optional) If you plan to start this session again later, you can save the session information for future use. Under Category, choose Session, enter a name for the session in Saved Sessions, and then choose Save.
- Choose Open.
The following part is mainly copied from https://course.fast.ai/start_aws
sudo apt update && sudo apt -y install git
git clone https://github.com/fastai/fastsetup.git
cd fastsetup
sudo ./ubuntu-initial.sh
Reboot when prompted. Then reconnect using ssh,
cd fastsetup
./setup-conda.sh
source ~/.bashrc
There is currently a problem when installing mamba with python 3.9, which is by default installed by the fast.ai scripts. Therefore, I downgraded miniconda to python 3.8 (source: https://stackoverflow.com/a/53300120/6220045):
conda install python=3.8
Afterwards you can install mamba
conda install -yq mamba
Find out which NVIDIA drivers you need:
ubuntu-drivers devices
... and install them (There is one entry with a "recommended" at the end. Use the corresponding number and adds the -server suffix. For me this was 470:
sudo apt-fast install -y nvidia-driver-470-server
sudo modprobe nvidia
nvidia-smi
Now you’re ready to install all needed packages for the fast.ai course:
mamba install -y fastbook
To download the notebooks, run:
cd
git clone https://github.com/fastai/fastbook
This part of the introduction differs from the fast.ai tutorial because I was not able to connect to the locally run notebook.
Tutorial is mainly compied from https://mnassrib.github.io/jupyter-putty-aws-ec2/
-
Re-load your .bashrc:
source .bashrc
-
Create a password hash:
ipython
from IPython.lib import passwd passwd()
Enter a password Note the password and the displayed SHA1 hash (
sha1:592a57cc3224:f190f1a25eb5f878e329f5...
)exit()
-
Create the configuration profile for your jupyter notebook server:
jupyter notebook --generate-config
-
Create a SSL certificate:
mkdir certs cd certs sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout mycert.key -out mycert.pem sudo chmod 400 mycert.pem
-
Update the jupyter configuration
cd ~/.jupyter/
nano upyter_notebook_config.py
- Add the following lines at the end of the file. Replace the SHA1 hash in last line with yours
c = get_config() # Kernel config c.IPKernelApp.pylab = 'inline' # if you want plotting support always in your notebook # Notebook config c.NotebookApp.certfile = u'/home/ubuntu/certs/mycert.pem' # location of your certificate file c.NotebookApp.keyfile = u'/home/ubuntu/certs/mycert.key' # location of your certificate key c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False # so that the ipython notebook does not open a browser by default c.NotebookApp.password = u'sha1:68c136a5b064...' # the encrypted password you generated above
This step seems to be unnecessary.
In order to be able to connect to the jupyter notebook server on port 8888, you need to tell the firewall to allow it:
sudo ufw allow 8888
Tutorial copied from AWS (https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-connect-master-node-proxy.html#switchyomega).
The following example demonstrates how to set up the SwitchyOmega extension for Google Chrome. SwitchyOmega lets you configure, manage, and switch between multiple proxies.
- Install SwitchyOmega from the Google Chrome extensions sore.
- Choose New profile and enter
emr-socks-proxy
as the profile name. - Choose PAC profile and then Create. Proxy Auto-Configuration (PAC) files help you define an allow list for browser requests that should be forwarded to a web proxy server.
- In the PAC Script field, replace the contents with the following script that defines which URLs should be forwarded through your web proxy server.
function FindProxyForURL(url, host) { if (shExpMatch(url, "*ec2*.amazonaws.com*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "*ec2*.compute*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "http://10.*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "*10*.compute*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "*10*.amazonaws.com*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "*.compute.internal*")) return 'SOCKS5 localhost:8157'; if (shExpMatch(url, "*ec2.internal*")) return 'SOCKS5 localhost:8157'; return 'DIRECT'; }
- Under Actions, chose Apply changes to save your proxy settings.
- On the Chrome toolbar, choose SwitchyOmega and select the emr-socks-proxy profile.
- In the Category pane, expand Connection, expand SSH, and then choose Tunnels.
- Complete the following to add a new forwarded port:
Source port = 8157 Select 'Dynamic' Keep 'Auto' Choose 'Add'
cd ~/fastbook
jupyter notebook
Open Google Chrome and use the public DNS name to connect to jupyter using https://master-public-dns-name:8888/
. Ignore the security warning, which is caused by the self-signed certificate.
Enjoy.