gushart/gcp_setup.md

## gcp_setup.md

      
    Raw
  

              gcp_setup.md
            
          
    GCP setup for deep learning

There is a lot of guides in the Internet how to start deep learning with GCP. But in all of them there are lots of mistakes or they are incomplete, or manuals are too old. I tried to collect the most important information how to run VM instance and start working with deep learning
Please citate this gist or my github profile if you use it or part of it

Also I'm ready to answer you questions and I would be very happy if you rate my work. Thank you.
Contents:

VM setup starting steps

Creating and configuring a VM instance

Workflow setting up
VM setting up

Starting steps

1.  Go to console.cloud.google.com

2.  Accept the terms of use

3.  Click activate on the left top

4.  Fill all fields Customer info,  including the bussiness adress(you can fill it as you want) and payment method, confirm it, and you will be redirected to the main page of GCP.

5. Now select Compute Engine on the left-side panel(Navigation menu).

6. Click Enable billing to have the possibility to create and use VM instances.

7. Optional, if you want to use gpu:

Go to Navigation menu -> IAM & admin -> Quotas. Here you must click Upgrade account. Click Metric-> None-> write GPU on the bar-> select GPU(all regions) as you can see on the screenshot.

Then click on the Edit Quotas button and chose selected gpu quota. Fill you phone number and go next. Then you must fill the number of gpu you want to require and specify the reason why you need it (keras studying for example). Click submit request.

After 2-4 hours your request will be accepted.
Creating and configuring a VM instance

1. Go to Computer Engine on the Navigation menu. Choose VM instances.

2. Click create.

3. Now you are free to config your VM instance.

For example:

4. Allow HTTP/HTTPS traffic and click create:


5. Static IP Configuration:

Go to Navigation menu -> VPC network -> External IP addresses. You need to reserve a new one static ip. Click on the Ephemeral and change it to static type.
6. Firewall configuration to work with Jupyter Notebook:

Go to Navigation menu -> VPC network -> Firewall rules. Click create firewall rule.
Here we configure the type of traffic(we need Ingress), Allow action. You can configure the targets of this rule. As example, I choose All instances in the network. Source filter by ip ranges, source ip ranges value: 0.0.0.0/0.
And the last one - specify tcp port through which you will be connected with Jupyter Notebook server.

Be careful with no-using default tcp ports(http, https, ssh)
As example:

Workflow setting up

I recommend you to use tmux util to parallel you works.
1. Establishing a connection to a remote server:

You can use do it by two ways:
The first one:

Install and configure google cloud shell on local machine
It is too simple to describe process here
After installation you are able to connect to VM instance by the simple gcloud command you can find it here:

Copy the command and past it on the terminal:

The second:

Enter the following command in Terminal, it will start the key generation process: ssh-keygen -t rsa

Now run the following command and copy the output(you public ssh key): cat ~/.ssh/id_rsa.pub

Now go to Computer Engine -> Metadata. Click on ssh keys field -> edit -> add item, paste you ssh key and save it:

To check that ssh works write on your Termial:

ssh vm ip adress

2. Anaconda download & install:

 wget https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
 sh Anaconda3-2019.03-Linux-x86_64.sh

on the end of installation choose yes to init Anaconda by running conda init:

For changes to take effect, close and re-open your current shell!!
3. Additional packets:

conda install keras tensorflow tensorflow-gpu

tip: you can create a new tmux window and execute command in it(for example):
 tmux new -s p

4. Jupyter notebook server:

jupyter notebook --generate-config

   nano configpath
in nano mode insert into config file these lines:
 c.NotebookApp.ip = '*'
 c.NotebookApp.open_browser = False
 c.NotebookApp.allow_remote_access = True

   c.NotebookApp.port = your configured tcp port
that is all, save and close it.

tip(not safety): configure your jupyter password to escape using jupyter token:
  jupyter notebook password

now you can test your jupyter server doing two steps:
1. Run jupyter notebook on your vm instance.

2. Open browser on your local machine and write ip address remote server:reserved port

for example: 31.154.45.132:8890
5. CUDA download & install & verifying:

wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

To verify that CUDA is installed and running properly:
nvidia-smi

If CUDA is properly installed and running on your Linux system, you will see something like this:

6. cuDNN download & install & verifying:

Go to the cudnn page, reate your NVIDIA Developer Account. Download cudnn libs(runtime+dev) and code samples for your ubuntu version.
wget your download link runtime lib
wget your download link dev lib
wget your download link code samples
Install libs and code samples with dpkg:
sudo dpkg -i libcudnn7_7.6.0.64-1+cuda10.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.6.0.64-1+cuda10.1_amd64.deb
  sudo dpkg -i libcudnn7-doc_7.6.0.64-1+cuda10.1_amd64.deb

To verify that cuDNN is installed and running properly:
cp -r /usr/src/cudnn_samples_v7/ $HOME
cd  $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN

If cuDNN is properly installed and running on your Linux system, you will see a message similar to the following:
Test passed!

7. TensorFlow GPU using test:

python3 -c 'import tensorflow as tf; print(tf.test.is_gpu_available())'

8. Saving config and data:

After all installations you can do snapshot of your disk if you want to transfer all settings to a new vm machine.

Go to Compute Engine -> Snapshots and click create snapshot.