tnguyenv/blog.md Secret

## blog.md

      
    Raw
  

              blog.md
            
          
    How to setup a deep-learning-ready server with Intel NUC 8 + Nvidia eGPU

Interested in learning deep learning with Pytorch and/or Fast.ai? The easiest way to get started is to use a cloud-based solution (Google it, there’s a lot!). However, if you want to invest for a longer run or simply want to make your hands dirty with setting up a personal server, this post is for you!
After spending quite a lot of time researching and setting up my new Intel NUC Hades Canyon with the Nvidia GTX 1080 eGPU, I decided to write this blog so people like me can save some time and quickly get down the road.
To make it easy for you to follow the steps, I outlined here the general process, however, before answering ‘how’, I would like to explain ‘why’ first. Feel free to skip the first section if you believe in my decisions. ;)

Why?
Install Ubuntu Server 18.04
Install Bolt
Install Nvidia driver
Install Docker CE
Install Nvidia Docker
Run a ML-ready docker image

1.  Why eGPU, Intel NUC, and Docker?


External GPU

The name has already implied its advantage: portability. With a single gaming box, I can use it with any laptop that supports thunderbolt/usb-c (e.g., MacBook, NUC 7/8) without worrying about compatibility. Here I use the Aorus GTX 1080 Gaming Box .
Intel NUC 8 Hadas Canyon

This mini PC brings you an Intel Core i7 processor, AMD Vega graphics power, and a spicy port mix. PCMag pointed out NUC 8’s top pros:

Compact and quiet-running
Excellent overall CPU and GPU performance
AMD Vega graphics are VR-ready
Bristling with connectivity for its size (of course, it has a pair of thunderbolt/usb-c ports)
Dual M.2 slots. VESA-mountable chassis


Docker

I would love to refer you to another blog post entitled “How Docker Can Help You Become A More Effective Data Scientist”, which explain everything you need to know about Docker, and of course, the reason why it is so important for data scientist. If you don't have time, here is my brief explanation:
Docker is like a virtual machine, however we call it container. It’s worth to mention that they are actually 2 different technologies (virtualization vs. containerization). Container is more efficient in resource allocation, you don’t have to cutoff 4GB of physical memory for each container. Consequently, you can run many containers simultaneously. Moreover, it’s lightweight, you can start a container just in a couple of seconds. Last but not least, you can easily share your development environment (container image), which might include tens of tens of libraries.
Docker has an image repository called DockerHub, where you can share the image of your Docker container to others publicly (or privately, you decide). And this literally how we can easily setup a machine learning development environment (with numpy, pandas, scikit, pytorch, and everything you need) in a couple lines of code.
2. Ubuntu 18.04 Server

It works, flawlessly. I just want to mention that in case you still concern that CUDA 9 doesn’t support Ubuntu 18.04 while PyTorch hasn’t support CUDA 10 yet. That’s it, I suppose that you know how to proceed with this.
Don’t forget to update and upgrade everything after the installation.
https://gist.github.com/bbd30bf1610951c9e83eb36621d12112
3. Bolt

Installing

As I am using an eGPU which connects to the computer via a thunderbolt port, I need to install the bolt library first.
https://gist.github.com/8ce075c2824a9beb1a2d228770d36a11
To check whether the GPU is recognized, run:
https://gist.github.com/020616d1151268bffceedf798e6ae31f
The output should displays your eGPU information, such as:
https://gist.github.com/300498682ceace0a7f0ac771a42d8674
Authorizing

If the eGPU is unauthorized,  you need to authorize it by manually changing the content of /sys/bus/thunderbolt/devices/0-0/0-1/authorized from 0 to 1. This can done with nano:
https://gist.github.com/ba5ead33f4de1a20ed26db580f1e2e8e
Now, we will start installing the driver, it will take awhile. You can try the latest drivers, I went with version 396:
https://gist.github.com/9ca023863a9534031148f3858d43d70e
Don’t forget to reboot after the driver installation:
https://gist.github.com/5d0352f7d5ede0608fe40adefb766297
Now you can check if the driver has been installed correctly with:
https://gist.github.com/86c39a79273d8ae04d8c98ea533b9cc5
The output should be your eGPU status, something like:
https://gist.github.com/7bc6f461fc4d522f9e183806c1b88df7
5. Docker CE

You can find the full instructions on their website. I will only list all the commands needed here.
Setup repository

https://gist.github.com/f1377c6a7532e4c162307f0b10cee0dd
Installing

https://gist.github.com/037d038ddb905b2512426b426fd83863
6. Nvidia Docker

To take advantage of the GPU power, we also need the Nvidia Docker. The official instruction is available on their Github. Similarly, we need to add their repository first, then install the library. I list all the needed commands here.
Adding repository

https://gist.github.com/2802c356ec5f7612d2b0b5b68eb0eacd
Installing

https://gist.github.com/2f87e329724e31686dd4f357717bd949
7. Run a machine-learning-ready Docker image

Now the magic happens, we don’t need to install everything from scratch. First, you need to ask yourself what kind of environment you want. I have some interesting image here:

Official PyTorch image: https://hub.docker.com/r/pytorch/pytorch/
Paperspace’s Fast.ai: https://hub.docker.com/r/paperspace/fastai/
Official Tensorflow image: https://hub.docker.com/r/tensorflow/tensorflow/

Here I will show you how to run the paperspace/fastai image:
https://gist.github.com/842d676e47d5f52c18ddfae905325a9b
To fully understand the parameters, I again strongly recommend you to read Hamel Husain’s article “How Docker Can Help You Become A More Effective Data Scientist”. When Docker cannot find the image paperspace/fastai:cuda9_pytorch0.3.0 at local, it automatically looks up on the DockerHub, download, and initialize a corresponding container. It will take awhile as the Fast.ai image includes a large dataset of cats and dogs photos. But no worries, it only happens for the first time.
Once the container has been up running, you might not be able to detect any changes (I told you, it is lightweight). To get the list of running containers, we can use the command:
https://gist.github.com/4d21298f257c5c11ea25ce66c5d320aa
The output should be something like:
https://gist.github.com/6397e716f94bee510c9d799125956ece
Remember the container id, you will need it later.
When you run this container, it also establishes a Jupyter Notebook at port 8888. If you are also running Ubuntu Server like me, we will need the Jupyter Notebook’s token to access it from another computer. In order to execute the command jupyter notebook list to get the token, we can call:
https://gist.github.com/0aed35d7c94155478874d3b45d1157bf
Remember to replace your_container_id with the ID given in the previous step.
That’s it! It’s time for you to test your setup in Jupyter Notebook. ;)

Thank you for reading to this point!

I hope that you find this tutorial somehow useful. Feel free to reach me out on Facebook or Github.