Skip to content

Instantly share code, notes, and snippets.

@tnguyenv

tnguyenv/blog.md Secret

Created November 29, 2018 15:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tnguyenv/b3043378c783ff3ff0c87864a8ee5059 to your computer and use it in GitHub Desktop.
Save tnguyenv/b3043378c783ff3ff0c87864a8ee5059 to your computer and use it in GitHub Desktop.

How to setup a deep-learning-ready server with Intel NUC 8 + Nvidia eGPU

Interested in learning deep learning with Pytorch and/or Fast.ai? The easiest way to get started is to use a cloud-based solution (Google it, there’s a lot!). However, if you want to invest for a longer run or simply want to make your hands dirty with setting up a personal server, this post is for you!

After spending quite a lot of time researching and setting up my new Intel NUC Hades Canyon with the Nvidia GTX 1080 eGPU, I decided to write this blog so people like me can save some time and quickly get down the road.

To make it easy for you to follow the steps, I outlined here the general process, however, before answering ‘how’, I would like to explain ‘why’ first. Feel free to skip the first section if you believe in my decisions. ;)

  1. Why?
  2. Install Ubuntu Server 18.04
  3. Install Bolt
  4. Install Nvidia driver
  5. Install Docker CE
  6. Install Nvidia Docker
  7. Run a ML-ready docker image

1. Why eGPU, Intel NUC, and Docker?

External GPU

The name has already implied its advantage: portability. With a single gaming box, I can use it with any laptop that supports thunderbolt/usb-c (e.g., MacBook, NUC 7/8) without worrying about compatibility. Here I use the Aorus GTX 1080 Gaming Box .

Intel NUC 8 Hadas Canyon

This mini PC brings you an Intel Core i7 processor, AMD Vega graphics power, and a spicy port mix. PCMag pointed out NUC 8’s top pros:

  • Compact and quiet-running
  • Excellent overall CPU and GPU performance
  • AMD Vega graphics are VR-ready
  • Bristling with connectivity for its size (of course, it has a pair of thunderbolt/usb-c ports)
  • Dual M.2 slots. VESA-mountable chassis

Docker

I would love to refer you to another blog post entitled “How Docker Can Help You Become A More Effective Data Scientist”, which explain everything you need to know about Docker, and of course, the reason why it is so important for data scientist. If you don't have time, here is my brief explanation:

Docker is like a virtual machine, however we call it container. It’s worth to mention that they are actually 2 different technologies (virtualization vs. containerization). Container is more efficient in resource allocation, you don’t have to cutoff 4GB of physical memory for each container. Consequently, you can run many containers simultaneously. Moreover, it’s lightweight, you can start a container just in a couple of seconds. Last but not least, you can easily share your development environment (container image), which might include tens of tens of libraries.

Docker has an image repository called DockerHub, where you can share the image of your Docker container to others publicly (or privately, you decide). And this literally how we can easily setup a machine learning development environment (with numpy, pandas, scikit, pytorch, and everything you need) in a couple lines of code.

2. Ubuntu 18.04 Server

It works, flawlessly. I just want to mention that in case you still concern that CUDA 9 doesn’t support Ubuntu 18.04 while PyTorch hasn’t support CUDA 10 yet. That’s it, I suppose that you know how to proceed with this.

Don’t forget to update and upgrade everything after the installation. https://gist.github.com/bbd30bf1610951c9e83eb36621d12112

3. Bolt

Installing

As I am using an eGPU which connects to the computer via a thunderbolt port, I need to install the bolt library first. https://gist.github.com/8ce075c2824a9beb1a2d228770d36a11

To check whether the GPU is recognized, run: https://gist.github.com/020616d1151268bffceedf798e6ae31f

The output should displays your eGPU information, such as: https://gist.github.com/300498682ceace0a7f0ac771a42d8674

Authorizing

If the eGPU is unauthorized, you need to authorize it by manually changing the content of /sys/bus/thunderbolt/devices/0-0/0-1/authorized from 0 to 1. This can done with nano: https://gist.github.com/ba5ead33f4de1a20ed26db580f1e2e8e

Now, we will start installing the driver, it will take awhile. You can try the latest drivers, I went with version 396: https://gist.github.com/9ca023863a9534031148f3858d43d70e

Don’t forget to reboot after the driver installation: https://gist.github.com/5d0352f7d5ede0608fe40adefb766297

Now you can check if the driver has been installed correctly with: https://gist.github.com/86c39a79273d8ae04d8c98ea533b9cc5

The output should be your eGPU status, something like: https://gist.github.com/7bc6f461fc4d522f9e183806c1b88df7

5. Docker CE

You can find the full instructions on their website. I will only list all the commands needed here.

Setup repository

https://gist.github.com/f1377c6a7532e4c162307f0b10cee0dd

Installing

https://gist.github.com/037d038ddb905b2512426b426fd83863

6. Nvidia Docker

To take advantage of the GPU power, we also need the Nvidia Docker. The official instruction is available on their Github. Similarly, we need to add their repository first, then install the library. I list all the needed commands here.

Adding repository

https://gist.github.com/2802c356ec5f7612d2b0b5b68eb0eacd

Installing

https://gist.github.com/2f87e329724e31686dd4f357717bd949

7. Run a machine-learning-ready Docker image

Now the magic happens, we don’t need to install everything from scratch. First, you need to ask yourself what kind of environment you want. I have some interesting image here:

Here I will show you how to run the paperspace/fastai image: https://gist.github.com/842d676e47d5f52c18ddfae905325a9b

To fully understand the parameters, I again strongly recommend you to read Hamel Husain’s article “How Docker Can Help You Become A More Effective Data Scientist”. When Docker cannot find the image paperspace/fastai:cuda9_pytorch0.3.0 at local, it automatically looks up on the DockerHub, download, and initialize a corresponding container. It will take awhile as the Fast.ai image includes a large dataset of cats and dogs photos. But no worries, it only happens for the first time.

Once the container has been up running, you might not be able to detect any changes (I told you, it is lightweight). To get the list of running containers, we can use the command: https://gist.github.com/4d21298f257c5c11ea25ce66c5d320aa

The output should be something like: https://gist.github.com/6397e716f94bee510c9d799125956ece

Remember the container id, you will need it later.

When you run this container, it also establishes a Jupyter Notebook at port 8888. If you are also running Ubuntu Server like me, we will need the Jupyter Notebook’s token to access it from another computer. In order to execute the command jupyter notebook list to get the token, we can call: https://gist.github.com/0aed35d7c94155478874d3b45d1157bf

Remember to replace your_container_id with the ID given in the previous step.

That’s it! It’s time for you to test your setup in Jupyter Notebook. ;)


Thank you for reading to this point!

I hope that you find this tutorial somehow useful. Feel free to reach me out on Facebook or Github.

@mathematicalmichael
Copy link

wondering how well this worked out, how it's still holding up.
Excellent article!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment