Skip to content

Instantly share code, notes, and snippets.

@soareschen
Last active August 8, 2018 01:04
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save soareschen/3b88190b5bb7ebc335e77d5ad8eaaade to your computer and use it in GitHub Desktop.
Save soareschen/3b88190b5bb7ebc335e77d5ad8eaaade to your computer and use it in GitHub Desktop.
CUDA setup on Ubuntu 16.04 and LXD

This gist explains the steps required to install CUDA on Ubuntu 16.04 as well as enabling it inside LXD containers.

The setup assumes GTX 10 series hardware, tested with my GTX 1070.

Driver Installation

Download the latest Nvidia driver at http://www.nvidia.com/Download/index.aspx.

On 64-bit systems, install 32-bit OpenGL libaries first so that the driver will install

To install the driver, you have to close down the desktop GUI and install through terminal. This is required for programs such as Steam to work.

host# apt-get install libgl1-mesa-dri:i386 libgl1-mesa-dri:i386

Open the terminal with Ctrl + Alt + F1 and run the following commands with root permission:

host# service lightdm stop
host# init 3
host# sh ./NVIDIA-Linux-x86_64-367.35.run
host# shutdown -r now

CUDA Host Installation

We will use CUDA 7.5 as there is driver conflict in CUDA 8 RC that causes blank screen after restart.

Download CUDA Toolkit at https://developer.nvidia.com/cuda-downloads. Choose Ubuntu 15.04 with runfile (local).

It is recommended to download the .run installer instead of .deb. With that we can customize some options such as to not install the Nvidia driver again during CUDA installation.

Run the installer with the --override option, as our Ubuntu and GCC versions are not officially supported.

host# sh ./cuda_7.5.18_linux.run --override

Remember to choose no when asked whether to install the driver.

After installation, restart and verify that the desktop can still be loaded as usual.

To compile any code, we need to force CUDA to work with the latest GCC 5.4. The official supported GCC for CUDA is only up to version 4.9.

Edit the file /usr/local/cuda-7.5/include/host_config.h, search for the following line and comment it out:

// before:
#error -- unsupported GNU version! gcc versions later than 4.9 are not supported!

// after:
// #error -- unsupported GNU version! gcc versions later than 4.9 are not supported!

Try compiling and run the CUDA example code to verify that CUDA is working properly.

host$ cd NVIDIA_CUDA-7.5_Samples/0_Simple/vectorAdd
host$ make
host$ ./vectorAdd

LXD CUDA Installation

Now once CUDA on the host is working. We can setup Ubuntu LXD container to install CUDA on it.

First setup LXD with this tutorial: http://insights.ubuntu.com/2016/03/14/the-lxd-2-0-story-prologue/

Make sure that the following 3 files exist on your system:

$ ls /dev/nvidia*
/dev/nvidia0
/dev/nvidiactl
/dev/nvidia-uvm

With my experience, /dev/nvidia-uvm is missing on Ubuntu 16.04 with the latest driver. The workaround according to the CUDA installation guide is to execute the following command:

host# /sbin/modprobe nvidia-uvm
host# D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
host# mknod -m 666 /dev/nvidia-uvm c $D 0

Next initialize an LXD container with the Nvidia devices mounted onto the container.

host$ CONTAINER=ubuntu-cuda
host$ lxc init ubuntu: $CONTAINER
host$ lxc config device add $CONTAINER nvidia0 unix-char path=/dev/nvidia0
host$ lxc config device add $CONTAINER nvidiactl unix-char path=/dev/nvidiactl
host$ lxc config device add $CONTAINER nvidia-uvm unix-char path=/dev/nvidia-uvm

Also to make easy of file sharing, I usually mount a share directory to the container to access the installer and example files.

host$ lxc config set $CONTAINER security.privileged true
host$ lxc config device add $CONTAINER shareName disk source=/home/$USER/share path=/share

Exec into the container and install the driver first. Note that we need to install the driver without the kernel module, as the module is installed on the host OS already.

host$ lxc start $CONTAINER
host$ lxc exec $CONTAINER bash
container# sh /share/NVIDIA-Linux-x86_64-367.35.run --no-kernel-module

After that install CUDA with the same steps as above. You should be able to access CUDA both on the host and container by now.

@wangruohui
Copy link

I have just tried the same thing on my server with the host installed with ubuntu 17.04 and container initialized with ubuntu 16.04. I use the latest NVIDIA driver (currently 381.22) and CUDA version 8.0.61. Some altanative ways for rerefence.
1, The nvidia-uvm device would appear after running some GPU program, for example, the deviceQuery in CUDA samples.
2, With LXD later than version 2.5, lxc config device add $CONTAINER gpu gpu will mount /dev/nvidia0 (and all other gpu devices), /dev/nvidiactl into the container, but it seems nvidia-uvm has to be mounted into the container manually.

@fungtion
Copy link

fungtion commented Aug 8, 2018

why must I install CUDA on both host and container?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment