Skip to content

Instantly share code, notes, and snippets.

@ErikGartner
Last active February 22, 2021 00:36
Show Gist options
  • Save ErikGartner/ca01aa56e3752ce3d9e0e8dc09b255c9 to your computer and use it in GitHub Desktop.
Save ErikGartner/ca01aa56e3752ce3d9e0e8dc09b255c9 to your computer and use it in GitHub Desktop.
A CUDA installations guide

A Magical Guide to Installing CUDA

This guide tries to make sense of installing NVIDIA CUDA on Ubuntu.

Disclaimer: Installing CUDA is a somewhat tedious and can be a problematic process. This guide worked for me, though if you have an unusual configuration you might need additional preparations to make this work. My machines are mostly blank Ubuntu machines.

For reference NVIDIA's official guides are here for CUDA and cuDNN.

Last updated: 2019-07-27

Versions

  • Ubuntu 16.04
  • NVIDIA driver 396.37
  • CUDA 9.2 Patch 1
  • cuDNN v7.2.1

Older versions this guide

Preparations

  1. Blacklist Nouveau drivers:
sudo -i
rm /etc/modprobe.d/blacklist-nouveau.conf
echo 'blacklist nouveau' >> /etc/modprobe.d/blacklist-nouveau.conf
echo 'options nouveau modeset=0' >> /etc/modprobe.d/blacklist-nouveau.conf
update-initramfs -u
exit
  1. If you have previously installed CUDA using the runfile you need to remove it. Check by running and look for a folder named cuda-X.X
ls /usr/local/

If you have CUDA installed execute the follow after replacing X.X with your version.

cd /usr/local/cuda-X.X/
sudo ./bin/uninstall_cuda.X.X.pl
sudo /usr/bin/nvidia-uninstall
  1. Purge all nvidia driver:
dpkg --get-selections | grep nvidia
sudo apt remove --purge [packages ..]
  1. Purge any CUDA leftovers:
dpkg --get-selections | grep cuda
sudo apt purge '*cuda*'
  1. Purge any cuDNN leftovers:
dpkg --get-selections | grep libcudnn
sudo apt purge '*cudnn*'

Installation

This guide uses the local deb file and installs the held back version with drivers. In English: this guide installs the Debian package along with graphic drivers in a way that disables automatic update.

  1. Download the drivers from here.

  2. Install the packages using:

sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.148-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604-9-2-148-local-patch-1_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
sudo apt update
sudo apt install cuda-9-2
  1. Download cuDNN (requires a NVIDIA Developer account) from here: cuDNN v7.2.1 Runtime Library for Ubuntu16.04 and cuDNN v7.2.1 Developer Library for Ubuntu16.04.

  2. Install the packages:

sudo dpkg -i libcudnn7_7.2.1.38-1+cuda9.2_amd64.deb
sudo dpkg -i libcudnn7-dev_7.2.1.38-1+cuda9.2_amd64.deb
sudo apt install libcudnn7-dev libcudnn7
  1. Setup shell environment. Add the followin lines to ~/.bashrc or similar. Restart your terminal/shell afterwards.
export CUDA_HOME=/usr/local/cuda-9.2
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64
PATH=${CUDA_HOME}/bin:${PATH}
export PATH
  1. Restart the computer.

Test everything (fingers crossed):

nvcc -V

Should output:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

Run:

nvidia-smi

Should output something similar to:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37                 Driver Version: 396.37                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp            Off  | 00000000:04:00.0 Off |                  N/A |
| 23%   30C    P0    61W / 250W |      0MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

If you want to test CUDA futher run the samples:

cp -r /usr/local/cuda-9.2/samples $HOME
cd  $HOME/samples
make


./bin/x86_64/linux/release/deviceQuery
./bin/x86_64/linux/release/bandwidthTest

They should should output:

 CUDA Device Query (Runtime API) version (CUDART static linking)

[...]

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.2, CUDA Runtime Version = 9.2, NumDevs = 1
Result = PASS
[CUDA Bandwidth Test] - Starting...
Running on...

[...]

result = PASS

Troubleshooting / Common Issues

The Xserver doesn't properly start

The root cause of this can be unclear and just reinstalling the Nvidia drivers might not help.

Here are som suggestions:

  • Creating a new Xorg.conf, see this old guide.
  • Checking the default display manager, making sure it's correctly configured and perhaps reinstalling it: cat etc/X11/default-display-manager, more here.
  • If you have multiple displays it might help to temporarily only use one display until the problem is resolved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment