Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Install NVIDIA Driver and CUDA on Ubuntu / CentOS / Fedora Linux OS

In this article, I will share some of my experience on installing NVIDIA driver and CUDA on Linux OS. Here I mainly use Ubuntu as example. Comments for CentOS/Fedora are also provided as much as I can.

Table of Contents

Table of contents generated with markdown-toc

Install NVIDIA Graphics Driver via apt-get

In Ubuntu systems, drivers for NVIDIA Graphics Cards are already provided in the official repository. Installation is as simple as one command.

For ubuntu 14.04.5 LTS, the latest version is 352. To install the driver, excute sudo apt-get nvidia-352 nvidia-modprobe, and then reboot the machine.

For ubuntu 16.04.3 LTS, the latest version is 375. To install the driver, excute sudo apt-get nvidia-375 nvidia-modprobe, and then reboot the machine.

The nvidia-modprobe utility is used to load NVIDIA kernel modules and create NVIDIA character device files automatically everytime your machine boots up.

It is recommended for new users to install the driver via this way because it is simple. However, it has some drawbacks:

  1. The driver included in official Ubuntu repository is usually not the latest.
  2. There would be some naming conflicts when other repositories (e.g. ones from CUDA) are added to the system.
  3. One has to reinstall the driver after Linux kernel are updated.

Install NVIDIA Graphics Driver via runfile

For advanced user who wants to get the latest version of the driver, get rid of the reinstallation issue caused bby dkms, or using Linux distributions that do not have nvidia drivers provided in the repositories, installing from runfile is recommended.

Remove Previous Installations (Important)

One might have installed the driver via apt-get. So before reinstall the driver from runfile, uninstalling previous installations is required. Executing the following scripts carefully one by one.

sudo apt-get purge nvidia*

# Note this might remove your cuda installation as well
sudo apt-get autoremove 

# Recommended if .deb files from NVIDIA were installed
# Change 1404 to the exact system version or use tab autocompletion
# After executing this file, /etc/apt/sources.list.d should contain no files related to nvidia or cuda
sudo dpkg -P cuda-repo-ubuntu1404

Download the Driver

The latest NVIDIA driver for Linux OS can be fetched from NVIDIA's official website. The first one in the list, i.e. Latest Long Lived Branch version for Linux x86_64/AMD64/EM64T, is suitable for most case.

If you want to down load the driver directly in a Linux shell, the script below would be useful.

cd ~
wget http://us.download.nvidia.com/XFree86/Linux-x86_64/384.69/NVIDIA-Linux-x86_64-384.69.run

Detailed installation instruction can be found in the download page via a README hyperlink in the ADDITIONAL INFORMATION tab. I have also summarized key steps below.

Install Dependencies

Software required for the runfile are officially listed here. But this page seems to be stale and not easy to follow.

For Ubuntu, installing the following dependencies is enough.

  1. build-essential -- For building the driver
  2. (Optional) gcc-multilib -- For providing 32-bit support
  3. dkms -- For providing dkms support
  4. (Optional) xorg and xorg-dev. On a workstation with GUI, this is require but usually have already been installed, because you have already got the graphic display. On headless servers without GUI, this is not a must.

As a summary, excuting sudo apt-get install build-essential gcc-multilib dkms to install all dependencies.

Required packages for CentOS are epel-release dkms libstdc++.i686. Execute yum install epel-release dkms libstdc++.i686.

Required packages for Fedora are dkms libstdc++.i686 kernel-devel. Execute dnf install dkms libstdc++.i686 kernel-devel.

Creat Blacklist for Nouveau Driver

Create a file at /etc/modprobe.d/blacklist-nouveau.conf with the following contents:

blacklist nouveau
options nouveau modeset=0

Note: It is also possible for the NVIDIA installation runfile to creat this blacklist file automatically. Excute the runfile and follow instructions when an error realted Nouveau appears.

Then,

  1. for Ubuntu 14.04 LTS, reboot the computer;
  2. for Ubuntu 16.04 LTS, excute sudo update-initramfs -u and reboot the computer;
  3. for CentOS/Fedora, excute sudo dracut --force and reboot the computer.

Stop lightdm/gdm/kdm

After the computer is rebooted. We need to stop the desktop manager before excuting the runfile to install the driver. lightdm is the default desktop manager in Ubuntu. If GNOME or KDE desktop environment is used, installed desktop manager will then be gdm or kdm.

  1. For Ubuntu 14.04 / 16.04, excuting sudo service lightdm stop (or use gdm or kdm instead of lightdm)
  2. For Ubuntu 16.04 / Fedora / CentOS, excuting sudo systemctl stop lightdm (or use gdm or kdm instead of lightdm)

Excuting the Runfile

After above batch of preparition, we can eventually start excuting the runfile. So this is why I, from the very begining, recommend new users to install the driver via apt-get.

cd ~
chmod +x NVIDIA-Linux-x86_64-384.69.run
sudo ./NVIDIA-Linux-x86_64-384.69.run --dkms -s

Note:

  1. option --dkms is used for register dkms module into the kernel so that update of the kernel will not require a reinstallation of the driver. This option should be turned on by default.
  2. option -s is used for silent installation which should used for batch installation. For installation on a single computer, this option should be turned off for more installtion information.
  3. option --no-opengl-files can also be added if non-NVIDIA (AMD or Intel) graphics are used for display while NVIDIA graphics are used for display.
  4. The installer may prompt warning on a system without X.Org installed. It is safe to ignore that based on my experience.
WARNING: nvidia-installer was forced to guess the X library path '/usr/lib' and X module path '/usr/lib/xorg/modules'; these paths were not queryable from the system.  If X fails to find the NVIDIA X driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your distribution and reinstall the driver.

Check the Installation

After a succesful installation, nvidia-smi command will report all your CUDA-capable devices in the system.

Common Errors and Solutions

  1. ERROR: Unable to load the 'nvidia-drm' kernel module.
  • One probable reason is that the system is boot from UEFI but Secure Boot option is turned on in the BIOS setting. Turn it off and the problem will be solved.

Additional Notes

nvidia-smi -pm 1 can enable the persistent mode, which will save some time from loading the driver. It will have significant effect on machines with more than 4 GPUs.

nvidia-smi -e 0 can disable ECC on TESLA products, which will provide about 1/15 more video memory. Reboot is reqired for taking effect. nvidia-smi -e 1 can be used to enable ECC again.

nvidia-smi -pl <some power value> can be used for increasing or decrasing the TDP limit of the GPU. Increasing will encourage higher GPU Boost frequency, but is somehow DANGEROUS and HARMFUL to the GPU. Decreasing will help to same some power, which is useful for machines that does not have enough power supply and will shutdown unintendedly when pull all GPU to their maximum load.

-i <GPUID> can be added after above commands to specify individual GPU.

These commands can be added to /etc/rc.local for excuting at system boot.

Install CUDA

Installing CUDA from runfile is much simpler and smoother than installing the NVIDIA driver. It just involves copying files to system directories and has nothing to do with the system kernel or online compilation. Removing CUDA is simply removing the installation directory. So I personally does not recommend adding NVIDIA's repositories and install CUDA via apt-get or other package managers as it will not reduce the complexity of installation or uninstallation but increase the risk of messing up the configurations for repositories.

The CUDA runfile installer can be downloaded from NVIDIA's websie. But what you download is a package the following three components:

  1. an NVIDIA driver installer, but usually of stale version;
  2. the actual CUDA installer;
  3. the CUDA samples installer;

To extract above three components, one can execute the runfile installer with --extract option. Then, executing the second one will finish the CUDA installation. Installation of the samples are also recommended because useful tool such as deviceQuery and p2pBandwidthLatencyTest are provided.

Scripts for installing CUDA Toolkit are summarized below.

cd ~
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run
chmod +x cuda_7.5.18_linux.run
./cuda_7.5.18_linux.run --extract=$HOME
sudo ./cuda-linux64-rel-7.5.18-19867135.run

After the installation finishes, configure runtime library.

sudo bash -c "echo /usr/local/cuda/lib64/ > /etc/ld.so.conf.d/cuda.conf"
sudo ldconfig

It is also recommended for Ubuntu users to append string /usr/local/cuda/bin to system file /etc/environments so that nvcc will be included in $PATH. This will take effect after reboot.

Install cuDNN

The recommended way for installing cuDNN is to first copy the tgz file to /usr/local and then extract it, and then remove the tgz file if necessary. This method will preserve symbolic links. At last, execute sudo ldconfig to update the shared library cache.

I can not thank you enough for this information. Finally I can use my GPU after weeks of trying various solutions and reinstalling ubuntu a few times. This solution worked perfectly and without a single glitch.

I was tried and installed successfully but my desktop has gone, it only shown just a blinking cursor at top left of the screen. I've tried many ways, like stopping and starting gdm, lightdm. None has worked for me, at last I installed current nvidia through sudo apt-get install nvidia-current, after successful installation and reboot, I get back my desktop. However this time I could not use gpu. when I try nvidia-smi:
nvidia-smi: command not found error occurring

any help to fix the issue greatly appreciated. Thank you

Owner

wangruohui commented Nov 4, 2016

@shravankumar147

I am not quite sure about what happens on your desktop. But I suggest you try apt-get install nvidia-361 --reinstall (ubuntu 1604) or apt-get install nvidia-352 --reinstall (Ubuntu 1404). The nvidia-current seems to be pointed to a considerably old version.

Owner

wangruohui commented Nov 4, 2016

@shravankumar147

and apt-get install nvidia-modprobe as I write in part Install NVIDIA Graphics Driver via apt-get

cseeker commented Nov 24, 2016

I'm using ASUS mainboard and ubuntu 14.04.
If someone meet "ERROR: Unable to load the 'nvidia-drm' kernel module."
Check uefi secure boot.
Maybe your bios setting of it is for "windows".
Just change it to for "other OS"
Then, save and exit.
You can see the installed driver after reboot.

thank you so much

yoelfme commented Apr 5, 2017

Thanks you so much @wangruohui, atfter more than 20 attempts with other guides, yours has saved my life.

Thank you so much. You really save my life after I struggled with many errors

working well,thanks..

Kif11 commented May 5, 2017

Very good comprehensive manual. Thanks.

visha-l commented Jun 2, 2017

I followed all the instruction but getting this error when i run make command in darknet.

ubuntu@ip-10-0-0-226:~/darknet$ make
nvcc  -gencode arch=compute_20,code=[sm_20,sm_21] -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=[sm_50,compute_50] -gencode arch=compute_52,code=[sm_52,compute_52]  -DGPU -I/usr/local/cuda-7.0/include/ --compiler-options "-Wall -Wfatal-errors  -Ofast -DGPU" -c ./src/convolutional_kernels.cu -o obj/convolutional_kernels.o
/bin/sh: 1: nvcc: not found
make: *** [obj/convolutional_kernels.o] Error 127

Done. The second version. And it worked, but now i am stuck in a login loop and can't find a solution. I am using Ubuntu 17.04

Owner

wangruohui commented Aug 29, 2017

@visha-l This is because /usr/local/cuda/bin is not in your path. You can try export PATH=$PATH:/usr/local/cuda/bin to solve your problem.

@vimac001 I am also stuck there. I hope someone can help us (without uninstalling nvidia)

this is awesome, thx @wangruohui

sli888 commented Sep 21, 2017

I have to say this is the best instruction out there for Nvidia Linux Driver installation! Thank you.

wa1618i commented Sep 30, 2017

THANKS

The first option that actually worked, thanks! Before this, my drivers were installed to /usr/lib/nvidia for some reason and adding those folders to PATH wouldn't help.

I do have one problem still - only one card out of 5 is visible, nothing changed to hardware setup. My friend told me it's possible that some default config was used on Ubuntu to enable only 1 card instead of all available, but I don't have enough knowledge about that. Any ideas what may be the issue?

michaelmbwang commented Nov 12, 2017

In the section Install NVIDIA Graphics Driver via apt-get, the command missing 'install', which should be sudo apt-get install nvidia-xxx nvidia-modprobe.
Not a big deal but save time.
But thx a lot for saving my life. I spent days on configuring what was going wrong with my Ubuntu.

Hi
Thanks for the detailed steps. I tried installing nvidia drivers using the 1st step (apt-get). I was installing nvidia-384 on Ubuntu 16.04 and I have Geforce GTX 960. When I tried downloading the official driver from the Nvidia website, I got the runfile corresponding to 384.98. However I see that apt-get nvidia-384 installs 384.90. I don't know if this is an issue :/
When I try the command nvidia-smi, it gives me the error:-
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
I have purged all nvidia drivers right now. Could you suggest what I should do?

Thanks. Great work.

Salzie commented Nov 23, 2017

This is the best and complete guide. After weeks of trying and getting errors and shit, this is the one stop solution to all problems. Can't thank you enough for this.

htcai commented Nov 27, 2017

This post is amazingly helpful!

A minor note: I am wondering whether it has been pointed out somewhere else. In my Latest Fedora 27 Workstation, at the step of Excuting the Runfile, I experienced freezing of the desktop after I entered the command to execute the runfile. I guess it is due to the stopping of gdm. My solution is to switch to the virtual console by pressing Ctrl + Alt + 2 before stopping gdm, which worked perfectly following the description in the post.

thanks a million

kunth commented Dec 11, 2017

Great work. Thanks.

Your post is awesome !!!

Thanks for explaining almost of what I haven't know before 🥇

Cysu commented Dec 16, 2017

@wangruohui, Ruohui, thank you very much for the step-by-step guidelines. Eventually I have to use Ubuntu =]

bearpaw commented Dec 18, 2017

@Cysu I happened to get here today and found it's written by Bro Hui 23333

I do the Nvidia driver installation using ssh from another machine in case things go south...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment