Skip to content

Instantly share code, notes, and snippets.

@nicolasrosa
Forked from morgangiraud/nvidia-reinstall.sh
Last active February 14, 2022 02:42
Show Gist options
  • Save nicolasrosa/3fd5e9a44068d6d3bf0cd14fcd864b8e to your computer and use it in GitHub Desktop.
Save nicolasrosa/3fd5e9a44068d6d3bf0cd14fcd864b8e to your computer and use it in GitHub Desktop.
NVIDIA, CUDA, CUDNN and Tensorflow Installation

Successful Instalations

[Ok] (Ubuntu) CUDA 8.0 + CuDNN 6.0 + tensorflow-gpu (0.12.1)

[Ok] (Ubuntu) CUDA 8.0 + CuDNN 6.0 + tensorflow-gpu (1.13.0)*

[Ok] (Fedora) CUDA 8.0 + CuDNN 6.0 + tensorflow-gpu (1.13.0)*

[ ] (Fedora 27) CUDA 8.0, não suportado

[ ] (Fedora 27) CUDA 9.1 + CuDNN 7.0 + tensorflow-gpu (1.4), minimum compute capacity 3.7**

  • Só funcionou após dar source .bashrc

** CUDA funciona, mas a última versão do tensorflow até então (1.4), suporta apenas o CUDA 8.0. Instalação funciona tanto com os *.rpms ou com o *.run

  • Tensorflow doesn't support Cuda 9 until then. (latest version 1.4)

NVIDIA Driver Installation

ATTENTION! Following steps may cause black screen of death!!!

(Fedora) Follow the tutorial: Fedora 28/27/26 nVidia Drivers Install Guide

OR

chmod +x /path/to/NVIDIA-Linux-*.run
su -
dnf update (reboot after it, if necessary)
dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc dkms acpid libglvnd-glx libglvnd-opengl libglvnd-devel pkgconfig
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
cat /etc/modprobe.d/blacklist.conf (Modification check)

Edit '/etc/sysconfig/grub':

Append ‘rd.driver.blacklist=nouveau’ to end of ‘GRUB_CMDLINE_LINUX=”…”‘

Update grub2 conf:

## BIOS ##
grub2-mkconfig -o /boot/grub2/grub.cfg

## UEFI ##
grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

dnf remove xorg-x11-drv-nouveau

Edit /etc/dnf/dnf.conf. If it exists, remove the following line:

exclude=xorg-x11*

Generate initramfs:

## Backup old initramfs nouveau image ##
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img

## Create new initramfs image ##
dracut /boot/initramfs-$(uname -r).img $(uname -r)

Run:

systemctl set-default multi-user.target
reboot
su -
./NVIDIA-Linux-x86_64-387.34.run (Attention!)*
nvidia-smi (Check if the NVIDIA driver was correctly installed)
systemctl set-default graphical.target
reboot

Known Errors

Black Screen of Depth

Fix:

systemctl set-default multi-user.target
reboot
sudo dnf reinstall xorg-* mesa*
systemctl set-default graphical.target
reboot

CUDA Installation from Repositories

Cuda 9

su -c 'dnf install wget make gcc-c++ freeglut-devel libXi-devel libXmu-devel mesa-libGLU-devel'
sudo dnf install http://developer.download.nvidia.com/compute/cuda/repos/fedora25/x86_64/cuda-repo-fedora25-9.0.176-1.x86_64.rpm
sudo dnf install cuda cuda-devel cuda-cudnn-devel

Helpful Links: https://fedoraproject.org/wiki/Cuda

CUDA Installation from *.run file

Cuda 8

Follow tutorial: Installing Nvidia’s CUDA 8.0 on Fedora 25

Edit the '~/.bashrc' file:

# CUDA 8.0 (active)
export PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/nvvm/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

Run:

source ~/.bashrc
nvcc -V

Cuda 9.1

Dependencies:

sudo dnf install libGLU-devel libXi-devel libXmu-devel glut glut-devel
sudo dnf groupinstall 'C Development Tools and Libraries'

Run:

mkdir /home/nicolas/Downloads/cuda_tmp
sudo sh cuda_9.1.85_387.26_linux.run --override --tmpdir=/home/nicolas/Downloads/cuda_tmp

Edit the '~/.bashrc' file:

# CUDA 9.1
# export PATH=/usr/local/cuda-9.1/bin:$PATH
# export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64:$LD_LIBRARY_PATH

export PATH=$PATH:$HOME/.local/bin:$HOME/bin:/usr/local/cuda-9.1/bin
export LIBRARY_PATH=$LIBRARY_PATH:$HOME/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.1/lib64:$HOME/lib
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:$HOME/NVIDIA_CUDA-9.1_Samples/common/inc:$HOME/include

Check if installation is successfull:

source ~/.bashrc
nvcc -V (Check if nvcc is working)
cd NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/
make
./deviceQuery (Should see "Result = PASS")

CUDNN Installation

Run:

chmod +x install_cudnn.sh
./install_cudnn.sh**

** May need to change the cudnn file's name

Tensorflow Installation

Dependencies:

sudo apt-get install python3-pip python3-dev

Latest:

sudo -H pip2 install --upgrade tensorflow-gpu (python 2.7)
sudo -H pip3 install --upgrade tensorflow-gpu (python 3.x)

Older Versions:

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0rc0-cp35-cp35m-linux_x86_64.whl
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0-cp35-cp35m-linux_x86_64.whl
sudo -H pip3 install --upgrade $TF_BINARY_URL

sudo -H pip3 install tensorflow-gpu==1.2.0rc0

Check if it's working:

python3 -c "import tensorflow"
python3 test_tensorflow.py
# Script to reinstall manually nvidia drivers,cuda 9.0 and cudnn 7.1 on Ubuntu 16.04
# Remove anything linked to nvidia
sudo apt-get remove --purge nvidia*
sudo apt-get autoremove
# Search for your driver
apt search nvidia
# Select one driver (the last one is a decent choice)
sudo apt install nvidia-370
# ERROR - Problem with NVIDIA Driver (Black Screen of Death, Ubuntu could not load account)
To reconfigure xorg.conf. Move your current /etc/X11/xorg.conf. If things go wrong you might need it later again:
$ sudo mv /etc/X11/xorg.conf /etc/X11/xorg.conf.BACKUP
The following steps will install the nouveau-driver on configure the xserver accordingly:
$ sudo apt-get install nouveau-firmware
$ sudo dpkg-reconfigure xserver-xorg
Go following the screen steps, answering the wizard questions and you should able to restore or reconfigure to previous Nouveau state.
# Test the driver
sudo shutdown -r now
nvidia-smi
# If it doesn't work, sometimes this is due to a secure boot option of your motherboard, disable it and test again
# Install cuda
# Get your deb cuda file from https://developer.nvidia.com/cuda-downloads
sudo dpkg -i cuda-repo-ubuntu<Version>.deb
sudo apt update
sudo apt install cuda
# Add cuda to your PATH and install the toolkit
# Also add them to your .bashrc file
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-9.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
nvcc --version
# Use the toolkit to check your CUDA capable devices
cuda-install-samples-9.0.sh ~/.
cd ~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery
make
shutdown -r now
# Test cuda
cd ~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery
./deviceQuery
# Downloads cudnn deb files from the nvidia website:
# https://developer.nvidia.com/rdp/cudnn-download
# Install cudnn
tar -zxvf cudnn-9.0-linux-x64-v7.tgz
sudo mv cuda/include/* /usr/local/cuda-9.0/include/.
sudo mv cuda/lib64/* /usr/local/cuda-9.0/lib64/.
# Reload your shell
. ~/.bashrc
# Tensorflow Installation
sudo apt-get install cuda-command-line-tools-9-0
sudo pip3 install tensorflow-gpu
sudo pip3 install opencv-python
sudo pip3 install matplotlib
sudo apt install python3-tk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment