-
-
Save Mahedi-61/2a2f1579d4271717d421065168ce6a73 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
### steps #### | |
# Verify the system has a cuda-capable gpu | |
# Download and install the nvidia cuda toolkit and cudnn | |
# Setup environmental variables | |
# Verify the installation | |
### | |
### to verify your gpu is cuda enable check | |
lspci | grep -i nvidia | |
### If you have previous installation remove it first. | |
sudo apt-get purge nvidia* | |
sudo apt remove nvidia-* | |
sudo rm /etc/apt/sources.list.d/cuda* | |
sudo apt-get autoremove && sudo apt-get autoclean | |
sudo rm -rf /usr/local/cuda* | |
# system update | |
sudo apt-get update | |
sudo apt-get upgrade | |
# install other import packages | |
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev | |
# first get the PPA repository driver | |
sudo add-apt-repository ppa:graphics-drivers/ppa | |
sudo apt update | |
# install nvidia driver with dependencies | |
sudo apt install libnvidia-common-470 | |
sudo apt install libnvidia-gl-470 | |
sudo apt install nvidia-driver-470 | |
# installing CUDA-11.8 | |
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.0-1_all.deb | |
sudo dpkg -i cuda-keyring_1.0-1_all.deb | |
sudo apt-get update | |
sudo apt-get -y install cuda | |
# setup your paths | |
echo 'export PATH=/usr/local/cuda-11.8/bin:$PATH' >> ~/.bashrc | |
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc | |
source ~/.bashrc | |
sudo ldconfig | |
# install cuDNN v8.9.7 | |
# First register here: https://developer.nvidia.com/developer-program/signup | |
CUDNN_TAR_FILE="cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz" | |
wget https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/11.x/cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz | |
tar -xvf ${CUDNN_TAR_FILE} | |
# copy the following files into the cuda toolkit directory. | |
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include | |
$ sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 | |
$ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn* | |
# Finally, to verify the installation, check | |
nvidia-smi | |
nvcc -V | |
# install Pytorch (an open source machine learning framework) | |
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 |
If you run this on shell, tensorflow recognizes gpus?
I ran this shell script, and seemed like there was no problem running it, but tensorflow-gpu still doesn't recognize gpus.
Tensorflow-gpu version is 2.3.0 and this version must also be compatible with cuda 10.1 and cudnn 7.6.
Try this
sudo apt-get install -y --no-install-recommends
cuda-10-1
libcudnn7=7.6.0.64-1+cuda10.1
libcudnn7-dev=7.6.0.64-1+cuda10.1;
sudo apt-get install -y --no-install-recommends
libnvinfer6=6.0.1-1+cuda10.1
libnvinfer-dev=6.0.1-1+cuda10.1
libnvinfer-plugin6=6.0.1-1+cuda10.1;
also as someone above said, cuda 10.1 install some cuda 10.2 components
it works on my pc. Thanks!
can you tell what changes should be done for cuda 11.1 in 18.04 ubuntu system.
Thanks
Was having issues getting the TensorFlow Object Detection API to work without errors. This guide worked for Ubuntu 20.04, CUDA 11.2, CuDNN 8.1.0 and TensorFlow 2.6.
Thanks a lot!
@mnielsen There is extra i in sudo apt install libnividia-gl-470. I think it should be sudo apt install libnvidia-gl-470
@yummyKnight Thanks for your correction.
Can someone tell me is sudo ubuntu-drivers autoinstall
the same as three following commands? Do they do the same job?
sudo apt install libnvidia-common-470
sudo apt install libnvidia-gl-470
sudo apt install nvidia-driver-470
After installing this I was getting the following (non-fatal) warning
>>> import tensorflow as tf
>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-11-24 09:01:58.877869: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.899255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-24 09:01:58.900051: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Num GPUs Available: 1
I resolved it by following tensorflow/tensorflow#53184
for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done
Thanks for nice repo
I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?
Works great up till cuDNN, and then I get the following
$ wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
--2022-02-13 13:24:40-- https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.1.1.33/11.2_20210301/cudnn-11.2-linux-x64-v8.1.1.33.tgz
Resolving developer.nvidia.com (developer.nvidia.com)... 152.195.19.142
Connecting to developer.nvidia.com (developer.nvidia.com)|152.195.19.142|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-02-13 13:24:41 ERROR 403: Forbidden.
EDIT: This link worked: wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.1.1/cudnn-11.2-linux-x64-v8.1.1.33.tgz
I want to install CUDA 11.3 or higher version on Ubuntu 18.04 (which is installed using a Virtual Machine). Which instructions should I follow?
Thanks for nice repo I have installed using your instruction. but when type nvidia-smi it shows 11.5. Why, how can I install 11.2?
I had to implement the end of this tutorial:
https://towardsdatascience.com/installing-multiple-cuda-cudnn-versions-in-ubuntu-fcb6aa5194e2
I used his edit of bash so tensorflow (in my case) can choose what cuda toolkit use, and it worked.
Thank you very much @Mahedi-61, much appreciated
RTX 3090 requires driver version of 515 (not 470).
# install nvidia driver with dependencies
sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515
I am wondering whether these work for installing cuda 11.3 on ubuntu 22.04 also?
Will it work for nvidia-server on ubuntu 20.04 server ?
install nvidia driver with dependencies
sudo apt install libnvidia-common-470-server
sudo apt install libnvidia-gl-470-server
sudo apt install nvidia-driver-470-server
Will it work for nvidia-server on ubuntu 20.04 server ?
install nvidia driver with dependencies
sudo apt install libnvidia-common-470-server sudo apt install libnvidia-gl-470-server sudo apt install nvidia-driver-470-server
@saravananpsg It's works for server. I tested. I also changed 470 to 515 to support 3090.
I also had to change the version from 470 to 515 for a 1070 TI.
sudo apt install libnvidia-common-515
sudo apt install libnvidia-gl-515
sudo apt install nvidia-driver-515
After installing, if nvidia-smi
gives a kernel/client version mismatch error, reboot.
This helped A LOT! Thanks!
Thank you! It was veeery helpful!
Thank you verry much you just forgotten a star character after cudnn here :
sudo cp -P cuda/include/cudnn*.h /usr/local/cuda-11.3/include
Verry important because else an error can be encountered while compiling for example pytorch "cudnn_version.h" not found.
Regards
tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar.xz
xz: (stdin): File format not recognized
tar: Child returned status 1
tar: Error is not recoverable: exiting now
I get this error
I have my tar and xz installed
with this command :
$ sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
message error :
cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
the same with the other commands :
$ sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
====> cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
$sudo cp cudnn--archive/include/cudnn.h /usr/local/cuda/include
====> cp: cannot stat 'cudnn--archive/include/cudnn.h': No such file or directory
If you run this on shell, tensorflow recognizes gpus?
I ran this shell script, and seemed like there was no problem running it, but tensorflow-gpu still doesn't recognize gpus.
Tensorflow-gpu version is 2.3.0 and this version must also be compatible with cuda 10.1 and cudnn 7.6.