Skip to content

Instantly share code, notes, and snippets.

@davideuler
Forked from denguir/cuda_install.md
Created October 28, 2023 13:20
Show Gist options
  • Save davideuler/8cc6331a88e102c26db6676016e63517 to your computer and use it in GitHub Desktop.
Save davideuler/8cc6331a88e102c26db6676016e63517 to your computer and use it in GitHub Desktop.
Installation procedure for CUDA & cuDNN

How to install CUDA & cuDNN on Ubuntu 22.04

Install NVIDIA drivers

Update & upgrade

sudo apt update && sudo apt upgrade

Remove previous NVIDIA installation

sudo apt autoremove nvidia* --purge

Check Ubuntu devices

ubuntu-drivers devices

You will install the NVIDIA driver whose version is tagged with recommended

Install Ubuntu drivers

sudo ubuntu-drivers autoinstall

Install NVIDIA drivers

My recommended version is 525, adapt to yours

sudo apt install nvidia-driver-525

Reboot & Check

reboot

after restart verify that the following command works

nvidia-smi

Install CUDA drivers

Update & upgrade

sudo apt update && sudo apt upgrade

Install CUDA toolkit

sudo apt install nvidia-cuda-toolkit

Check CUDA install

nvcc --version

Install cuDNN

Download cuDNN .deb file

You can download cuDNN file here. You will need an Nvidia account. Select the cuDNN version for the appropriate CUDA version, which is the version that appears when you run:

nvcc --version

Install cuDNN

sudo apt install ./<filename.deb>
sudo cp /var/cudnn-<something>.gpg /usr/share/keyrings/

My cuDNN version is 8, adapt the following to your version:

sudo apt update
sudo apt install libcudnn8
sudo apt install libcudnn8-dev
sudo apt install libcudnn8-samples

Test CUDA on Pytorch

Create a virtualenv and activate it

sudo apt-get install python3-pip
sudo pip3 install virtualenv 
virtualenv -p py3.10 venv
source venv/bin/activate

Install pytorch

pip3 install torch torchvision torchaudio

Open Python and execute a test

import torch
print(torch.cuda.is_available()) # should be True

t = torch.rand(10, 10).cuda()
print(t.device) # should be CUDA
@davideuler
Copy link
Author

davideuler commented Nov 7, 2023

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

No devices were found

sudo apt-get remove --purge '^nvidia-.'
sudo apt-get remove --purge '^libnvidia-.
'
sudo apt-get remove --purge '^cuda-.*'
sudo apt autoremove

wget https://developer.download.nvidia.com/compute/cuda/12.3.0/local_installers/cuda_12.3.0_545.23.06_linux.run
sudo sh cuda_12.3.0_545.23.06_linux.run

@davideuler
Copy link
Author

$ conda create -n pytorch
$ conda activate pytorch

$ conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

check if cuda is available

$ python -m torch.utils.collect_env

Container toolkit:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

@davideuler
Copy link
Author

davideuler commented Nov 8, 2023

Pip package for pytorch which works with Cuda

Pytorch 与 Cuda 的版本兼容列表:

https://pytorch.org/get-started/previous-versions/

Q1: cuda 11.4 的驱动(nvidia-smi看到的版本号), 如何安装 torch ?
A: cuda 11.4/11.6 的驱动,可以使用 torch cu113, cu118 兼容的 pip 包, 详细讨论:
pytorch/pytorch#75992

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

可以从镜像站下载包安装, 搭配 Python 3.10 的版本包:
wget https://mirror.sjtu.edu.cn/pytorch-wheels/cu113/torch-1.12.0+cu113-cp310-cp310-linux_x86_64.whl

Q2: 只有 cuda 11.4 的系统驱动(nvidia-smi), 要使用 torch 2.0.1 以上的版本,如何安装 torch?
A: 如果需要使用 torch 2.0.1 以上的版本,但 ubuntu 系统只有 cuda 11.4 的驱动,可以使用 torch 2.0.1 & cu11 的编译版本:
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

参考:https://discuss.pytorch.org/t/which-pytorch-version-2-0-1-support-cuda-11-4/190446/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment