mantasu/install-cuda-tf-pytorch.md

## install-cuda-tf-pytorch.md

      
    Raw
  

              install-cuda-tf-pytorch.md
            
          
    Install Tensorflow & Pytorch with CUDA [Linux | WSL2]

Overview

This guide provides steps on how to install Tensorflow and Pytorch on Linux environment (including WSL2 - Windows Subsystem for Linux) with NVIDIA GPU support. Here I focus on Ubuntu 22.04 and WSL2 (Windows 11) but things should work on more/less recent/relevant versions. From what I've checked there are no full consistent guidelines, hopefully this one should clear things up (also serve as a reminder for me).

To install purely on Windows 10/11 (no WSL), I suggest to follow this tutorial.

GPU Setup

NVIDIA Driver

Please install the newest NVIDIA drivers. There are plenty of tutorials on how to do this, here are some examples:

Linux - guide by Shahriar Shovon on how to install & uninstall NVIDIA drivers on Ubuntu 22.04 LTS
Windows 11 (WSL2 users) - use GeForce Experience for automatic updates or update manually by following the official guide

Cuda

CUDA Toolkit can be installed by following the official guides. CUDA is backwards compatible with previous versions so please install the newest version.

Linux - install deb/rpm (preferably network version) from NVIDIA page. If needed, check the guide
WSL2 - install WSL2 on Windows 11 by following this guide, then CUDA Toolkit by following the subsequent guide


Notice for WSL2 users: as mentioned in the official Ubuntu guide, "the CUDA driver used is part of the Windows driver installed on the system" so make sure to follow those steps since installation is not the same as on a separate Linux system.

CuDNN

From here the installation is the same for both Linux and WSL2 users. All there is to do is, again, to follow the official guide. To keep things simple:

Ensure you're registered for the NVIDIA Developer Program
Install Zlib as specified here
Download deb*/rpm for newest CUDA here (complete survey and accept terms)
Install the downloaded file as specified here (don't install libcudnn8-samples)


*For Debian releases, check the architecture type (to download the correct deb file):

$ dpkg --print-architecture
Path Setup & Version Management

Add the following lines to your ~/.profile or ~/.bash_profile file (alternatively, ~/.bashrc also works):
# Add header locations to path variables
export PATH=$PATH:/usr/local/cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/include

# UNCOMMENT if you use WSL2
# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/wsl/lib

Note that /usr/local/ should contain the newest CUDA directory, e.g., currently, cuda-11.7. There may also be /usr/local/cuda-11 and /usr/local/cuda which are simply shortcuts to the newest cuda-X.Y directory. cuda is chosen to be used as export path because, if there are any version changes, /usr/local/cuda should point to the selected one.

You can switch between CUDA and CuDNN versions (if they were installed from deb/rpm) with the following commands:


$ sudo update-alternatives --config cuda # switch CUDA version


$ sudo update-alternatives --config libcudnn # switch cuDNN version


Reboot as a final step for GPU Setup
Package & Library Setup

Conda

The installation for Anaconda is as follows (official guide):

Download anaconda installer for your Linux distribution
Run the installer (replace 2022.05 and x86_64 with the downloaded version)
$ bash Anaconda-2022.05-Linux-x86_64.sh # type `yes` at the end to init conda

You can disable automatic activation of the base environment:
$ conda config --set auto_activate_base false


In case you want to remove it later, just remove the entire directory:
$ rm -rf $CONDA_PREFIX # ensure conda environment is deactivated when running

You don't have to install Anaconda as you can simply create environments for every project with virtual environment, however Anaconda or Miniconda makes environments easier to manage and alleviates some issues with GPU setup if there are any.

Tensorflow

Create new CONDA environment and install Tensorflow (CUDA 11.7 or newer should be backwards compatible):
$ conda create -n tf-ws python=3.10 # currently Python 3.10.4 is the newest
$ conda activate tf-ws
$ pip install tensorflow
To test if Tensorflow supports GPU:
$ python
>>> import tensorflow
>>> tensorflow.test.is_gpu_available()
...
True
>>> exit()
$ conda deactivate

For WSL2 users: there should be warnings about NUMA support but they can be ignored as it is a small side effect of WSL2.

Pytorch

Create new CONDA environment and install Pytorch (CUDA 11.7 or newer should be backwards compatible):
$ conda create -n torch-ws python=3.10 # currently Python 3.10.4 is the newest
$ conda activate torch-ws
$ pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
$ pip install [pytorch-lightning | pytorch-ignite] # choose either (optional)
To test if Pytorch supports GPU:
$ python
>>> import torch
>>> torch.cuda.is_available()
True
>>> exit()
$ conda deactivate