Skip to content

Instantly share code, notes, and snippets.

@PeterPirog
Forked from khansun/LinuxCUDAtoolkits.md
Created December 3, 2022 17:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PeterPirog/20f7fab72208ddb9487ab0f82c6565e8 to your computer and use it in GitHub Desktop.
Save PeterPirog/20f7fab72208ddb9487ab0f82c6565e8 to your computer and use it in GitHub Desktop.
Multiple versions of CUDA toolkit and CUDNN installation guide for Linux and WSL2

Solve python environment related issues to utilize CUDA enabled GPUs

NVIDIA GPU Driver

  • Check status from terminal: nvidia-smi.

  • Skip this step if you have the following output:


+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.52       Driver Version: 511.79       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   46C    P8    N/A /  N/A |      0MiB /  2048MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Remove previous NVIDIA drivers if needed: Linux

  • sudo dpkg -P $(dpkg -l | grep nvidia-driver | awk '{print $2}')

  • sudo apt autoremove

  • sudo lshw -C display

NVIDIA Apmere cards including 3070, 3080 and 3090 do not work with CUDA 10.
You have to use CUDA 11.0 or higher.
Right now, the only way to do so is by installing tf-nightly or building yourself.
Works with TensorFlow version 2.5

Install proper NVIDIA driver: Linux

  • sudo ubuntu-drivers devices

  • sudo ubuntu-drivers autoinstall

Install proper NVIDIA driver: WSL2

Easy way: Recommended

Using Anaconda/Miniconda:

  • Create virtual environment 'tf': conda create -n tf -c nvidia -c defaults -c anaconda python=3.9 cudatoolkit cudnn
  • Activate tf environment: conda activate tf
  • Install TensorFlow GPU: pip install tensorflow-gpu

If importing tensorflow appears to have missing libraries:

  • vim ~/.bashrc
  • Append the following lines:
# Missing dynamic libraries for conda env
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"/home/[khansun]*/miniconda3/envs/tf/lib/"
  • source ~/.bashrc

Test Tensorflow GPU:

  • Run the following python code:
import tensorflow as tf
import timeit

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

def cpu():
  with tf.device('/cpu:0'):
    random_image_cpu = tf.random.normal((100, 100, 100, 3))
    net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu)
    return tf.math.reduce_sum(net_cpu)

def gpu():
  with tf.device('/device:GPU:0'):
    random_image_gpu = tf.random.normal((100, 100, 100, 3))
    net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
    return tf.math.reduce_sum(net_gpu)
  
# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

# Run the op several times.
print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images '
      '(batch x height x width x channel). Sum of ten runs.')
print('CPU (s):')
cpu_time = timeit.timeit('cpu()', number=10, setup="from __main__ import cpu")
print(cpu_time)
print('GPU (s):')
gpu_time = timeit.timeit('gpu()', number=10, setup="from __main__ import gpu")
print(gpu_time)
print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))

print("TensorFlow Version: " + str(tf.__version__))
  • Output For Intel Core i7-7700HQ CPU and GeForce GTX 1050:
CPU (s):
1.2489470999998957
GPU (s):
0.06826900000032765
GPU speedup over CPU: 18x
TensorFlow Version: 2.8.0

Rigorous way: If you know what you're doing!

Install CUDA toolkit: 10.1

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin

sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub

sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"

sudo apt-get update

sudo apt-get -y install cuda-10.1

Install CUDA toolkit: 11.2

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin

sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub

sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"

sudo apt-get update
sudo apt-get -y install cuda-11.2

Install CUDNN dynamic libraries:

 tar -xzvf cudnn-10.1-linux-x64-v7.6.5.32.tgz

 sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
 
 sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
 
 sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

Update environment variable

vim ~/.bashrc

  • Append the following lines to it:
# CUDA related exports
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

export PATH=/usr/local/cuda-11.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

# cubalas

export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/cuda-10.2/targets/x86_64-linux/include/
export PATH=/usr/local/cuda-10.2/targets/x86_64-linux/include/${PATH:+:${PATH}}

export LD_LIBRARY_PATH="/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
export PATH="/usr/local/cuda/bin:${PATH:+:${PATH}}"
export LIBRARY_PATH="/usr/local/cuda-10.1/lib64:${LIBRARY_PATH:+:${LIBRARY_PATH}}


export PATH=/usr/local/cuda-11.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
"
  • Create CUDA library Path file and append the following text:

    sudo touch /etc/profile.d/cuda.sh

export PATH=/usr/local/cuda-11.2/bin:$PATH
export CUDADIR=/usr/local/cuda-11.2
  • Update changes to ENV valiable:

    source ~/.bashrc

    source /etc/profile.d/cuda.sh

The NVCC should have the following config:

  • Run from terminal: nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment