mikaelhg/cuda-install.md

## cuda-install.md

      
    Raw
  

              cuda-install.md
            
          
  date
  title
  tags
  
  
  2020-02-29
  Proper CUDA and cuDNN installation
  
  
  tech
  
  
You're here, so you're probably already hurting because of CUDA and cuDNN compatibility,
and I won't have to motivate you, or explain why you'd want to have standalone CUDA and cuDNN
installations, if you're going to develop using Tensorflow in the long term.
1. Download your CUDA runfile (local) packages

Check out the TF compatibility matrix.
Then hit NVidia's CUDA toolkit archive.
Get your local runfiles.
wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.run
wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
wget https://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.2.2/local_installers/cuda_11.2.2_460.32.03_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.5.1/local_installers/cuda_11.5.1_495.29.05_linux.run
wget https://developer.download.nvidia.com/compute/cuda/11.6.0/local_installers/cuda_11.6.0_510.39.01_linux.run
2. Install the toolkits only

sudo bash cuda_10.1.243_418.87.00_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-10.1.243 --librarypath=/usr/local/cuda-10.1.243

sudo bash cuda_10.2.89_440.33.01_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-10.2.89 --librarypath=/usr/local/cuda-10.2.89

sudo bash cuda_11.0.3_450.51.06_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-11.0.3 --librarypath=/usr/local/cuda-11.0.3

sudo bash cuda_11.2.2_460.32.03_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-11.2.2 --librarypath=/usr/local/cuda-11.2.2

sudo bash cuda_11.5.1_495.29.05_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-11.5.1 --librarypath=/usr/local/cuda-11.5.1

sudo bash cuda_11.6.0_510.39.01_linux.run --no-man-page --override --silent \
  --toolkit --toolkitpath=/usr/local/cuda-11.6.0 --librarypath=/usr/local/cuda-11.6.0
3. Download and extract your cuDNN tarballs

wget https://developer.download.nvidia.com/compute/redist/cudnn/v7.6.5/cudnn-10.1-linux-x64-v7.6.5.32.tgz
wget https://developer.download.nvidia.com/compute/redist/cudnn/v7.6.5/cudnn-10.2-linux-x64-v7.6.5.32.tgz
wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.0.5/cudnn-11.0-linux-x64-v8.0.5.39.tgz
wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.1.1/cudnn-11.2-linux-x64-v8.1.1.33.tgz
wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.3.2/local_installers/11.5/cudnn-linux-x86_64-8.3.2.44_cuda11.5-archive.tar.xz
sudo mkdir /usr/local/cudnn-10.1-7.6.5.32
sudo tar -xzf cudnn-10.1-linux-x64-v7.6.5.32.tgz -C /usr/local/cudnn-10.1-7.6.5.32 --strip 1

sudo mkdir /usr/local/cudnn-10.2-7.6.5.32
sudo tar -xzf cudnn-10.2-linux-x64-v7.6.5.32.tgz -C /usr/local/cudnn-10.2-7.6.5.32 --strip 1

sudo mkdir /usr/local/cudnn-11.0-8.0.5.39
sudo tar -xzf cudnn-11.0-linux-x64-v8.0.5.39.tgz -C /usr/local/cudnn-11.0-8.0.5.39 --strip 1

sudo mkdir /usr/local/cudnn-11.2-8.1.1.33
sudo tar -xzf cudnn-11.2-linux-x64-v8.1.1.33.tgz -C /usr/local/cudnn-11.2-8.1.1.33 --strip 1
4. Run your application or training with LD_LIBRARY_PATH

Your TensorFlow-using application will load the TF language support .so,
which will load libtensorflow.so, which will dynamically load the various libraries.
We'll use the LD_LIBRARY_PATH environmental variable to tell the dynamic library
loader where to look for the shared library files, first. That way we'll force Tensorflow
to load the very specific CUDA and cuDNN libraries that are compatible with it.
In addition, the Tensorflow developers seem to like hardcoding paths and values,
such as the path for the ptxas binary, and thus you'll encounter this error.
2020-03-01 13:19:42.121134: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.

Then your program will hang for a good minute or so. So nice.
As @DawyD informs us in his comment to #33375,
the currently accepted solution to this issue is to symbolically link to the CUDA version's bin directory
from the current working directory from which you execute your Tensorflow application.
Clarification.
You remember the TF compatibility matrix?
If you hit
2020-03-04 18:49:06.955169: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

you need to work around that Tensorflow issue by setting this environmental variable:
export TF_FORCE_GPU_ALLOW_GROWTH=true
or alternatively you can modify your Python code with
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
TensorFlow 1.15.0

$ ln -snf /usr/local/cuda-10.0.130/bin bin
$ LD_LIBRARY_PATH=/usr/local/cuda-10.0.130/lib64:/usr/local/cudnn-10.0-7.4.2.24/lib64 python -c 'import tensorflow'
TensorFlow 2.0.0

$ ln -snf /usr/local/cuda-10.1.243/bin bin
$ LD_LIBRARY_PATH=/usr/local/cuda-10.1.243/lib64:/usr/local/cuda-10.1.243/extras/CUPTI/lib64:/usr/local/cudnn-10.1-7.6.5.32/lib64 python -c 'import tensorflow'
TensorFlow 2.1.0

$ ln -snf /usr/local/cuda-10.2.89/bin bin
$ LD_LIBRARY_PATH=/usr/local/cuda-10.2.89/lib64:/usr/local/cuda-10.2.89/extras/CUPTI/lib64:/usr/local/cudnn-10.2-7.6.5.32/lib64 python -c 'import tensorflow'
TensorFlow 2.4.1

$ ln -snf /usr/local/cuda-11.0.3/bin bin
$ LD_LIBRARY_PATH=/usr/local/cuda-11.0.3/lib64:/usr/local/cuda-11.0.3/extras/CUPTI/lib64:/usr/local/cudnn-11.0-8.0.5.39/lib64 python -c 'import tensorflow'
TensorFlow 2.5 nightly

$ ln -snf /usr/local/cuda-11.2.2/bin bin
$ LD_LIBRARY_PATH=/usr/local/cuda-11.2.2/lib64:/usr/local/cuda-11.2.2/extras/CUPTI/lib64:/usr/local/cudnn-11.2-8.1.1.33/lib64 python -c 'import tensorflow'