Skip to content

Instantly share code, notes, and snippets.

@ljaraque
Last active March 12, 2022 10:32
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save ljaraque/d18d3dd198dcff3bc40cbe91889564d0 to your computer and use it in GitHub Desktop.
Save ljaraque/d18d3dd198dcff3bc40cbe91889564d0 to your computer and use it in GitHub Desktop.
Install tensorflow-gpu in ubuntu

Install tensorflow-gpu1.8 in ubuntu18.04 with CUDA9.2, cuDNN7.2.1 and NVIDIA Driver 396

ljaraque@yahoo.com

Overview

This is a summary of the process I lived in order to enable my system with CUDA9.2, cuDNN7.2.1, Tensorflow1.8 and NVIDIA GEFORCE GTX860M GPU. You can just skip the steps marked with FAILED. I decided to keep them there in order to be useful for others who tried those paths too.

FAILED (Next section is successfull) Install NVIDIA driver (FAILED, THIS WILL INSTALL DRIVER 390 which is not compatible with CUDA9.2):

ubuntu-drivers devices
sudo ubuntu-drivers autoinstall
sudo poweroff
lshw
lspci
nvidia-smi

Install NVIDIA driver 396:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt install nvidia-driver-396
sudo apt install nvidia-settings
sudo reboot

Install CUDA9.2:

Download CUDA from Link, and install it with:
(In this case CUDA9.2)

sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libglfw3-dev libgles2-mesa-dev
sudo chmod +x cuda_9.2.148_396.37_linux\(2\).run
./cuda_9.2.148_396.37_linux\(2\).run --override

Choose unsupported configuration.

Choose NOT to install NVIDIA Driver, since it is already installed.

If there is any patches available installed them with the same procedure.

Install cuDNN7.2.1:

Download the cuDNN version corresponding to the installed CUDA version from Link:
(In this case cuDNN cuDNN v7.2.1 (August 7, 2018), for CUDA 9.2)

tar -zxvf cudnn-9.2-linux-x64-v7.2.1.38.tgz
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-9.2/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda-9.2/include/
sudo chmod a+r /usr/local/cuda-9.2/include/cudnn.h  /usr/local/cuda/lib64/lib
sudo chmod a+r /usr/local/cuda-9.2/include/cudnn.h  /usr/local/cuda/lib64/libcudnn*
sudo apt-get install libcupti-dev

Add to ~/.bashrc the following:

#for Tensorflow
export PATH=/usr/local/cuda-9.2/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

FAILED (Next section is successfull) Install Tensorflow-gpu with pip install (FAILED because required CUDA9.0 libraries):

Install Tensorflow-gpu:

pip3 install --upgrade tensorflow-gpu

Test it with:

import tensorflow as tf
hello = tf.constant('Hello, Tensorflow!')
sess = tf.Session()
print(sess.run(hello))

Building tensorflow-gpu from source

Install bazel 0.13.1 with any method. Might be already installed, or the following can help:

sudo apt-get install openjdk-8-jdk
wget https://github.com/bazelbuild/bazel/releases/download/0.13.1/bazel_0.13.1-linux-x86_64.deb
sudo dpkg -i bazel_0.13.1-linux-x86_64.deb
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
# If any package is broken try:
sudo apt --fix-broken install
# Install again:
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
# in another terminal get nccl_2.2.12-1+cuda9.2_x86_64.txz from NVIDIA and do:
tar -xf nccl_2.2.12-1+cuda9.2_x86_64.txz
cd nccl_2.2.12-1+cuda9.2_x86_64/
# with sudo create directory /usr/local/cuda-9.2/targets/x86_64-linux
sudo cp -R * /usr/local/cuda-9.2/targets/x86_64-linux/

# Verify that the following directory exists, otherwise create it:
# /usr/lib/x86_64-linux-gnu
sudo ldconfig
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow/
git pull
git checkout r1.8
./configure

# Specify the following options:
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: Y
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: Y
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.2
Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-9.2
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.2.1
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/lib/x86_64-linux-gnu
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: 2.2
Please specify the location where NCCL 2 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-9.2]: /usr/local/cuda-9.2/targets/x86_64-linux
# If ubuntu18.04 does not come with /usr/bin/python, you will get an error, so create it as symlink to /usr/bin/python3 or /usr/bin/python3.6 with:
sudo ln -s /usr/bin/python3.6 /usr/bin/python

Then try again the configure step.

bazel build --config=opt --config=mkl --config=monolithic --verbose_failures //tensorflow/tools/pip_package:build_pip_package

A result like the following must be obtained:

INFO: Elapsed time: 7684.887s, Critical Path: 253.30s
INFO: 12300 processes, local.
INFO: Build completed successfully, 15101 total actions

Create PIP Package:

bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg

A directory tensorflow_pkg will be created.

Installation of Built PIP Tensorflow-1.8 Package in a Virtualenv:

(Activate Virtualenv here)

cd tensorflow_pkg
pip3 install tensorflow*.whl

We should get something ending with:

...
Installing collected packages: tensorboard, tensorflow
  Found existing installation: tensorboard 1.10.0
    Uninstalling tensorboard-1.10.0:
      Successfully uninstalled tensorboard-1.10.0
Successfully installed tensorboard-1.8.0 tensorflow-1.8.0

Then test the installation with:

import tensorflow as tf
hello = tf.constant('Hello, Tensorflow!')
sess = tf.Session()
print(sess.run(hello))

Note: If you get errors mentioning that version of Driver is not sufficient means and incompatibility between CUDA/Tensorflow versions with NVIDIA Driver.

References:

Reference_1: https://medium.com/@taylordenouden/installing-tensorflow-gpu-on-ubuntu-18-04-89a142325138
Reference_2: https://hk.saowen.com/a/a9cc5b7c90a6f350850d8554c018f7415981fc8d470b481c90afd7573f5e12cd
Reference_3: http://machinelearninguru.com/deep_learning/tensorflow/installation/install_from_the_source.html
Reference_4: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1710&target_type=runfilelocal
Reference_5: https://medium.com/@asmello/how-to-install-tensorflow-cuda-9-1-into-ubuntu-18-04-b645e769f01d
Reference_6: Compute Capability of NDIVIA GPUs
Reference_7: Install NVIDIA Driver 396 from PPA (Luis Alvarado Answer)
Reference_8: Compatibility CUDA vs NVIDIA Driver Version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment