To check if tensorflow can detect any GPU just run
python -c "import tensorflow as tf;print(tf.test.is_gpu_available(True))"
If the GPUs are available it should print True.
The easiest way to setup cuda and tensorflow-gpu is to install everything using anaconda. Anaconda do not requires root permissions and create a self-contained folder in $HOME/anaconda3.
# Download anaconda
wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh -O anaconda.sh
# Quiet setup, the default installation path is $HOME/anaconda3
bash ./anaconda.sh -b
# Setup anaconda as the default python
echo "export PATH=\$PATH:$HOME/anaconda3/bin" >> $HOME/.bashrc
# Activate the default anaconda environment
$HOME/anaconda3/bin/conda init
If everything went well the shell should look like something like this:
(base) user$
and to double check just run python:
(base) user$ python
Python 3.8.3 (default, Jul 2 2020, 16:21:59)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
It should print the GCC version and Anaconda in the Header.
This step is not needed for SLURM clusters.
If the drivers are available and correctly installed you should be able to run:
nvidia-smi
Which should print something like:
Tue Aug 4 08:17:14 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... Off | 00000000:09:00.0 On | N/A |
| 0% 40C P8 26W / 235W | 546MiB / 7979MiB | 6% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1228 G /usr/lib/Xorg 262MiB |
| 0 1473 G /usr/bin/gnome-shell 133MiB |
| 0 2653 G ...AAAAAAAAAAAACAAAAAAAAAA= --shared-files 61MiB |
| 0 44439 G ...uest-channel-token=13355294416564996066 57MiB |
| 0 45602 G /usr/bin/alacritty 13MiB |
| 0 50497 G /usr/bin/alacritty 13MiB |
+-----------------------------------------------------------------------------+
Before installing we must install all the dependancies
sudo apt-get install build-essential freeglut3 freeglut3-dev libxi-dev libxmu-dev linux-headers-$(uname -r)
To install the cuda toolkit for Cuda 10.1 for ubuntu 18.04 run:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget http://developer.download.nvidia.com/compute/cuda/11.0.2/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
For other versions check this website and follow the instructions.
First of all you must uninstall tensorflow if already installed. If you install tensorflow-gpu over a previous version the installation will work but you won't be able to use the GPUs.
pip uninstall tensorflow
Install the drivers:
conda install cuda
Install tensorflow-gpu
conda install tensorflow-gpu
If everything went well this should output True:
python -c "import tensorflow as tf;print(tf.test.is_gpu_available(True))"