- Install tensorflow:
sudo apt install tensorflow-cuda-latest - Install cuda:
sudo apt install system76-cuda-latest
- Download latest miniconda installer from https://docs.conda.io/en/latest/miniconda.html
- Install it:
bash Miniconda3-latest-Linux-x86_64.sh - Create conda environment with tensorflow-gpu support:
- v2:
conda create --name ENV_NAME_HERE tensorflow-gpu - v1:
conda create --name ENV_NAME_HERE tensorflow-gpu==1.15
- v2:
- Activate new environment:
conda activate ENV_NAME_HERE
(Source: tensorflow/tensorflow#35860 (comment))
- Add
options nvidia "NVreg_RestrictProfilingToAdminUsers=0"to/etc/modprobe.d/nvidia-kernel-common.conf - Reboot
For some reason tensorflow 2 tries to run ptxas based on a relative path, and not from the system path. So the workaround to fix this is to just create a symlink to it inside the project directory:
mkdir bin
ln -s `which ptxas` bin/
(Sources: https://dmitry.ai/t/topic/50/2 and https://stackoverflow.com/questions/44232898/memoryerror-in-tensorflow-and-successful-numa-node-read-from-sysfs-had-negativ/44233285#44233285)
Add to /etc/crontab:
@reboot root for a in /sys/bus/pci/devices/*; do echo 0 | tee -a $a/numa_node; done > /dev/null