Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Installing TensorFlow on EC2
# Note – this is not a bash script (some of the steps require reboot)
# I named it .sh just so Github does correct syntax highlighting.
# This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
# The CUDA part is mostly based on this excellent blog post:
# Install various packages
sudo apt-get update
sudo apt-get upgrade -y # choose “install package maintainers version”
sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-jdk zip zlib1g-dev
# Blacklist Noveau which has some kind of conflict with the nvidia driver
echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
sudo reboot # Reboot (annoying you have to do this in 2015!)
# Some other annoying thing we have to do
sudo apt-get install -y linux-image-extra-virtual
sudo reboot # Not sure why this is needed
# Install latest Linux headers
sudo apt-get install -y linux-source linux-headers-`uname -r`
# Install CUDA 7.0 (note – don't use any other version)
chmod +x
./ -extract=`pwd`/nvidia_installers
cd nvidia_installers
sudo ./
sudo modprobe nvidia
sudo ./
# Install CUDNN 6.5 (note – don't use any other version)
# YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE – it's not available online.
# You need to register and get approved to get a download link. Very annoying.
tar -xzf cudnn-6.5-linux-x64-v2.tgz
sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/
# At this point the root mount is getting a bit full
# I had a lot of issues where the disk would fill up and then Bazel would end up in this weird state complaining about random things
# Make sure you don't run out of disk space when building Tensorflow!
sudo mkdir /mnt/tmp
sudo chmod 777 /mnt/tmp
sudo rm -rf /tmp
sudo ln -s /mnt/tmp /tmp
# Note that /mnt is not saved when building an AMI, so don't put anything crucial on it
# Install Bazel
cd /mnt/tmp
git clone
cd bazel
git checkout tags/0.1.0
sudo cp output/bazel /usr/bin
# Install TensorFlow
cd /mnt/tmp
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
git clone --recurse-submodules
cd tensorflow
# Patch to support older K520 devices on AWS
# wget ""
# git apply cuda_30.patch
# According to this patch is no longer needed
# Instead, you need to run ./configure like below (not tested yet)
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
# Build Python package
# Note: you have to specify --config=cuda here - this is not mentioned in the official docs
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
# Test it!
cd tensorflow/models/image/cifar10/
# On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
# On a g2.8xlarge: step 100, loss = 4.49 (337.9 examples/sec; 0.379 sec/batch)
# doesn't seem like it is able to use the 4 GPU cards unfortunately :(
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.