Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Installing Tensorflow on CENTOS 6.8 Cluster without Root Access


OS: CENTOS 6.8 (No root access)

GCC: locally installed 5.2.0 (Cluster default is 4.4.7)

Bazel: 0.4.0-2016-11-06 (@fa407e5)

Tensorflow: v0.11.0rc2

CUDA: 8.0

CUDNN: 5.1.5


You should be able to modify the script ( below to do these steps automatically, but I list out details here as well.

Installing Java Locally:

Follow this Tutorial or download prefered version of JDK 8.0 and set proper environment variables as described in the tutorial.

Compiling Bazel, Compiling and Installing Tensorflow:

Great Tutorial that got me to the error below!

Note: After change the linker line to your local or module GCC, If you get errors about finding ld, or other executables that are stored in /usr/bin here is the work around I used (it isn't pretty and you might not need it, but just in case):

  1. Copy your compiler directory (/opt/gcc/5.2.0) to a local directory that you have permissions to modify.

  2. Then run:

cp `which ld` /opt/gcc/5.2.0/bin/ld (repeat for any command listed in the crosstools that doesn't already reside in your gcc /bin directory)

Note2: I downloaded a newer release of bazel and tensorflow as noted above and there are fewer changes required in the latest versions of the crosstool then described in the tutorial.

  1. modify /tensorflow/third_party/gpus/crosstool/CROSSTOOL.tpl as described in tutorial above

  2. modify /tensorflow/third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl as described in tutorial above. I did not change the first line: #!/usr/bin/env python (but the tutorial does!)

Again these steps led to the below error which took me forever to get past:

GLIBCXX_3.4.18 not found error

Getting Past GBLICXX_3.4.18 Error:

As described in gbkedar's comment from Jul 12. You have to find this file:


But, until the compile fails this file is harder to find. (The re-runs the compile after modifying the file after the first failure). The failure creates the shortcut in the /tensorflow directory. I was running into issues re-attempting the compile and had to run ./configure almost everytime. Therefore, I had to find this file before the first failure of my compile attempt. The file should be located somewhere similar to this after running ./configure from the /tensorflow directory:

~/.cache/bazel/_bazel_YOURUSERNAME/YOURHASH(i.e. f81f1107f96c7515450fc43e0dbb6ed5)/external/protobuf/protobuf.bzl

If you have several hashes, check the files that were modified at the time corresponding to your ./configure run.

As described in the error link above, search for ctx.action and add env=ctx.configuration.default_shell_env, at the bottom of the call like so:

  if args:
        arguments=args + import_flags + [s.path for s in srcs],

You will then likely hit error trying to exec 'as': execvp: No such file or directory. Since I am a self-confessing linux noob, you have to use the few tricks you know as much as possible(I didn't follow gbkedar's 2nd comment):

cp `which as` /opt/gcc/5.2.0/bin/as

After this change, tensorflow finally compiled successfully for me!

Building .whl file:

Going back to our tutorial I ran this command:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

and received the bdist_wheel not found error... I solved this by using pip install to install a new version of wheel locally:

pip install --target=/home/thpaul/python27-packages wheel

and then added that directory to my $PYTHONPATH variable:

export PYTHONPATH=/home/thpaul/python27-packages/:$PYTHONPATH

Re-running the command builds the proper .whl file which you can install via pip.

Hope this helps anyone trying to compile tensorflow from source!

#installs TF and all required dependencies except CUDNN* without root!
#*Requires signing up for account to download! (Pretty easy, but do this first!)
#Original Environment: CENTOS 6.8, non-standard GCC = 5.2.0
#To note, I copied every binary (ld, as, etc..) required by BAZEL (see tensorflow CROSSTOOL.tpl)
#into my GCC_DIR!
#TODO: There are a couple TODO's listed that will be system specific!
# Ensure we can load CUDA drivers.
module load cuda/8.0 || { echo 'Failed to load CUDA drivers. Are you not on a compute node?' ; exit 1; }
#TODO: GCC_DIR/LOCAL_INCLUDE/LOCAL_LIBRARY if not standard system gcc (which gcc)
BAZEL_BIN_DIR=/work/thpaul/bin #/bin where to copy bazel binary
JAVA_DIR=jdk1.8.0_102 #Directory you jdk.tar file extracts too (depends on which version you DL)
JAVA_FILE=jdk-8u102-linux-x64 #Update Java version in DOWNLOADS too...
BAZEL_VERSION=0.4.0 #TAG from github, don't use if latest release
TF_VERSION=v0.11.0rc2 #TAG from, don't use if latest release
wget --no-check-certificate -O setuptools-1.4.2.tar.gz
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie"
wget #TODO: update if newer version needed (3.8.6)
echo "Buidling Directories"
mkdir -p $STARTDIR
cd ..
# Set tmp directory to userspace
mkdir -p tmp
cd tmp
cd ..
#Unzip archives
echo "Decompressing archives"
tar zxvf ../Python-$PYTHON_VERSION.tgz
tar --totals -xvf ../setuptools-1.4.2.tar.gz
tar --totals -xvf ../$JAVA_FILE.tar.gz
tar --totals -xvf ../sqlite-autoconf-3150100.tar.gz
cd sqlite-autoconf-3150100
echo "Installing sqlite3 libs at `pwd`!"
./configure --enable-shared --prefix=$SQLITE_INSTALL_DIR
make install
cp ./include/* $LOCAL_INCLUDE
cp ./lib/* $LOCAL_LIBRARY
cd ..
echo "Installing python at $PYTHON_INSTALL_DIR"
#TODO: Have to change to look in local include file for sqlite3 libraries
sed -i 's#/usr/local/include/sqlite3#'$LOCAL_INCLUDE'#g' ./
./configure --enable-shared --prefix=$PYTHON_INSTALL_DIR --enable-loadable-sqlite-extensions #TODO: need sqlite3 for nltk and others
make altinstall
cd ..
echo "----- Installing Pip"
cd setuptools-1.4.2
$PYTHON_INSTALL_DIR/bin/python2.7 install
curl | $PYTHON_INSTALL_DIR/bin/python2.7 -
pip install --no-cache-dir numpy
pip install -U nltk
cd ..
echo "Installing JAVA at `pwd`"
#Save JAVA variables:
export JAVA_JRE=$JAVA_INSTALL_DIR/jdk1.8.0_102/jre
export PATH=$PATH:$JAVA_INSTALL_DIR/jdk1.8.0_102/bin:$JAVA_INSTALL_DIR/jdk1.8.0_102/jre/bin
cd ..
echo "Compiling bazel in `pwd`/bazel"
git clone
git checkout $BAZEL_VERSION #TODO: Format Specific to your git version, only need if not using bazel latest-release
cd bazel
wait #TODO: Seems to want to compile twice???
cp ./output/bazel $BAZEL_BIN_DIR/bazel
cd ..
echo "Compiling tensorflow in `pwd`/tensorflow"
git clone
git checkout $TF_VERSION #TODO: Format Specific to your git version if not latest-release
cd tensorflow
# TODO: Adjust the configure file only if .cache is on an NFS and clean fails:
cp configure configure_orig #just in case
sed -i 's/bazel clean --expunge/bazel clean --expunge_async/g' configure
#Modify tensorflow CROSSTOOL.tpl file:
cp $TF_INSTALL_DIR/third_party/gpus/crosstool/CROSSTOOL.tpl ./third_party/gpus/crosstool/CROSSTOOL_ORIG.tpl
sed -i 's#/usr/bin#'$GCC_DIR'/bin#g' $TF_INSTALL_DIR/third_party/gpus/crosstool/CROSSTOOL.tpl
#Modify tensorflow crosstool_wrapper_driver_is_not_gcc.tpl file
cp $TF_INSTALL_DIR/third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl \
sed -i 's#/usr/bin/gcc/#'$GCC_DIR'/bin/gcc#g' \
#Can't just use the basic call from install directions:
bazel build -c opt --config=cuda --genrule_strategy=standalone --spawn_strategy=standalone //tensorflow/tools/pip_package:build_pip_package
#TODO: When/If fails with GBLICXX... error, run this afterwards:
PROTOFILE=$(readlink -f -- "$TF_INSTALL_DIR/bazel-tensorflow/external/protobuf/protobuf.bzl")
cp $TF_INSTALL_DIR/bazel-tensorflow/external/protobuf/protobuf.bzl $TF_INSTALL_DIR/bazel-tensorflow/external/protobuf/ORIG_protobuf.bzl
sed -i 's/mnemonic="ProtoCompile",/mnemonic="ProtoCompile", env=ctx.configuration.default_shell_env,/g' \
bazel build -c opt --config=cuda --genrule_strategy=standalone --spawn_strategy=standalone //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package $TMPDIR/tensorflow_pkg
#Get name of the created whl file:
for filename in $TMPDIR/tensorflow_pkg/*;
export TF_WHEEL_FILE=$filename
#Finally install TF!
pip install $TF_WHEEL_FILE
echo "====================CAVEATS=============================="
echo "Don't forget to update necessary Environment Variables for in .bash_profile!"
echo 'echo "export PATH='$JAVA_INSTALL_DIR'/bin:'$JAVA_INSTALL_DIR'jre/bin:$PATH" >> ~/.bash_profile'
echo 'echo "export PATH='$PYTHON_INSTALL_DIR'/bin:'$BAZEL_BIN_DIR'/bin:$PATH" >> ~/.bash_profile'
echo 'echo "export JAVA_HOME='$JAVA_INSTALL_DIR'" >> ~/.bash_profile'
echo 'echo "export JAVA_JRE='$JAVA_INSTALL_DIR'/jre" >> ~/.bash_profile'

Worked for me with minor tweaks on Scientific Linux 6.6 -- thank you!

i3v commented Dec 1, 2016

This answer describes another possible approach to fixing the problem with "as" - to hardlink "as","ld", and "nm" when building gcc. There's also a link to a related issue on TF github, where you can find a link to an issue on bazel github. For now - it is still open, so maybe they would fix it sometime.

mrdivine commented Feb 9, 2017

I have been fighting with this for the past two days. Looking through your script, I begin to recount the hardships I have endured. Quick question-- would this work for Cuba 7.5? The reason I ask is because that's what's installed on the cluster already, and I can't seem to install updated drivers i.e. cuda 8 without root privileges. Any suggestion?


taylorpaul commented Mar 2, 2017 edited

@mrdivine. Sorry for taking so long to get back! I asked the maintainers of my cluster to install cuda 8.0. They were willing since we already had 7.5 installed. That being said, I am pretty sure this should work with older versions of cuda. You just have to provide the location of your cuda library anytime the tensorflow install asks for it. And be sure to install CUDNN locally following the link at the top of the script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment