Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stevexyz/2020f44a0cd60e094128b57e90f1d159 to your computer and use it in GitHub Desktop.
Save stevexyz/2020f44a0cd60e094128b57e90f1d159 to your computer and use it in GitHub Desktop.
Build TensorFlow 1.3 with SSE4.1/SSE4.2/AVX/AVX2/FMA and NVIDIA CUDA support on macOS Sierra 10.12

Build TensorFlow 1.8 with SSE4.1/SSE4.2/AVX/AVX2/FMA on Linux

These instructions were inspired by Mistobaan's gist, ageitgey's gist, and mattiasarro's tutorial.

Background

I always encountered the following warnings when running my scripts using the precompiled TensorFlow Python package:

I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

I realized I can make these warnings go away by compiling from source, in addition to improve training speed. It was not as easy and straightforward as I thought, but I finally succeeded in creating a working build. Here I outline the steps I took, in the hopes it may benefit those who have encountered similar challenges.

Machine setup

Hardware

  • Model: Dell XPS
  • Processor: ...
  • Memory: ...
  • Graphics: ...

Software

  • OS: Linux 16.04
  • TensorFlow version: 1.8.0rc1
  • Python version: 3.5.2
  • Bazel version: 0.12.0
  • CUDA/cuDNN version: ...

Prerequisites

_

Steps

Note: Many steps were based on https://www.tensorflow.org/install/install_sources

  • Verify that the following packages are installed:
    • six
    • numpy
    • has to be at least 1.13 so you don't get a ModuleNotFoundError: No module named 'numpy.lib.mixins' error later on during bazel build
    • wheel
  • Clone the TensorFlow repository (instructions): be sure to checkout the r1.3 release
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow
    git checkout r1.8
  • Configure the installation
    bazel clean
    ./configure
    My configure settings (Enter N for CUDA support if you do not want CUDA support or do not have a NVIDIA GPU):
    Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
    Found possible Python library paths:  
      /usr/local/lib/python3.5/dist-packages
      /usr/lib/python3/dist-packages
    Please input the desired Python library path to use.  Default is [/usr/local/lib/python3.5/dist-packages]
    
    Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: 
    jemalloc as malloc support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n    
    No Google Cloud Platform support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
    No Hadoop File System support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
    No Amazon S3 File System support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
    No Apache Kafka Platform support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with XLA JIT support? [y/N]: 
    No XLA JIT support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with GDR support? [y/N]: 
    No GDR support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with VERBS support? [y/N]: 
    No VERBS support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
    No OpenCL SYCL support will be enabled for TensorFlow.
    
    Do you wish to build TensorFlow with CUDA support? [y/N]: 
    No CUDA support will be enabled for TensorFlow.
    
    Do you wish to download a fresh release of clang? (Experimental) [y/N]: 
    Clang will not be downloaded.
    
    Do you wish to build TensorFlow with MPI support? [y/N]: 
    No MPI support will be enabled for TensorFlow.
    
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
    
    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: 
    Not configuring the WORKSPACE for Android builds.
    
    Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
        --config=mkl         	# Build with MKL support.
    	--config=monolithic  	# Config for mostly static monolithic build.
    Configuration finished
    
    
  • Build the pip package (reference: https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions). It took a lot!
    bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 --verbose_failures -k //tensorflow/tools/pip_package:build_pip_packa
  • Refer to tensorflow/tensorflow#6729 if you run into any other problems
  • Build the wheel (.whl) file
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
  • Install the pip package
    pip3 install --upgrade --ignore-installed /tmp/tensorflow_pkg/tensorflow-1.8.0rc1-cp35-cp35m-linux_x86_64.whl
  • Validate your installation (instructions)
    • Change directory to any directory on your system other than the tensorflow subdirectory from which you ran ./configure
      cd ~
    • Invoke python interactive shell
      python
    • Type in the following script
      import tensorflow as tf
      with tf.device('/gpu:0'):
          a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
          b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
          c = tf.matmul(a, b)
      
      with tf.Session() as sess:
          print (sess.run(c))

Have fun training your models!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment