qxj/Build tensorflow on macOS with NVIDIA CUDA support.md

## Build tensorflow on macOS with NVIDIA CUDA support.md

      
    Raw
  

              Build tensorflow on macOS with NVIDIA CUDA support.md
            
          
    Build tensorflow on macOS with NVIDIA CUDA support (GPU acceleration)

Reference:

Official instruction Install Tensorflow from source
smitshilu/Tensorflow_Build_GPU.md


Software
version


Tensorflow
1.5


python
2.7, 3.3-3.6


Command Line Tools
8.3.2


Bazel
0.7.0


CUDA
9.0


cuDNN
7.0.4


[TOC]
Requirements

OS X 10.10 (Yosemite) or newer

I tested these intructions on OS X v10.13.6.
Xcode Command-Line Tools

Download Command Line Tool for Xcode 8.3.2 from Apple website, because CUDA 9.0 doesn't support versions of clang >= 9.0.0.:
https://download.developer.apple.com/Developer_Tools/Command_Line_Tools_for_Xcode_8.3.2/CommandLineToolsforXcode8.3.2.dmg
Then install, and make it as default:
sudo xcode-select --switch /Library/Developer/CommandLineTools

Verify clang version:
$ clang -v
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

homebrew

To compile tensorflow on OS X, you need several dependent libraries. The easiest way to
get them is to install them with the homebrew package manager.
If you don't already have brew installed, you can install it like this:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
If you don't want to blindly run a ruby script loaded from the internet, they have
alternate install options.
coreutils, swig

First, make sure you have brew up to date with the latest available packages:
brew update
brew upgrade
Then install these tools:
brew install coreutils
brew install swig
bazel

Install bazel from binary installer by bazel installation guide, download bazel-0.7.0-installer-darwin-x86_64.sh:
https://github.com/bazelbuild/bazel/releases/download/0.7.0/bazel-0.7.0-installer-darwin-x86_64.sh
chmod +x bazel-0.7.0-installer-darwin-x86_64.sh
./bazel-0.7.0-installer-darwin-x86_64.sh --user

Check the version to make sure you installed bazel 0.7.0 or greater.
$ bazel version
Build label: 0.7.0
Build target: bazel-out/darwin_x86_64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Oct 18 14:25:46 2017 (1508336746)
Build timestamp: 1508336746
Build timestamp as int: 1508336746

OpenMP library

clang8 has no built-in support for OpenMP, which is needed to build tensorflow (-lgomp). We have to install openmp by ourself. Firstly, download from here:
http://releases.llvm.org/5.0.0/openmp-5.0.0.src.tar.xz
Then uncompress and build with cmake:
$ cd openmp-5.0.0.src
$ mkdir build && cd build && cmake ..
$ make && make install
NVIDIA's CUDA && cuDNN

NVIDIA's CUDA libraries

NVIDIA requires you to sign up and be approved before you can download this.
First, go sign up here:
https://developer.nvidia.com/accelerated-computing-developer
When you sign up, make sure you provide accurate information. A human at NVIDIA will
review your application. If it's a business day, hopefully you'll get approved quickly.
Then go here to download CUDA Toolkit 9.0:
https://developer.nvidia.com/cuda-toolkit-archive

CUDA Toolkit 9.0

Then install them.
Then upgrade CUDA Driver and make it compatible to your GPU Driver version.

Open "System Preferences" > "CUDA Preference", then click "Install CUDA Update".

For example, my current CUDA and GPU version (2018-08-25):
CUDA Driver Version: 396.148
GPU Driver Version: 387.10.10.10.40.105

NVIDIA's cuDNN library

Then go here to download cuDNN:
https://developer.nvidia.com/rdp/cudnn-archive
On that page, download cuDNN v7.0.4 (Nov 13, 2017), for CUDA 9.0:
https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.4/prod/9.0_20171031/cudnn-9.0-osx-x64-v7
Next, you need to manually install it by copying over some files:
tar zxvf ~/Downloads/cudnn-9.0-osx-x64-v7.tgz
sudo cp ./cuda/cudnn.h /usr/local/cuda/include/
sudo cp ./cuda/libcudnn* /usr/local/cuda/lib/

Finally, you need to make sure the library is in your library load path.
Edit your ~/.bash_profile file and add this line at the bottom:
export PATH=/Developer/NVIDIA/CUDA-9.0/bin${PATH:+:${PATH}}
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=$CUDA_HOME/lib${DYLD_LIBRARY_PATH:+:${DYLD_LIBRARY_PATH}}
export DYLD_LIBRARY_PATH=$CUDA_HOME/extras/CUPTI/lib${DYLD_LIBRARY_PATH:+:${DYLD_LIBRARY_PATH}}
After that, close and reopen your terminal window to apply the change.
Verify CUDA installation

There is no nvidia-smi tool on macOS, use cuda-smi as alternative.
git clone https://github.com/phvu/cuda-smi.git
cd cuda-smi
./compile.sh
cp cuda-smi $HOME/bin

Run cuda-smi to verify CUDA driver works:
$ cuda-smi
Device 0 [PCIe 0:1:0.0]: GeForce GTX 970 (CC 5.2): 365.28 of 4095.7 MB (i.e. 8.92%) Free

Look up your NVIDIA card's Graphics Capability on the CUDA website

Before you start, open up System Report in macOS:

Apple Menu > About this Mac > System Report...

In System Report, click on "Graphics/Displays" and find out the exact model
NVIDIA card you have:
NVIDIA GeForce GTX 970:

  Chipset Model:	NVIDIA GeForce GTX 970

Then go to https://developer.nvidia.com/cuda-gpus and find that exact model
name in the list:

CUDA-Enabled GeForce Products > GeForce GTX 970

There it will list the Compute Capability for your card. For the GeForce GTX 970, it is 5.2. Write this down as it's
critical to have this number for the next step.
Prepare tensorflow source code

Download the tensorflow v1.5 branch:
https://github.com/tensorflow/tensorflow/tree/r1.5
Uncompress downloaded code package and REMOVE ALL __align(sizeof(T))__ from following files:

tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc
tensorflow/core/kernels/split_lib_gpu.cu.cc
tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc

For example, extern shared __align(sizeof(T))__ unsigned char smem[]; => extern __shared__ unsigned char smem[];.
$ for i in $(echo "tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc tensorflow/core/kernels/split_lib_gpu.cu.cc tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc"); do \
sed -i'' -e 's/__align__(sizeof(T))//g' $i; done
Configure and Build tensorflow

You will first need to configure the tensorflow build options:
./configure

During the config process, it will ask you a bunch of questions. You can use
the answers below except make sure to use the Compute Capability for your NVIDIA card
you looked up in the previous step:
You have bazel 0.7.0 installed.
Please specify the location of python. [Default is /usr/local/opt/python@2/bin/python2.7]:

Found possible Python library paths:
  /usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages
Please input the desired Python library path to use.  Default is [/usr/local/Cellar/python@2/2.7.14_3/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages]

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]:
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:

Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]:

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]5.2

Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.

Configuration finished

Now you can actually build and install tensorflow! Enable --verbose_failures option to trace error message.
bazel build --config=opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build --verbose_failures --config=opt --config=cuda --action_env PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-1.5.1-cp27-cp27m-macosx_10_13_x86_64.whl

Verify Installaion

You need to exit the tensorflow build folder to test your installation.
cd ~

Now, run python and paste in this test script:
import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.
print sess.run(c)
You should get output that looks something like this:
2018-08-25 21:07:11.405558: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2018-08-25 21:07:11.405662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0
with properties:
name: GeForce GTX 970 major: 5 minor: 2 memoryClockRate(GHz): 1.329
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 1.75GiB
2018-08-25 21:07:11.405678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0, compute
capability: 5.2)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0, compute capability: 5.2
2018-08-25 21:07:11.700406: I tensorflow/core/common_runtime/direct_session.cc:297] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0, compute capability: 5.2
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2018-08-25 21:07:32.102781: I tensorflow/core/common_runtime/placer.cc:874] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-08-25 21:07:32.102795: I tensorflow/core/common_runtime/placer.cc:874] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2018-08-25 21:07:32.102803: I tensorflow/core/common_runtime/placer.cc:874] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
 [49. 64.]]

Yay! Now you can train your models with GPU acceleration!
Software	version
Tensorflow	1.5
python	2.7, 3.3-3.6
Command Line Tools	8.3.2
Bazel	0.7.0
CUDA	9.0
cuDNN	7.0.4