Largely based on the Tensorflow 1.6 gist, this should hopefully simplify things a bit. Mixing homebrew python2/python3 with pip ends up being a mess, so here's an approach to uses the built-in python27.
- NVIDIA Web-Drivers 387.10.10.10.25.156 for 10.13.3
- CUDA-Drivers 387.178
- CUDA 9.1 Toolkit
- cuDNN 7.0.5 (latest release for mac os)
- Python 2.7
- XCode 8.3.2
- bazel 0.10.0
- Tensorflow 1.7
Download and install from http://www.nvidia.com/download/driverResults.aspx/130460/en-us
Download and install from http://www.nvidia.com/object/macosx-cuda-387.178-driver.html
I was able to compile all of it on XCode9, but tensorflow promptly segfaults if you actually try to do anything on the gpu. You may need a developer account to grab the old version https://developer.apple.com/download/more/
If you have newer Xcode installed, rename the XCode.app to something like Xcode9.app Unpack XCode 8.3.2 and switch the tool chain over to it:
sudo xcode-select -s /Applications/Xcode.app
Download the binary here
chmod 755 bazel-0.10.0-installer-darwin-x86_64.sh
./bazel-0.10.0-installer-darwin-x86_64.sh
It should be something along the lines of cuda_9.1.128_mac.dmg
Edit ~/.bash_profile and add the following:
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH:/Developer/NVIDIA/CUDA-9.1/bin
you may have to run source ~/.bash_profile to verify the LD paths are set:
source .bash_profile
echo $LD_LIBRARY
pmalik@MacPro:~$ echo $LD_LIBRARY_PATH
/Users/pmalik/lib:/usr/local/opt/libomp/lib:/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
We want to compile some CUDA sample to check if the GPU is correctly recognized and supported.
cd /Developer/NVIDIA/CUDA-9.1/samples
chown -R YOURUSERNAMEHERE *
make -C 1_Utilities/deviceQuery
./Developer/NVIDIA/CUDA-9.1/samples/bin/x86_64/darwin/release/deviceQuery
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 1060 6GB"
CUDA Driver Version / Runtime Version 9.1 / 9.1
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 6144 MBytes (6442254336 bytes)
(10) Multiprocessors, (128) CUDA Cores/MP: 1280 CUDA Cores
GPU Max Clock rate: 1709 MHz (1.71 GHz)
Memory Clock rate: 4004 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 1572864 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 195 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
If not already done, register at https://developer.nvidia.com/cudnn Download cuDNN 7.0.5
Change into your download directory and follow the post installation steps.
tar -xzvf cudnn-9.1-osx-x64-v7-ga.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib/libcudnn*
Download get-pip and run it in python. More info here
python get-pip.py
If I remeber correctly, pip will automatically install the tensorflow dependencies (wheel, six etc)
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout v1.7.0
Apply the following patch to fix a couple build issues:
git apply xtensorflow17macos.patch
Except CUDA support, CUDA SDK version and Cuda compute capabilities, I left the other settings untouched.
./configure
You have bazel 0.10.0 installed.
Please specify the location of python. [Default is /usr/bin/python]:
Found possible Python library paths:
/Library/Python/2.7/site-packages
Please input the desired Python library path to use. Default is [/Library/Python/2.7/site-packages]
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]:
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]6.1
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
Configuration finished
Takes about 20 minutes on my machine
bazel build --config=cuda --config=opt --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/
pip install ~/tensorflow-1.7.0-cp27-cp27m-macosx_10_13_intel.whl
It's useful to leave the .whl file lying around in case you want to install it for another environment.
See if everything got linked correctly
>>> import tensorflow as tf
>>> tf.Session()
2018-04-05 23:04:20.457912: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2018-04-05 23:04:20.458122: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:05:00.0
totalMemory: 4.00GiB freeMemory: 2.75GiB
2018-04-05 23:04:20.458143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-05 23:04:20.821699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-05 23:04:20.821728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-05 23:04:20.821736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-05 23:04:20.821856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2467 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
<tensorflow.python.client.session.Session object at 0x10e186990>
pip install keras
git clone https://github.com/fchollet/keras.git
cd keras/examples
python mnist_cnn.py
Using TensorFlow backend.
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2018-04-05 22:38:30.156464: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2018-04-05 22:38:30.156645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:05:00.0
totalMemory: 4.00GiB freeMemory: 2.98GiB
2018-04-05 22:38:30.156672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-05 22:38:30.519346: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-05 22:38:30.519376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-05 22:38:30.519383: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-05 22:38:30.519499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2697 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
2018-04-05 22:38:30.649987: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
2018-04-05 22:38:30.693399: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
2018-04-05 22:38:30.761824: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
59648/60000 [============================>.] - ETA: 0s - loss: 0.2698 - acc: 0.91682018-04-05 22:38:42.071923: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
You can use cuda-smi to watch the GPU memory usages. In case the of the mnist example in keras, you should see the free memory drop down to maybe 2% and the fans spin up. Not quite sure what the grappler/clusters/utils.cc:127 warning is, however.
pmalik@MacPro:~/cuda-smi$ ./cuda-smi
Device 0 [PCIe 0:5:0.0]: GeForce GTX 1050 Ti (CC 6.1): 2901.6 of 4095.8 MB (i.e. 70.8%) Free
pmalik@MacPro:~/cuda-smi$ ./cuda-smi
Device 0 [PCIe 0:5:0.0]: GeForce GTX 1050 Ti (CC 6.1): 2893.1 of 4095.8 MB (i.e. 70.6%) Free
pmalik@MacPro:~/cuda-smi$ ./cuda-smi
Device 0 [PCIe 0:5:0.0]: GeForce GTX 1050 Ti (CC 6.1): 223.86 of 4095.8 MB (i.e. 5.47%) Free
pmalik@MacPro:~/cuda-smi$ ./cuda-smi
Device 0 [PCIe 0:5:0.0]: GeForce GTX 1050 Ti (CC 6.1): 97.852 of 4095.8 MB (i.e. 2.39%) Free
Tested on a 2010 Mac Pro (Mid 2010) 10.13.3 (17D47) 2 x 2.93 GHz 6-Core Intel Xeon and NVIDIA GeForce GTX 1050 Ti 4 GB
If you'd like to build tensorflow with openmp (multi-cpu support), grab the open mp library via homebrew
brew install cliutils/apple/libomp
and uncomment the -lgomp line /third_party/gpus/cuda/BUILD.tpl
Also you can build the binary to your specific cpu architecure, run this to get a list
bazel build --config=cuda --config=opt --copt=-march=native --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package
You can run this command to see what instruction sets are getting built
echo | clang -E - -march=native -###
@ZexuanTHU It would seem you're missing the libcuda env path - it's complaining about not being able to find the cuda library. Try running source ~/.bash_profile and then echo $LD_LIBRARY_PATH
Here's where the libraries live on my box:
You shouldn't need to disable SIP at all; everything builds on my machine with SIP in place.