Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?

Build tensorflow on OSX with NVIDIA CUDA support (GPU acceleration)

These instructions are based on Mistobaan's gist but expanded and updated to work with the latest tensorflow OSX CUDA PR.

Requirements

OS X 10.10 (Yosemite) or newer

I tested these intructions on OS X v10.10.5. They will probably work on OS X v10.11 (El Capitan), too.

Xcode Command-Line Tools

These instructions assume you have Xcode installed and your machine is already set up to compile c/c++ code.

If not, simply type gcc into a terminal and it will prompt you to download and install the Xcode Command-Line Tools.

homebrew

To compile tensorflow on OS X, you need several dependent libraries. The easiest way to get them is to install them with the homebrew package manager.

If you don't already have brew installed, you can install it like this:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

If you don't want to blindly run a ruby script loaded from the internet, they have alternate install options.

coreutils, swig, bazel

First, make sure you have brew up to date with the latest available packages:

brew update
brew upgrade

Then install these tools:

brew install coreutils
brew install swig
brew install bazel

Check the version to make sure you installed bazel 0.1.4 or greater. bazel 0.1.3 or below will fail when building tensorflow.

$ bazel version

Build label: 0.1.4-homebrew

NVIDIA's CUDA libraries

Also installed from brew:

brew cask install cuda

Check the version to make sure you installed CUDA 7.5. Older versions will fail.

$ brew cask info cuda

cuda: 7.5.20
Nvidia CUDA

NVIDIA's cuDNN library

NVIDIA requires you to sign up and be approved before you can download this.

First, go sign up here:

https://developer.nvidia.com/accelerated-computing-developer

When you sign up, make sure you provide accurate information. A human at NVIDIA will review your application. If it's a business day, hopefully you'll get approved quickly.

Then go here to download cuDNN:

https://developer.nvidia.com/cudnn

Click 'Download' to fill out their survey and agree to their Terms. Finally, you'll see the download options.

However, you'll only see download options for cuDNN v4 and cuDNN v3. You'll want to scroll to the very bottom and click "Archived cuDNN Releases".

This will take you to this page where you can download cuDNN v2:

https://developer.nvidia.com/rdp/cudnn-archive

On that page, download "cuDNN v2 Library for OSX".

Next, tou need to manually install it by copying over some files:

tar zxvf ~/Downloads/cudnn-6.5-osx-v2.tar.gz
sudo cp ./cudnn-6.5-osx-v2/cudnn.h /usr/local/cuda/include/
sudo cp ./cudnn-6.5-osx-v2/libcudnn* /usr/local/cuda/lib/

Finally, you need to make sure the library is in your library load path. Edit your ~/.bash_profile file and add this line at the bottom:

export DYLD_LIBRARY_PATH="/usr/local/cuda/lib":$DYLD_LIBRARY_PATH

After that, close and reopen your terminal window to apply the change.

Checkout tensorflow

Since OS X CUDA support is still an unmerged pull request (#664), you need to check out that specific branch:

git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
git fetch origin pull/664/head:cuda_osx
git checkout cuda_osx

Look up your NVIDIA card's Graphics Capability on the CUDA website

Before you start, open up System Report in OSX:

Apple Menu > About this Mac > System Report...

In System Report, click on "Graphics/Displays" and find out the exact model NVIDIA card you have:

NVIDIA GeForce GT 650M:

  Chipset Model:	NVIDIA GeForce GT 650M

Then go to https://developer.nvidia.com/cuda-gpus and find that exact model name in the list:

 CUDA-Enabled GeForce Products > GeForce GT 650M

There it will list the Compute Capability for your card. For the GeForce GT 650M used in late 2011 Macbook Pro Retinas, it is 3.0. Write this down as it's critical to have this number for the next step.

Configure and Build tensorflow

You will first need to configure the tensorflow build options:

TF_UNOFFICIAL_SETTING=1 ./configure

During the config process, it will ask you a bunch of questions. You can use the answers below except make sure to use the Compute Capability for your NVIDIA card you looked up in the previous step:

WARNING: You are configuring unofficial settings in TensorFlow. Because some external libraries are not backward compatible, these settings are largely untested and unsupported.

Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify the Cuda SDK version you want to use. [Default is 7.0]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Default is 6.5]:
Please specify the location where cuDNN 6.5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
Setting up Cuda include
Setting up Cuda lib
Setting up Cuda bin
Setting up Cuda nvvm
Configuration finished

Now you can actually build and install tensorflow!

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-py2-none-any.whl

Verify Installaion

You need to exit the tensorflow build folder to test your installation.

cd ~

Now, run python and paste in this test script:

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.
print sess.run(c)

You should get output that looks something like this:

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.6.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.7.5.dylib locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] OS X does not support NUMA - returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GT 650M
major: 3 minor: 0 memoryClockRate (GHz) 0.9
pciBusID 0000:01:00.0
Total memory: 1023.69MiB
Free memory: 452.21MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:705] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.00MiB
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
I tensorflow/core/common_runtime/direct_session.cc:142] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
b: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:73] Allocating 252.21MiB bytes.
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:83] GPU 0 memory begins at 0x700a80000 extends to 0x7106b6000

[[ 22.  28.]
 [ 49.  64.]]

Yay! Now you can train your models using a GPU!

If you are using a Retina Macbook Pro with only a 1GB GeForce 650M, you will probably run into Out of Memory errors with medium to large models. But at least it will make small-scale experimentation faster.

esd100 commented Apr 24, 2016

Awesome. it worked for me!

I used python 3.5.1 for my installation and ran into a couple hiccups along the way, but nothing that wasn't fixable.

In particular, I ran into issues with permissions and versionings of my python software.
So, I ran the following:
sudo -H pip3 freeze --local | sudo -H grep -v '^\-e' | sudo -H cut -d = -f 1 | xargs sudo -H pip3 install -U
sudo -H pip freeze --local | sudo -H grep -v '^\-e' | sudo -H cut -d = -f 1 | xargs sudo -H pip install -U
sudo -H pip3 install --upgrade pip
sudo -H pip install --upgrade pip

and Finally,
sudo -H pip3 install /private/tmp/tensorflow_pkg/tensorflow-0.8.0rc0-py3-none-any.whl

Then, in the little test function, I needed to use
print(sess.run(c))
otherwise, python3 gives you an error because of the differences in how it treats the print function.

Hi Adam,

I am also trying to install tensor flow on MAC OSX (version 10.11.5 ) with nvidia GPU support. I am following the instructions given in the page and received following errors :
when i run the python code I get the following output :
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:102] Couldn't open CUDA library libcuda.1.dylib. LD_LIBRARY_PATH: :/usr/local/cuda/lib
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: sfocfjgn32.ads.autodesk.com
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
Segmentation fault: 11

Hi esd, Since I am using python 3.5.1 , I used the suggestions provided by you. I tried running :

sudo -H pip3 freeze --local | sudo -H grep -v '^-e' | sudo -H cut -d = -f 1 | xargs sudo -H pip3 install -U

I received the following error:
Collecting conda-build
Using cached conda-build-1.20.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/private/tmp/pip-build-7tya1d46/conda-build/setup.py", line 10, in
import versioneer
ImportError: No module named 'versioneer'

Can you suggest something that I might be doing wrong ?

Thanks in advance!!

Hi, many thanks for your tutorial ! Are you using cuDNN v2 Library on purpose ? I tried to install tensorflow with the GPU support using cuDnn v5 but I could not make it work :s ...

bakercp commented Jul 25, 2016

Thanks for this. It worked great with a few tweaks.

My setup:

  • OSX 10.11.5, MacBook Pro (15-inch, Mid 2012), NVIDIA GeForce GT 650M 512 MB
  • I'm using python Python 2.7.11 :: Anaconda 2.4.1 (x86_64),
  • brew, Homebrew 0.9.9 (git revision 1455a; last commit 2016-07-24)
  • [bazel release 0.3.0-homebrew], etc. (just ran brew update, brew upgrade)
  • cuda: 7.5.27, Nvidia CUDA (from brew)

A few additions:

  • During the Checkout Tensorflow section, even after calling git clone --recurse-submodules https://github.com/tensorflow/tensorflow I had to manually call git submodule update --init to bring in the submodules.
  • To install the package I had to call sudo -H pip install /private/tmp/tensorflow_pkg/tensorflow-0.8.0rc0-py2-none-any.whl (note the version difference).

slcott commented Aug 5, 2016

I had to run this command:

cp -r bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/__main__/* bazel- bin/tensorflow/tools/pip_package/build_pip_package.runfiles/

before these two:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-py2-none-any.whl

@bakerc
I am using the same mac as you , but when I running the tensorflow examples it has the problem of GPU memory short,so have you encountered the same problem?

I have MacBook Pro 13" which shows "Intel Iris" under "Graphics/Displays -> Chipset. Can I also use tensorflow GPU? if so how?Please help.

keevee09 commented Oct 9, 2016

@KSanthanam
No. You need an NVIDIA graphics card. Your GPU is from Intel.

kadkaz commented Oct 10, 2016

@KSanthanam
Your mac may have two cards. look at it "About this mac" -> System report -> Graphics/displays.
Usually Nvidia driver can tell you it as well. otherwise it will not install

For OS X v10.11 and 10.12, it will fail to linked NVIDIA Libs, since DYLD_LIBRARY_PATH is disabled for protection reason.

yay! I tried to build tensorflow for mac+gpu in september and had all sorts of issues. this list seem to work.
the only thing I had to do is:

git submodule update --init

before I could do 'bazel build'. The problem is mentioned here: tensorflow/tensorflow#1069

actually, I see an error trying to do bazel build:

tensorflow/stream_executor/cuda/cuda_dnn.cc:662:7: error: expected body of lambda expression
SHARED_LOCKS_REQUIRED(dnn_handle_mutex_) {
^

for those of you who succeeded:

  1. what version of CUDA did you use?
  2. what version of cuDNN did you use?
  3. what version of xcode did you use?

djwbrown commented Jan 6, 2017

@ady477 I had a similar issue. Check out this part of the setup documentation.
https://www.tensorflow.org/get_started/os_setup#mac_os_x_segmentation_fault_when_import_tensorflow

Couldn't open CUDA library libcuda.1.dylib
...
Segmentation fault: 11

I think you can fix this with the following symlink:
ln -sf /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib

While implementing this tutorial when i am running the command
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

it's giving me following error:
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:59:16: in includes attribute of cc_library rule //google/protobuf:protobuf_lite: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:124:16: in includes attribute of cc_library rule //google/protobuf:protobuf: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:266:16: in includes attribute of cc_library rule //google/protobuf:protoc_lib: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
INFO: Found 1 target...
INFO: From Executing genrule //third_party/gpus/cuda:cuda_config_check [for host]:
/bin/bash: greadlink: command not found
INFO: From Executing genrule //third_party/gpus/cuda:cuda_config_check:
/bin/bash: greadlink: command not found
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: declared output 'third_party/gpus/cuda/cuda.config' was not created by genrule. This is probably because the genrule actually didn't create this output, or because the output was a directory and the genrule was run remotely (note that only the contents of declared file outputs are copied from genrules run remotely).
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: declared output 'third_party/gpus/cuda/cuda.config' was not created by genrule. This is probably because the genrule actually didn't create this output, or because the output was a directory and the genrule was run remotely (note that only the contents of declared file outputs are copied from genrules run remotely).
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: not all outputs were created.
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 22.253s, Critical Path: 0.25s

Can someone suggest me what is wrong i am doing here. Thanks in advance!!!

Has anyone successfully compiled r1.2 with CUDA support on macOS?

jeff3dx commented Jul 1, 2017

Here is a new setup guide published recently on Medium. (February 2017) https://medium.com/@fabmilo/how-to-compile-tensorflow-with-cuda-support-on-osx-fd27108e27e1

  • Found this link while examining the pull request referenced by the old guide (on this page)
  • During the configuration process use all defaults (do NOT select clang compiler for instance)
  • The instructions specify a command line switch to change to XCode 7.2. This is necessary for the build, but to set it back later do this: sudo xcode-select -r

had problems with cask. This solved it:

brew tap caskroom/drivers

Created a setup guide for TensorFlow r1.3 with SSE4.1/SSE4.2/AVX/AVX2/FMA instructions and CUDA support on macOS Sierra 10.12
https://gist.github.com/philster/042fabcf73f2269e6b1e6d28fbeeb7e3

Please let me know if it works/not works for you. Thanks!

brightbytes-dude commented Sep 3, 2017

I followed directions exactly up to this point on El Capitan 10.11.6, updating and upgrading brew, but I get this trying to install cuda:

$ brew cask install cuda
Error: Cask 'cuda' is unavailable: No Cask with this name exists. Did you mean one of:
cuda-z
Error: Install incomplete.

Going to try cuda-z, but this seems off ...

*** Update ***

LOL, I have an AMD Radeon card on my machine. Tensorflow doesn't support it. Nevermind. Sorry for the noise. :-(

My slightly more detailed version for TensorFlow 1.3.0 and fixes for all the errors I've encountered can be found here.

doctorpangloss commented Sep 18, 2017

Following @metakermit 's instructions, I built a Python 2.7 wheel of TF 1.3 for Mac:

pip install https://www.dropbox.com/s/18cy4fqrneovnuo/tensorflow-1.3.0-cp27-cp27m-macosx_10_12_intel.whl?dl=1

This was built against the system numpy because it uses Accelerate.framework for BLAS.

Tested with a Maxwell Titan X in an AKITIO Node enclosure. I built it with support for the 650M, 750M, Kepler, Maxwell and Pascal external GPUs.

ultraN commented Sep 22, 2017

@doctorpangloss Can you please build one for python 3.6.2? I have trouble to build.

@doctorpangloss Could you please refresh the link to your wheel? Thanks in advance!

tslater commented Oct 17, 2017

@ultraN, I have one here for Python 3.6.2/MacOS v10.13, I'm not sure if it will work for MacOS v10.12.

pip3 install https://github.com/tslater/tensorflow-gpu-support-pythonv3.6.2-macosv10.13/raw/master/tensorflow-1.3.1-cp36-cp36m-macosx_10_13_x86_64.whl

jasilberman commented Nov 2, 2017

@ tslater, we appreciate the work.
Any chance we can get a build for Tensorflow 1.4?

Also, did you test your install with CUDA9? That is, using the latest NVIDA video driver and CUDA install?

Thank you!
-Jack

@jasilberman
I uploaded a TF 1.4 build for Python 3.6 using CUDA 9 and cuDNN 7 on macOS 10.13.1 to my repo.
Check it out if it works for you:
https://github.com/norman-thomas/tensorflow-gpu-mac

jasilberman commented Nov 17, 2017

@norman-thomas

Thanks for the attention. Any chance you saved the incorrect file name=?

It is showing for MAC OS 10.9
I tried to install in on 10.13.1, no game.

tensorflow-1.4.0-cp36-cp36m-macosx_10_9_x86_64
https://github.com/norman-thomas/tensorflow-gpu-mac/blob/master/tensorflow-1.4.0-cp36-cp36m-macosx_10_9_x86_64.whl

Thanks,
-Jack

ihenry commented Nov 27, 2017

Great guide - A couple of modifications required for me too. Previously I had the error

Error: Cask 'cuda' is unavailable: No Cask with this name exists. Did you mean “cuda-z”?

brew tap caskroom/drivers
brew cask install nvidia-cuda

@mishaaskerka The link is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment