Skip to content

Instantly share code, notes, and snippets.

@ageitgey
Last active September 11, 2023 13:08
Show Gist options
  • Save ageitgey/819a51afa4613649bd18 to your computer and use it in GitHub Desktop.
Save ageitgey/819a51afa4613649bd18 to your computer and use it in GitHub Desktop.

Build tensorflow on OSX with NVIDIA CUDA support (GPU acceleration)

These instructions are based on Mistobaan's gist but expanded and updated to work with the latest tensorflow OSX CUDA PR.

Requirements

OS X 10.10 (Yosemite) or newer

I tested these intructions on OS X v10.10.5. They will probably work on OS X v10.11 (El Capitan), too.

Xcode Command-Line Tools

These instructions assume you have Xcode installed and your machine is already set up to compile c/c++ code.

If not, simply type gcc into a terminal and it will prompt you to download and install the Xcode Command-Line Tools.

homebrew

To compile tensorflow on OS X, you need several dependent libraries. The easiest way to get them is to install them with the homebrew package manager.

If you don't already have brew installed, you can install it like this:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

If you don't want to blindly run a ruby script loaded from the internet, they have alternate install options.

coreutils, swig, bazel

First, make sure you have brew up to date with the latest available packages:

brew update
brew upgrade

Then install these tools:

brew install coreutils
brew install swig
brew install bazel

Check the version to make sure you installed bazel 0.1.4 or greater. bazel 0.1.3 or below will fail when building tensorflow.

$ bazel version

Build label: 0.1.4-homebrew

NVIDIA's CUDA libraries

Also installed from brew:

brew cask install cuda

Check the version to make sure you installed CUDA 7.5. Older versions will fail.

$ brew cask info cuda

cuda: 7.5.20
Nvidia CUDA

NVIDIA's cuDNN library

NVIDIA requires you to sign up and be approved before you can download this.

First, go sign up here:

https://developer.nvidia.com/accelerated-computing-developer

When you sign up, make sure you provide accurate information. A human at NVIDIA will review your application. If it's a business day, hopefully you'll get approved quickly.

Then go here to download cuDNN:

https://developer.nvidia.com/cudnn

Click 'Download' to fill out their survey and agree to their Terms. Finally, you'll see the download options.

However, you'll only see download options for cuDNN v4 and cuDNN v3. You'll want to scroll to the very bottom and click "Archived cuDNN Releases".

This will take you to this page where you can download cuDNN v2:

https://developer.nvidia.com/rdp/cudnn-archive

On that page, download "cuDNN v2 Library for OSX".

Next, tou need to manually install it by copying over some files:

tar zxvf ~/Downloads/cudnn-6.5-osx-v2.tar.gz
sudo cp ./cudnn-6.5-osx-v2/cudnn.h /usr/local/cuda/include/
sudo cp ./cudnn-6.5-osx-v2/libcudnn* /usr/local/cuda/lib/

Finally, you need to make sure the library is in your library load path. Edit your ~/.bash_profile file and add this line at the bottom:

export DYLD_LIBRARY_PATH="/usr/local/cuda/lib":$DYLD_LIBRARY_PATH

After that, close and reopen your terminal window to apply the change.

Checkout tensorflow

Since OS X CUDA support is still an unmerged pull request (#664), you need to check out that specific branch:

git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
git fetch origin pull/664/head:cuda_osx
git checkout cuda_osx

Look up your NVIDIA card's Graphics Capability on the CUDA website

Before you start, open up System Report in OSX:

Apple Menu > About this Mac > System Report...

In System Report, click on "Graphics/Displays" and find out the exact model NVIDIA card you have:

NVIDIA GeForce GT 650M:

  Chipset Model:	NVIDIA GeForce GT 650M

Then go to https://developer.nvidia.com/cuda-gpus and find that exact model name in the list:

 CUDA-Enabled GeForce Products > GeForce GT 650M

There it will list the Compute Capability for your card. For the GeForce GT 650M used in late 2011 Macbook Pro Retinas, it is 3.0. Write this down as it's critical to have this number for the next step.

Configure and Build tensorflow

You will first need to configure the tensorflow build options:

TF_UNOFFICIAL_SETTING=1 ./configure

During the config process, it will ask you a bunch of questions. You can use the answers below except make sure to use the Compute Capability for your NVIDIA card you looked up in the previous step:

WARNING: You are configuring unofficial settings in TensorFlow. Because some external libraries are not backward compatible, these settings are largely untested and unsupported.

Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify the Cuda SDK version you want to use. [Default is 7.0]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Default is 6.5]:
Please specify the location where cuDNN 6.5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
Setting up Cuda include
Setting up Cuda lib
Setting up Cuda bin
Setting up Cuda nvvm
Configuration finished

Now you can actually build and install tensorflow!

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-py2-none-any.whl

Verify Installaion

You need to exit the tensorflow build folder to test your installation.

cd ~

Now, run python and paste in this test script:

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)

# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.
print sess.run(c)

You should get output that looks something like this:

I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.6.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.7.5.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.dylib locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.7.5.dylib locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] OS X does not support NUMA - returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GT 650M
major: 3 minor: 0 memoryClockRate (GHz) 0.9
pciBusID 0000:01:00.0
Total memory: 1023.69MiB
Free memory: 452.21MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:705] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 512.0KiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 1.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 2.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 4.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 8.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 16.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 32.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 64.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 128.00MiB
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:51] Creating bin of max chunk size 256.00MiB
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
I tensorflow/core/common_runtime/direct_session.cc:142] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GT 650M, pci bus id: 0000:01:00.0
b: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/simple_placer.cc:304] MatMul: /job:localhost/replica:0/task:0/gpu:0
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:73] Allocating 252.21MiB bytes.
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:83] GPU 0 memory begins at 0x700a80000 extends to 0x7106b6000

[[ 22.  28.]
 [ 49.  64.]]

Yay! Now you can train your models using a GPU!

If you are using a Retina Macbook Pro with only a 1GB GeForce 650M, you will probably run into Out of Memory errors with medium to large models. But at least it will make small-scale experimentation faster.

@keevee09
Copy link

keevee09 commented Oct 9, 2016

@KSanthanam
No. You need an NVIDIA graphics card. Your GPU is from Intel.

@kadkaz
Copy link

kadkaz commented Oct 10, 2016

@KSanthanam
Your mac may have two cards. look at it "About this mac" -> System report -> Graphics/displays.
Usually Nvidia driver can tell you it as well. otherwise it will not install

@gangliao
Copy link

For OS X v10.11 and 10.12, it will fail to linked NVIDIA Libs, since DYLD_LIBRARY_PATH is disabled for protection reason.

@asimonov
Copy link

yay! I tried to build tensorflow for mac+gpu in september and had all sorts of issues. this list seem to work.
the only thing I had to do is:

git submodule update --init

before I could do 'bazel build'. The problem is mentioned here: tensorflow/tensorflow#1069

@asimonov
Copy link

actually, I see an error trying to do bazel build:

tensorflow/stream_executor/cuda/cuda_dnn.cc:662:7: error: expected body of lambda expression
SHARED_LOCKS_REQUIRED(dnn_handle_mutex_) {
^

for those of you who succeeded:

  1. what version of CUDA did you use?
  2. what version of cuDNN did you use?
  3. what version of xcode did you use?

@djwbrown
Copy link

djwbrown commented Jan 6, 2017

@ady477 I had a similar issue. Check out this part of the setup documentation.
https://www.tensorflow.org/get_started/os_setup#mac_os_x_segmentation_fault_when_import_tensorflow

Couldn't open CUDA library libcuda.1.dylib
...
Segmentation fault: 11

I think you can fix this with the following symlink:
ln -sf /usr/local/cuda/lib/libcuda.dylib /usr/local/cuda/lib/libcuda.1.dylib

@ansu1234
Copy link

While implementing this tutorial when i am running the command
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

it's giving me following error:
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:59:16: in includes attribute of cc_library rule //google/protobuf:protobuf_lite: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:124:16: in includes attribute of cc_library rule //google/protobuf:protobuf: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
WARNING: /Users/asuma2/tensorflow/google/protobuf/BUILD:266:16: in includes attribute of cc_library rule //google/protobuf:protoc_lib: 'src/' resolves to 'google/protobuf/src' not in 'third_party'. This will be an error in the future.
INFO: Found 1 target...
INFO: From Executing genrule //third_party/gpus/cuda:cuda_config_check [for host]:
/bin/bash: greadlink: command not found
INFO: From Executing genrule //third_party/gpus/cuda:cuda_config_check:
/bin/bash: greadlink: command not found
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: declared output 'third_party/gpus/cuda/cuda.config' was not created by genrule. This is probably because the genrule actually didn't create this output, or because the output was a directory and the genrule was run remotely (note that only the contents of declared file outputs are copied from genrules run remotely).
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: declared output 'third_party/gpus/cuda/cuda.config' was not created by genrule. This is probably because the genrule actually didn't create this output, or because the output was a directory and the genrule was run remotely (note that only the contents of declared file outputs are copied from genrules run remotely).
ERROR: /Users/asuma2/tensorflow/third_party/gpus/cuda/BUILD:196:1: not all outputs were created.
Target //tensorflow/cc:tutorials_example_trainer failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 22.253s, Critical Path: 0.25s

Can someone suggest me what is wrong i am doing here. Thanks in advance!!!

@andrewrech
Copy link

Has anyone successfully compiled r1.2 with CUDA support on macOS?

@jeff3dx
Copy link

jeff3dx commented Jul 1, 2017

Here is a new setup guide published recently on Medium. (February 2017) https://medium.com/@fabmilo/how-to-compile-tensorflow-with-cuda-support-on-osx-fd27108e27e1

  • Found this link while examining the pull request referenced by the old guide (on this page)
  • During the configuration process use all defaults (do NOT select clang compiler for instance)
  • The instructions specify a command line switch to change to XCode 7.2. This is necessary for the build, but to set it back later do this: sudo xcode-select -r

@minusplusminus
Copy link

had problems with cask. This solved it:

brew tap caskroom/drivers

@philster
Copy link

Created a setup guide for TensorFlow r1.3 with SSE4.1/SSE4.2/AVX/AVX2/FMA instructions and CUDA support on macOS Sierra 10.12
https://gist.github.com/philster/042fabcf73f2269e6b1e6d28fbeeb7e3

Please let me know if it works/not works for you. Thanks!

@brightbytes-dude
Copy link

brightbytes-dude commented Sep 3, 2017

I followed directions exactly up to this point on El Capitan 10.11.6, updating and upgrading brew, but I get this trying to install cuda:

$ brew cask install cuda
Error: Cask 'cuda' is unavailable: No Cask with this name exists. Did you mean one of:
cuda-z
Error: Install incomplete.

Going to try cuda-z, but this seems off ...

*** Update ***

LOL, I have an AMD Radeon card on my machine. Tensorflow doesn't support it. Nevermind. Sorry for the noise. :-(

@metakermit
Copy link

My slightly more detailed version for TensorFlow 1.3.0 and fixes for all the errors I've encountered can be found here.

@doctorpangloss
Copy link

doctorpangloss commented Sep 18, 2017

Following @metakermit 's instructions, I built a Python 2.7 wheel of TF 1.3 for Mac:

pip install https://www.dropbox.com/s/18cy4fqrneovnuo/tensorflow-1.3.0-cp27-cp27m-macosx_10_12_intel.whl?dl=1

This was built against the system numpy because it uses Accelerate.framework for BLAS.

Tested with a Maxwell Titan X in an AKITIO Node enclosure. I built it with support for the 650M, 750M, Kepler, Maxwell and Pascal external GPUs.

@AnonymousArthur
Copy link

@doctorpangloss Can you please build one for python 3.6.2? I have trouble to build.

@mishaaskerka
Copy link

@doctorpangloss Could you please refresh the link to your wheel? Thanks in advance!

@tslater
Copy link

tslater commented Oct 17, 2017

@ultran, I have one here for Python 3.6.2/MacOS v10.13, I'm not sure if it will work for MacOS v10.12.

pip3 install https://github.com/tslater/tensorflow-gpu-support-pythonv3.6.2-macosv10.13/raw/master/tensorflow-1.3.1-cp36-cp36m-macosx_10_13_x86_64.whl

@jasilberman
Copy link

jasilberman commented Nov 2, 2017

@ tslater, we appreciate the work.
Any chance we can get a build for Tensorflow 1.4?

Also, did you test your install with CUDA9? That is, using the latest NVIDA video driver and CUDA install?

Thank you!
-Jack

@norman-thomas
Copy link

@jasilberman
I uploaded a TF 1.4 build for Python 3.6 using CUDA 9 and cuDNN 7 on macOS 10.13.1 to my repo.
Check it out if it works for you:
https://github.com/norman-thomas/tensorflow-gpu-mac

@jasilberman
Copy link

jasilberman commented Nov 17, 2017

@norman-thomas

Thanks for the attention. Any chance you saved the incorrect file name=?

It is showing for MAC OS 10.9
I tried to install in on 10.13.1, no game.

tensorflow-1.4.0-cp36-cp36m-macosx_10_9_x86_64
https://github.com/norman-thomas/tensorflow-gpu-mac/blob/master/tensorflow-1.4.0-cp36-cp36m-macosx_10_9_x86_64.whl

Thanks,
-Jack

@ihenry
Copy link

ihenry commented Nov 27, 2017

Great guide - A couple of modifications required for me too. Previously I had the error

Error: Cask 'cuda' is unavailable: No Cask with this name exists. Did you mean “cuda-z”?

brew tap caskroom/drivers
brew cask install nvidia-cuda

@doctorpangloss
Copy link

@mishaaskerka The link is fixed.

@gingerbeardman
Copy link

Thanks for your wheel @slater

@knasim
Copy link

knasim commented Mar 23, 2018

This errors out with:

ERROR: /Users/nasimk/git/tensorflow/tensorflow/tensorflow.bzl:418:42: name 'DATA_CFG' is not defined
ERROR: error loading package '': Extension 'tensorflow/tensorflow.bzl' has errors
ERROR: error loading package '': Extension 'tensorflow/tensorflow.bzl' has errors ```

@niccottrell
Copy link

I second the brew case install change from cuda to nvidia-cuda

@Einsley
Copy link

Einsley commented Aug 3, 2018

Hi!
"tar zxvf ~/Downloads/cudnn-6.5-osx-v2.tar.gz" gives error: Failed to open '/Users/yili/Downloads/cudnn-6.5-osx-v2.tar.gz'. Does anyone know how to fix this? Thanks!

@racekiller
Copy link

Hello All
I am getting following error

Current Bazel version is 0.15.2-homebrew, expected at least 0.4.5

when running this line

bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

anyone can help?

Thanks

@peisungtsai
Copy link

When tried to run $ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
I got this error.
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/Users/peisungtsai/tensorflow/tools/bazel.rc

Not sure what to do. Thanks.

@peisungtsai
Copy link

Sorry, the complete error message whild running $ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/Users/peisungtsai/tensorflow/tools/bazel.rc
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=80
ERROR: Config value cuda is not defined in any .rc file
INFO: Invocation ID: 80a35643-a47b-4d78-95db-dc799800258b

@lp74
Copy link

lp74 commented Aug 17, 2019

Sorry, the complete error message whild running $ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/Users/peisungtsai/tensorflow/tools/bazel.rc
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=80
ERROR: Config value cuda is not defined in any .rc file
INFO: Invocation ID: 80a35643-a47b-4d78-95db-dc799800258b

I've got the same error. Did you solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment