Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save antoniopioricciardi/240921b16e32dcc67a779e11dfaade1d to your computer and use it in GitHub Desktop.
Save antoniopioricciardi/240921b16e32dcc67a779e11dfaade1d to your computer and use it in GitHub Desktop.

Tensorflow GPU 1.8 with MacOS 10.13.6

A guide to install and make work an already compiled version of Tensorflow 1.8 - GPU on MacOS 10.13.6.

PREREQUISITE: Having an Nvidia GPU or EGPU (already working)

These are the required steps:

(note: follow the guide at your own risk.
note2: Big part of this guide is taken from this other guide):

1. Install Homebrew:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install wget

2. Install Nvidia Web Drivers:

https://images.nvidia.com/mac/pkg/387/WebDriver-387.10.10.10.40.105.pkg

3. Install Nvidia Cuda Drivers:

https://www.nvidia.com/object/macosx-cuda-387.178-driver.html

4. Download Xcode 8.2.xip and Xcode 9.4.xip, extract both .app files, rename them to Xcode8.2.app and Xcode9.4 respectively and move then to Applications folder:

https://developer.apple.com/download/more/

You need to search for them there, they're about 4.2GB and 5.2GB. V9.4 will be needed to install OpenMP, which suggests to install that version. I don't know if latest Xcode version works instead of 9.4, if you already have latest, you could try to use that. V8.2 is essential, anyway.

5. Set Xcode8.2 as default:

sudo xcode-select -s /Applications/Xcode8.2.app

6. Install bazel:

brew install bazel

7. Install cuda 9.1.128:

https://developer.nvidia.com/cuda-91-download-archive?target_os=MacOSX&target_arch=x86_64&target_version=1013&target_type=dmglocal

8. Download and install nccl 1.3.4:

https://storage.googleapis.com/74thopen/tensorflow_osx/nccl_osx_1.3.4.tar.gz

unarchive it, open a terminal window into the extracted folder and move it into /usr/local/nccl by performing:

sudo mkdir -p /usr/local/nccl
cd nccl_2.1.15-1+cuda9.1_x86_64
sudo mv * /usr/local/nccl
sudo mkdir -p /usr/local/include/third_party/nccl
sudo ln -s /usr/local/nccl/include/nccl.h /usr/local/include/third_party/nccl

9. Edit ~/.bash_profile by inserting:

export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH:/Developer/NVIDIA/CUDA-9.1/bin

in it.

10. Compile CUDA samples to test if GPU is working correctly:

cd /Developer/NVIDIA/CUDA-9.1/samples
chown -R $(whoami) *
make -C 1_Utilities/deviceQuery
./bin/x86_64/darwin/release/deviceQuery

You should get this result at the bottom of the terminal:

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1Result = PASS

11. Register here and download cuDNN 7.0.5:

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.0.5/prod/9.1_20171129/cudnn-9.1-osx-x64-v7-ga

Perform:

tar -xzvf cudnn-9.1-osx-x64-v7-ga.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib/libcudnn*

to extract and copy required files into CUDA install folder.

12. Download and install Python 3.6.4:

https://www.python.org/ftp/python/3.6.4/python-3.6.4-macosx10.6.pkg

Now this is where i stopped following the guide.

13. Install Tensorflow 1.8 (other versions HERE):

pip3 install  https://storage.googleapis.com/74thopen/tensorflow_osx/tensorflow-1.8.0-cp36-cp36m-macosx_10_13_x86_64.whl 

14. Set Xcode9.4 as default:

sudo xcode-select -s /Applications/Xcode9.4.app

15. Install OpenMP:

brew install cliutils/apple/libomp

16. Finally, test the whole installation: Run in terminal:

python

then

>>> import tensorflow as tf
>>> tf.Session()

you should get some messages about your GPU, memory and others (### i will insert the exact returned message ###).

17. If you get -ncclAllReduce issue:

  1. Download file here:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/nccl/kernels/nccl_ops.cc

  1. Execute:
gcc -c -fPIC nccl_ops.cc -o hello_world.o
  1. Execute:
gcc hello_world.o -shared -o _nccl_ops.so  
  1. Replace generated file "nccl_ops.so" at Path:
tensorflow/contrib/nccl/python/ops

To find where TF is installed:

pip3 show tensorflow

you will get:

Name: tensorflow

Version: 1.8.0

Summary: TensorFlow helps the tensors flow

Home-page: https://www.tensorflow.org/

Author: Google Inc.

Author-email: opensource@google.com

License: Apache 2.0

Location: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages

Requires: grpcio, tensorboard, wheel, astor, gast, protobuf, termcolor, numpy, six, absl-py

Required-by:

Then repeat step 16, if everything works, congratulations, you have tensorflow 1.8 with GPU support installed!

Moreover, if you want to test a sample code to be sure everything really works, then download and run

https://github.com/antoniopioricciardi/Tensorflow-MacOS-10.13.6-eGPU/blob/master/TFtest.py
@pylearndl
Copy link

Hi, I'm getting below error. I have no clue what is wrong and where. Please help.

Aagnyas-Air:Nails Aagnya$ python3 custom.py train --dataset=customImages --weights=coco
Using TensorFlow backend.
Weights: coco
Dataset: customImages
Logs: /Users/Aagnya/Documents/AppCodes/Applications/logs

Configurations:
BACKBONE resnet101
BACKBONE_STRIDES [4, 8, 16, 32, 64]
BATCH_SIZE 2
BBOX_STD_DEV [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE None
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.9
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_NORM 5.0
IMAGES_PER_GPU 2
IMAGE_MAX_DIM 1024
IMAGE_META_SIZE 14
IMAGE_MIN_DIM 800
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE square
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28, 28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56, 56)
NAME nail
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
ROI_POSITIVE_RATIO 0.33
RPN_ANCHOR_RATIOS [0.5, 1, 2]
RPN_ANCHOR_SCALES (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE 1
RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD 0.7
RPN_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 100
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN False
TRAIN_ROIS_PER_IMAGE 200
USE_MINI_MASK True
USE_RPN_ROIS True
VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001

Loading weights /Users/Aagnya/Documents/AppCodes/Applications/mask_rcnn_coco.h5
2019-01-06 22:25:18.813208: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2019-01-06 22:25:18.814461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1070 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:c3:00.0
totalMemory: 8.00GiB freeMemory: 7.24GiB
2019-01-06 22:25:18.814514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2019-01-06 22:25:20.799216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-01-06 22:25:20.799247: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2019-01-06 22:25:20.799256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2019-01-06 22:25:20.801938: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 Ti, pci bus id: 0000:c3:00.0, compute capability: 6.1)
2019-01-06 22:25:20.804368: E tensorflow/stream_executor/cuda/cuda_driver.cc:936] failed to allocate 6.83G (7329701632 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2019-01-06 22:25:22.579117: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
2019-01-06 22:25:23.544083: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
2019-01-06 22:25:25.234273: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
Training network heads
Traceback (most recent call last):
File "custom.py", line 365, in
train(model)
File "custom.py", line 201, in train
layers='heads')
File "/Users/Aagnya/Documents/AppCodes/Applications/untitled folder/Nails/mrcnn/model.py", line 2348, in train
histogram_freq=0, write_graph=True, write_images=False),
File "/usr/local/lib/python3.6/site-packages/keras/callbacks.py", line 745, in init
from tensorflow.contrib.tensorboard.plugins import projector
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/init.py", line 36, in
from tensorflow.contrib import distribute
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/distribute/init.py", line 22, in
from tensorflow.contrib.distribute.python.cross_tower_ops import *
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/distribute/python/cross_tower_ops.py", line 23, in
from tensorflow.contrib.distribute.python import cross_tower_utils
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/distribute/python/cross_tower_utils.py", line 23, in
from tensorflow.contrib import nccl
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/nccl/init.py", line 30, in
from tensorflow.contrib.nccl.python.ops.nccl_ops import all_max
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/nccl/python/ops/nccl_ops.py", line 30, in
resource_loader.get_path_to_datafile('_nccl_ops.so'))
File "/usr/local/lib/python3.6/site-packages/tensorflow/contrib/util/loader.py", line 56, in load_op_library
ret = load_library.load_op_library(path)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/usr/local/lib/python3.6/site-packages/tensorflow/contrib/nccl/python/ops/_nccl_ops.so, 6): Symbol not found: _ncclAllReduce
Referenced from: /usr/local/lib/python3.6/site-packages/tensorflow/contrib/nccl/python/ops/_nccl_ops.so
Expected in: flat namespace
in /usr/local/lib/python3.6/site-packages/tensorflow/contrib/nccl/python/ops/_nccl_ops.so
Aagnyas-Air:Nails Aagnya$

@pylearndl
Copy link

I follow instructions nccl_ops.cc and it works but after about a minute python crashes with below error stack

Process: Python [24192]
Path: /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: 3.6.5 (3.6.5)
Code Type: X86-64 (Native)
Parent Process: bash [18085]
Responsible: Python [24192]
User ID: 501

Date/Time: 2019-01-06 22:58:16.089 +0000
OS Version: Mac OS X 10.13.6 (17G65)
Report Version: 12
Anonymous UUID: E9E6C993-F608-2A21-3104-5D27FD5EAFEF

Time Awake Since Boot: 94000 seconds

System Integrity Protection: disabled

Crashed Thread: 14

Exception Type: EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes: 0x0000000000000001, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY

Termination Signal: Illegal instruction: 4
Termination Reason: Namespace SIGNAL, Code 0x4
Terminating Process: exc handler [0]

Thread 0:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::1::condition_variable::wait(std::1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138c7180b nsync::nsync_mu_semaphore_p_with_deadline(nsync::nsync_semaphore_s
*, timespec) + 363
4 libtensorflow_framework.so 0x0000000138c6dfc7 nsync::nsync_cv_wait_with_deadline_generic(nsync::nsync_cv_s
, void, void ()(void), void ()(void), timespec, nsync::nsync_note_s
*) + 439
5 libtensorflow_framework.so 0x0000000138c6e721 nsync::nsync_cv_wait(nsync::nsync_cv_s
*, nsync::nsync_mu_s
) + 49
6 _pywrap_tensorflow_internal.so 0x000000012bb174ab tensorflow::DirectSession::WaitForNotification(tensorflow::Notification
, long long) + 219
7 _pywrap_tensorflow_internal.so 0x000000012bb0ea16 tensorflow::DirectSession::WaitForNotification(tensorflow::DirectSession::RunState*, tensorflow::CancellationManager*, long long) + 38
8 _pywrap_tensorflow_internal.so 0x000000012bb0e1d2 tensorflow::DirectSession::RunInternal(long long, tensorflow::RunOptions const&, tensorflow::CallFrameInterface*, tensorflow::DirectSession::ExecutorsAndKeys*, tensorflow::RunMetadata*) + 2114
9 _pywrap_tensorflow_internal.so 0x000000012bb17f4e tensorflow::DirectSession::RunCallable(long long, std::__1::vector<tensorflow::Tensor, std::__1::allocatortensorflow::Tensor > const&, std::__1::vector<tensorflow::Tensor, std::__1::allocatortensorflow::Tensor >, tensorflow::RunMetadata) + 1086
10 _pywrap_tensorflow_internal.so 0x000000012890d215 tensorflow::(anonymous namespace)::RunCallableHelper(tensorflow::Session*, long long, _object*, TF_Status*, tensorflow::gtl::InlinedVector<_object*, 8>, TF_Buffer) + 805
11 _pywrap_tensorflow_internal.so 0x000000012890d65c tensorflow::TF_SessionRunCallable(TF_Session*, long long, _object*, TF_Status*, tensorflow::gtl::InlinedVector<_object*, 8>, TF_Buffer) + 12
12 _pywrap_tensorflow_internal.so 0x00000001288cccc4 _wrap_TF_SessionRunCallable(_object*, _object*) + 548
13 org.python.python 0x0000000106a9f015 _PyCFunction_FastCallDict + 166
14 org.python.python 0x0000000106b050f6 call_function + 491
15 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
16 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
17 org.python.python 0x0000000106b06129 _PyFunction_FastCallDict + 449
18 org.python.python 0x0000000106a66718 _PyObject_FastCallDict + 196
19 org.python.python 0x0000000106a6683b _PyObject_Call_Prepend + 156
20 org.python.python 0x0000000106a66599 PyObject_Call + 101
21 org.python.python 0x0000000106ab1b96 slot_tp_call + 50
22 org.python.python 0x0000000106a66599 PyObject_Call + 101
23 org.python.python 0x0000000106afd835 _PyEval_EvalFrameDefault + 2175
24 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
25 org.python.python 0x0000000106b050cd call_function + 450
26 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
27 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
28 org.python.python 0x0000000106a66718 _PyObject_FastCallDict + 196
29 org.python.python 0x0000000106a6683b _PyObject_Call_Prepend + 156
30 org.python.python 0x0000000106a66599 PyObject_Call + 101
31 org.python.python 0x0000000106ab1b96 slot_tp_call + 50
32 org.python.python 0x0000000106a666e3 _PyObject_FastCallDict + 143
33 org.python.python 0x0000000106b050c6 call_function + 443
34 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
35 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
36 org.python.python 0x0000000106b05f59 fast_function + 218
37 org.python.python 0x0000000106b050cd call_function + 450
38 org.python.python 0x0000000106afd6c8 _PyEval_EvalFrameDefault + 1810
39 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
40 org.python.python 0x0000000106b05f59 fast_function + 218
41 org.python.python 0x0000000106b050cd call_function + 450
42 org.python.python 0x0000000106afd6c8 _PyEval_EvalFrameDefault + 1810
43 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
44 org.python.python 0x0000000106afcf7b PyEval_EvalCodeEx + 57
45 org.python.python 0x0000000106a87537 function_call + 339
46 org.python.python 0x0000000106a66599 PyObject_Call + 101
47 org.python.python 0x0000000106afd835 _PyEval_EvalFrameDefault + 2175
48 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
49 org.python.python 0x0000000106b05f59 fast_function + 218
50 org.python.python 0x0000000106b050cd call_function + 450
51 org.python.python 0x0000000106afd6c8 _PyEval_EvalFrameDefault + 1810
52 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
53 org.python.python 0x0000000106b05f59 fast_function + 218
54 org.python.python 0x0000000106b050cd call_function + 450
55 org.python.python 0x0000000106afd6c8 _PyEval_EvalFrameDefault + 1810
56 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
57 org.python.python 0x0000000106b050cd call_function + 450
58 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
59 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
60 org.python.python 0x0000000106afcf3c PyEval_EvalCode + 42
61 org.python.python 0x0000000106b25acf run_mod + 54
62 org.python.python 0x0000000106b24ade PyRun_FileExFlags + 164
63 org.python.python 0x0000000106b241c9 PyRun_SimpleFileExFlags + 283
64 org.python.python 0x0000000106b38faa Py_Main + 3466
65 org.python.python 0x0000000106a59e1d 0x106a58000 + 7709
66 libdyld.dylib 0x00007fff599ab015 start + 1

Thread 1:
0 libsystem_kernel.dylib 0x00007fff59afc28a __workq_kernreturn + 10
1 libsystem_pthread.dylib 0x00007fff59cc3009 _pthread_wqthread + 1035
2 libsystem_pthread.dylib 0x00007fff59cc2be9 start_wqthread + 13

Thread 2:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138883066 Eigen::EventCount::CommitWait(Eigen::EventCount::Waiter*) + 278
4 libtensorflow_framework.so 0x0000000138882dbd Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) + 941
5 libtensorflow_framework.so 0x0000000138882447 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 583
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 3:
0 libsystem_platform.dylib 0x00007fff59cbcf49 _platform_memmove$VARIANT$Haswell + 41
1 _pywrap_tensorflow_internal.so 0x000000012af1b70a std::__1::__function::__func<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long), std::__1::allocator<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long)>, void (long long, long long)>::operator()(long long&&, long long&&) + 266
2 libtensorflow_framework.so 0x0000000138883c9c std::__1::__function::__func<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long), std::__1::allocator<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long)>, void (long, long)>::operator()(long&&, long&&) + 44
3 libtensorflow_framework.so 0x0000000138883ac4 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const + 196
4 libtensorflow_framework.so 0x0000000138883bde std::__1::__function::__func<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'(), std::__1::allocator<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'()>, void ()>::operator()() + 46
5 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 4:
0 libsystem_platform.dylib 0x00007fff59cbcf49 _platform_memmove$VARIANT$Haswell + 41
1 _pywrap_tensorflow_internal.so 0x000000012af1b70a std::__1::__function::__func<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long), std::__1::allocator<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long)>, void (long long, long long)>::operator()(long long&&, long long&&) + 266
2 libtensorflow_framework.so 0x0000000138883c9c std::__1::__function::__func<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long), std::__1::allocator<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long)>, void (long, long)>::operator()(long&&, long&&) + 44
3 libtensorflow_framework.so 0x0000000138883ac4 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const + 196
4 libtensorflow_framework.so 0x0000000138883bde std::__1::__function::__func<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'(), std::__1::allocator<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'()>, void ()>::operator()() + 46
5 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 5:
0 libsystem_platform.dylib 0x00007fff59cbcf49 _platform_memmove$VARIANT$Haswell + 41
1 _pywrap_tensorflow_internal.so 0x000000012af1b70a std::__1::__function::__func<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long), std::__1::allocator<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long)>, void (long long, long long)>::operator()(long long&&, long long&&) + 266
2 libtensorflow_framework.so 0x0000000138883c9c std::__1::__function::__func<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long), std::__1::allocator<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long)>, void (long, long)>::operator()(long&&, long&&) + 44
3 libtensorflow_framework.so 0x0000000138883ac4 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const + 196
4 libtensorflow_framework.so 0x0000000138883bde std::__1::__function::__func<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'(), std::__1::allocator<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'()>, void ()>::operator()() + 46
5 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 6:
0 libsystem_platform.dylib 0x00007fff59cbcf49 _platform_memmove$VARIANT$Haswell + 41
1 _pywrap_tensorflow_internal.so 0x000000012af1b70a std::__1::__function::__func<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long), std::__1::allocator<void tensorflow::ConcatCPUImpl<float, tensorflow::(anonymous namespace)::MemCpyCopier >(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, long long, tensorflow::(anonymous namespace)::MemCpyCopier, tensorflow::TTypes<float, 2, long>::Matrix*)::'lambda'(long long, long long)>, void (long long, long long)>::operator()(long long&&, long long&&) + 266
2 libtensorflow_framework.so 0x0000000138883c9c std::__1::__function::__func<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long), std::__1::allocator<tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>)::'lambda'(long, long)>, void (long, long)>::operator()(long&&, long&&) + 44
3 libtensorflow_framework.so 0x0000000138883ac4 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const + 196
4 libtensorflow_framework.so 0x0000000138883b19 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const + 281
5 libtensorflow_framework.so 0x0000000138883bde std::__1::__function::__func<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'(), std::__1::allocator<Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const::'lambda'(long, long)::operator()(long, long) const::'lambda'()>, void ()>::operator()() + 46
6 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
7 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
8 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
9 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
10 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
11 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 7:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138883066 Eigen::EventCount::CommitWait(Eigen::EventCount::Waiter*) + 278
4 libtensorflow_framework.so 0x0000000138882dbd Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) + 941
5 libtensorflow_framework.so 0x0000000138882447 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 583
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 8:
0 libsystem_kernel.dylib 0x00007fff59af220a mach_msg_trap + 10
1 libsystem_kernel.dylib 0x00007fff59af1724 mach_msg + 60
2 libcuda_387.10.10.10_mercury.dylib 0x000000015cd45b3e 0x15cbdc000 + 1481534
3 libcuda_387.10.10.10_mercury.dylib 0x000000015cd9974c 0x15cbdc000 + 1824588
4 libcuda_387.10.10.10_mercury.dylib 0x000000015cd476c9 0x15cbdc000 + 1488585
5 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
6 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
7 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 9:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libcuda_387.10.10.10_mercury.dylib 0x000000015cd47527 0x15cbdc000 + 1488167
3 libcuda_387.10.10.10_mercury.dylib 0x000000015ccf8ece 0x15cbdc000 + 1167054
4 libcuda_387.10.10.10_mercury.dylib 0x000000015cd476c9 0x15cbdc000 + 1488585
5 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
6 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
7 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 10:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::1::condition_variable::wait(std::1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138c7180b nsync::nsync_mu_semaphore_p_with_deadline(nsync::nsync_semaphore_s
*, timespec) + 363
4 libtensorflow_framework.so 0x0000000138c6dfc7 nsync::nsync_cv_wait_with_deadline_generic(nsync::nsync_cv_s
, void, void ()(void), void ()(void), timespec, nsync::nsync_note_s
*) + 439
5 libtensorflow_framework.so 0x0000000138c6e721 nsync::nsync_cv_wait(nsync::nsync_cv_s
*, nsync::nsync_mu_s
) + 49
6 libtensorflow_framework.so 0x0000000138c6c401 tensorflow::EventMgr::PollLoop() + 161
7 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
8 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
9 libtensorflow_framework.so 0x00000001388a83e0 void
std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
10 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
11 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
12 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 11:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138883066 Eigen::EventCount::CommitWait(Eigen::EventCount::Waiter*) + 278
4 libtensorflow_framework.so 0x0000000138882dbd Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) + 941
5 libtensorflow_framework.so 0x000000013888295c Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1884
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 12:
0 libsystem_pthread.dylib 0x00007fff59cc4598 _pthread_cond_wait + 747
1 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
2 libtensorflow_framework.so 0x0000000138883066 Eigen::EventCount::CommitWait(Eigen::EventCount::Waiter*) + 278
3 libtensorflow_framework.so 0x0000000138882dbd Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) + 941
4 libtensorflow_framework.so 0x000000013888295c Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1884
5 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
6 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
7 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
8 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
9 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 13:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x00000001388838cb Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const + 923
4 libtensorflow_framework.so 0x00000001388834c6 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<void (long, long)>) const + 134
5 libtensorflow_framework.so 0x0000000138880f6f tensorflow::thread::ThreadPool::Impl::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>) + 159
6 libtensorflow_framework.so 0x0000000138880e92 tensorflow::thread::ThreadPool::ParallelFor(long long, long long, std::__1::function<void (long long, long long)>) + 114
7 libtensorflow_framework.so 0x00000001387fb0b9 tensorflow::Shard(int, tensorflow::thread::ThreadPool*, long long, long long, std::__1::function<void (long long, long long)>) + 825
8 _pywrap_tensorflow_internal.so 0x000000012af15806 void tensorflow::ConcatCPU(tensorflow::DeviceBase*, std::__1::vector<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> >, std::__1::allocator<std::__1::unique_ptr<tensorflow::TTypes<float, 2, long>::ConstMatrix, std::__1::default_delete<tensorflow::TTypes<float, 2, long>::ConstMatrix> > > > const&, tensorflow::TTypes<float, 2, long>::Matrix*) + 454
9 _pywrap_tensorflow_internal.so 0x0000000128c75d80 tensorflow::ConcatBaseOp<Eigen::ThreadPoolDevice, float, (tensorflow::AxisArgumentName)0>::Compute(tensorflow::OpKernelContext*) + 1712
10 libtensorflow_framework.so 0x0000000138c6094e tensorflow::ThreadPoolDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) + 334
11 libtensorflow_framework.so 0x0000000138c27462 tensorflow::(anonymous namespace)::ExecutorState::Process(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long) + 4722
12 libtensorflow_framework.so 0x0000000138c27d75 std::__1::__function::__func<tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady(tensorflow::gtl::InlinedVector<tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, 8> const&, tensorflow::(anonymous namespace)::ExecutorState::TaggedNodeReadyQueue*)::$_1, std::__1::allocator<tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady(tensorflow::gtl::InlinedVector<tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, 8> const&, tensorflow::(anonymous namespace)::ExecutorState::TaggedNodeReadyQueue*)::$_1>, void ()>::operator()() + 37
13 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
14 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
15 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
16 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
17 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
18 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 14 Crashed:
0 _pywrap_tensorflow_internal.so 0x000000012937bd25 std::__1::__function::__func<Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<int, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorStridingSlicingOp<Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::TensorMap<Eigen::Tensor<int const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const, Eigen::ThreadPoolDevice, false>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<int, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorStridingSlicingOp<Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::TensorMap<Eigen::Tensor<int const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const&, Eigen::ThreadPoolDevice const&)::'lambda'(long, long), std::__1::allocator<Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<int, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorStridingSlicingOp<Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::TensorMap<Eigen::Tensor<int const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const, Eigen::ThreadPoolDevice, false>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<int, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorStridingSlicingOp<Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::DSizes<long, 1> const, Eigen::TensorMap<Eigen::Tensor<int const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const&, Eigen::ThreadPoolDevice const&)::'lambda'(long, long)>, void (long, long)>::operator()(long&&, long&&) + 725
1 _pywrap_tensorflow_internal.so 0x0000000128d00f44 Eigen::ThreadPoolDevice::parallelFor(long, Eigen::TensorOpCost const&, std::__1::function<long (long)>, std::__1::function<void (long, long)>) const + 196
2 _pywrap_tensorflow_internal.so 0x00000001293725db tensorflow::functor::StridedSlice<Eigen::ThreadPoolDevice, int, 1>::operator()(Eigen::ThreadPoolDevice const&, Eigen::TensorMap<Eigen::Tensor<int, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorMap<Eigen::Tensor<int const, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::DSizes<long, 1> const&, Eigen::DSizes<long, 1> const&, Eigen::DSizes<long, 1> const&) + 315
3 _pywrap_tensorflow_internal.so 0x0000000129372373 void tensorflow::HandleStridedSliceCase<Eigen::ThreadPoolDevice, int, 1>(tensorflow::OpKernelContext*, tensorflow::gtl::ArraySlice const&, tensorflow::gtl::ArraySlice const&, tensorflow::gtl::ArraySlice const&, tensorflow::TensorShape const&, bool, tensorflow::Tensor*) + 483
4 _pywrap_tensorflow_internal.so 0x0000000129324c5a tensorflow::StridedSliceOp<Eigen::ThreadPoolDevice, int>::Compute(tensorflow::OpKernelContext*) + 1722
5 libtensorflow_framework.so 0x0000000138be7686 tensorflow::BaseGPUDevice::ComputeHelper(tensorflow::OpKernel*, tensorflow::OpKernelContext*) + 1318
6 libtensorflow_framework.so 0x0000000138c27462 tensorflow::(anonymous namespace)::ExecutorState::Process(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long) + 4722
7 libtensorflow_framework.so 0x0000000138c27d75 std::__1::__function::__func<tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady(tensorflow::gtl::InlinedVector<tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, 8> const&, tensorflow::(anonymous namespace)::ExecutorState::TaggedNodeReadyQueue*)::$_1, std::__1::allocator<tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady(tensorflow::gtl::InlinedVector<tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, 8> const&, tensorflow::(anonymous namespace)::ExecutorState::TaggedNodeReadyQueue*)::$_1>, void ()>::operator()() + 37
8 libtensorflow_framework.so 0x0000000138882982 Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1922
9 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
10 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
11 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
12 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
13 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 15:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 libc++.1.dylib 0x00007fff578ffcb0 std::__1::condition_variable::wait(std::__1::unique_lockstd::__1::mutex&) + 18
3 libtensorflow_framework.so 0x0000000138883066 Eigen::EventCount::CommitWait(Eigen::EventCount::Waiter*) + 278
4 libtensorflow_framework.so 0x0000000138882dbd Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) + 941
5 libtensorflow_framework.so 0x000000013888295c Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) + 1884
6 libtensorflow_framework.so 0x0000000138882104 std::__1::__function::__func<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'(), std::__1::allocator<tensorflow::thread::EigenEnvironment::CreateThread(std::__1::function<void ()>)::'lambda'()>, void ()>::operator()() + 52
7 libtensorflow_framework.so 0x00000001388a83e0 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::function<void ()> > >(void*) + 96
8 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
9 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
10 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 16:
0 libsystem_kernel.dylib 0x00007fff59afba16 __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff59cc4589 _pthread_cond_wait + 732
2 org.python.python 0x0000000106b369ff PyThread_acquire_lock_timed + 465
3 org.python.python 0x0000000106b3b7d0 acquire_timed + 104
4 org.python.python 0x0000000106b3b59c lock_PyThread_acquire_lock + 44
5 org.python.python 0x0000000106a9f13e _PyCFunction_FastCallDict + 463
6 org.python.python 0x0000000106b050f6 call_function + 491
7 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
8 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
9 org.python.python 0x0000000106b05f59 fast_function + 218
10 org.python.python 0x0000000106b050cd call_function + 450
11 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
12 org.python.python 0x0000000106b05876 _PyEval_EvalCodeWithName + 1747
13 org.python.python 0x0000000106b05f59 fast_function + 218
14 org.python.python 0x0000000106b050cd call_function + 450
15 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
16 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
17 org.python.python 0x0000000106b050cd call_function + 450
18 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
19 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
20 org.python.python 0x0000000106b050cd call_function + 450
21 org.python.python 0x0000000106afd631 _PyEval_EvalFrameDefault + 1659
22 org.python.python 0x0000000106b061f9 _PyFunction_FastCall + 121
23 org.python.python 0x0000000106a66718 _PyObject_FastCallDict + 196
24 org.python.python 0x0000000106a6683b _PyObject_Call_Prepend + 156
25 org.python.python 0x0000000106a66599 PyObject_Call + 101
26 org.python.python 0x0000000106b3bef7 t_bootstrap + 70
27 libsystem_pthread.dylib 0x00007fff59cc3661 _pthread_body + 340
28 libsystem_pthread.dylib 0x00007fff59cc350d _pthread_start + 377
29 libsystem_pthread.dylib 0x00007fff59cc2bf9 thread_start + 13

Thread 14 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000007 rbx: 0x0000000000000000 rcx: 0x0000000000000000 rdx: 0x0000000000000000
rdi: 0xffffffffffffffff rsi: 0x0000000000000000 rbp: 0x000070000288c310 rsp: 0x000070000288c2e8
r8: 0xfffffffffffffe77 r9: 0x0000000000000000 r10: 0x000000020720e700 r11: 0x0000000000000000
r12: 0x000000020720e700 r13: 0x0000000000000000 r14: 0x00007fbc8f6cac20 r15: 0x0000000000000000
rip: 0x000000012937bd25 rfl: 0x0000000000010246 cr2: 0x000000012cab7250

Logical CPU: 3
Error Code: 0x00000000
Trap Number: 6

Binary Images:
0x106a58000 - 0x106a59fff +org.python.python (3.6.5 - 3.6.5) /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/Resources/Python.app/Contents/MacOS/Python
0x106a5c000 - 0x106bc9fff +org.python.python (3.6.5, [c] 2001-2018 Python Software Foundation. - 3.6.5) <873198BF-CCB2-3D5E-9165-306A1E601866> /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/Python
0x106f59000 - 0x106f5afff +_heapq.cpython-36m-darwin.so (0) <8FE97434-8B32-3165-A0E4-BC437576A992> /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/_heapq.cpython-36m-darwin.so
0x106f9e000 - 0x106fa0fff +mmap.cpython-36m-darwin.so (0) /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/mmap.cpython-36m-darwin.so
0x1073de000 - 0x1073e3ff7 +_json.cpython-36m-darwin.so (0) <00ACBCF8-DC34-3DAA-890C-68F4097EEE44> /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/_json.cpython-36m-darwin.so
0x1073e8000 - 0x1073edfff +math.cpython-36m-darwin.so (0) /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/math.cpython-36m-darwin.so
0x1073f3000 - 0x1073fdff7 +_datetime.cpython-36m-darwin.so (0) <89E6D137-9C4D-3D1F-B1B3-CB5A6775E806> /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/_datetime.cpython-36m-darwin.so
0x107405000 - 0x107408ffb +_struct.cpython-36m-darwin.so (0) /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/_struct.cpython-36m-darwin.so
0x107410000 - 0x107413fff +zlib.cpython-36m-darwin.so (0) <71CA8E66-C850-3D2D-878B-EFC80617FF6E> /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/zlib.cpython-36m-darwin.so
0x107418000 - 0x107419fff +_bz2.cpython-36m-darwin.so (0) /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/lib-dynload/_bz2.cpython-36m-darwin.so
0x107508000 - 0x1076a3ff7 +multiarray.cpython-36m-darwin.so (???) <14268F66-60A2-3CAA-815C-05EBED6001A7> /usr/local/lib/python3.6/site-packages/numpy/core/multiarray.cpython-36m-darwin.so
0x10777b000 - 0x10b8fe5a7 +libopenblasp-r0.3.0.dev.dylib (0) /usr/local/lib/python3.6/site-packages/numpy/.dylibs/libopenblasp-r0.3.0.dev.dylib
0x10bc71000 - 0x10bd88ff7 +libgfortran.3.dylib (0) <9ABE5EDE-AD43-391A-9E54-866711FAC32A> /usr/local/lib/python3.6/site-packages/numpy/.dylibs/libgfortran.3.dylib

External Modification Summary:
Calls made by other processes targeting this process:
task_for_pid: 1
thread_create: 0
thread_set_state: 0
Calls made by this process:
task_for_pid: 0
thread_create: 0
thread_set_state: 0
Calls made by all processes on this machine:
task_for_pid: 35596
thread_create: 0
thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=919.2M resident=0K(0%) swapped_out_or_unallocated=919.2M(100%)
Writable regions: Total=3.6G written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=3.6G(100%)

                            VIRTUAL   REGION 

REGION TYPE SIZE COUNT (non-coalesced)
=========== ======= =======
Activity Tracing 256K 2
CoreUI image file 116K 2
Dispatch continuations 8192K 2
Kernel Alloc Once 8K 2
MALLOC 2.7G 310
MALLOC guard page 32K 9
MALLOC_LARGE (reserved) 256K 3 reserved VM address space (unallocated)
STACK GUARD 100K 26
Stack 68.7M 26
VM_ALLOCATE 19.8G 130
VM_ALLOCATE (reserved) 96.0M 4 reserved VM address space (unallocated)
__DATA 68.1M 585
__FONT_DATA 4K 2
__LINKEDIT 337.8M 252
__NV_CUDA 578.4M 13
__TEXT 581.5M 492
__UNICODE 560K 2
mapped file 34.4M 6
shared memory 516.6M 14
=========== ======= =======
TOTAL 24.7G 1863
TOTAL, minus reserved VM space 24.6G 1863

Model: MacBookAir6,2, BootROM MBA61.0107.B00, 2 processors, Intel Core i5, 1.3 GHz, 4 GB, SMC 2.13f15
Graphics: Intel HD Graphics 5000, Intel HD Graphics 5000, Built-In
Graphics: NVIDIA GeForce GTX 1070 Ti, NVIDIA GeForce GTX 1070 Ti, PCIe
Memory Module: BANK 0/DIMM0, 2 GB, DDR3, 1600 MHz, 0x02FE, 0x45424A3230554638454455302D474E2D4620
Memory Module: BANK 1/DIMM0, 2 GB, DDR3, 1600 MHz, 0x02FE, 0x45424A3230554638454455302D474E2D4620
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x117), Broadcom BCM43xx 1.0 (7.77.37.31.1a9)
Bluetooth: Version 6.0.7f10, 3 services, 27 devices, 1 incoming serial ports
Network Service: Wi-Fi, AirPort, en0
PCI Card: NVIDIA GeForce GTX 1070 Ti, Display Controller, Thunderbolt@195,0,0
PCI Card: pci10de,10f0, Audio Device, Thunderbolt@195,0,1
Serial ATA Device: APPLE SSD SM0128F, 121.33 GB
USB Device: USB 3.0 Bus
USB Device: BRCM20702 Hub
USB Device: Bluetooth USB Host Controller
Thunderbolt Bus: MacBook Air, Apple Inc., 23.6
Thunderbolt Device: eGFX Breakaway Box, Sonnet Technologies, Inc., 1, 25.2

@antoniopioricciardi
Copy link
Author

Hi, I updated the guide.
First try to run this sample program::
https://github.com/antoniopioricciardi/Tensorflow-MacOS-10.13.6-eGPU/blob/master/TFtest.py

Does it terminate correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment