Skip to content

Instantly share code, notes, and snippets.

@kmhofmann
Last active August 11, 2024 14:14
Show Gist options
  • Save kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 to your computer and use it in GitHub Desktop.
Save kmhofmann/e368a2ebba05f807fa1a90b3bf9a1e03 to your computer and use it in GitHub Desktop.
Building TensorFlow from source

Building TensorFlow from source (TF 2.3.0, Ubuntu 20.04)

Why build from source?

The official instructions on installing TensorFlow are here: https://www.tensorflow.org/install. If you want to install TensorFlow just using pip, you are running a supported Ubuntu LTS distribution, and you're happy to install the respective tested CUDA versions (which often are outdated), by all means go ahead. A good alternative may be to run a Docker image.

I am usually unhappy with installing what in effect are pre-built binaries. These binaries are often not compatible with the Ubuntu version I am running, the CUDA version that I have installed, and so on. Furthermore, they may be slower than binaries optimized for the target architecture, since certain instructions are not being used (e.g. AVX2, FMA).

So installing TensorFlow from source becomes a necessity. The official instructions on building TensorFlow from source are here: https://www.tensorflow.org/install/install_sources.

What they don't mention there is that on supposedly "unsupported" configurations (i.e. up-to-date Linux systems), this can be a task from hell. So far, building TensorFlow has been a mostly terrible experience. With some TensorFlow versions and combination of system properties (OS version, CUDA version), things may work out decently. But more often than not, issues have popped up during the many times I have tried to build TensorFlow.

Trying to compile TensorFlow 2.1.0 (on Ubuntu 19.10 at the time) was particularly difficult; read about this in a previous version of this Gist. I aptly described building TensorFlow as a clusterfuck, and that may still prove true. My conservative guess is that quite a few developer years have been wasted out there because of several odd choices that have been made during TensorFlow development.

With TensorFlow 2.2.0, however, some fixes seem to have been made to improve the experience given my particular system configuration. Building went almost(!) smoothly, if one ignores the nightmare of installing CUDA and the specific Bazel requirement. At least no code and/or build file patching was required... this time around.

With CUDA 11 available "officially" for Ubuntu 20.04, another roadblock has been moved out of the way. TensorFlow 2.3.0 so far seems to be compatible with CUDA 11 & cuDNN 8. The biggest issue that TensorFlow currently has w.r.t. building is a compilation error using GCC 10. This should be easy to fix if anyone at Google really bothered.

Described configuration

I am describing the steps necessary to build TensorFlow in (currently) the following configuration:

  • Ubuntu 20.04
  • NVIDIA driver v450.57
  • CUDA 11.0.2 / cuDNN v8.0.2.39
  • GCC 9.3.0 (system default; Ubuntu 9.3.0-10ubuntu2)
  • TensorFlow v2.3.0

At the time of writing (2020-08-06), these were the latest available versions, except for the GCC version. (There seem to be build issues with GCC 10.1.0.)

Note that I am not interested in running an outdated Ubuntu version (this includes the actually quite ancient 18.04 LTS), installing a CUDA/cuDNN version that is not the latest, or using a TensorFlow version that is not the latest. Regressing to either of these is nonsensical to me. Therefore, the below instructions may or may not be useful to you. Please also note that the instructions are likely outdated, since I only update them occasionally. Many of the comments from other users below will most certainly be outdated. Don't just copy these instructions, but check what the respective latest versions are and use these instead!

Prerequisites

Installing the NVIDIA driver, CUDA and cuDNN

Please refer to my instructions here.

System packages

According to the official instructions, TensorFlow requires Python and pip:

$ sudo apt install python3-dev python3-pip python3-venv

Installing Bazel

Bazel is Google's monster of a build system and is required to build TensorFlow.

Google apparently did not want to make developers' lives easy and use a de-facto standard build system such as CMake. Life could be so nice. No, Google is big and dangerous enough to force their own creation upon everyone and thus make everyone else's life miserable. I wouldn't complain if Bazel was nice and easy to use. But I don't think there were many times when I built TensorFlow and did not have issues with Bazel (or with a combination of the two). This may a a system that works very well inside Google, but outside of the company's infrastructure, it seems less valuable but more of a hindrance.

Anyway... things got better since a previous version of this Gist. TensorFlow 2.3.0 now requires Bazel 3.1.0, which fixes the issues encountered previously.

Note that, at the time of writing, v3.4.1 was the latest released version of Bazel. But for unfathomable reasons TensorFlow refuses to cooperate with any other version than v3.1.0. Someone at Google needs to be taught the concepts of backward or forward compatibility!

Installing Bazel via apt

I really don't like installing things via third-party apt repositories, but hey, here we go. At least I hope/assume that this one is going to be decently maintained, being from the authors. Just follow these instructions. The following commands should do the trick:

$ sudo apt install curl gnupg
$ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
$ echo "deb [arch=amd64] https://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
$ sudo apt update && sudo apt install bazel-3.1.0

Note the explicit mention of bazel-3.1.0 in the last line. Usually, one would just install bazel, but nope, TensorFlow doesn't like that as it would install the respective latest version. So it's falling back to an explicit version.

Note that using this installation mechanism for Bazel, you cannot have multiple versions of Bazel installed on your system at the same time. Here's to hoping that this will never be necessary in practice. (Just don't try to build multiple versions of Tensorflow, ever. ;-))

Compiling Bazel from source

OK, this is much better -- we don't need to hook into the system's package management mechanism and can build completely user-locally. Official instructions here.

We first need to install some prerequisite dependencies:

sudo apt-get install build-essential openjdk-11-jdk python zip unzip

Since a while, Bazel needs to be built using Bazel, unless we use a bootstrapped distribution archive. These are specific per version, so let's get the right one:

$ wget https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-dist.zip
$ mkdir bazel-3.1.0
$ unzip -d ./bazel-3.1.0 bazel-3.1.0-dist.zip

Oh wait, we had to do... what? Yep, you read that right. If you just unzip the downloaded file, it will relentlessly litter the directory. Insert giant sadface here.

Also, I'd argue that it would be much easier to simply add an older version of Bazel to the repository such that user expectations are not subverted and one can follow a usual git clone <bazel_repo_url> followed by make approach. But oh well. Let's go on to call make.

$ cd bazel-3.1.0
$ env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh

I was just joking above. Of course you don't just call make with Bazel. Nor do you call cmake and ninja or anything that you would expect. You call the very intuitive compilation command that is easy to figure out by yourself </s>.

Either way, Bazel should now be compiled. There is no equivalent to make install. (I don't want to repeat myself talking about user expectations, but...) The binary should be located in output/ relative to the unzipped Bazel source. Add that to your PATH.

Building TensorFlow

Cloning and patching

First clone the sources, and check out the desired branch. At the time of writing, v2.2.0 was the latest version; adjust if necessary.

  $ git clone https://github.com/tensorflow/tensorflow
  $ cd tensorflow
  $ git checkout v2.3.0

Configuration

Create a Python 3 virtual environment, if you have not done this yet. For example:

  $ python3 -m venv ~/.virtualenvs/tf_dev

Activate it with source ~/.virtualenvs/tf_dev/bin/activate. This can later be deactivated with deactivate.

Install the Python packages mentioned in the official instructions:

$ pip install -U pip six 'numpy<1.19.0' wheel setuptools mock 'future>=0.17.1'
$ pip install -U keras_applications --no-deps
$ pip install -U keras_preprocessing --no-deps

(If you choose to not use a virtual environment, you'll need to add --user to each of the above commands.)

Congratulations, they snuck another maximum version in there. This appears to be due to this issue, but it is highly annoying nonetheless.

Now run the TensorFlow configuration script

  $ ./configure

We all like interactive scripts called ./configure, don't we? (Whoever devised this atrocity has never used GNU tools before.)

Carefully go through the options. You can leave most defaults, but do specify the required CUDA compute capabilities (as below, or similar):

  CUDA support -> Y
  CUDA compute capability -> 5.2,6.1,7.0

Some of the compute capabilities of popular GPU cards might be good to know:

  • Maxwell TITAN X: 5.2
  • Pascal TITAN X (2016): 6.1
  • GeForce GTX 1080 Ti: 6.1
  • Tesla V100: 7.0

(See here for the full list.)

Building

Now we can start the TensorFlow build process.

$ bazel build --config=opt -c opt //tensorflow/tools/pip_package:build_pip_package

Totally intuitive, right? :-D This command will build TensorFlow using optimized settings for the current machine architecture.

  • Add -c dbg --strip=never in case you do not want debug symbols to be stripped (e.g. for debugging purposes). Usually, you won't need to add this option.

  • Add --compilation_mode=dbg to build in debug instead of release mode, i.e. without optimizations. You shouldn't do this unless you really want to.

This will take some time. Have a coffee, or two, or three. Cook some dinner. Watch a movie.

Building & installing the Python package

Once the above build step has completed without error, the remainder is now easy. Build the Python package, which the build_pip_package script puts into a specified location.

  $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

And install the build wheel package:

  $ pip install /tmp/tensorflow_pkg/tensorflow-2.3.0-cp38-cp38-linux_x86_64.whl

Testing the installation

Google suggests to test the TensorFlow installation with the following command:

$ python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

This does not make explicit use of CUDA yet, but will emit a whole bunch of initialization messages that can give an indication whether all libraries could be loaded. And it should print that requested sum.

It worked? Great! Be happy and hope you won't have to build TensorFlow again any time soon...

@RubinXnibu
Copy link

YES !!!! I have TF 2.2 and TF 1.15.3 both running with GPU and the same CUDA and drivers on the same box!!! OH YEA OH YEA !!!! Thank you for your post!

(To anyone wanting to duplicate this, I had no problems building TF 2.2 with above instructions, and for building TF 1.15.3 it is almost as easy, you just have to take care of a couple issues. You just change the checkout command to "git checkout v1.15.3", install and use bazel 0.26.1, and deal with some compile errors. Yes, the source required fixing four things and one fix in bazel 0.26.1's cache: The four fixes are here:
themikem/tensorflow@cc5645f
And for the cache, edit this file:
~/.cache/bazel/bazel//external/grpc/src/core/lib/gpr/log_linux.cc
and rename the function static long gettid(void) to mygettid(void) to avoid a conflict with system function of the same name.

Thanks again kmhofmann !! )

@jarlostensen
Copy link

jarlostensen commented Jun 8, 2020

Great post, thanks for this. My experience was that it was "not straight forward" but that's to be expected.
I used pipenv btw.

A couple of things I had to fix were;

  • A header file somewhere (I forgot where, sorry) throws a compile time error if the GCC version > 8 (unbelievable) but this line can simply be commented out.

  • I had no end of trouble getting keras_application and keras_preprocessing installed, tbh, and in the end I had to install it in the "host" environment (i.e. I ran pip install keras) outside of the pipenv I was running. Perhaps my choice of virtenv was the problem, I don't know, but when I did this the dogging "import keras_preprocessing" error went away.

  • You need TIME, at least 12 Gigs of RAM and ideally as many cores as you can find. Otherwise you will wait. For a long time.

  • Consider putting the PATH setup and bazel command line in a shell script, you will rerun frequently so having that handy is good.

  • Be strong.

@stanwin00
Copy link

Hello, great tutorial! but i keep getting this gcc compile error. Anyway I can fix this?
ensorflow/compiler/mlir/tensorflow/BUILD:175:1: C++ compilation of rule '//tensorflow/compiler/mlir/tensorflow:tensorflow' failed (Exit 1)
gcc: fatal error: Killed signal terminated program cc1plus

@xenon3dfx
Copy link

xenon3dfx commented Jul 26, 2020

Hello everybody.

First of all, thank you Hoffman and the people who commented here for your contributions. This page is one of the best resources I have found, if not the best.

The GPU in my new intel i7 setup is an RTX 2060 SUPER. I would like to make Cuda work with Tensorflow. I have been for 6 days following this guide, actually doing nothing else.

  • I went through 4 complete fresh installs of Ubuntu 20.04. I tried different combinations of drivers: Nvidia 440.100, 450.51 and 450.57 , CUDA 10.1, 10.02 and 11.0, cuDNN 7 and 8. And everything installed from apt or from the downloaded packages.

In one of these 4 fresh installs, with 450 and Cuda 11 and cuDNN 8 I was able to successfully compile TF 2.2. I successfully installed the wheel file with pip, but unfortunately it did not work: I run different training processes, but when I opened nvidia-smi , the python process did not show, and just the CPU resources were available. Therefore, I understood that my driver/cuda/cudnn installation was broken. I was not able to fix it, so I formatted and started again. Unfortunately, I was not able to compile it again. I tried different ways to install the nVidia drivers, Cuda and cuDNN, to no avail.

I would like to start again, taking attention on the very basic, in a 5th fresh install. But before formatting, I would like to know if I was doing anything wrong.

  1. Could you please let me know a combination of versions that worked for you in Ubuntu 20.04?
  • NVIDIA drivers:
  • CUDA drivers
  • cuDNN:
  1. Could you please let me know exactly, how did you install the nVidia drivers after a fresh install? Same for the Cuda drivers and cuDNN.

If you let me know how you installed these three items, I think I can take over and compile tensorflow again, to see if it works.

Again, thank you so much.

@kmhofmann
Copy link
Author

Hi @xenon3dfx,
The exact versions of CUDA, cuDNN, and NVIDIA drivers that I used are mentioned above in the article (currently: NVIDIA driver 440.82, CUDA 10.2, cuDNN v7.6.5). As far as I remember I installed the NVIDIA driver through the Ubuntu-native built-in 'Additional Drivers' mechanism in this instance (since it was equal to the latest available version at the time). I installed the CUDA and cuDNN libraries using the files downloaded from the NVIDIA website (i.e. no apt!), as described here.

Note that the last update to the text was made in May, so apparently versions are outdated by now. Unfortunately I don't have time for another thorough update or any further testing at the moment, so this will have to wait until a later point in time (at least August).

If you use the documented (un)installation instructions for driver and CUDA/cuDNN libraries, you shouldn't have to do any fresh install of Ubuntu 20.04. Uninstalling using the official scripts (do consult the NVIDIA docs) will be sufficient to remove all parts, if necessary.

@jucendrero
Copy link

Thank you for this amazing guide! I didn't expect that building TF from source would be so unintuitive. I've just installed TF 2.3 following your steps and everything worked perfectly fine for me in Ubuntu 20.04 (NVIDIA driver 450.51.05, CUDA 11, cuDNN v8.0.2).

@rsuprun
Copy link

rsuprun commented Aug 28, 2020

Thank the heavens! This works!! This is the 4th time I've built TF from source, and everytime is an exploration in masochism. Just when I think I've figured out the gotchas, there's something new lying in wait. Thank you so much for such thorough instructions.

@FHermisch
Copy link

Whoa, got it! For me it worked nearly completely as described, although I started with a slightly different setup: Ubuntu 18.04, Nvidia-Driver 450.51.06, CUDA 11.0.3, cuDNN 8.0.3.33, GCC 7.5.0.
And then there is the last step(!!!): I tried to execute the "python -c "import tensorflow as tf;" from the directory I was in (because I executed all the steps before there) - it failed for me. Some Googling brought up, that Python will mess with the directory structure and the imports and ends up with "ImportError: cannot import name 'function_pb2'".
Anyone who is stuck there: keep calm, change directory, (to ~ or something) and try again ;-)

@piotrv
Copy link

piotrv commented Sep 27, 2020

I failed compiling TF 2.3.1 (segmentation faults), but with TF 2.3.0, I got it right.
The build process took ages with my I7-4770, almost 4 hours.
I choose bazelisk instead of directly bazel, as it is more convenient to setup

 $ bazel --version
 bazel 3.5.0

This version runs fine.

My config file:

 cat  .tf_configure.bazelrc

build --action_env PYTHON_BIN_PATH="/usr/bin/python3"
build --action_env PYTHON_LIB_PATH="/usr/lib/python3.8/site-packages"
build --python_path="/usr/bin/python3"
build --config=xla
build --action_env TF_CUDA_VERSION="11"
build --action_env TF_CUDNN_VERSION="8"
build --action_env TF_NCCL_VERSION=""
build --action_env TF_CUDA_PATHS="/opt/cuda,/usr/lib,/usr/include"
build --action_env CUDA_TOOLKIT_PATH="/opt/cuda"
build --action_env TF_CUDA_COMPUTE_CAPABILITIES="6.1"
build --action_env LD_LIBRARY_PATH="/opt/cuda/extras/CUPTI/lib64:/opt/cuda/extras/CUPTI/lib64"
build --action_env GCC_HOST_COMPILER_PATH="/usr/bin/gcc-9"
build --config=cuda
build:opt --copt=-march=native
build:opt --copt=-Wno-sign-compare
build:opt --host_copt=-march=native
build:opt --define with_default_optimizations=true
test --flaky_test_attempts=3
test --test_size_filters=small,medium
test --test_env=LD_LIBRARY_PATH
test:v1 --test_tag_filters=-benchmark-test,-no_oss,-no_gpu,-oss_serial
test:v1 --build_tag_filters=-benchmark-test,-no_oss,-no_gpu
test:v2 --test_tag_filters=-benchmark-test,-no_oss,-no_gpu,-oss_serial,-v1only
test:v2 --build_tag_filters=-benchmark-test,-no_oss,-no_gpu,-v1only
build --action_env TF_CONFIGURE_IOS="0"

I always like to (roughly) compare CPU and GPU, so I "embellished" the little Python test a bit:

import time

import tensorflow as tf

cpu_slot = 0
gpu_slot = 0
random_values = [24000, 24000]   # lower values if you get an OOM error
gpus = tf.config.experimental.list_physical_devices('GPU')


def tensor_ops(values):
    return tf.reduce_sum(tf.random.normal(values))


if not gpus:
    print("Sorry, no GPU available")
else:
    # Using CPU at slot 0
    with tf.device('/cpu:' + str(cpu_slot)):
        # Starting a timer
        start = time.monotonic()
        # Doing operations on CPU
        tensor_ops(random_values)
        # Printing how long it took with CPU
        end_CPU = time.monotonic() - start
    # Using the GPU at slot 0
    with tf.device('/gpu:' + str(gpu_slot)):
        # Starting a timer
        start = time.monotonic()
        # Doing operations on CPU
        tensor_ops(random_values)
        # Printing how long it took with CPU
        end_GPU = time.monotonic() - start
        print(f"Executing {random_values[0]} x {random_values[1]} tensor operation:\n")
        print("CPU took:", end_CPU)
        print("GPU took:", end_GPU)
        print(f"\nGPU is {end_CPU / end_GPU:.3f} times faster than CPU !")

Result on my "old" desktop machine with a GTX 1070:

Executing 24000 x 24000 tensor operation:

CPU took: 2.706507972000054
GPU took: 0.11730803399996148

GPU is 23.072 times faster than CPU !

@Jack-KW
Copy link

Jack-KW commented Sep 28, 2020

Thanks! Helped a lot! Very detailed, comfortable to read article, and very easy to implement instructions.

@github-jeff
Copy link

I think I am quite close. This is an excellent article. CUDA, and cuDNN appears to be installed, but I cannot seem to figure this error out when I am compiling via bazel 3.1.0. I'm hoping I just have a header file in the wrong location.

Loading: 0 packages loaded
currently loading: tensorflow/tools/pip_package
Fetching @local_config_cuda; fetching
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):

@sbatururimi
Copy link

sbatururimi commented Nov 26, 2020

I was unable to compile v2.3.1 nor v2.3.0 with

  1. CUDA/cuDNN version: 11.1, 8.0.5, driver 455.45.01
  2. RTX2080Ti
    Always getting
ERROR: An error occurred during the fetch of repository 'local_config_cuda':

But seems to work with v2.4.0rc3 (still in progress). Any advice about how I could use 2.3.1 with the above settings?

Much appreciated!

Updates

  1. I was been able to build v2.4.0rc3 with success using the above Cuda/CuDNN and driver 455.46.01
  2. If, like me, you need an up to date version of tensorflow-text also, then follow the guide above and build v2.3.0 with the above drivers (worked for me)

@becageuse
Copy link

Thanks for this! I'll keep that in mind if I ever need to use tf2 + cuda11.
I needed to use tf2 but my machine has cuda10.2. A friend suggested me to install tensorflow with conda and it worked like a charm with gpu support! Apparently it automatically installs a cudatoolkit 10.1 locally in your conda virtual env.

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

@sbatururimi
Copy link

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet

@Abilay99
Copy link

Abilay99 commented Dec 4, 2020

building tensorflow from source how long will it take and how many iterations will I have now 18674/19018

It took me around 2.5h for the whole process

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet

It took me around 13.5h for the whole process

@sbatururimi
Copy link

Cool!, What devices do you have? I have gpu: gtx 1650ti for notebook cpu: core i5 10th gen ram: 8gb. I started the installation process yesterday, it hasn't finished yet
A PC build almost as a server (or Game PC configuration almost) :), RTX2080Ti, 64gb RAM,

Intel Core i9-9900KS 4 GHz 8-Core Processor

@jediRey
Copy link

jediRey commented Dec 10, 2020

Hi! Thank you for the complete instructions!
I compiled bazel successfully, however when building tensorflow (this step: bazel build --config=opt -c opt //tensorflow/tools/pip_package:build_pip_package), I get the following error:

RROR: An error occurred during the fetch of repository 'eigen_archive':
java.io.IOException: Error downloading [https://storage.googleapis.com/mirror.tensorflow.org/gitlab.com/libeigen/eigen/-/archive/386d809bde475c65b7940f290efe80e6a05878c4/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz, https://gitlab.com/libeigen/eigen/-/archive/386d809bde475c65b7940f290efe80e6a05878c4/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz] to /home/rey/.cache/bazel/_bazel_rey/7a2ccf9885a6b6731b7d3780dd19d183/external/eigen_archive/eigen-386d809bde475c65b7940f290efe80e6a05878c4.tar.gz: GET returned 406 Not Acceptable

Any ideas? I'm using ubuntu 20.4, bazel 3.1.0, and I have tried with both tensorflow 2.3.0 & 2.3.1
Thank you,
Rey

@pribadihcr
Copy link

pribadihcr commented Dec 18, 2020

Whoa, got it! For me it worked nearly completely as described, although I started with a slightly different setup: Ubuntu 18.04, Nvidia-Driver 450.51.06, CUDA 11.0.3, cuDNN 8.0.3.33, GCC 7.5.0.
And then there is the last step(!!!): I tried to execute the "python -c "import tensorflow as tf;" from the directory I was in (because I executed all the steps before there) - it failed for me. Some Googling brought up, that Python will mess with the directory structure and the imports and ends up with "ImportError: cannot import name 'function_pb2'".
Anyone who is stuck there: keep calm, change directory, (to ~ or something) and try again ;-)

Hi I still got the same error even I have changed the directory

@manojec054
Copy link

Uffff, Finally completed. Thanks for the detailed explanation. Had to spend most time in adjusting bazel parameters because of less RAM.
Here is few details if anyone trying to build tensorflow with 8GB RAM.

- add "--jobs=2  --local_ram_resources 2048" to limit RAM consumption.
- Had to use gcc 7.0 
- If your GPU is Geforce GTX 1650 use compute capabilities 7.5. Its not listed in https://developer.nvidia.com/cuda-gpus

@iaroslavragel
Copy link

Hi @kmhofmann, how do run all those pip command? Package python-pip is no longer present in ubuntu 20.04. So is your pip pointing to pip3 actually? Or did you install it not from ubuntu repo?

@EnziinSystem
Copy link

Build Tensorflow from source code is a real nightmare.
Also, I have CPU Core i7 and 8 cores with 16GB RAM but I halt built after 6 hours, my computer hangs.

@fouvy
Copy link

fouvy commented Aug 10, 2021

for cuda 11.0 version and up version. include and lib file has been moved to
/usr/local/cuda-11.0/targets/x86_64-linux
so the fastest way to solve build error of find cuda.h error or cusolver_common.h error and so on, is to doing this:

cp -r /usr/local/cuda-11.0/targets/x86_64-linux/lib/* /usr/local/cuda-11.0/lib64/
cp -r /usr/local/cuda-11.0/targets/x86_64-linux/include/* /usr/local/cuda-11.0/include/

@pnheinsohn
Copy link

For any of those who are having trouble with @local_config_cuda because of some version incompatibility with libcuddart.11.x.y, I suggest this Link: Step 3: Errors, where OP changes the version to 11.0 manually.

@mhoangvslev
Copy link

I made a dockerised tensorflow-compiler that allows you to compile from source with minimal input from the end user
https://github.com/mhoangvslev/tensorflow-compiler

@clockzhong
Copy link

I follow the instructions, but found the following errors:


WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/tensorflow/runtime/archive/093ed77f7d50f75b376f40a71ea86e08cedb8b80.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found
DEBUG: /home/clock/.cache/bazel/_bazel_clock/3e6fc66f8ed17456d842535e756a7be2/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:118:10: 
Auto-Configuration Warning: 'TMP' environment variable is not set, using 'C:\Windows\Temp' as default
WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/rules_cc/archive/081771d4a0e9d7d3aa0eed2ef389fa4700dfb23e.tar.gz failed: class java.io.FileNotFoundException GET returned 404 Not Found


Then no any progress going

@lakpa-tamang9
Copy link

lakpa-tamang9 commented Dec 22, 2022

I have the following error while installing the tensorflow package at the final step. Please help me resolve this.

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

The error started from here.

Collecting tb-nightly<2.4.0a0,>=2.3.0a0
  Using cached tb_nightly-2.3.0a20200722-py3-none-any.whl (6.8 MB)
Collecting google-pasta>=0.1.8
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting absl-py>=0.7.0
  Using cached absl_py-1.3.0-py3-none-any.whl (124 kB)
Collecting scipy==1.4.1
  Using cached scipy-1.4.1.tar.gz (24.6 MB)
  Installing build dependencies ... error
  error: subprocess-exited-with-error
  
  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [327 lines of output]
      Ignoring numpy: markers 'python_version == "3.5" and platform_system != "AIX"' don't match your environment
      Ignoring numpy: markers 'python_version == "3.6" and platform_system != "AIX"' don't match your environment

@g588928812
Copy link

g588928812 commented Apr 16, 2023

Great guide! Just made TF work with CUDA 12.1 and a RTX 3090. Thanks!

@ilan1987
Copy link

Thanks, you helped me a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment