Skip to content

Instantly share code, notes, and snippets.

@michaelchughes
Last active April 7, 2024 09:48
Show Gist options
  • Star 51 You must be signed in to star a gist
  • Fork 11 You must be signed in to fork a gist
  • Save michaelchughes/85287f1c6f6440c060c3d86b4e7d764b to your computer and use it in GitHub Desktop.
Save michaelchughes/85287f1c6f6440c060c3d86b4e7d764b to your computer and use it in GitHub Desktop.
Fixes for GLIBC errors when installing tensorflow or pytorch on older Red Hat or CentOS cluster environments

Goal

Install working tensorflow or pytorch via standard conda environment workflow.

Basic Setup : Install pytorch in a fresh conda environment

The recommended conda-based install process works smoothly:

$ # Create a fresh environment
$ conda create --name py37_torch python=3.7 --yes

$ # Activate new environment
$ source activate py37_torch

$ # Install tensorflow
$ conda install tensorflow --yes

$ # Install pytorch 
$ conda install pytorch-cpu torchvision-cpu -c pytorch --yes

Roadblock

The gotcha is that when we try to then use the package we just installed, we get an GLIBC error like this:

$ python -c "import torch"
ImportError: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by .../site-packages/torch/lib/libshm.so)

Badness! Clearly, the current computing system doesn't have a recent-enough GLIBC. However, if this is a cluster computing system, you often don't have root access and can't easily upgrade the GLIBC.

Step 1: Install recent copies of glibc and libc++ in userspace

Credit: StackOverflow answer by Theo T.

Step 1a: (NEW FOR PYTHON 3.7) Download and unpack some pre-compiled GLIBC shared libraries

This is for Python 3.7 (works for 3.6 too!) (See an older list for Python 2.7 at bottom of this doc).

$ # Make a folder within the environment to hold useful things
$ mkdir -p /path/to/conda/envs/py37_torch1.0/custom_libs/
$ cd /path/to/conda/envs/py37_torch1.0/custom_libs/

$ # Get libc files (URL verified by MCH on 2019/08/21)
$ wget http://mirrors.kernel.org/ubuntu/pool/main/g/glibc/libc6_2.23-0ubuntu10_amd64.deb
$ wget http://mirrors.kernel.org/ubuntu/pool/main/g/glibc/libc6-dev_2.23-0ubuntu10_amd64.deb

$ # Unpack files into current directory (will create usr/ and lib/ and lib64/ folders)
$ ar p libc6_2.23-0ubuntu10_amd64.deb data.tar.xz | tar xvJ
$ ar p libc6-dev_2.23-0ubuntu10_amd64.deb data.tar.xz | tar xvJ

What have we accomplished? You should have some new folders in your current directory, labeld usr/ and lib/ and lib64/.

We can verify that before, we had an OLD libc, and now we have a shiny new one!

Check the OLD location of libc.so.6

$ strings /lib/libc.so.6 | grep GLIBC_2. | tail -n3
GLIBC_2.10
GLIBC_2.11
GLIBC_2.12

NEW version of libc.so.6 in working directory

$ strings lib/x86_64-linux-gnu/libc.so.6 | grep GLIBC_2 | tail -n3
GLIBC_2.18
GLIBC_2.22
GLIBC_2.23

Step 1b: Download and unpack some pre-compiled LIBSTDC++ shared libraries

# Get libstdc++ (URL verified by MCH on 2019/02/18)
wget ftp://195.220.108.108/linux/mageia/distrib/4/x86_64/media/core/updates/libstdc++6-4.8.2-3.2.mga4.x86_64.rpm

# Alternative URL:
# wget http://ftp.riken.jp/Linux/scientific/6.0/x86_64/os/Packages/libstdc++-4.4.4-13.el6.x86_64.rpm

# Unpack into current directory (will add content to lib/ and lib64/ folders)
rpm2cpio libstdc++6-4.8.2-3.2.mga4.x86_64.rpm | cpio -idmv

Step 2: Use patchelf to make your python install use these userspace libraries instead of the system defaults

Credit: Stackoverview answer by Evalds Urtans

Step 2a: Install patchelf into current conda env

# Be sure correct environment is active
$ source activate py37_torch

# Install patchelf
(py37_torch) $ conda install patchelf -c conda-forge --yes

Step 2b: Use attached script to alter the conda env's python executable to use the custom GLIBC libraries

(py37_torch) $ bash rewrite_python_exe_glibc_with_patchelf.sh

-- DEPRECATED --

Step 1a: (OLD FOR PYTHON 2.7) Download and unpack some pre-compiled GLIBC shared libraries

$ # Make a folder within the environment to hold useful things
$ mkdir -p /path/to/conda/envs/py27_torch1.0/custom_libs/
$ cd /path/to/conda/envs/py27_torch1.0/custom_libs/

$ # Get libc files (URL verified by MCH on 2019/02/18)
$ wget https://launchpadlibrarian.net/137699828/libc6_2.17-0ubuntu5_amd64.deb
$ wget https://launchpadlibrarian.net/137699829/libc6-dev_2.17-0ubuntu5_amd64.deb

$ # Unpack files into current directory (will create usr/ and lib/ and lib64/ folders)
$ ar p libc6_2.17-0ubuntu5_amd64.deb data.tar.gz | tar zx
$ ar p libc6-dev_2.17-0ubuntu5_amd64.deb data.tar.gz | tar zx
#!/usr/env bash
# TODO edit this line to specify location of new glibc
export GLIBC_PATH=/cluster/tufts/hugheslab/miniconda2/envs/ape/custom_libs/
export GLIBC_LD_PATH=$GLIBC_PATH/lib/x86_64-linux-gnu/ld-2.23.so
if [[ ! -f $GLIBC_LD_PATH ]]; then
echo "ERROR: Provided GLIBC_LD_PATH not valid"
exit
fi
echo "OVERWRITING PYTHON EXECUTABLE:"
python_exe=`which python`
echo $python_exe
IS_CONDA_ENV=`python -c "print('$python_exe'.count('/envs/') > 0)"`
echo "IS_CONDA_ENV: $IS_CONDA_ENV"
if [[ $IS_CONDA_ENV -ne 'True' ]]; then
echo "ERROR: Current python executable not in conda env. Will not alter to avoid problems."
exit
fi
CONDA_ENV_LIB=`python -c "print('$python_exe'.replace('/bin/python', '/lib'))"`
echo "CREATING BACKUP PYTHON"
python_tmp_exe=`python -c "print('$python_exe'.replace('python', 'python_backup'))"`
cp $python_exe $python_tmp_exe
echo "$python_tmp_exe"
rpath=$GLIBC_PATH/lib/x86_64-linux-gnu:$CONDA_ENV_LIB:/usr/lib64:/lib64:/lib
echo "CALLING PATCHELF on 'python' binary"
patchelf --set-interpreter $GLIBC_LD_PATH --set-rpath $rpath $python_exe
echo "DONE! patchelf complete"
@jung-youjin
Copy link

@fangwei18 Any updates? I'm on different OS but facing similar issues. I presume segmentation fault is due to export part.

@ausstein
Copy link

@fangwei18 @jung-youjin

I fixed this by using an older version of pytorch then rerunning the script.
conda install pytorch==1.4.0 torchvision==0.5.0 cpuonly -c pytorch
conda install -c anaconda pip python==3.7
bash rewrite_python_exe_glibc_with_patchelf.sh

I am not sure if you need to go this far back but pytorch 1.4 is sufficent for me.
I am on OpenSuse 11 and I am so happy to finally have this running. I hope this helps

@KleinWang
Copy link

Thank you, it helps me a lot.

@LitMSCTBB
Copy link

Thanks for the solution; the process ran smoothly. But I'm having an error with opencv-python. Using conda install -c menpo opencv results in this:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: -
Found conflicts! Looking for incompatible packages.                                                                                                                   failed

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - opencv -> python[version='2.7.*|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0|>=2.7,<2.8.0a0']

Your python: python=3.9

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.12=0
  - feature:|@/linux-64::__glibc==2.12=0
  - opencv -> libgcc-ng[version='>=7.2.0'] -> __glibc[version='>=2.17']

Your installed version is: 2.12

Anyone been able to solve this issue?

@hzcheney
Copy link

@LitMSCTBB I think you should downgrade your python to 3.6 and try again.

@ZhuofanShen
Copy link

Hi, I just tried your solution to modify my anaconda environment on the university cluster. Things went well, but after running the patchelf, the PyTorch package cannot recognize the CUDA drive anymore.
import torch
print(torch.cuda.device_count()) # --> 0
print(torch.cuda.is_available()) # --> False
print(torch.version.cuda) # --> 11.3
Do you have any idea what is happening? Thank you.

@KleinWang
Copy link

Hi, thanks for your method. It almost works but I get the bug as follows. Do you have any idea to fix it? Thank you very much

(SG) klein@fawn:GPU$python -c 'import torch'
Traceback (most recent call last):
File "", line 1, in
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 198, in
_load_global_deps()
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 151, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: /lib64/librt.so.1: symbol __vdso_clock_gettime, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference

@KleinWang
Copy link

If I import torch twice in a jupyter notebook, the second time will be successful. However torch.nn.NLLLoss() has bugs again.

(SG) klein@fawn:GPU$python
Python 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
Traceback (most recent call last):
File "", line 1, in
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 198, in
_load_global_deps()
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 151, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: /lib64/librt.so.1: symbol __vdso_clock_gettime, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference
import torch
torch.nn.NLLLoss()
Traceback (most recent call last):
File "", line 1, in
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/nn/modules/loss.py", line 207, in init
super(NLLLoss, self).init(weight, size_average, reduce, reduction)
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/nn/modules/loss.py", line 26, in init
self.register_buffer('weight', weight)
File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/nn/modules/module.py", line 308, in register_buffer
elif not isinstance(name, torch._six.string_classes):
AttributeError: module 'torch' has no attribute '_six'

@zouguangxian
Copy link

based on this gist, I succeed to install python 3.8 with conda on CentOS 6. GLIBC 2.17 and patchelf are compiled from source code. rewrite_python_exe_glibc_with_patchelf.sh is rewritten with one-line command.

https://gist.github.com/zouguangxian/31856f63fe2ac1bad11f404728dfb305

@zwben
Copy link

zwben commented Jul 28, 2022

Hi, thanks for your method. It almost works but I get the bug as follows. Do you have any idea to fix it? Thank you very much

(SG) klein@fawn:GPU$python -c 'import torch' Traceback (most recent call last): File "", line 1, in File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 198, in _load_global_deps() File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/site-packages/torch/init.py", line 151, in _load_global_deps ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL) File "/data/L/Brain/klein/anaconda3/envs/SG/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: /lib64/librt.so.1: symbol __vdso_clock_gettime, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference

I got the same problem. I installed the latest pytorch (1.12.0) which requires glibc > 2.27. I installed glibc 2.31 and libstdc++6-9.3.1, but got this problem when I import torch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment