Skip to content

Instantly share code, notes, and snippets.

@vardaan123
Last active May 12, 2018 02:17
Show Gist options
  • Save vardaan123/5772abefb20194474dbad2c3aeb2e583 to your computer and use it in GitHub Desktop.
Save vardaan123/5772abefb20194474dbad2c3aeb2e583 to your computer and use it in GitHub Desktop.
Library Loading Tutorial

Source: #clusters by Olexa Bilanuik

Important Environment Variables

  • PATH: Ordered, colon (:)-separated list of directories. List of directories searched by the shell for executables to execute.
  • LD_LIBRARY_PATH: Ordered, colon (:)-separated list of directories. Contributes to, but is not the only, list of directories searched by the dynamic linker for libraries to load.

Execution of a Program in a Shell

  • If you execute a command cmd arg arg arg... in a shell, your shell will:
    • Search the directories on the PATH for the first such directory that contains a file or symlink by the name of cmd,
    • Execute it, passing the given arguments arg arg arg....
  • Typical problems at this stage:
    • bash: /usr/bin/blob: No such file or directory (Your cmd is a path to somewhere that doesn't exist)
    • bash: blob: command not found (Your cmd could not be found in any directory on the PATH)

Library Loading

Any time a library is dynamically requested by name, such as

    1. At program startup or
    1. when a Python module with native extensions is import-ed), a search is begun for that library. If a library by that name is already loaded, that is what is used. Otherwise, a procedure is followed to search the filesystem, and a handle to it is returned. This uses LD_LIBRARY_PATH, but that's not the whole story!!!! This process may be recursive if the loaded object has its own dependencies.

Typical problems at this stage:

  • cmd: error while loading shared libraries: libwhatever.so.1: cannot open shared object file: No such file or directory (The cmd or shared library could not find libwhatever.so.1 after the search procedure).

Assuming this succeeds, the dynamic linker attempts to link any symbols (functions, global variables, ...) that the loading object requested from the loaded object. All symbols have a name, and it may rarely include a version.

Typical problems at this stage:

  • Error relocating /usr/lib/libwhatever.so.1: functionname: symbol not found (The library that was loaded doesn't have the symbol functionname in it. Probably because the wrong library was loaded.)
  • /lib64/libc.so.6: version GLIBC_2.14' not found (required by /usr/lib/libstdc++.so.6)(The library you loaded uses symbol versioning (this is almost alwayslibc.so`) and its symbols are too old. Probably because the wrong library was loaded, but also possible your libraries really are too old.)

Python Import

  • A failure to find a Python module for import on the sys.path at all is a ModuleNotFoundError.
  • A failure to import a Python module that was found due to (e.g.) a failed shared-library dynamic load/link is an ImportError.

Useful Command-Line Tools to Debug These Problems:

  • Which file will be selected by BASH for execution can be found by executing which cmd. This matters if you have multiple directories in which python is installed (like the system Python, the lab stack Python, or your own Conda envs).

  • Which libraries are requested (and found) by a shared library or executable loaded in isolation is given by ldd /path/to/cmd or ldd path/to/libwhatever.so.

  • Which symbols a shared library defines, exposes and requests is given by nm path/to/libwhatever.so (but see also nm's -D flag).

  • Detail is given by readelf -d path/to/libwhatever.so (The -a is extreme overkill) :boom: Detailed Library Search Procedure (WARNING: BATSHIT CRAZY) 💥

  • The search for libraries uses LD_LIBRARY_PATH, but that's not the only thing that influences the search order.

  • The dynamic loader can also use two optional PATHs that are directly embedded within the executable/shared library: RPATH and RUNPATH. That allows a program to "know" ahead-of-time where to find its libraries.

  • The difference is that 1) RUNPATH overrides RPATH and 2) RUNPATH is searched after LD_LIBRARY_PATH, not before.

  • Anaconda binaries are specially-built to use a RUNPATH that points into the right place relative to all other Anaconda software.

  • Compute Canada's compilers are trained to insert a RUNPATH to the correct locations (where libc.so.6 is, for instance) in everything they compile.

  • Compute Canada's usual Linux utilities are built with those compilers, so they do find the right libraries and do work.

  • Software downloaded off the internet does not have the RUNPATH/RPATH properly set, so it will not find the right libraries and will not work.

  • To fix this, Compute Canada has a script called setrpaths.sh that directly patches the binary to use the right RUNPATH.

The Search Procedure:

  RPATH of the loading object,
    then the RPATH of its loader (unless it has a RUNPATH), ...,
    until the end of the chain, which is either the executable
    or an object loaded by dlopen
  Unless executable has RUNPATH:
    RPATH of the executable
LD_LIBRARY_PATH
RUNPATH of the loading object
ld.so.cache
default dirs```
https://blog.qt.io/blog/2011/10/28/rpath-and-runpath/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment