Skip to content

Instantly share code, notes, and snippets.

@bgeneto
Last active June 5, 2020 14:07
Show Gist options
  • Save bgeneto/d01cf916f542e22b36e22aff6bced94b to your computer and use it in GitHub Desktop.
Save bgeneto/d01cf916f542e22b36e22aff6bced94b to your computer and use it in GitHub Desktop.
openblas-install-instructions
0. Purpose
This document contains step-by-step instructions to proceed with a (hopefully) successful build and installation of the openblas (optimized BLAS and LAPACK) library. Tested with Ubuntu 18.04 LTS and 20.04 LTS.
1. Install prerequisite software
sudo apt install build-essential g++ gfortran git -y
2. Create installation directory
OPENBLAS_DIR=/opt/openblas
sudo mkdir $OPENBLAS_DIR
3. Build and install openblas library from source
3.1. Download the latest openblas version
cd $HOME
git clone https://github.com/xianyi/OpenBLAS
Now you have to decide wich version of openblas you need/want: non-threaded (sequential), threaded (with native parallelism) or threaded (with openmp parallelism/support).
3.2. Build and install openblas nonthreaded version
cd $HOME/OpenBLAS
make -j DYNAMIC_ARCH=0 CC=gcc FC=gfortran HOSTCC=gcc BINARY=64 INTERFACE=64 \
NO_AFFINITY=1 NO_WARMUP=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 LIBNAMESUFFIX=nonthreaded
sudo make PREFIX=$OPENBLAS_DIR LIBNAMESUFFIX=nonthreaded install
3.3. Build and install openblas threaded version
cd $HOME/OpenBLAS
export USE_THREAD=1
export NUM_THREADS=64
export DYNAMIC_ARCH=0
export NO_WARMUP=1
export BUILD_RELAPACK=0
export COMMON_OPT="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export CFLAGS="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export FCOMMON_OPT="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export FCFLAGS="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
make -j DYNAMIC_ARCH=0 CC=gcc FC=gfortran HOSTCC=gcc BINARY=64 INTERFACE=64 LIBNAMESUFFIX=threaded \
sudo make PREFIX=$OPENBLAS_DIR LIBNAMESUFFIX=threaded install
3.4. Build and install openblas openmp version
cd $HOME/OpenBLAS
export USE_THREAD=1
export NUM_THREADS=64
export DYNAMIC_ARCH=0
export NO_WARMUP=1
export BUILD_RELAPACK=0
export COMMON_OPT="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export CFLAGS="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export FCOMMON_OPT="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
export FCFLAGS="-O3 -ftree-vectorize -fprefetch-loop-arrays --param prefetch-latency=300"
make -j DYNAMIC_ARCH=0 CC=gcc FC=gfortran HOSTCC=gcc BINARY=64 INTERFACE=64 \
USE_OPENMP=1 LIBNAMESUFFIX=openmp
sudo make PREFIX=$OPENBLAS_DIR LIBNAMESUFFIX=openmp install
PS.: do NOT use -march=native or BUILD_RELAPACK=1 (gives too many errors in tests: make -j lapack-test)
4. Testing
Ideally you should test before install :-)
make -j lapack-test
cd ./lapack-netlib; python3 ./lapack_testing.py -r -b TESTING
Example of an acceptable output:
--> LAPACK TESTING SUMMARY <--
SUMMARY nb test run numerical error other error
================ =========== ================= ================
REAL 1298655 3 (0.000%) 0 (0.000%)
DOUBLE PRECISION 1309007 0 (0.000%) 0 (0.000%)
COMPLEX 768366 0 (0.000%) 0 (0.000%)
COMPLEX16 769178 0 (0.000%) 0 (0.000%)
--> ALL PRECISIONS 4145206 3 (0.000%) 0 (0.000%)
5. Compiling and linking with openblas. Using your openblas library
Required linker options:
-L/opt/openblas/lib -lm -lpthread -lgfortran -lopenblas
Required compiler options for openmp execution:
-I/opt/openblas/include -pthread -fopenmp
Compilation example (using openmp version):
ulimit -s unlimited
export MAX_THREADS=4
gfortran -I/opt/openblas/include -pthread -fopenmp -O3 -funroll-all-loops -fexpensive-optimizations -ftree-vectorize -fprefetch-loop-arrays -floop-parallelize-all \
-ftree-parallelize-loops=$MAX_THREADS -m64 -Wall example.f90 -o example -L/opt/openblas/lib -lm -lpthread -lgfortran -lopenblas_openmp
gcc compilation example (threaded version, not openmp):
gcc -I/opt/openblas/include -pthread -O3 -Wall example.c -o ~/bin/example -L/opt/openblas/lib -lm -lpthread -lgfortran -lopenblas
Environment variables are used to specify a maximum number of threads. For example,
export MAX_THREADS=4
export OPENBLAS_NUM_THREADS=$MAX_THREADS
export GOTO_NUM_THREADS=$MAX_THREADS
export OMP_NUM_THREADS=$MAX_THREADS
export MKL_NUM_THREADS=$MAX_THREADS
export BLIS_NUM_THREADS=$MAX_THREADS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment