Skip to content

Instantly share code, notes, and snippets.

@bgeneto
Last active May 5, 2024 22:15
Show Gist options
  • Save bgeneto/ed3b5dbd41ae28f2adb2c7b87b626c3f to your computer and use it in GitHub Desktop.
Save bgeneto/ed3b5dbd41ae28f2adb2c7b87b626c3f to your computer and use it in GitHub Desktop.
numpy eigen benchmark

Purpose

To benchmark Python (Numpy) default procedure to find all eigenvalues and eigenvectors of a complex (hermitian) matrix and compare the results with Fortran using either MKL and OpenBLAS libraries.

Python code

import time
import numpy as np

# record start time
start = time.time()

# square matrix dimension
n = 4096

# generate our random (real and imaginary) matrix elements
print('01. Generating random numbers for hermitian matrix... ')
A = np.random.rand(n,n)
B = np.random.rand(n,n)

print('02. Building our positive definite hermitian matrix...')
# build complex matrix from real ones
C = A + 1j*B

# ensure its hermitian
C = C * C.conj().T

# positive-definite
np.fill_diagonal(C, C.diagonal() + n)

# compute eigenvalues and eigenvectors
print('03. Computing all eigenvalues+eigenvectors using numpy eigh...')
w, v = np.linalg.eigh(C)

# save end time
end = time.time()
print(f"Numpy eigh took: {end-start:8.3f} seconds")

Bench results

A) numpy + openblas

A.1) build openblas

Custom build of the latest openblas with 8 byte interger support via the following:

wget https://github.com/xianyi/OpenBLAS/releases/download/v0.3.20/OpenBLAS-0.3.20.tar.gz
tar xf OpenBLAS-0.3.20.tar.gz
cd OpenBLAS-0.3.20
export USE_THREAD=1
export DYNAMIC_ARCH=0
export NO_WARMUP=1
export BUILD_RELAPACK=0
export BINARY=64
export INTERFACE64=1
export SYMBOLSUFFIX=64_
export NPY_USE_BLAS_ILP64=1
export NPY_BLAS_ILP64_ORDER=openblas64_
export NPY_LAPACK_ILP64_ORDER=openblas64_
make -j
sudo make install SYMBOLSUFFIX=64_

A.2) build numpy

Configure numpy to use our custom (and latest) openblas:

wget https://raw.githubusercontent.com/numpy/numpy/main/site.cfg.example -O ~/.numpy-site.cfg
vi ~/.numpy-site.cfg

Uncomment and edit the following lines:

[openblas64_]
libraries = openblas64_
library_dirs = /opt/OpenBLAS/lib
include_dirs = /opt/OpenBLAS/include
runtime_library_dirs = /opt/OpenBLAS/lib

Now build our numpy with the built openblas:

cd $HOME
python3 -m venv openblas-numpy.env
source ~/openblas-numpy.env/bin/activate
export NPY_USE_BLAS_ILP64=1
export NPY_BLAS_ILP64_ORDER=openblas64_
export NPY_LAPACK_ILP64_ORDER=openblas64_
pip install numpy --no-binary numpy --force-reinstall

A.3) Openblas results

Results:

(50.134+50.600+50.139)/3 = 50.291 seconds

(B) numpy + mkl

B.1) install intel mkl

Intel mkl installed from oneapi repo using the following commands:

curl -Lo- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | sudo gpg --dearmor -o /usr/share/keyrings/oneapi-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
sudo apt install intel-oneapi-mkl intel-oneapi-mkl-devel -y

B.2) build numpy + mkl

vi ~/.numpy-site.cfg

[mkl]
library_dirs = /opt/intel/oneapi/mkl/2022.1.0/lib/intel64
include_dirs = /opt/intel/oneapi/mkl/2022.1.0/include
libraries = mkl_rt

Now build numpy with mkl:

cd $HOME
python3 -m venv mkl-numpy.env
source ~/mkl-numpy.env/bin/activate
pip install numpy --no-binary numpy --force-reinstall

Check if numpy is really using mkl:

python
>>> import numpy as np
>>> np.__config__.show()

B.3) mkl results

Results:

(9.304+8.974+9.050)/3 = 9.11 seconds (x5.5 faster than openblas)

NOTE 1:

For comparison, intel fortran 2021.6 (ifort) with the same mkl took 10.8 seconds. gfortran 11 with openblas 0.3.20 took 12.6 seconds. So, in this case, Python is ~20% faster than Fortran!

NOTE 2:

You can use w, v = scipy.linalg.eigh(C, driver='evr') with numpy+openblas in order to avoid this performance issue. In this case, openblas will be only marginally slower than mkl.

NOTE 3:

Hardware used for this comparison: Intel i5 12600k with 64GB DDR4 3200MHz memory running Debian Linux testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment