Skip to content

Instantly share code, notes, and snippets.

@rmcgibbo
Last active October 14, 2021 00:17
Show Gist options
  • Star 33 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save rmcgibbo/4950848 to your computer and use it in GitHub Desktop.
Save rmcgibbo/4950848 to your computer and use it in GitHub Desktop.
Scientific Python From Source, with MKL

Scientific Python From Source

This document will walk you through compiling your own scientific python distribution from source, without sudo, on a linux machine. The core numpy and scipy libraries will be linked against Intel MKL for maximum performance.

This procedure has been tested with Rocks Cluster Linux 6.0 (Mamba) and CentOS 6.3.

Compiling Python From Source

Most scientific python software has not ported to python3 yet, so we're going to use the latest and final version of python2.

To get started, download and compile python

wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz`
tar -xzvf Python-2.7.3.tgz
cd Python-2.7.3
./configure --prefix=$HOME/local/python
make
make install
export PATH=$HOME/local/python/bin:$PATH
cd ..

Numpy and Scipy with MKL

The core scientific python packages are numpy and scipy. To get the most performance, we want to build them with intel compilers and link them against the Intel Math Kernel Library (MKL), which contains the fastest linear algebra routines

Download The Intel Compilers

The first step is to download the intel compilers. They contain MKL as well -- you don't need to download a separate package for MKL.

  • Intel® Fortran Composer XE 2013 for Linux
  • Intel® C++ Composer XE 2013 for Linux

The compilers are free for academic use. You can sign the academic license and get the download links at this website.

When you sign up, they send you an email containing the links, and you can download files called l_fcompxe_2013.2.146.tgz for the fortran compiler, and l_ccompxe_2013.2.146.tgz for the C/C++ compiler. Untar these files.

The installation is done via an interactive shell script. When it asks, change the install location to /home/<your_user_name>/opt.

cd l_ccompxe_2013.2.146
./install.sh

Install the fortran compiler as well, also using /home/<your_user_name>/opt as the install path

cd ../l_fcompxe_2013.2.146
./install.sh

To make icc, ifort, and the MKL libraries available to other programs, you'll need to add some entries to your ~/.bashrc file:

# add the intel compilers to the PATH
export PATH=$HOME/opt/intel/bin:$PATH

# add MKL and the compiler libs to the path
export LD_LIBRARY_PATH=$HOME/opt/intel/mkl/lib/intel64/:$$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOME/opt/intel/lib/intel64/:$LD_LIBRARY_PATH

Installing Numpy

Intel has some good directions for installing numpy and scipy with MKL, available here

The latest version of numpy, 1.7.0, can be downloaded from the python package index at http://pypi.python.org/pypi/numpy/1.7.0.

Inside the numpy-1.7.0 directory, make a file called site.cfg, containing these lines

[mkl]
library_dirs = /home/<your_username>/opt/intel/mkl/composer_xe_2013/lib/intel64
include_dirs = /home/<your_username>/opt/intel/mkl/include
mkl_libs = mkl_rt
lapack_libs =

As the article describes, you want to add some compiler flags to how icc is invoked. Open up numpy/distutils/intelcompiler.py file using your editor, and edit the line that reads cc_exe='icc' (probably line 7) to instead read cc_exe='icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'

Now, the execute the python install command, specifying the intelem compiler.

python setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install

Installing Scipy

The lastest version of scipy, current 0.11, can be downloaded from the python package index at http://pypi.python.org/pypi/scipy/0.11.0.

The install command is:

python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install

Installing Python Packages

Now that we've got numpy and scipy installed, we can add in some more packages. One of the key packages, pytables, requires a C library, hdf5 that might not be installed by default.

Compiling hdf5 Manually

hdf5 is a C library for efficient storage of hierarchical datasets. Unfortunately, it's not installed by default on all clusters. We'll install it to $HOME/opt/hdf5.

wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.10-patch1.tar.gz
tar -xzvf hdf5-1.8.10-patch1.tar.gz
cd hdf5-1.8.10-patch1
./configure --prefix=$HOME/opt/hdf5
make
make install

You should add the following lines to your ~/.bashrc file:

# install hdf5 libraries and executables
export LD_LIBRARY_PATH=$HOME/opt/hdf5/lib:$LD_LIBRARY_PATH
export PATH=$HOME/opt/hdf5/bin:$PATH

Manually Install Python setuptools

We're going to need to get one python package called setuptools manually.

wget http://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.gz
tar -xzvf setuptools-0.6c11.tar.gz
cd setuptools-0.6c11.tar.gz
python setup.py install
cd ..

Install Remaining Packages With pip

Using setuptools we can automatically install some other python packages. Here are commands to paste into the shell that will install a wide range of core scientific python packages.

# First, get pip, a better python package manager
easy_install pip

# Virtualenv helps to manage dependencies via isolated virtual environments
pip install virtualenv
pip install virtualenvwrapper

# For command line interaction, we want the GNU readline wrappers
pip install readline

# Nose is a widely used unit testing library
pip install nose

# Cython is a python/c hybrid language, and numexpr is an inline
# mathematical expression compiler. Both are required by `tables`
pip install cython
pip install numexpr

# tables` gives us hierarchical data sets. note: we have to tell
# it where to find our hdf5 libraries
export HDF5_DIR=$HOME/opt/hdf5
pip install tables

# Matplotlib is the standard 2d plotting library
pip install matplotlib
 
# IPython is great for interactive data analysis
# pip install ipython

# pandas builds on numpy with a powerful numpy-backed DataFrame object for
# smart data analysis. sklearn and statsmodels give us key statistical
# algorithms
pip install pandas
pip install scikit.learn
pip install statsmodels

Testing Your Installation

Here are a few commands that can be executed from the shell to verify your installation

python -c "import numpy; numpy.test()"
python -c "import scipy; scipy.test()"
python -c "import tables; tables.test()"
nosetests pandas

Compiling Gromacs

For reference, I've put down some gromacs stuff here.

OpenMPI w/ gcc

wget http://www.open-mpi.org/software/ompi/v1.6/downloads/openmpi-1.6.3.tar.gz
tar -xzvf openmpi-1.6.3.tar.gz
cd openmpi-1.6.3
# vt is not compatible with recent versions of CUDA
./configure  --disable-vt --prefix=$HOME/opt/openmpi163
make
make install
cd ..

OpenMPI w/ intel

cd openpi-1.6.3
# vt is not compatible with recent versions of CUDA
./configure  --disable-vt --prefix=$HOME/opt/intel/openmpi/1.6.3/ CC=icc CXX=icpc F77=ifort FC=ifort
make -j4
make install

Add the following to your .bashrc

# install OpenMPI
export LD_LIBRARY_PATH=$HOME/opt/openmpi163/lib:$LD_LIBRARY_PATH
export PATH=$HOME/opt/openmpi163/bin:$PATH

Gromacs depends on FFTW3

wget http://www.fftw.org/fftw-3.3.3.tar.gz
tar -xzvf fftw-3.3.3.tar.gz
cd fftw-3.3.3
./configure --prefix=$HOME/opt/fftw --enable-float --enable-shared --enable-sse2
make
make install
cd ..

Gromacs 4.5

wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.5.5.tar.gz
tar -xzvf gromacs-4.5.5.tar.gz
cd  gromacs-4.5.5
export CPPFLAGS="-I$HOME/opt/fftw/include"
export LDFLAGS="-L$HOME/opt/fftw/lib"
./configure --prefix=$HOME/opt/gromacs455 --enable-mpi --enable-shared --enable-threads --with-fft=fftw3
make
make install
cd ..

Gromacs 4.6

wget ftp://ftp.gromacs.org/pub/gromacs/gromacs-4.6.tar.gz
tar -xzvf gromacs-4.6.tar.gz
mkdir build
cd build
export PKG_CONFIG_PATH=$HOME/opt/fftw/lib/pkgconfig/:$PKG_CONFIG_PATH
cmake -DGMX_MPI=ON -DCMAKE_INSTALL_PREFIX:PATH=$HOME/opt/gromacs46 ../gromacs-4.6 
make -j4
make install
cd ..

Amber12, w/ GPU

Amber isn't open source, so you're going to have to obtain the source code yourself. It comes in two files, Amber12.tar.bz2, and AmberTools12.tar.bz2

tar -xjvf Amber12.tar.bz2
tar -xjvf AmberTools12.tar.bz2
cd amber12
export AMBERHOME=`pwd`
export MKL_HOME=/home/rmcgibbo/opt/intel/mkl/
export CUDA_HOME=/usr/local/cuda/
export LD_LIBRARY_PATH=$HOME/opt/intel/openmpi/1.6.3/lib/:$LD_LIBRARY_PATH
export PATH=$HOME/opt/intel/openmpi/1.6.3/bin/:$PATH
echo 'Y' | ./configure -cuda -noX11 intel
make install
@fasiha
Copy link

fasiha commented Jan 15, 2016

+1 Thanks! Is the age of this document the only reason Python 2.7.11 or Numpy 1.10.4 or Scipy 0.16.1 aren't mentioned, and instead much older versions are cited? Or are these older versions carefully calibrated for your compiler setup?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment