How to install numpy on M1 Max, with the most accelerated performance (Apple's vecLib)? Here's the answer as of Dec 6 2021.
So that your Python is run natively on arm64, not translated via Rosseta.
- Download Miniforge3-MacOSX-arm64.sh, then
- Run the script, then open another shell
$ bash Miniforge3-MacOSX-arm64.sh
- Create an environment (here I use name
np_veclib
)
$ conda create -n np_veclib python=3.9
$ conda activate np_veclib
- To compile
numpy
, first need to installcython
andpybind11
:
$ conda install cython pybind11
- Compile
numpy
by (Thanks @Marijn's answer) - don't useconda install
!
$ pip install --no-binary :all: --no-use-pep517 numpy
- An alternative of 2. is to build from source
$ git clone https://github.com/numpy/numpy
$ cd numpy
$ cp site.cfg.example site.cfg
$ nano site.cfg
Edit the copied site.cfg
: add the following lines:
[accelerate]
libraries = Accelerate, vecLib
Then build and install:
$ NPY_LAPACK_ORDER=accelerate python setup.py build
$ python setup.py install
- After either 2 or 3, now test whether numpy is using vecLib:
>>> import numpy
>>> numpy.show_config()
Then, info like /System/Library/Frameworks/vecLib.framework/Headers
should be printed.
Make conda recognize packages installed by pip
conda config --set pip_interop_enabled true
This must be done, otherwise if e.g. conda install pandas
, then numpy
will be in The following packages will be installed
list and installed again. But the new installed one is from conda-forge
channel and is slow.
Except for the above optimal one, I also tried several other installations
- A.
np_default
:conda create -n np_default python=3.9 numpy
- B.
np_openblas
:conda create -n np_openblas python=3.9 numpy blas=*=*openblas*
- C.
np_netlib
:conda create -n np_netlib python=3.9 numpy blas=*=*netlib*
The above ABC options are directly installed from conda-forge channel. numpy.show_config()
will show identical results. To see the difference, examine by conda list
- e.g. openblas
packages are installed in B. Note that mkl
or blis
is not supported on arm64.
- D.
np_openblas_source
: First install openblas bybrew install openblas
. Then add[openblas]
path/opt/homebrew/opt/openblas
tosite.cfg
and build Numpy from source. M1
andi9–9880H
in this post.- My old
i5-6360U
2cores on MacBook Pro 2016 13in.
Here I use two benchmarks:
mysvd.py
: My SVD decomposition
import time
import numpy as np
np.random.seed(42)
a = np.random.uniform(size=(300, 300))
runtimes = 10
timecosts = []
for _ in range(runtimes):
s_time = time.time()
for i in range(100):
a += 1
np.linalg.svd(a)
timecosts.append(time.time() - s_time)
print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')
dario.py
: A benchmark script by Dario Radečić at the post above.
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| sec | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H | i5-6360U |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| mysvd | 1.02300 | 4.29386 | 4.13854 | 4.75812 | 12.57879 | / | / | 2.39917 |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| dario | 21 | 41 | 39 | 323 | 40 | 33 | 23 | 78 |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
micromamba install numpy "libblas=*=*accelerate"
works well.