Skip to content

Instantly share code, notes, and snippets.

@MarkDana
Last active October 28, 2023 11:42
Show Gist options
  • Star 33 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save MarkDana/a9481b8134cf38a556cf23e1e815dafb to your computer and use it in GitHub Desktop.
Save MarkDana/a9481b8134cf38a556cf23e1e815dafb to your computer and use it in GitHub Desktop.
Install NumPy on M1 Max

How to install numpy on M1 Max, with the most accelerated performance (Apple's vecLib)? Here's the answer as of Dec 6 2021.


Steps

I. Install miniforge

So that your Python is run natively on arm64, not translated via Rosseta.

  1. Download Miniforge3-MacOSX-arm64.sh, then
  2. Run the script, then open another shell
$ bash Miniforge3-MacOSX-arm64.sh
  1. Create an environment (here I use name np_veclib)
$ conda create -n np_veclib python=3.9
$ conda activate np_veclib

II. Install Numpy with BLAS interface specified as vecLib

  1. To compile numpy, first need to install cython and pybind11:
$ conda install cython pybind11
  1. Compile numpy by (Thanks @Marijn's answer) - don't use conda install!
$ pip install --no-binary :all: --no-use-pep517 numpy
  1. An alternative of 2. is to build from source
$ git clone https://github.com/numpy/numpy
$ cd numpy
$ cp site.cfg.example site.cfg
$ nano site.cfg

Edit the copied site.cfg: add the following lines:

[accelerate]
libraries = Accelerate, vecLib

Then build and install:

$ NPY_LAPACK_ORDER=accelerate python setup.py build
$ python setup.py install
  1. After either 2 or 3, now test whether numpy is using vecLib:
>>> import numpy
>>> numpy.show_config()

Then, info like /System/Library/Frameworks/vecLib.framework/Headers should be printed.

III. For further installing other packages using conda

Make conda recognize packages installed by pip

conda config --set pip_interop_enabled true

This must be done, otherwise if e.g. conda install pandas, then numpy will be in The following packages will be installed list and installed again. But the new installed one is from conda-forge channel and is slow.


Comparisons to other installations:

1. Competitors:

Except for the above optimal one, I also tried several other installations

  • A. np_default: conda create -n np_default python=3.9 numpy
  • B. np_openblas: conda create -n np_openblas python=3.9 numpy blas=*=*openblas*
  • C. np_netlib: conda create -n np_netlib python=3.9 numpy blas=*=*netlib*

The above ABC options are directly installed from conda-forge channel. numpy.show_config() will show identical results. To see the difference, examine by conda list - e.g. openblas packages are installed in B. Note that mkl or blis is not supported on arm64.

  • D. np_openblas_source: First install openblas by brew install openblas. Then add [openblas] path /opt/homebrew/opt/openblas to site.cfg and build Numpy from source.
  • M1 and i9–9880H in this post.
  • My old i5-6360U 2cores on MacBook Pro 2016 13in.

2. Benchmarks:

Here I use two benchmarks:

  1. mysvd.py: My SVD decomposition
import time
import numpy as np
np.random.seed(42)
a = np.random.uniform(size=(300, 300))
runtimes = 10

timecosts = []
for _ in range(runtimes):
    s_time = time.time()
    for i in range(100):
        a += 1
        np.linalg.svd(a)
    timecosts.append(time.time() - s_time)

print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')
  1. dario.py: A benchmark script by Dario Radečić at the post above.

3. Results:

+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
|  sec  | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H | i5-6360U |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| mysvd |  1.02300  |   4.29386  |   4.13854   |  4.75812  |      12.57879      |  / |     /    |  2.39917 |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
| dario |     21    |     41     |      39     |    323    |         40         | 33 |    23    |    78    |
+-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
@alexshmmy
Copy link

alexshmmy commented Nov 12, 2022

Thank you for the tips. I followed and in my freshly new MAC M1 MAX i did the following:

  1. I installed the Minoforge3 (bash Miniforge3-MacOSX-arm64.sh)
  2. Initialized a conda base environment (conda init) with Python 3.10
  3. Installed numpy as:
    conda install numpy "libblas=*=*accelerate"

And then the suggested benchmarks:

  1. The script mysvd.py runs in mean of 10 runs: 1.08088s

  2. The script dario.py gives:

Dotted two 4096x4096 matrices in 0.28 s.
Dotted two vectors of length 524288 in 0.11 ms.
SVD of a 2048x1024 matrix in 0.44 s.
Cholesky decomposition of a 2048x2048 matrix in 0.07 s.
Eigendecomposition of a 2048x2048 matrix in 3.83 s.

TOTAL TIME = 19 seconds

@alexshmmy
Copy link

alexshmmy commented Nov 12, 2022

@placeless @RoyiAvital @MarkDana what about the related packages to numpy, i.e., scipy, pandas, scikit-learn? Do they need also specific installation with conda and accelerated "libblas=*=*accelerate" for them to work efficiently?

@placeless
Copy link

@RoyiAvital , openblas results:

Dotted two 4096x4096 matrices in 0.38 s.
Dotted two vectors of length 524288 in 0.07 ms.
SVD of a 2048x1024 matrix in 1.67 s.
Cholesky decomposition of a 2048x2048 matrix in 0.07 s.
Eigendecomposition of a 2048x2048 matrix in 9.43 s.

@alexshmmy , I've never heard of this switch on scipy/pandas/scikit, installing numpy first would be a good choice, I think.

@alexshmmy
Copy link

Thanks @placeless! After conda install numpy "libblas=*=*accelerate" i have now installed int he same environment scipy and pandas. Let me know if there is any benchmark that i can test scipy, pandas also if they work efficiently.

@vlebert
Copy link

vlebert commented Nov 18, 2022

So, when can we hope a simple conda install numpy do the job for M1 chips ?
Do you know what is blocking?

@QueryType
Copy link

As of Jan 2023, is it possible to install numpy natively on a M1 chip mini mac and get it to use the GPU? I am curious since I plan to purchase one and use it for vector maths and machine learning alogs. Thanks.

@maguzj
Copy link

maguzj commented Feb 4, 2023

I've just followed the steps on an M1 machine and it worked perfectly: my code runs 60 times faster.
I tried the same on an M2 machine and works a little bit slower: x20 improvement.

Any ideas on how to translate/update this info for M2 MacBook?

@fmigas
Copy link

fmigas commented Sep 19, 2023

It looks like pip install --no-binary :all: --no-use-pep517 numpy does not work anymore.
It returns an error:
ERROR: Disabling PEP 517 processing is invalid: project specifies a build backend of mesonpy in pyproject.toml

What can be done to repair it?

@justin-himself
Copy link

It looks like pip install --no-binary :all: --no-use-pep517 numpy does not work anymore. It returns an error: ERROR: Disabling PEP 517 processing is invalid: project specifies a build backend of mesonpy in pyproject.toml

What can be done to repair it?

Same issue here.

@CoryKornowicz
Copy link

You can omit the --no-use-pep517 flag altogether, which should still work.

@fmigas
Copy link

fmigas commented Oct 28, 2023

It looks like the solution is very simple. Numpy 1.26 does not accept this argument, but numpy 1.25.2 does.
You need to add ==1.25.2 at the end and it will work smoothly.

@vlebert
Copy link

vlebert commented Oct 28, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment