Skip to content

Instantly share code, notes, and snippets.

@yuxuanzhuang
Last active October 14, 2020 15:27
Show Gist options
  • Save yuxuanzhuang/82e1e7b57d0cda80ac964d1cd138f618 to your computer and use it in GitHub Desktop.
Save yuxuanzhuang/82e1e7b57d0cda80ac964d1cd138f618 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@yuxuanzhuang
Copy link
Author

yuxuanzhuang commented Oct 14, 2020

As a sum up for the code performance after numpy.dot (Line 6)

Threads Time
version_1 version_2 version_3
without numpy.dot 1 37070 37606 36344
2 37132 37527 36547
6 37139 37747 36638
12 37205 37493 36684 (not really using 12 threads with MKL)
with numpy.dot 1 37360 36285 36369
2 38269 36373 37200
6 38038 39142 36863
12 72670 67931 36739 (not really using 12 threads with MKL)
non numpy operation after numpy.dot 1 201422 208773 213470
2 200999 208948 212878
6 428546 216324 213259
12 440734 463529 213133 (not really using 12 threads with MKL)

Another interesting observation is that non-numpy operation is already performing pretty bad with 6 threads.

The test was down with

  • 6-core CPU (hyperthreading enabled)

version_1

  • numpy 1.19.1
  • python 3.8
threadpool_info()
[{'filepath': '/home/scottzhuang/anaconda3/envs/gsoc/lib/python3.8/site-packages/numpy.libs/libopenblasp-r0-ae94cfde.3.9.dev.so',
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'internal_api': 'openblas',
  'version': '0.3.9.dev',
  'num_threads': 12,
  'threading_layer': 'pthreads'},
 {'filepath': '/home/scottzhuang/anaconda3/envs/gsoc/lib/libgomp.so.1.0.0',
  'prefix': 'libgomp',
  'user_api': 'openmp',
  'internal_api': 'openmp',
  'version': None,
  'num_threads': 12},
 {'filepath': '/home/scottzhuang/anaconda3/envs/gsoc/lib/libmkl_rt.so',
  'prefix': 'libmkl_rt',
  'user_api': 'blas',
  'internal_api': 'mkl',
  'version': '2020.0.1',
  'num_threads': 6,
  'threading_layer': 'intel'},
 {'filepath': '/home/scottzhuang/anaconda3/envs/gsoc/lib/libiomp5.so',
  'prefix': 'libiomp',
  'user_api': 'openmp',
  'internal_api': 'openmp',
  'version': None,
  'num_threads': 12}]

version_2

  • numpy 1.20.0.dev0+4ccfbe6
  • python 3.8
threadpool_info()
[{'filepath': '/opt/OpenBLAS/lib/libopenblas_haswellp-r0.3.10.dev.so',
  'prefix': 'libopenblas',
  'user_api': 'blas',
  'internal_api': 'openblas',
  'version': '0.3.10.dev',
  'num_threads': 12,
  'threading_layer': 'pthreads'}]

version_3

  • numpy 1.19
  • python 3.8
threadpool_info()
[{'filepath': '/home/scottzhuang/anaconda3/lib/libmkl_rt.so',
  'prefix': 'libmkl_rt',
  'user_api': 'blas',
  'internal_api': 'mkl',
  'version': '2020.0.0',
  'num_threads': 6,
  'threading_layer': 'intel'},
 {'filepath': '/home/scottzhuang/anaconda3/lib/libiomp5.so',
  'prefix': 'libiomp',
  'user_api': 'openmp',
  'internal_api': 'openmp',
  'version': None,
  'num_threads': 12}]

Will check if it's version related.
EDIT: Update with other clean-installed version of numpy (OpenBlas, MKL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment