Skip to content

Instantly share code, notes, and snippets.

@mingfeima
Last active July 8, 2022 06:09
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save mingfeima/bdfb2db3928ca51b795622b29264ef11 to your computer and use it in GitHub Desktop.
Save mingfeima/bdfb2db3928ca51b795622b29264ef11 to your computer and use it in GitHub Desktop.
BKMs to check whether mkl or mkldnn is enabled on PyTorch

BKMs to check whether mkl or mkldnn is enabled on PyTorch

PyTorch can be installed via different channels: conda, pip, docker, source code...

By default, mkl and mkl-dnn are enabled; But this might not always be true, so it is still useful to learn how to check this by yourself:

1. How to check whether mkl is enabled?

### check where your torch is installed
python -c 'import torch; print(torch.__path__)'

On my machine, it points to the conda env pytorch-cuda which i created specifically for cuda runs...

['/home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch']

Next,

cd /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch
cd lib
ldd libtorch.so

This will give all the .so that PyTorch compiled against...

linux-vdso.so.1 =>  (0x00007ffe5ef06000)
libgomp.so.1 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libgomp.so.1 (0x00007f0216544000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0216312000)
libnvToolsExt.so.1 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libnvToolsExt.so.1 (0x00007f0216108000)
libcudart.so.10.0 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libcudart.so.10.0 (0x00007f0215e8b000)
libcaffe2.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./libcaffe2.so (0x00007f0212c54000)
libcaffe2_gpu.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./libcaffe2_gpu.so (0x00007f01e71c7000)
libc10_cuda.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./libc10_cuda.so (0x00007f01e6fa2000)
libc10.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./libc10.so (0x00007f01e6d5e000)
libm.so.6 => /lib64/libm.so.6 (0x00007f01e6a5c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f01e6858000)
libstdc++.so.6 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libstdc++.so.6 (0x00007f01e6716000)
libgcc_s.so.1 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libgcc_s.so.1 (0x00007f01e6702000)
libc.so.6 => /lib64/libc.so.6 (0x00007f01e633f000)
/lib64/ld-linux-x86-64.so.2 (0x000056504e07a000)
librt.so.1 => /lib64/librt.so.1 (0x00007f01e6136000)
libmkl_intel_lp64.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libmkl_intel_lp64.so (0x00007f01e55e8000)
libmkl_gnu_thread.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libmkl_gnu_thread.so (0x00007f01e3d93000)
libmkl_core.so => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libmkl_core.so (0x00007f01dfc07000)
libcusparse.so.10.0 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libcusparse.so.10.0 (0x00007f01dc198000)
libcurand.so.10.0 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libcurand.so.10.0 (0x00007f01d8030000)
libcufft.so.10.0 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libcufft.so.10.0 (0x00007f01d1b79000)
libcublas.so.10.0 => /home/mingfeim/anaconda3/envs/pytorch-cuda/lib/python3.7/site-packages/torch/lib/./../../../../libcublas.so.10.0 (0x00007f01cd5e0000)

In case you see libmkl_intel_lp64.so, libmkl_gnu_thread.so, libmkl_core.so, your PyTorch has mkl; otherwise not.

Also this is the method to check which mkl is being used in case you have multiple versions installed on your machine, which is particularly useful for intel employees...

2. How to check whether mkl-dnn is enabled?

python -c 'import torch; a = torch.randn(10); print(a.to_mkldnn().layout)'

On my machine, this will print the tensor's layout which is _mkldnn, which indicates pytorch is compiled against mkl-dnn

torch._mkldnn

In case you have no mkl-dnn enabled, you will receive a RuntimeError from to_mkldnn()...

Notes:

PyTorch is now shipped with gomp by default...In case you want to use iomp, follow use-intel-openmp-library.

@nexgus
Copy link

nexgus commented Feb 24, 2020

Hello mingfeima,

Your material is good for me to getting started to using optimized PyTorch on Intel CPU. I got an interesting thing on this topic: How to check whether mkl is enabled? you wrote.

According to https://software.intel.com/en-us/articles/getting-started-with-intel-optimization-of-pytorch,

Intel MKL-DNN has been integrated into official release of PyTorch by default, thus users can get performance benefit on Intel platform without additional installation steps.

So I installed PyTorch v1.4.0 with CUDA 10.1 support, and print the configuration by executing

$ python -c "import torch; print(torch.__config__.show())"

Yap, as expected, it is built with Intel MKL and MKL-DNN.
And then I use your instruction to check the used .so files for libtorch.so, found that there is not any libmkl*!

$ ldd libtorch.so
linux-vdso.so.1 (0x00007ffcee840000)
libcudart-1b201d85.so.10.1 (0x00007f93b1d6e000)
libgomp-7c85b1e2.so.1 (0x00007f93b1b44000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f93b1925000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f93b171d000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f93b1505000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f93b1301000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f93b0f63000)
libc10_cuda.so (0x00007f93b0d36000)
libnvToolsExt-3965bdd0.so.1 (0x00007f93b0b2c000)
libc10.so (0x00007f93b08d7000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f93b054e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f93b015d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f93fc58e000)

The 2nd check is using to_mkldnn() to verify.

$ python -c 'import torch; a = torch.randn(10); print(a.to_mkldnn().layout)'
torch._mkldnn

Sure it is not a major issue since in normal case a customer will not use CUDA with MKL-DNN.
However, isn't is interesting?

Anyway, thanks for your material. It is very practical to guide me.

B.Rds,
Augustus

@mingfeima
Copy link
Author

Sure it is not a major issue since in normal case a customer will not use CUDA with MKL-DNN.

MKL, MKL-DNN doesn't have conflict with cuda or miopen, you can build your PyTorch with both MKL and cuda.
For MKL, PyTorch will search mkl runtime from your environment, by default MKL will be enabled if found. So you need to make sure your environment has MKL, e.g. conda install mkl mkl-include.
For MKL-DNN, it is now treated as a submodule in third-party directory, MKL-DNN has no dependency on MKL and by default it will be enabled on PyTorch.

@schheda1
Copy link

schheda1 commented Nov 1, 2021

Hi @mingfeima,

Thank you for sharing this instructive gist. However, I am running into some errors when I try to use MKL-DNN for training.

I have created a conda environment with Python 3.8.2; I wish to use MKL-DNN (oneDNN rls-v2.1) for training with PyTorch v1.7.0
PyTorch 1.7.0 is built from source with the following flags - USE_MKLDNN=1, USE_NCCL=0, USE_CUDA=0, USE_NNPACK=0, USE_XNNPACK=0) with gcc-9.2.0

I run torch.__config__.show() and it shows BLAS=MKL, and for the second step I get torch._mkldnn as the layout.

Next, I try to run a simple training program using Resnet50 from torchvision.models and a dummy input resembling an image. For this, I convert the inputs to mkldnn layout using to_mkldnn() and don't change the model layout. I run into this error --
RuntimeError: Input type (Mkldnntorch.FloatTensor) and weight type (torch.FloatTensor) should be the same

If I convert the model to mkldnn layout using torch.utils.to_mkldnn(model), I run into an assertion error triggered in mkldnn.py - assert(not dense_module.training) as expected.

I may be missing something during the installation. Can you help me resolve this issue?

Please let me know.

PS. I am able to use this build - 1.11.0a0+git39ad7b6, and perform training with mkldnn layout without any error. But I am expected to use PyTorch 1.7.0.

Thanks in advance!

Regards,
Smeet.

@mingfeima
Copy link
Author

@smeetdc, v1.7 doesn't have MkldnnTensor training feature (only the forward is enabled), 1.10 has full set of training support for MkldnnTensor. I'm afraid v1.7 is not suitable for your request and it is a dead end.

For inference you need to convert both the input and the model, as below

from torch.utils import mkldnn as mkldnn_utils
### this will recursively convert `weight` and `bias` to MkldnnTensor and do weight prepacking
model = mkldnn_utils.to_mkldnn(model)

For training, only need to convert input (the weight is kept in plain CPU tensor layout, not Mkldnn layout)

@schheda1
Copy link

schheda1 commented Nov 2, 2021

Hi @mingfeima, I see. I will work with PyTorch v1.10 to use Mkldnn during training.
Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment