- Create a conda environment named
nncf
conda create -n nncf python=3.8 # <your choice of python version, 3.8 is recommended>
# always activate your environment before following steps
conda activate nncf
- Install NNCF for development
git clone https://github.com/openvinotoolkit/nncf
cd nncf && git checkout 1210723ff2b9b9c6774d8fe6f8734f7edd13724e -b v2.0.0 # this commit id corresponds to v2.0.0
python setup.py develop
pip3 install -r examples/torch/requirements.txt
-
NNCF provides out-of-the-box examples for compressing models for image classification, object detection and semantic segmentation. Here, we cover basic usage for image classification with quantization only. Pls refer
nncf/examples/torch
and official documentation for more usage details. -
NNCF examples and NNCF-adapted scripts primarily work hand in hand with NNCF config which is in json format, NNCF provides a number of configs for its examples to help users to replicate the results tabulated at the landing page of the github. We will go over the command line to replicate a few of the results here.
-
Quantization configs for image classification can be found at
nncf/examples/torch/classification/configs/quantization
-
Example 1: No quantization, evaluate torchvision's pretrained resnet50. Pls go through the config to get a sense of NNCF config and its specifics. Important note: test mode only evaluates the model, pretrained model is loaded because of the key-pair set in the nncf config. You should get top1 of 76.15.
cd nncf/examples/torch/classification
python main.py \
--mode test \
--gpu-id 0 \
--batch-size 64 \
--log-dir /tmp/your-log-location \
--data /path/to/your_imagenet_dataset \
--config configs/quantization/resnet50_imagenet.json
- Example 2: We quantize resnet50 with uniform 8-bit to weight and activations using the config provided and no fine-tuning. This means we stay with test mode, what NNCF does with this config in the test mode is that it performs model transformation to add quantization layers, carrying out calibration to set the saturation threshold for the quantizers and finally evaluating the network without fine-tuning. This way of usage is also similar to the post-training quantization as fine-tuning is skipped, except that number of calibration set is usually 1-5 samples per class where we are allowed to increase the number of calibration batches to arbitrary size. With test mode, you will notice a slight difference to the published results but still within a percentage drop.
cd nncf/examples/torch/classification
python main.py \
--mode test \
--gpu-id 0 \
--batch-size 64 \
--log-dir /tmp/your-log-location \
--data /path/to/your_imagenet_dataset \
--config configs/quantization/resnet50_imagenet_int8.json
- try train mode to experience the fine-tuning process
- try other pretrained model in torchvision
- modify the config to perform examples above in cifar10
- try other sample configs in quantization, sparsity, or pruning
- perhaps try other tasks e.g. semantic segmentation or object detection
- perhaps integrate quantization with custom/third-party models