nikos-kekatos/NN_compression.md

## NN_compression.md

      
    Raw
  

              NN_compression.md
            
          
    Resources for Neural Network Compression

To be checked & Evaluated


https://huzi96.github.io/compression-bench.html
https://paperswithcode.com/paper/quantisation-and-pruning-for-neural-network
https://github.com/yoshitomo-matsubara/torchdistill
https://arxiv.org/pdf/2006.03669.pdf
https://web.stanford.edu/~jurafsky/slp3/ed3book_jan122022.pdf

Probably useful but not relevant


https://paperswithcode.com/task/neural-network-compression
https://sites.google.com/view/vnn20?pli=1

Good papers but not applicable

https://arxiv.org/pdf/2210.14991.pdf
Papers


A Survey of Neural Network Compression, [arxiv]

Transformer-based architectures that are commonly used in NLP and CV have millions of parameters for each fully-connected layer.


Papers & Code


EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [arxiv]

we use neural architecture search to design a new baseline network and scale it up to obtain a family of models, called EfficientNets, which achieve much better accuracy and efficiency than previous ConvNets. In particular, our EfficientNet-B7 achieves state-of-the-art 84.3% top-1 accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference than the best existing ConvNet. Our EfficientNets also transfer well and achieve state-of-the-art accuracy on CIFAR-100 (91.7%), Flowers (98.8%), and 3 other transfer learning datasets, with an order of magnitude fewer parameters.


Source Code: Github


"Smaller Networks" (~5M parameters): https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/lite/README.md