kobybibas/pytorch_acceleration.md

## pytorch_acceleration.md

      
    Raw
  

              pytorch_acceleration.md
            
          
    Summry of the following guide
https://nvlabs.github.io/eccv2020-mixed-precision-tutorial/files/szymon_migacz-pytorch-performance-tuning-guide.pdf

In the dataset class

pin_memory=True

Enable for device specific CNN acceleration

torch.backends.cudnn.benchmark = True


Increase the batch size to max out GPU memory. SGD modification for large batch: LARS


Disable bias for convlutaion if followed firectly by batch norm to reduce paramters. Instead


model.zero_grad()
use
for param in model.parameters():
    param.grad = None

Add jit decorator to fuse cuda kernels

@torch.jit.script decorator to fuse cuda kernels