Franck Dernoncourt Franck-Dernoncourt

## Batch Normalization.md

      
        
          
            
              
              1 file
            
          
          
            
              
              11 forks
            
          
          
            
              
              10 comments
            
          
          
            
              
              84 stars
            
          
        
        
          
              
          
          
            
                shagunsodhani
                / Batch Normalization.md
            
            
              Last active
              July 25, 2023 18:07
            
              
                Notes for "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" paper
              
          
        
      
        
  
      
    The Batch Normalization paper describes a method to address the various issues related to training of Deep Neural Networks. It makes normalization a part of the architecture itself and reports significant improvements in terms of the number of iterations required to train the network.
Issues With Training Deep Neural Networks

Internal Covariate shift

Covariate shift refers to the change in the input distribution to a learning system. In the case of deep networks, the input to each layer is affected by parameters in all the input layers. So even small changes to the network get amplified down the network. This leads to change in the input distribution to internal layers of the deep network and is known as internal covariate shift.
It is well established that networks converge faster if the inputs have been whitened (ie zero mean, unit variances) and are uncorrelated and internal covariate shift leads to just the opposite.