Skip to content

Instantly share code, notes, and snippets.

@shagunsodhani
shagunsodhani / Batch Normalization.md
Last active July 25, 2023 18:07
Notes for "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" paper

The Batch Normalization paper describes a method to address the various issues related to training of Deep Neural Networks. It makes normalization a part of the architecture itself and reports significant improvements in terms of the number of iterations required to train the network.

Issues With Training Deep Neural Networks

Internal Covariate shift

Covariate shift refers to the change in the input distribution to a learning system. In the case of deep networks, the input to each layer is affected by parameters in all the input layers. So even small changes to the network get amplified down the network. This leads to change in the input distribution to internal layers of the deep network and is known as internal covariate shift.

It is well established that networks converge faster if the inputs have been whitened (ie zero mean, unit variances) and are uncorrelated and internal covariate shift leads to just the opposite.