shagunsodhani/RecurrentNeuralNetworkRegularization.md

## RecurrentNeuralNetworkRegularization.md

      
    Raw
  

              RecurrentNeuralNetworkRegularization.md
            
          
    Recurrent Neural Network Regularization

Introduction


The paper explains how to apply dropout to LSTMs and how it could reduce overfitting in tasks like language modelling, speech recognition, image caption generation and machine translation.
Link to the paper

Dropout


Regularisation method that drops out (or temporarily removes) units in a neural network.
the network, along with all its incoming and outgoing connections
Conventional dropout does not work well with RNNs as the recurrence amplifies the noise and hurts learning.

Regularization


The paper proposes to apply dropout to only the non-recurrent connections.
The dropout operator would corrupt information carried by some units (and not all) forcing them to perform intermediate computations more robustly.
The information is corrupted L+1 times where L is the number of layers and is independent of timestamps traversed by the information.

Observation


In the context of language modelling, image caption generation, speech recognition and machine translation, dropout enables training larger networks and reduces the testing error in terms of perplexity and frame accuracy.