Skip to content

Instantly share code, notes, and snippets.

@kaniblu
Created October 26, 2017 05:14
Show Gist options
  • Save kaniblu/81828dfcf5cca60ae93f4d7bd19aeac5 to your computer and use it in GitHub Desktop.
Save kaniblu/81828dfcf5cca60ae93f4d7bd19aeac5 to your computer and use it in GitHub Desktop.
PyTorch LSTM and GRU Orthogonal Initialization and Positive Bias
def init_gru(cell, gain=1):
cell.reset_parameters()
# orthogonal initialization of recurrent weights
for _, hh, _, _ in cell.all_weights:
for i in range(0, hh.size(0), cell.hidden_size):
I.orthogonal(hh[i:i + cell.hidden_size], gain=gain)
def init_lstm(cell, gain=1):
init_gru(cell, gain)
# positive forget gate bias (Jozefowicz et al., 2015)
for _, _, ih_b, hh_b in cell.all_weights:
l = len(ih_b)
ih_b[l // 4:l // 2].data.fill_(1.0)
hh_b[l // 4:l // 2].data.fill_(1.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment