Skip to content

Instantly share code, notes, and snippets.

@braingineer
Last active April 28, 2016 02:02
Show Gist options
  • Save braingineer/af0bd7dbcf9ba2c05ec73167b19f19b7 to your computer and use it in GitHub Desktop.
Save braingineer/af0bd7dbcf9ba2c05ec73167b19f19b7 to your computer and use it in GitHub Desktop.
setup for ptb language model w/ keras (not a working example; missing personal libraries)
B = self.igor.batch_size
R = self.igor.rnn_size
S = self.igor.max_sequence_len
V = self.igor.vocab_size
E = self.igor.embedding_size
### loaded from glove
emb_W = self.igor.embeddings.astype(theano.config.floatX)
## dropout parameters
p_emb = self.igor.p_emb_dropout
p_W = self.igor.p_W_dropout
p_U = self.igor.p_U_dropout
p_dense = self.igor.p_dense_dropout
w_decay = self.igor.weight_decay
M = Sequential()
M.add(Embedding(V, E, batch_input_shape=(B,S), input_length=S,
W_regularizer=l2(w_decay),
weights=[emb_W], mask_zero=True, dropout=p_emb))
#for i in range(self.igor.num_lstms):
M.add(LSTM(R, return_sequences=True, dropout_W=p_W, dropout_U=p_U,
U_regularizer=l2(w_decay), W_regularizer=l2(w_decay)))
M.add(Dropout(p_dense))
## from dropout rnn paper: keep same # of active connections as early layer, hence the scaling of R
M.add(LSTM(R*int(1/p_dense), return_sequences=True, dropout_W=p_W, dropout_U=p_U))
M.add(Dropout(p_dense))
M.add(TimeDistributed(Dense(V, activation='softmax',
W_regularizer=l2(w_decay), b_regularizer=l2(w_decay))))
optimizer = Adam(self.igor.LR, clipnorm=self.igor.max_grad_norm,
clipvalue=5.0)
M.compile(loss='categorical_crossentropy', optimizer=optimizer,
metrics=['accuracy', 'perplexity'])
"""
configuration used (yaml file):
########### │··········
## set in training │··········
########### │··········
max_sequence_len: -1 │··········
vocab_size: 0 │··········
###### │··········
## training parameters │··········
##### │··········
num_epochs: 1500 │··········
max_grad_norm: 10 │··········
LR: 0.0005 │··········
max_sentence_length: 100 │··········
frequency_cutoff: null │··········
size_cutoff: 10000 │··········
#### ############### │··········
## model parameters │··········
######### │··········
embedding_size: 300 │··········
rnn_size: 368 │··········
batch_size: 32 │··········
p_emb_dropout: 0.5 │··········
p_W_dropout: 0.5 │··········
p_U_dropout: 0.5 │··········
p_dense_dropout: 0.5 │··········
weight_decay: 1e-8 │··········
############## │··········
## file stuff │··········
############# │··········
saving_prefix: ptb_april15 │··········
from_checkpoint: False │··········
train_filepath: data/ptb.train.txt │··········
dev_fp: data/ptb.valid.txt │··········
test_fp: data/ptb.test.txt │··········
glove_fp: /research/data/glove/glove.840B.300d.txt │··········
embeddings_file: data/ptb_embeddings_april15.pkl │··········
vocab_file: data/ptb_april15.vocab │··········
######### │··········
## logger stuff │··········
########## │··········
disable_logger: False
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment