Skip to content

Instantly share code, notes, and snippets.

@ChunML
Created March 29, 2019 01:59
Show Gist options
  • Save ChunML/283c9377453a5f7923a5a851559d7b28 to your computer and use it in GitHub Desktop.
Save ChunML/283c9377453a5f7923a5a851559d7b28 to your computer and use it in GitHub Desktop.
# Combine the context vector and the LSTM output
# Before combined, both have shape of (batch_size, 1, rnn_size),
# so let's squeeze the axis 1 first
# After combined, it will have shape of (batch_size, 2 * rnn_size)
lstm_out = tf.concat([tf.squeeze(context, 1), tf.squeeze(lstm_out, 1)], 1)
# lstm_out now has shape (batch_size, rnn_size)
lstm_out = self.wc(lstm_out)
# Finally, it is converted back to vocabulary space: (batch_size, vocab_size)
logits = self.ws(lstm_out)
return logits, state_h, state_c, alignment
@entrpn
Copy link

entrpn commented Sep 3, 2021

Hello, thanks for writing your tutorial here https://trungtran.io/2019/03/29/neural-machine-translation-with-attention-mechanism/

I do have one question. Should line 11 not also have a softmax function applied?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment