Skip to content

Instantly share code, notes, and snippets.

@NMZivkovic
Last active August 16, 2019 13:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save NMZivkovic/2118ab8c698aafbfa0893a7564c50f93 to your computer and use it in GitHub Desktop.
Save NMZivkovic/2118ab8c698aafbfa0893a7564c50f93 to your computer and use it in GitHub Desktop.
class ScaledDotProductAttentionLayer():
def calculate_output_weights(self, q, k, v, mask):
qk = tf.matmul(q, k, transpose_b=True)
dk = tf.cast(tf.shape(k)[-1], tf.float32)
scaled_attention = qk / tf.math.sqrt(dk)
if mask is not None:
scaled_attention_logits += (mask * -1e9)
weights = tf.nn.softmax(scaled_attention, axis=-1)
output = tf.matmul(weights, v)
return output, weights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment