Skip to content

Instantly share code, notes, and snippets.

@thomwolf
Created August 8, 2019 18:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thomwolf/d9eb157eac34b49ba3be6d3d2d6e3895 to your computer and use it in GitHub Desktop.
Save thomwolf/d9eb157eac34b49ba3be6d3d2d6e3895 to your computer and use it in GitHub Desktop.
GPT-2 TensorFlow block class
def block(x, scope, *, past, hparams):
with tf.variable_scope(scope):
nx = x.shape[-1].value
a, present = attn(norm(x, 'ln_1'), 'attn', nx, past=past, hparams=hparams)
x = x + a
m = mlp(norm(x, 'ln_2'), 'mlp', nx*4, hparams=hparams)
x = x + m
return x, present
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment