Skip to content

Instantly share code, notes, and snippets.

@kabirahuja2431
Last active October 8, 2019 14:36
Show Gist options
  • Save kabirahuja2431/2dc564dcecd4f60de88aa0058b70493b to your computer and use it in GitHub Desktop.
Save kabirahuja2431/2dc564dcecd4f60de88aa0058b70493b to your computer and use it in GitHub Desktop.
T = 12
padded_tokens = tokens + ['[PAD]' for _ in range(T - len(tokens))]
print(padded_tokens)
# Out: ['[CLS]', 'i', 'really', 'enjoyed', 'this', 'movie', 'a', 'lot', '.', '[SEP]', '[PAD]', '[PAD]']
attn_mask = [1 if token != '[PAD]' else 0 for token in padded_tokens]
print(attn_mask)
# Out: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment