Skip to content

Instantly share code, notes, and snippets.

@a7v8x
Created May 5, 2020 18:35
Show Gist options
  • Save a7v8x/9d264bb529bc173103eb11094f5f6845 to your computer and use it in GitHub Desktop.
Save a7v8x/9d264bb529bc173103eb11094f5f6845 to your computer and use it in GitHub Desktop.
bert_input = tokenizer.encode_plus(
test_sentence,
add_special_tokens = True, # add [CLS], [SEP]
max_length = max_length_test, # max length of the text that can go to BERT
pad_to_max_length = True, # add [PAD] tokens
return_attention_mask = True, # add attention mask to not focus on pad tokens
)
print('encoded', bert_input)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment