Skip to content

Instantly share code, notes, and snippets.

@rishisidhu
Created September 1, 2020 03:14
Show Gist options
  • Save rishisidhu/9c3b10b632b89a14b22367bd4a07d675 to your computer and use it in GitHub Desktop.
Save rishisidhu/9c3b10b632b89a14b22367bd4a07d675 to your computer and use it in GitHub Desktop.
What happens when Python tokenizer comes across new words
# Unseen Words
test_data = [
'Grapes are sour but oranges are sweet',
]
test_seq = myTokenizer.texts_to_sequences(test_data)
print("\nTest Sequence = ", test_seq, " => ", [x for x in myTokenizer.sequences_to_texts_generator(test_seq)])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment