Skip to content

Instantly share code, notes, and snippets.

@rishisidhu
Last active August 25, 2020 05:38
Show Gist options
  • Save rishisidhu/3f2a4c49e8c7a9a73bf2ae23b3da9d55 to your computer and use it in GitHub Desktop.
Save rishisidhu/3f2a4c49e8c7a9a73bf2ae23b3da9d55 to your computer and use it in GitHub Desktop.
Looking at sequence length based on num_words parameter
from tensorflow.keras.preprocessing.text import Tokenizer
#Let's add custom sentences
sentences = [
"One plus one is two!",
"Two plus two is four!"
]
#most frequent words
for num_w in range(1,8):
myTokenizer = Tokenizer(num_words=num_w)
myTokenizer.fit_on_texts(sentences)
print(num_w,": ",myTokenizer.texts_to_sequences(sentences))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment