Skip to content

Instantly share code, notes, and snippets.

@ferrygun
Created June 8, 2020 10:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ferrygun/289ae810afa64617fe1d250f46af69df to your computer and use it in GitHub Desktop.
Save ferrygun/289ae810afa64617fe1d250f46af69df to your computer and use it in GitHub Desktop.
tokenizer = Tokenizer(num_words = vocab_size, oov_token=oov_tok)
tokenizer.fit_on_texts(train_articles)
word_index = tokenizer.word_index
train_sequences = tokenizer.texts_to_sequences(train_articles)
train_padded = pad_sequences(train_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
validation_sequences = tokenizer.texts_to_sequences(validation_articles)
validation_padded = pad_sequences(validation_sequences, maxlen=max_length, padding=padding_type, truncating=trunc_type)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment