Skip to content

Instantly share code, notes, and snippets.

@bharatc9530
Created August 4, 2020 08:07
Show Gist options
  • Save bharatc9530/4b78574909caa113fdcb6e864f89dd36 to your computer and use it in GitHub Desktop.
Save bharatc9530/4b78574909caa113fdcb6e864f89dd36 to your computer and use it in GitHub Desktop.
t = Tokenizer()
t.fit_on_texts(docs)
vocab_size = len(t.word_index) + 1
# integer encode the documents
print(vocab_size)
X_train = [one_hot(d, vocab_size,filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~',lower=True, split=' ') for d in X_train]
X_test = [one_hot(d, vocab_size,filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~',lower=True, split=' ') for d in X_test]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment