Skip to content

Instantly share code, notes, and snippets.

@aravindpai
Last active January 27, 2020 13:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aravindpai/c02310a69b45c8193b34e078e577e7ec to your computer and use it in GitHub Desktop.
Save aravindpai/c02310a69b45c8193b34e078e577e7ec to your computer and use it in GitHub Desktop.
#initialize glove embeddings
TEXT.build_vocab(train_data,min_freq=3,vectors = "glove.6B.100d")
LABEL.build_vocab(train_data)
#No. of unique tokens in text
print("Size of TEXT vocabulary:",len(TEXT.vocab))
#No. of unique tokens in label
print("Size of LABEL vocabulary:",len(LABEL.vocab))
#Commonly used words
print(TEXT.vocab.freqs.most_common(10))
#Word dictionary
print(TEXT.vocab.stoi)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment