Skip to content

Instantly share code, notes, and snippets.

@gachet
Forked from bhaettasch/gensim_word2vec_demo.py
Created June 13, 2017 10:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gachet/c36f9797caedab46e1644999e5daedcb to your computer and use it in GitHub Desktop.
Save gachet/c36f9797caedab46e1644999e5daedcb to your computer and use it in GitHub Desktop.
Use gensim to load a word2vec model pretrained on google news and perform some simple actions with the word vectors.
from gensim.models import Word2Vec
# Load pretrained model (since intermediate data is not included, the model cannot be refined with additional data)
model = Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True, norm_only=True)
dog = model['dog']
print(dog.shape)
print(dog[:10])
# Deal with an out of dictionary word: Михаил (Michail)
if 'Михаил' in model:
print(model['Михаил'].shape)
else:
print('{0} is an out of dictionary word'.format('Михаил'))
# Some predefined functions that show content related information for given words
print(model.most_similar(positive=['woman', 'king'], negative=['man']))
print(model.doesnt_match("breakfast cereal dinner lunch".split()))
print(model.similarity('woman', 'man'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment