Skip to content

Instantly share code, notes, and snippets.

@tylerneylon
Created January 12, 2018 22:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save tylerneylon/8fc7b94f77969dcab693556cd045c246 to your computer and use it in GitHub Desktop.
Save tylerneylon/8fc7b94f77969dcab693556cd045c246 to your computer and use it in GitHub Desktop.
Quick reference on how to work with pre-trained word2vec vectors in Python.
# wordvec_example.py
#
# This file shows one way to work with word2vec data in Python.
#
# Setup:
#
# 1. Install gensim:
#
# pip install gensim
#
# 2. Download
#
# Use your favorite download tool (eg curl, wget, your browser) to d/l from:
# https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing
#
# That ought to give you a file called GoogleNews-vectors-negative300.bin, which
# you can put anywhere you like; this code assumes it lives in '~/Downloads'.
#
import gensim
filepath = '~/Downloads/GoogleNews-vectors-negative300.bin'
model = gensim.models.KeyedVectors.load_word2vec_format(filepath, binary=True)
# As an example, print out words similar to 'chicken':
print(model.most_similar(positive=['chicken']))
# Docs covering some methods of `model` are here:
# https://radimrehurek.com/gensim/models/keyedvectors.html#gensim.models.keyedvectors.EuclideanKeyedVectors
@GiaAncona
Copy link

GiaAncona commented Sep 25, 2020

Cool. I'll try to apply this for my project https://trustsession.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment