Skip to content

Instantly share code, notes, and snippets.

@dav009
Created February 11, 2015 17:32
Show Gist options
  • Save dav009/fb9a42890d3048b3b745 to your computer and use it in GitHub Desktop.
Save dav009/fb9a42890d3048b3b745 to your computer and use it in GitHub Desktop.
import gensim
import os
os.system("taskset -p 0xff %d" % os.getpid())
def read_corpus(path_to_corpus):
sentences = gensim.models.word2vec.LineSentence(path_to_corpus)
print("training word2vec...")
model = gensim.models.Word2Vec(sentences, min_count=10, size=500, window=10, sg=1, workers=4)
print("finished training word2vec")
model.save("/mnt/data/word2vec.model")
read_corpus("/mnt/data/corpus")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment