Skip to content

Instantly share code, notes, and snippets.

@yaronv
Created November 26, 2018 13:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yaronv/028fa6519dce563f4ddbd96978efcf15 to your computer and use it in GitHub Desktop.
Save yaronv/028fa6519dce563f4ddbd96978efcf15 to your computer and use it in GitHub Desktop.
%%time
assert gensim.models.doc2vec.FAST_VERSION > -1
print('Training the model...')
cores = multiprocessing.cpu_count()
texts = MyTexts()
doc2vec_model = Doc2Vec(vector_size=300, workers=cores, min_count=1, window=3, negative=5)
doc2vec_model.build_vocab(texts)
doc2vec_model.train(texts, total_examples=doc2vec_model.corpus_count, epochs=20)
if not os.path.exists('models'):
os.makedirs('models')
doc2vec_model.save('models/doc2vec.model')
doc2vec_model.save_word2vec_format('models/trained.word2vec')
print('Done!')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment