Skip to content

Instantly share code, notes, and snippets.

@justindavies
Created June 11, 2017 14:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save justindavies/b343d44ec536d2e99e26019039201347 to your computer and use it in GitHub Desktop.
Save justindavies/b343d44ec536d2e99e26019039201347 to your computer and use it in GitHub Desktop.
Doc2Vec Iterator
from gensim.models.doc2vec import LabeledSentence
import re
class DocIterator(object):
def __init__(self, doc_list, labels_list):
self.labels_list = labels_list
self.doc_list = doc_list
def __iter__(self):
for idx, doc in enumerate(self.doc_list):
yield LabeledSentence(words=doc.split(),tags=[self.labels_list[idx]])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment