Skip to content

Instantly share code, notes, and snippets.

@HeshamMeneisi
Last active June 3, 2018 23:16
Show Gist options
  • Save HeshamMeneisi/01994558b7e22d8718832f96b9b6af70 to your computer and use it in GitHub Desktop.
Save HeshamMeneisi/01994558b7e22d8718832f96b9b6af70 to your computer and use it in GitHub Desktop.
word_list = [doc.words for doc in docs]
bigram = gensim.models.Phrases(word_list)
bigram_phr = gensim.models.phrases.Phraser(bigram)
bi_docs = []
for doc in docs:
words = bigram_phr[doc.words]
tags = doc.tags
bi_docs.append(TaggedDocument(words, tags))
# Again for N+1 Gram
word_list = [doc.words for doc in bi_docs]
trigram = gensim.models.Phrases(word_list)
trigram_phr = gensim.models.phrases.Phraser(trigram)
tri_docs = []
for doc in bi_docs:
words = trigram_phr[doc.words]
tags = doc.tags
tri_docs.append(TaggedDocument(words, tags))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment