Skip to content

Instantly share code, notes, and snippets.

@satomacoto
Last active August 29, 2015 14:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save satomacoto/efce940c067ee962d4e5 to your computer and use it in GitHub Desktop.
Save satomacoto/efce940c067ee962d4e5 to your computer and use it in GitHub Desktop.
doc2vec most_similar_labels and most_similar_words sample
>>> import gensim
>>> sentences = [
... ['human', 'interface', 'computer'], #0
... ['survey', 'user', 'computer', 'system', 'response', 'time'], #1
... ['eps', 'user', 'interface', 'system'], #2
... ['system', 'human', 'system', 'eps'], #3
... ['user', 'response', 'time'], #4
... ['trees'], #5
... ['graph', 'trees'], #6
... ['graph', 'minors', 'trees'], #7
... ['graph', 'minors', 'survey'] #8
... ]
>>> labeledSentences = gensim.models.doc2vec.LabeledListSentence(sentences)
>>> model = gensim.models.doc2vec.Doc2Vec(labeledSentences, min_count=0)
>>> model.labels
{'SENT_5', 'SENT_0', 'SENT_2', 'SENT_3', 'SENT_6', 'SENT_4', 'SENT_1', 'SENT_8', 'SENT_7'}
>>> model.most_similar_labels('SENT_0')
[('SENT_7', 0.09040503203868866), ('SENT_8', 0.05388247221708298), ('SENT_3', 0.018625225871801376), ('SENT_6', 0.0021968595683574677), ('SENT_2', -0.005669509992003441), ('SENT_1', -0.034463658928871155), ('SENT_5', -0.044474877417087555), ('SENT_4', -0.11045961081981659)]
>>> model.most_similar_words('human')
[('eps', 0.0804225355386734), ('system', 0.0298603605479002), ('graph', 0.024964405223727226), ('user', 0.020017698407173157), ('computer', 0.00942305475473404), ('interface', 0.006561885587871075), ('response', -0.0009844079613685608), ('time', -0.02301063761115074), ('survey', -0.049963727593421936), ('trees', -0.11870135366916656)]
>>> model.most_similar_words(positive=['SENT_0', 'SENT_1'], negative=['SENT_2'], topn=5)
[('human', 0.10655776411294937), ('response', 0.0948006808757782), ('interface', 0.07383717596530914), ('eps', 0.04268331080675125), ('graph', 0.02581930160522461)]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment