Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@rafaljanwojcik
Created November 26, 2019 18:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rafaljanwojcik/865a9847e1fbf3299b9bf111a164bdf9 to your computer and use it in GitHub Desktop.
Save rafaljanwojcik/865a9847e1fbf3299b9bf111a164bdf9 to your computer and use it in GitHub Desktop.
words = pd.DataFrame(word_vectors.vocab.keys())
words.columns = ['words']
words['vectors'] = words.words.apply(lambda x: word_vectors.wv[f'{x}'])
words['cluster'] = words.vectors.apply(lambda x: model.predict([np.array(x)]))
words.cluster = words.cluster.apply(lambda x: x[0])
words['cluster_value'] = [1 if i==0 else -1 for i in words.cluster]
words['closeness_score'] = words.apply(lambda x: 1/(model.transform([x.vectors]).min()), axis=1)
words['sentiment_coeff'] = words.closeness_score * words.cluster_value
@bhavyadeep111
Copy link

I tried to use this chunk, but I am getting "ValueError: Buffer dtype mismatch, expected 'double' but got 'float' " for words['cluster'] = words.vectors.apply(lambda x: model.predict([np.array(x)]))

I tried to explicitly typecast np.array to double and float64, but nothing worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment