Skip to content

Instantly share code, notes, and snippets.

@nokados
Created May 25, 2018 09:39
Show Gist options
  • Save nokados/09078a7abc352eff347605dbf56b855f to your computer and use it in GitHub Desktop.
Save nokados/09078a7abc352eff347605dbf56b855f to your computer and use it in GitHub Desktop.
Wordcloud visualization of clusters
%%time
clusters = dbscan.fit(doc2vec_list)
cl_labels = clusters.labels_.tolist()
def wordcloud_cluster_byIds(cluId):
texts = []
for i in range(0, len(cl_labels)):
if cl_labels[i] == cluId:
for word in word_tokenize(dialogs_concatted.iloc[i].TEXT):
texts.append(word)
wordcloud = WordCloud(max_font_size=40, relative_scaling=.8).generate(' '.join(texts))
plt.figure(figsize=(10,10))
plt.imshow(wordcloud)
plt.axis("off")
plt.title('Cluster #{}'.format(cluId))
plt.savefig(str(cluId)+".png")
for cluId in tqdm_notebook(set(cl_labels)):
wordcloud_cluster_byIds(cluId)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment