Skip to content

Instantly share code, notes, and snippets.

@TheLoneNut
Created February 6, 2019 18:42
Show Gist options
  • Save TheLoneNut/67590540964204aae3b6a25e5b63b855 to your computer and use it in GitHub Desktop.
Save TheLoneNut/67590540964204aae3b6a25e5b63b855 to your computer and use it in GitHub Desktop.
from sklearn.cluster import MiniBatchKMeans
from sklearn.metrics import calinski_harabaz_score
num_clusters = range(10, 600, 10)
scores = []
for num_cluster in num_clusters:
km = MiniBatchKMeans(n_clusters=num_cluster, init_size=max(300, 3*num_cluster)).fit(X)
labels = km.labels_
scores.append(calinski_harabaz_score(X, labels))
fig, ax = plt.subplots()
ax.plot(num_clusters, scores)
ax.set(xlabel='number of clusters', ylabel='score',
title='Calinski Harabaz Score vs number of clusters')
ax.grid()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment