Skip to content

Instantly share code, notes, and snippets.

@ground0state
Last active May 7, 2020 03:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ground0state/8d07a437f48802f89288e00f6bf7d6ba to your computer and use it in GitHub Desktop.
Save ground0state/8d07a437f48802f89288e00f6bf7d6ba to your computer and use it in GitHub Desktop.
"""
MIT License
Copyright (c) 2017-2020 Packt, grouns0state
https://github.com/PacktPublishing/Artificial-Intelligence-with-Python/blob/master/LICENSE
"""
from sklearn.cluster import KMeans
from sklearn import metrics
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
X, _ = make_blobs(n_samples=1000, centers=6, n_features=5, random_state=0)
scores = []
values = np.arange(2, 10)
for num_clusters in values:
kmeans = KMeans(init='k-means++', n_clusters=num_clusters, n_init=10)
kmeans.fit(X)
score = metrics.silhouette_score(X, kmeans.labels_,
metric='euclidean', sample_size=len(X))
print("\nNumber of clusters =", num_clusters)
print("Silhouette score =", score)
scores.append(score)
plt.figure()
plt.bar(values, scores, width=0.7, color='black', align='center')
plt.title('Silhouette score vs number of clusters')
plt.show()
num_clusters = np.argmax(scores) + values[0]
print('Optimal number of clusters =', num_clusters)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment