Skip to content

Instantly share code, notes, and snippets.

@milyasyousuf
Last active March 23, 2017 15:28
Show Gist options
  • Save milyasyousuf/10b28afad497c5b04be9fbcfc5f90ae3 to your computer and use it in GitHub Desktop.
Save milyasyousuf/10b28afad497c5b04be9fbcfc5f90ae3 to your computer and use it in GitHub Desktop.
The KMeans algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. This algorithm requires the number of clusters to be specified. It scales well to large number of samples and has been used across a large range of application areas in many…
##################################################################
# Kmean algorithm basic implementation using scikit lean #
# By: Muhammad Ilyas #
# email: m_ilyas@outlook.com #
# source: http://scikit-learn.org/stable/auto_examples/cluster/ #
##################################################################
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
x = [1,5,1.5,8,1,9,2,8,1.8,8,0.6,10]
y = [2,8,1.8,8,0.6,10,1,5,1.5,8,1,9]
def k_mean_function(x,y):
test=[]
for i in range(0,len(x)-1,1):
test.append([x[i],y[i]])
#result_array = np.append(result_array, z)
X=np.asarray(test)
kmeans=KMeans(n_clusters=3)
kmeans.fit(X)
#point mark we are putting in each clusters
cn=kmeans.cluster_centers_
labels=kmeans.labels_
#printing the points
print (cn)
print (labels)
#color for centroid
colors=["g.","r.","b.","g.","r."]
for i in range(len(X)):
print ("coordinate: ",X[i],"label:",labels[i])
plt.plot(X[i][0],X[i][1],colors[labels[i]],markersize=10)
#plotting the centroids
plt.scatter(cn[:,0],cn[:,1],marker="x",s=150,linewidths=5,zorder=10)
plt.show()
k_mean_function(x,y)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment