Skip to content

Instantly share code, notes, and snippets.

@sjqtentacles
Last active August 29, 2015 14:20
Show Gist options
  • Save sjqtentacles/67c712e4893d2a3669d4 to your computer and use it in GitHub Desktop.
Save sjqtentacles/67c712e4893d2a3669d4 to your computer and use it in GitHub Desktop.
Calinski Harabasz Index function
def CH_index(X, labels, centroids):
#X being a pandas dataframe, so X = dataframe.values
#labels being the (KMeans(n_clusters=i).fit(X)).labels_
#centroids being the KMeans centroid values of a fitted kmeans model
'''
https://github.com/scampion/scikit-learn/blob/master/scikits/learn/cluster/__init__.py
'''
mean = np.mean(X,axis=0)
B = np.sum([ np.sum(labels==i)*(c - mean)**2 for i,c in enumerate(centroids)])
W = np.sum([ (x-centroids[labels[i]])**2 for i, x in enumerate(X)])
c = len(centroids)
n = len(X)
return ((n-c)*B )/1.0/((c-1)*W)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment