Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save terrycojones/1953690f960e3da1d2e307cf918b8453 to your computer and use it in GitHub Desktop.
Save terrycojones/1953690f960e3da1d2e307cf918b8453 to your computer and use it in GitHub Desktop.
from sklearn.metrics import mutual_info_score
from sklearn.metrics.cluster import entropy
def normalized_information_distance(c1, c2):
"""
Calculate Normalized Information Distance
Taken from Vinh, Epps, and Bailey, (2010). Information Theoretic Measures
for Clusterings Comparison: Variants, Properties, Normalization and
Correction for Chance, JMLR
<http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf>
"""
denom = max(entropy(c1), entropy(c2))
# The clusterings are identical (so return 1.0) if both have zero entropy.
return (1.0 - (mutual_info_score(c1, c2) / denom)) if denom else 1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment