Created
June 14, 2018 13:31
-
-
Save terrycojones/1953690f960e3da1d2e307cf918b8453 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.metrics import mutual_info_score | |
from sklearn.metrics.cluster import entropy | |
def normalized_information_distance(c1, c2): | |
""" | |
Calculate Normalized Information Distance | |
Taken from Vinh, Epps, and Bailey, (2010). Information Theoretic Measures | |
for Clusterings Comparison: Variants, Properties, Normalization and | |
Correction for Chance, JMLR | |
<http://jmlr.csail.mit.edu/papers/volume11/vinh10a/vinh10a.pdf> | |
""" | |
denom = max(entropy(c1), entropy(c2)) | |
# The clusterings are identical (so return 1.0) if both have zero entropy. | |
return (1.0 - (mutual_info_score(c1, c2) / denom)) if denom else 1.0 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment