Last active
May 27, 2021 19:31
Jaccard Similarity: The Jaccard similarity of sets is the ratio of the size of the intersection of the sets to the size of the union. This measure of similarity is suitable for many applications, including textual similarity of documents and similarity of buying habits of customers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
__author__ = 'renienj' | |
import numpy as np | |
def compute_jaccard_similarity_score(x, y): | |
""" | |
Jaccard Similarity J (A,B) = | Intersection (A,B) | / | |
| Union (A,B) | | |
""" | |
intersection_cardinality = len(set(x).intersection(set(y))) | |
union_cardinality = len(set(x).union(set(y))) | |
return intersection_cardinality / float(union_cardinality) | |
if __name__ == "__main__": | |
score = compute_jaccard_similarity_score(np.array([0, 1, 2, 5, 6]), np.array([0, 2, 3, 5, 7, 9])) | |
print "Jaccard Similarity Score : %s" %score | |
pass |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment