Skip to content

Instantly share code, notes, and snippets.

@Renien
Last active May 27, 2021 19:31
Jaccard Similarity: The Jaccard similarity of sets is the ratio of the size of the intersection of the sets to the size of the union. This measure of similarity is suitable for many applications, including textual similarity of documents and similarity of buying habits of customers.
__author__ = 'renienj'
import numpy as np
def compute_jaccard_similarity_score(x, y):
"""
Jaccard Similarity J (A,B) = | Intersection (A,B) | /
| Union (A,B) |
"""
intersection_cardinality = len(set(x).intersection(set(y)))
union_cardinality = len(set(x).union(set(y)))
return intersection_cardinality / float(union_cardinality)
if __name__ == "__main__":
score = compute_jaccard_similarity_score(np.array([0, 1, 2, 5, 6]), np.array([0, 2, 3, 5, 7, 9]))
print "Jaccard Similarity Score : %s" %score
pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment