Skip to content

Instantly share code, notes, and snippets.

@Renien
Last active May 27, 2021 19:31
Show Gist options
  • Save Renien/9672f174e31b6f96f356da09eb481d2c to your computer and use it in GitHub Desktop.
Save Renien/9672f174e31b6f96f356da09eb481d2c to your computer and use it in GitHub Desktop.
Jaccard Similarity: The Jaccard similarity of sets is the ratio of the size of the intersection of the sets to the size of the union. This measure of similarity is suitable for many applications, including textual similarity of documents and similarity of buying habits of customers.
__author__ = 'renienj'
import numpy as np
def compute_jaccard_similarity_score(x, y):
"""
Jaccard Similarity J (A,B) = | Intersection (A,B) | /
| Union (A,B) |
"""
intersection_cardinality = len(set(x).intersection(set(y)))
union_cardinality = len(set(x).union(set(y)))
return intersection_cardinality / float(union_cardinality)
if __name__ == "__main__":
score = compute_jaccard_similarity_score(np.array([0, 1, 2, 5, 6]), np.array([0, 2, 3, 5, 7, 9]))
print "Jaccard Similarity Score : %s" %score
pass
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment