Skip to content

Instantly share code, notes, and snippets.

@kaiix
Created December 24, 2015 10:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kaiix/602595674dbb2c86daa1 to your computer and use it in GitHub Desktop.
Save kaiix/602595674dbb2c86daa1 to your computer and use it in GitHub Desktop.
import math
from collections import Counter
def tf(word, doc):
word_counter = Counter(doc.split())
return word_counter(word) / sum(word_counter.values())
def idf(word, corpus):
return math.log(len(corpus) / (1 + sum(1 for doc in corpus if word in doc)))
def tfidf(word, doc, corpus):
return tf(word, doc) * idf(word, corpus)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment