Skip to content

Instantly share code, notes, and snippets.

@herdrick
Created July 18, 2010 01:27
Show Gist options
  • Save herdrick/480019 to your computer and use it in GitHub Desktop.
Save herdrick/480019 to your computer and use it in GitHub Desktop.
(def freq-files (memoize (fn [pof word]
(/ (or (get (frequencies-m (to-words pof)) word) 0)
(count-m (to-words pof))))))
(def freq (memoize (fn [pof word]
(if (instance? java.io.File pof)
(freq-files pof word)
(mean (vector (freq (first pof) word) ; combine frequencies by taking their unweighted mean.
(freq (second pof) word)))))))
(def euclidean (memoize (fn [pof1 pof2 pofs]
(sqrt (reduce + (map (fn [word]
(sq (- (freq pof1 word)
(freq pof2 word))))
(word-list pofs)))))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment