Skip to content

Instantly share code, notes, and snippets.

@herdrick
Created June 6, 2010 02:04
Show Gist options
  • Save herdrick/427187 to your computer and use it in GitHub Desktop.
Save herdrick/427187 to your computer and use it in GitHub Desktop.
(ns hc (:use [incanter.core :only (abs sq sqrt)]
[incanter.stats :only (mean)]
[clojure.contrib.combinatorics :only (combinations)]))
(def *interesting-words-count* 3)
(def *directory-string* "/Users/herdrick/Dropbox/clojure/hierarchical-classifier/data/mixed")
(def *txt-files* (seq (org.apache.commons.io.FileUtils/listFiles (new java.io.File *directory-string*) nil true)))
(def file->seq (memoize (fn [file]
(re-seq #"[a-z]+"
(org.apache.commons.lang.StringUtils/lowerCase (slurp (.toString file)))))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment