Skip to content

Instantly share code, notes, and snippets.

@sorenmacbeth
Created September 15, 2011 08:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sorenmacbeth/1218827 to your computer and use it in GitHub Desktop.
Save sorenmacbeth/1218827 to your computer and use it in GitHub Desktop.
(defn ngram-counts [root]
(let [ga (select-fields (gadata-tap root) "!kw")]
(<- [?ngram ?count]
(ga !kw)
(not= !kw "")
(gen-ngrams !kw 4 :> ?ngram)
(:sort ?ngram)
(c/count ?count))))
(defbufferop collapse-ngrams [tuples]
(->> (map first tuples) (partition 2 1) (remove (fn [[a b]] (and b (.contains b a)))) (map first)))
(defn ones [root]
(let [counts (ngram-counts root)]
(<- [?filtered]
(counts ?n ?c)
(= ?c 1)
(collapse-ngrams ?n :> ?filtered))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment