Skip to content

Instantly share code, notes, and snippets.

@sorenmacbeth
Created January 12, 2012 22:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sorenmacbeth/1603559 to your computer and use it in GitHub Desktop.
Save sorenmacbeth/1603559 to your computer and use it in GitHub Desktop.
(ns ybot.hadoop.pail
(:use cascalog.api
[cascalog.io :only (with-fs-tmp)])
(:import [backtype.cascading.tap PailTap PailTap$PailTapOptions]
[backtype.hadoop.pail Pail]))
(defn- pail-tap
[path colls structure]
(let [seqs (into-array java.util.List colls)
spec (PailTap/makeSpec nil structure)
opts (PailTap$PailTapOptions. spec "string" seqs nil)]
(PailTap. path opts)))
(defn string-tap [path & colls]
(pail-tap path colls (ybot.hadoop.pail.StringPailStructure.)))
(defn ?pail-*
"Executes the supplied query into the pail located at the supplied
path, consolidating when finished."
[tap pail-path query]
(let [pail (Pail. pail-path)]
(with-fs-tmp [_ tmp]
(?- (tap tmp) query)
(.absorb pail (Pail. tmp)))))
(defmacro ?pail-
"Executes the supplied query into the pail located at the supplied
path, consolidating when finished."
[[tap path] query]
(list `?pail-* tap path query))
(defn to-pail
"Executes the supplied `query` into the pail at `pail-path`. This
pail must make use of the `StringPailStructure`."
[pail-path query]
(?pail- (string-tap pail-path)
query))
(defmain consolidate [pail-path]
(.consolidate (Pail. pail-path)))
(defmain absorb [from-pail to-pail]
(.absorb (Pail. to-pail) (Pail. from-pail)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment