Created
February 2, 2011 17:38
-
-
Save sritchie/808044 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(defn whole-file | |
"Custom scheme for dealing with entire files." | |
[field-name] | |
(WholeFile. (w/fields field-name))) | |
(defn hfs-wholefile | |
"Creates a tap on HDFS using the wholefile format. Guaranteed not | |
to chop files up! Required for unsupported compression formats like HDF." | |
[path] | |
(w/hfs-tap (whole-file ["file"]) path)) | |
(defn files-with-name | |
"Query to return all files in the supplied directory, along with filenames." | |
[dir] | |
(let [source (hfs-wholefile dir)] | |
(?<- (stdout) [?file ?filename] | |
(source ?file) | |
((AddYearFunction.) ?file :> ?filename)))) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey, something like this should really really be part of cascalogs standard taps. :)