Skip to content

Instantly share code, notes, and snippets.

@cmiles74
Created December 28, 2011 19:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save cmiles74/1529376 to your computer and use it in GitHub Desktop.
Save cmiles74/1529376 to your computer and use it in GitHub Desktop.
Load data from a file (line by line) into HBase
(defn hfs-report
[path]
"Loads the log data from an HDFS path into Hbase."
(?<- (hbase-tap "urls" "?url-hash" "urls" "?url" "?crawl-date"
"?crawl-time" "?response-code" "?status" "?host")
[?url-hash ?url ?crawl-date ?crawl-time ?response-code ?status ?host]
[(hfs-textline path) ?text]
(fetch-value-hash ?text :url :> ?url-hash)
(fetch-value ?text :url :> ?url)
(fetch-value ?text :crawl-date :> ?crawl-date)
(fetch-value ?text :crawl-time :> ?crawl-time)
(fetch-value ?text :response :> ?response-code)
(fetch-value ?text :status :> ?status)
(fetch-value ?text :host :> ?host)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment