Skip to content

Instantly share code, notes, and snippets.

@alexpw
Created November 29, 2012 19:30
Show Gist options
  • Save alexpw/4171281 to your computer and use it in GitHub Desktop.
Save alexpw/4171281 to your computer and use it in GitHub Desktop.
cascalog w/ cascading.avro -- problem
(comment
(defproject project "0.0.1-SNAPSHOT"
:description "project: analysis framework"
:dependencies [[org.clojure/clojure "1.4.0"]
[cheshire "5.0.0"]
[cascalog "1.10.0"]
[cascalog-math "0.1.0"]
[cascading.avro/avro-scheme "2.1.0"]
]
:dev-dependencies [[org.apache.hadoop/hadoop-core "0.20.2-dev"]]
:repositories {"conjars.org" "http://conjars.org/repo"}
:main project.core
:aot [project.core]))
;;(:import [cascading.avro AvroScheme])
(def avro-file "/project/data/data.avro")
(def avro-scheme (AvroScheme. (Schema/parse (slurp "/project/data/schema.json"))))
(def avro-tap (t/hfs-tap avro-scheme avro-file))
;(def reader (DataFileReader. (clojure.java.io/file avro-file) (GenericDatumReader. (Schema/parse (slurp "/project/data/schema.json")))))
;(doseq [rec (iterator-seq reader)] (prn rec))
(defn avro-q []
(binding [d/*DEBUG* true]
(?<- (stdout) [?key]
((select-fields avro-tap ["key"]) ?key))))
;; => (avro-q)
;; ...
;; 12/11/29 11:38:32 ERROR stream.TrapHandler: caught Throwable, no trap available, rethrowing
;; cascading.tuple.TupleException: unable to read from input identifier: file:/project/data/data.avro
;; at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:127)
;; at cascading.flow.stream.SourceStage.map(SourceStage.java:76)
;; at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
;; at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:124)
;; at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
;; at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
;; at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.javFlowException local step failed cascading.flow.planner.FlowStepJob.blockOnJob (FlowStepJob.java:191)
;; a:176)
;; Caused by: org.apache.avro.AvroRuntimeException: Unions may only consist of a concrete type and null in cascading.avro
;; at cascading.avro.AvroToCascading.fromAvroUnion(AvroToCascading.java:131)
;; at cascading.avro.AvroToCascading.fromAvro(AvroToCascading.java:58)
;; at cascading.avro.AvroToCascading.parseRecord(AvroToCascading.java:48)
;; at cascading.avro.AvroToCascading.fromAvro(AvroToCascading.java:73)
;; at cascading.avro.AvroToCascading.parseRecord(AvroToCascading.java:48)
;; at cascading.avro.AvroScheme.source(AvroScheme.java:222)
;; at cascading.tuple.TupleEntrySchemeIterator.getNext(TupleEntrySchemeIterator.java:140)
;; at cascading.tuple.TupleEntrySchemeIterator.hasNext(TupleEntrySchemeIterator.java:120)
;; ... 6 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment