Skip to content

Instantly share code, notes, and snippets.

@matthayes
Last active December 24, 2015 07:59
Show Gist options
  • Save matthayes/6767077 to your computer and use it in GitHub Desktop.
Save matthayes/6767077 to your computer and use it in GitHub Desktop.
DataFu's Hourglass: Quick Start
git clone git://git.apache.org/incubator-datafu.git datafu
cd datafu/contrib/hourglass
rm temp.avro
hadoop fs -copyToLocal /output/20130315/part-r-00000.avro temp.avro
java -jar lib/test/avro-tools-jar-1.7.4.jar tojson temp.avro | head
ant jar
ant testjar
export LIBJARS=$(find "lib/common" -name '*.jar' | xargs echo | tr ' ' ',')
export LIBJARS=$LIBJARS,$(find "build" -name '*.jar' | xargs echo | tr ' ' ',')
export HADOOP_CLASSPATH=`echo ${LIBJARS} | sed s/,/:/g`
hadoop jar build/datafu-hourglass-test.jar generate -libjars ${LIBJARS} /data/event 2013/03/01-2013/03/14
hadoop fs -copyToLocal /data/event/2013/03/01/part-00000.avro temp.avro
java -jar lib/test/avro-tools-jar-1.7.4.jar tojson temp.avro | head
hadoop jar build/datafu-hourglass-test.jar countbyid -libjars ${LIBJARS} /data/event /output
rm temp.avro
hadoop fs -copyToLocal /output/20130314/part-r-00000.avro temp.avro
java -jar lib/test/avro-tools-jar-1.7.4.jar tojson temp.avro | head
hadoop jar build/datafu-hourglass-test.jar generate -libjars ${LIBJARS} /data/event 2013/03/15
hadoop jar build/datafu-hourglass-test.jar countbyid -libjars ${LIBJARS} /data/event /output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment