Skip to content

Instantly share code, notes, and snippets.

@mattf
Last active August 29, 2015 13:59
Show Gist options
  • Save mattf/10578722 to your computer and use it in GitHub Desktop.
Save mattf/10578722 to your computer and use it in GitHub Desktop.
sahara bigpetstore script
create node group
- master
- namenode, oozie, resourcemanager, historyserver
create node group
- worker
- datanode, nodemanager
create cluster
- master: 1
- worker: 4
launch cluster
- bigcluster
- keypair1
explore
- connect IP
- export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/java/jdk1.7.0_51/bin:/opt/hadoop/bin:/opt/hadoop/sbin; cd; clear
- hadoop dfsadmin -report
create job binary
- bigpetstore.jar
- internal database
- upload
create job
- biggen
- java action
- libs
- bigpetstore.jar
launch job on existing cluster
- config
- main class: org.bigtop.bigpetstore.generator.BPSGenerator
- arg: 1000000
- arg: bigpetstore/gen
explore
- hadoop fs -ls -R bigpetstore
- hadoop fs -cat bigpetstore/gen/part-r-00000 | head
create job binary
- piglib.jar
- internal database
create job binary
- bps_analyze.pig
- internal database
- create a script
- https://gist.github.com/mattf/10560429 (thanks @jayunit100)
create job
- bigetl
- java action
- libs
- bigpetstore.jar
- piglib.jar
- bps_analyze.pig
launch job on existing cluster
- config
- main class: org.bigtop.bigpetstore.etl.PigCSVCleaner
- arg: bigpetstore/gen
- arg: bigpetstore/clean
- arg: bps_analyze.pig
explore
- hadoop fs -ls -R bigpetstore
- hadoop fs -cat bigpetstore/clean/part-r-00000 | head
finish
- hadoop fs -cat bigpetstore/pig_ad_hoc_script0/part-r-00000
- http://jayunit100.github.io/bigpetstore/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment