Skip to content

Instantly share code, notes, and snippets.

@endform
Created October 27, 2011 22:54
Show Gist options
  • Save endform/1321145 to your computer and use it in GitHub Desktop.
Save endform/1321145 to your computer and use it in GitHub Desktop.
Multistep EMR
MY_BUCKET=my_bucket
S3_PATH=s3n://$MY_BUCKET/data/logins_data
HDFS_PATH=hdfs:///data/logins_data
OUTPUT_PATH=s3n://$MY_BUCKET/output/summarized
PIG_SCRIPT=s3n://$MY_BUCKET/scripts/provider_counts.pig
ruby elastic-mapreduce --create --alive --num-instance 3 --enable-debugging s3n://$MY_BUCKET/logs
JOB_ID=X # found in the output of the last command.
ruby elastic-mapreduce --jar s3://elasticmapreduce/samples/distcp/distcp.jar --arg $S3_PATH --arg $HDFS_PATH -j $JOB_ID
ruby elastic-mapreduce --pig-script --args "$PIG_SCRIPT,-p,INPUT=$HDFS_PATH,-p,OUTPUT=$OUTPUT_PATH" -j $JOB_ID
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment