Skip to content

Instantly share code, notes, and snippets.

@griggheo
Created November 4, 2011 20:43
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save griggheo/1340429 to your computer and use it in GitHub Desktop.
EMR automation
#!/bin/bash
TIMESTAMP=`date "+%Y%m%d%H%M"`
EMR_DIR=/opt/emr
LOG_FILE=$EMR_DIR/run_emr_cluster.log.$TIMESTAMP
START=`date "+%Y-%m-%d %H:%M"`
echo $START > $LOG_FILE
SSH_KEY=/root/.ssh/emrdw.pem
NAME=nightly
CREDENTIALS=/opt/emr/credentials.json
NUM_INSTANCES=2
MASTER_INSTANCE_TYPE=m1.small
SLAVE_INSTANCE_TYPE=m1.small
CMD="$EMR_DIR/elastic-mapreduce -c $CREDENTIALS --create --name "$NAME" --alive --num-instances $NUM_INSTANCES --master-instance-type $MASTER_INSTANCE_TYPE --slave-instance-type $SLAVE_INSTANCE_TYPE --hadoop-version 0.20 --hive-interactive --hive-versions 0.7.1"
echo Launching EMR cluster with command $CMD >> $LOG_FILE
JOBFLOWID=`$CMD| egrep 'j-.*' -o`
echo JOBFLOWID: $JOBFLOWID >> $LOG_FILE
while true; do
$EMR_DIR/elastic-mapreduce --list --jobflow $JOBFLOWID | grep WAITING
if [ $? = 0 ]; then
break
fi
sleep 10
done
MASTER=`$EMR_DIR/elastic-mapreduce --jobflow $JOBFLOWID --describe | grep MasterPublicDnsName | egrep 'ec2.*com' -o`
echo Master node: $MASTER >> $LOG_FILE
scp -i $SSH_KEY -r $EMR_DIR/hivescripts hadoop@$MASTER:
ssh -i $SSH_KEY hadoop@$MASTER "cd hivescripts; ./run_hive_scripts.sh" >& /tmp/emr.log
cat /tmp/emr.log >> $LOG_FILE
rm -rf $EMR_DIR/hive_outputs
scp -i $SSH_KEY -r hadoop@$MASTER:hive_outputs $EMR_DIR/
$EMR_DIR/elastic-mapreduce --jobflow $JOBFLOWID --terminate
STOP=`date "+%Y-%m-%d %H:%M"`
echo $STOP >> $LOG_FILE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment