Skip to content

Instantly share code, notes, and snippets.

@sujee
Created January 2, 2011 07:14
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sujee/762371 to your computer and use it in GitHub Desktop.
Save sujee/762371 to your computer and use it in GitHub Desktop.
emr-wait-for-completion.sh
#!/bin/bash
## http://sujee.net/tech/articles/amazon-emr-beyond-basics/
echo "=== $JOBID started...."
LOGDIR="/var/logs/hadoop-logs/${JOBNAME}__${JOBID}__${TIMESTAMP}"
mkdir -p "${LOGDIR}"
## stuff below is to wait till the jobs is done
# credit ekampf (https://gist.github.com/762371)
STATUS=$(elastic-mapreduce --list --nosteps | grep $JOBID | awk '{print $2}')
while [ "$STATUS" = "STARTING" -o "$STATUS" = "BOOTSTRAPPING" ]
do
sleep 60
STATUS=$(elastic-mapreduce --list --nosteps | grep $JOBID | awk '{print $2}')
done
t2=$(date +%s)
echo "=== Job started RUNNING in " $(expr $t2 - $t1) " seconds. status : $STATUS"
if [ "$STATUS" = "RUNNING" ]
then
elastic-mapreduce --list | grep "$JOBID"
MASTER_NODE=$(elastic-mapreduce --list | grep "$JOBID"| awk '{print $3}')
echo "Task tracker interface : http://$MASTER_NODE:9100"
echo "Namenode interface : http://$MASTER_NODE:9101"
fi
while [ "$STATUS" = "RUNNING" ]
do
sleep 60
STATUS=$(elastic-mapreduce --list --nosteps | grep $JOBID | awk '{print $2}')
s3cmd sync "s3://my_bucket/emr-logs/${JOBID}/" "${LOGDIR}/" > /dev/null 2> /dev/null
cp -f "${LOGDIR}/steps/1/syslog" "${LOGDIR}/mapreduce.log" 2> /dev/null
done
t3=$(date +%s)
diff=$(expr $t3 - $t1)
elapsed="$(expr $diff / 3600)-hours-$(expr $diff % 60)-mins"
s3cmd sync "s3://my_bucket/emr-logs/${JOBID}/" "${LOGDIR}/" > /dev/null 2> /dev/null
cp "${LOGDIR}/steps/1/syslog" "${LOGDIR}/mapreduce.log" 2> /dev/null
s3cmd del -r "s3://my_bucket/emr-logs/${JOBID}/" > /dev/null 2> /dev/null
echo $(date +%Y%m%d.%H%M%S) " > $0 : finished in $elapsed. status: $STATUS"
touch "${LOGDIR}/job-finished-in-$elapsed"
echo "==========================================="
@ekampf
Copy link

ekampf commented Jan 2, 2011

Mind sharing job-status-parse.rb too?

@ekampf
Copy link

ekampf commented Jan 2, 2011

never mind...you can simplify this script by using this instead of the ruby script:
STATUS=$(elastic-mapreduce --list --no-steps | grep $JOBID | awk '{print $2}')

@sujee
Copy link
Author

sujee commented Jan 3, 2011

Eran,
thanks, I am going to try your AWK solution.

@ekampf
Copy link

ekampf commented Jan 3, 2011

Note that it changes the while conditions
"$STATUS" = "STARTING"
instead of
"$STATUS" = ""STARTING""

@sujee
Copy link
Author

sujee commented Jan 3, 2011

Eran,
full writeup is at : http://sujee.net/tech/articles/amazon-emr-beyond-basics/
feedback appreciated....

@ekampf
Copy link

ekampf commented Jan 3, 2011

Awesome stuff! used a modified version of this script for my production scripts...
Just removed with ruby stuff with the awk version in the comment above...
I think its simpler this way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment