Skip to content

Instantly share code, notes, and snippets.

@djamelz
Last active October 6, 2015 07:45
Show Gist options
  • Save djamelz/94ca0603ab3eb68e8d15 to your computer and use it in GitHub Desktop.
Save djamelz/94ca0603ab3eb68e8d15 to your computer and use it in GitHub Desktop.
Fun with aws-cli
#pop emr cluster with spark job to execute
aws emr create-cluster --name "<cluster name>"\
--region us-east-1\
--log-uri s3://<bucket>/logs/<cluster name>/\
--enable-debugging\
--release-label emr-4.0.0\
--applications Name=Spark\
--ec2-attributes KeyName=<keyname>\
--use-default-roles\
--instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=c3.xlarge InstanceGroupType=CORE,InstanceCount=2,InstanceType=r3.4xlarge\
--bootstrap-actions Path=s3://<bucket>/scripts/S3Get.sh,Args=[s3://<bucket>/jars/<jar>,/tmp/]\
--steps Type=Spark,Name="<jobName>",ActionOnFailure=TERMINATE_CLUSTER,Args=[--executor-memory,100G,--executor-cores,28,--class,<class>,/tmp/<jar>,<args>]\
--auto-terminate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment