Skip to content

Instantly share code, notes, and snippets.

@saswata-dutta
Last active April 29, 2020 10:39
Show Gist options
  • Save saswata-dutta/f456c8519ba01f3b45037ec26caa1d10 to your computer and use it in GitHub Desktop.
Save saswata-dutta/f456c8519ba01f3b45037ec26caa1d10 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash
# test access without cost
aws emr create-cluster \
--name "1-node dummy cluster" \
--ec2-attributes KeyName=???,SubnetId=subnet-??? \
--instance-type m4.large \
--release-label emr-5.23.0 \
--instance-count 1 \
--use-default-roles \
--applications Name=Spark \
--auto-terminate
# submit jar from s3
job_name="poc_emr_spark"
aws emr create-cluster \
--name $job_name \
--ec2-attributes KeyName=???,SubnetId=subnet-??? \
--auto-terminate \
--use-default-roles \
--release-label emr-5.23.0 \
--instance-type m4.large \
--instance-count 3 \
--log-uri s3://???/emr/spark/`date +%Y-%m-%d-%H-%M-%S`/$job_name/logs \
--applications "Name=Spark" \
--steps "Name=Spark_step,Type=Spark,Args=[--deploy-mode,cluster,--master,yarn,--class,com.???,--conf,spark.driver.memory=4g,--conf,spark.executor.memory=4g,--conf,spark.memory.fraction=0.8,s3://???.jar]"
aws emr create-cluster \
--name "$job_name" \
--ec2-attributes "$ec2_attrs" \
--service-role EMR_DefaultRole \
--release-label emr-5.23.0 \
--instance-type m4.large \
--instance-count 2 \
--log-uri s3://rivigo-data-lake/dev/emr/logs/$job_name/`date +%Y-%m-%d-%H-%M-%S`/ \
--applications "Name=Spark" \
--configurations '[{"Classification":"spark","Properties":{"maximizeResourceAllocation":"true"}}]' \
--steps "${steps[@]}" \
--scale-down-behavior TERMINATE_AT_TASK_COMPLETION \
--auto-terminate \
--region ap-southeast-1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment