- You will first have to download the gist to a file and then upload it to S3 in a bucket of your choice.
- Using the AWS EMR Console create a cluster and choose advanced options.
- In Step 1 make sure you check the Spark x.x.x checkbox if you want to make use of the sparklyr library in RStudio. You can customize the Spark version by choosing a different emr Release version.
- In Step 3 you can configure your bootstraps. Choose to Configure and add a Custom action
- For the Name you can fill something like Install RStudio Server
- For the Script location you will need to point to where you have uploaded the gist (Eg.
s3://my-bucket/emr/bootstrap/install-rstudio-server.sh)
NewerOlder