Skip to content

Instantly share code, notes, and snippets.

@duluca
Last active December 9, 2023 11:36
Show Gist options
  • Save duluca/ebcf98923f733a1fdb6682f111b1a832 to your computer and use it in GitHub Desktop.
Save duluca/ebcf98923f733a1fdb6682f111b1a832 to your computer and use it in GitHub Desktop.
Step-by-step Instructions to Setup an AWS ECS Cluster

Configuring AWS ECS to have access to AWS EFS

If you would like to persist data from your ECS containers, i.e. hosting databases like MySQL or MongoDB with Docker, you need to ensure that you can mount the data directory of the database in the container to volume that's not going to dissappear when your container or worse yet, the EC2 instance that hosts your containers, is restarted or scaled up or down for any reason.

Don't know how to create your own AWS ECS Cluster? Go here!

New Cluster

Sadly the EC2 provisioning process doesn't allow you to configure EFS during the initial config. After your create your cluster, follow the guide below.

New Task Definition for Web App

If you're using an Alpine-based Node server like duluca/minimal-node-web-server follow this guide:

  1. Go to Amazon ECS
  2. Task Definitions -> Create new Task Definition
  3. Name: app-name-task, role: none, network: bridge
  4. Add container, name: app-name from before, image: URI from before, but append ":latest"
  5. Soft limit, 256 MB for Node.js
  6. Port mappings, Container port: 3000
  7. Log configuration: awslogs; app-name-logs, region, app-name-prod

New Task Definition for Database

If you're hosting a lightweight database like mongo or excellalabs/mongo:

  1. Go to Amazon ECS
  2. Task Definitions -> Create new Task Definition
  3. Name: mongodb-task, role: none, network: bridge
  4. Add container, name: mongodb-prod, image: mongo or excellalabs/mongo, append a version number like ":3.4.7"
  5. Soft limit, 1024 MB
  6. Port mappings, Container port: 27017
  7. Log configuration: awslogs; mongodb-prod-logs, region, mongodb-prod
  8. Add Env Variables, see excellalabs/mongo repo for details
MONGODB_ADMIN_PASS
MONGODB_APPLICATION_DATABASE
MONGODB_APPLICATION_PASS
MONGODB_APPLICATION_USER

It is not a security best practice to store such secrets in an encrypted form. If you'd like to do the right way, here's your homework: https://aws.amazon.com/blogs/compute/managing-secrets-for-amazon-ecs-applications-using-parameter-store-and-iam-roles-for-tasks/

  1. Then create a new service based on this task definition. 8.1. Make sure that under Deployment Options Minimum healthy percent is 0 and Maximum percent 100. You don't ever want two separate Mongo instances mounted to the same data source.

Existing ECS Cluster with Existing Task Definition for Container

Create a new KMS encryption key

If you would like to encrypt your file system at-rest, then you must have a KMS key.

If not, you may skip but it is strongly recommended that you encrypt your data - no matter how unimportant you think your data is at the moment.

  1. Headover to IAM -> Encryption Keys
  2. Create key
  3. Provide Alias and a description
  4. Tag with 'Environment': 'production'
  5. Carefuly select 'Key Administrators'
  6. Uncheck 'Allow key administrators to delete this key.' to prevent accidental deletions
  7. Key Usage Permissions
  8. Select the 'Task Role' that was created when configuring your AWS ECS Cluster. If not see the Create Task Role section in the guide linked above. You'll need to update existing task definitions, and update your service with the new task definition for the changes to take affect.
  9. Finish

Create a new EFS

  1. Launch EFS
  2. Create file system
  3. Select the VPC that your ECS cluster resides in
  4. Select the AZs that your container instances reside in
  5. Next
  6. Add a name
  7. Enable encryption (You WANT this -- see above)
  8. Create File System
  9. Back on the EFS main page, expand the EFS definition, if not already expanded
  10. Copy the DNS name

Update Your Cloud Formation Template

  1. CloudFormation
  2. Select EC2ContainerService-cluster-name
  3. View/edit design template
  4. Modify the YML to add EfsUri amongst the input parameters
  EfsUri:
    Type: String
    Description: >
      EFS volume DNS URI you would like to mount your EC2 instances to. Directory -> /mnt/efs
    Default: ''
  1. Find EcsInstanceLc update its UserData property to look like:
UserData: !If
        - SetEndpointToECSAgent
        - Fn::Base64: !Sub |
           #!/bin/bash
           # Install efs-utils
           cloud-init-per once yum_update yum update -y
           cloud-init-per once install_nfs_utils yum install -y amazon-efs-utils
        
           # Create /efs folder
           cloud-init-per once mkdir_efs mkdir /efs
        
           # Mount /efs, ensuring a TLS connection
           cloud-init-per once mount_efs echo -e '${EfsUri}:/ /efs efs tls,_netdev 0 0' >> /etc/fstab
           mount -a
           echo ECS_CLUSTER=${EcsClusterName} >> /etc/ecs/ecs.config
           echo ECS_BACKEND_HOST=${EcsEndpoint} >> /etc/ecs/ecs.config
        - Fn::Base64: !Sub |
           #!/bin/bash
           # Install efs-utils
           cloud-init-per once yum_update yum update -y
           cloud-init-per once install_nfs_utils yum install -y amazon-efs-utils
        
           # Create /efs folder
           cloud-init-per once mkdir_efs mkdir /efs
        
           # Mount /efs, ensuring a TLS connection
           cloud-init-per once mount_efs echo -e '${EfsUri}:/ /efs efs tls,_netdev 0 0' >> /etc/fstab
           mount -a
           echo ECS_CLUSTER=${EcsClusterName} >> /etc/ecs/ecs.config
  1. Validate the template
  2. Save the template to S3 and copy the URL
  3. Select your CloudFormation stack again -> Update stack
  4. Paste in the S3 url -> Next
  5. Now you'll see an EfsUri parameter, define it using the DNS name copied from the previous part
  6. On the review screen make sure it is only updating the Auto Scaling Group (ASG) and the Launch Configuration (LC)
  7. Let it update the stack

And Now, The Fun Part -- Updating Your ECS Instances

  1. ECS -> Cluster
  2. Switch to ECS Instances tab There are two paths forward here, one is the sledgehammer, which will bring down your applications:
  3. Scale ECS instances to 0 Note This is the part where your applications come down
  4. After all instances have been brougt down, scale back up to 2 (or more) Or perform a rolling update, which will keep alive your application:
  5. Click on the EC2 instance and on the EC2 dashboard, select Actions -> State -> Terminate
  6. Wait while the instance is terminated and reprovisioned
  7. Rinse and repeat for the next instance

Update Task Definition to Mount to the EFS Volume

  1. ECS -> Task definitions
  2. Create new revision
  3. If you already have not added it, make sure the Role here matches the one for the KMS key
  4. Add volume
  5. Name: 'efs', Source Path: '/efs/your-dir' (If this doesn't work try '/mnt/efs/your-dir')
  6. Add
  7. Click on container name, under Storage and Logs
  8. Select mount point 'efs'
  9. Provide the internal container path. i.e. for MongoDB default is '/data/db'
  10. Update
  11. Create

Update ECS Service with the new Task Definition

  1. ECS -> Clusters
  2. Click on Service name
  3. Update
  4. Type in the new task definition name
  5. Update service

Your service should re-provision the existing containers and voila, you're done!

Last, But Not Least -- Test

Test what you have done.

Go ahead and save some data.

Then scale your EC2 instance size down to 0 (the sledgehammer) and scale it back up again and see if the data is still accessible.

Setting up Your Own AWS ECS Cluster

This is a multi-step configuration -- easy mistakes are likely. Be patient! The pay-off will be worth it. Rudimentary knowledge and awareness of the AWS landscape is not necessarily required, but will make it easier to set things up.

Enable fantastic Blue-Green deployments with npm scripts for AWS ECS.

Some of the instructions make references to package.json for npm script for AWS ECS users. You may safely ignore these steps.   

Creating Amazon ECS Infrastructure

Create a new IAM role

If you plan on having multiple clusters (which is likely to happen at some point) then you should define its own IAM role to prevent any future unintended or malicious access AWS resources.

  1. IAM -> Roles
  2. Create new role
  3. Select Amanzon EC2
  4. Select AmazonEC2ContainerServiceforEC2Role policy -> Next
  5. prod-ecs-instanceRole

Create Cluster

  1. Go to Amazon ECS
  2. Clusters -> Create Cluster
  3. Name: prod-ecs-cluster
  4. On-Demand Instance
  5. 2 m4.large instances across two AZs for highly available config
  6. Create new prod-vpc
  7. Create new prod-security-group
  8. Allow port 80 and 443 for HTTP and HTTPS inbound
  9. Allow port range 32768-61000 so that ECS can dynamically scale instances and run healh checks
  10. Container instance IAM role: select 'prod-ecs-instanceRole' that you just created, if not 'ecsIntanceRole'
  11. Create

Verify Security Group Config

This is a big deal.

  1. Go EC2 -> Network & Security -> Security Groups
  2. Verify there ports are open:
Type Protocol Port Range Source
HTTP (80) TCP (6) 80 0.0.0.0/0
HTTP (80) TCP (6) 80 ::/0
Custom TCP Rule TCP (6) 32768-61000 0.0.0.0/0
HTTPS (443) TCP (6) 443 0.0.0.0/0
HTTPS (443) TCP (6) 443 ::/0

Create Container Repository

  1. Go to Amazon ECS
  2. Repositories -> Create Repository
  3. Enter your app-name
  4. Copy repository URI, add to package.json “imageRepo”: “000000000000.dkr.ecr.us-east-1.amazonaws.com/app-name"
  5. Create

Create Task Definition

  1. Go to Amazon ECS
  2. Task Definitions -> Create new Task Definition
  3. Name: app-name-task, role: none, network: bridge
  4. Add container, name: app-name from before, image: URI from before, but append ":latest"
  5. Soft limit, 256 MB for Node.js
  6. Port mappings, Container port: 3000
  7. Log configuration: awslogs; app-name-logs, region, app-name-prod

Create ELB

  1. Go to Amazon EC2
  2. Load Balancers -> Create Load Balancer
  3. Application Load Balancer
  4. Name: app-name-prod-elb
  5. Add listener: HTTPS, 443
  6. AZs, select prod-vpc, select all
  7. Tags -> Domain, app-name.yourdomain.com
  8. Next
  9. Choose or create SSL cert (star is recommended: add *.yourdomain.com and yourdomain.com separately on the cert)
  10. Select default ELB security policy
  11. Next
  12. Create prod-cluster specific security group only allowing port 80 and 443 inbound
  13. Next
  14. New target group, name: app-name
  15. Health-checks: Keep default "/" if serving a website on HTTP, but if deploying an API and/or redirecting all HTTP calls to HTTPS, ensure your app defines a custom route that is not redirected to HTTPS. On HTTP server GET "/healthCheck" return simple 200 message saying "I'm healthy" -- verify that this does not redirect to HTTPS, otherwise lot's of pain and suffering will occur. Health checks on AWS will fail.
  16. DO NOT REGISTER ANY TARGETS: ECS will do this for you, if you do so yourself, you will provision a semi-broken infrastructure
  17. Next:Review, then Create  

Create Service

  1. Go to Amazon ECS
  2. Clusters -> Select "prod-ecs-cluster"
  3. Task Definition: app-name-task from before
  4. Service name: app-name
  5. No of tasks: 2, min healthy: 100, max healthy: 200 for highly available blue/green deployment setup
  6. Configure ELB
  • Application Load Balancer
  • ecsServiceRole
  • Select app-name-prod-elb from before
  • Select app-name:0:3000 container from before
  • Add to ELB
  • Target Group Name: app-name from before
  • Save
  1. Create Service
  2. View Service
  3. Verify information
  4. Build image with npm run image:build
  5. Publish and release image with npm run aws:publish
  6. On the Service Events tabs keep an eye on health check errors

Update package.json

"awsRegion": "us-east-1",
"awsEcsCluster": "prod-ecs-cluster",
"awsService": "app-name"

Setup Logs

  1. cloudwatch -> logs
  2. Create Log group
  3. app-name-logs  

Route 53 DNS Update

If you don't use Route 53, don't panic. Just create an A record to the ELB's DNS address and you're done.

  1. hosted zone
  2. select domain
  3. create record set
  4. alias 'yes'
  5. Select ELB App load balancer from the list
  6. Create

Phew!!

Now what?

Now you need to deploy an application on your newly-minted cloud infrastructure. Enable fantastic Blue-Green deployments with [npm scripts for AWS ECS](https://gist.github.com/duluca/2b67eb6c2c85f3d75be8c183ab15266e#file-npm-scripts-for-aws-ecs-md).

Then what?

Go to the ELB DNS address and see if your app works. If you used Route 53 to connect your domain with your ELB or through your own DNS provider, then go to the URL and see if things work.

I Would Like to Persist Data

If you'd like to persist data in your containers via Docker volume mounting, then configure EFS. See this guide.

Troubleshooting

  1. ELB DNS works, but URL doesn't? Your DNS configuration is wrong.
  2. ELB DNS doesn't work. Then check the health of your ECS Service, see step 3 below.
  3. Go to ECS -> Your Cluster -> click on Your Service and switch to the events tab: If you don't see service your-app has reached a steady state. then your container is having trouble starting or AWS is failing to perform a health check.
  4. To see what's wrong with your container, go to the Cloudwatch Logs you setup earlier and you'll be able to see the console logs of your application.
  5. Service is healthy, logs look fine. Things still don't work? Then re-check security group port rules and target group port rules and any AWS IAM security role you may have setup or may be overriding some default behavior that hasn't been covered.
  6. Call someone who knows better :)
@duluca
Copy link
Author

duluca commented Jul 31, 2019

All these little nuances if you read mountains of documentation... I'm dying to know why it isn't secure by default from amazon!

Their consulting arm needs to make money somehow :) On a more serious note, I think it is easier/cheaper to develop an infinitely configurable/flexible tool vs something that makes sense by default and you can just use without much worry. I could see it easily increasing the QA and maintenance burden on AWS by a wide margin.

@disciplezero
Copy link

lol. I can see applications with high volume of non-sensitive throughput not wanting to incur extra cpu load for TLS.

@duluca
Copy link
Author

duluca commented Jul 31, 2019

For sure! And it's probably the exception to how most people would use it - cheap & fast by default vs secure by default. It's their mindset and probably at AWS scale a costly one to argue about.

@disciplezero
Copy link

disciplezero commented Aug 27, 2019 via email

@disciplezero
Copy link

@duluca - There was an email reply that I don't see here. Missed a value in the mount command. Note the efs filesystem type there.

           cloud-init-per once mount_efs echo -e '${EfsUri}:/ /efs efs tls,_netdev 0 0' >> /etc/fstab

@duluca
Copy link
Author

duluca commented Aug 27, 2019

@disciplezero done, thank you!

@AdrienPoupa
Copy link

On Alpine, make sure that you installed nfs-utils otherwise the folder will be mounted but shown as empty

@duluca
Copy link
Author

duluca commented Nov 14, 2019

@AdrienPoupa Did you also install amazon-efs-utils?

@AdrienPoupa
Copy link

AdrienPoupa commented Nov 14, 2019

Yes, it was installed in the ECS host but not in the container. I needed to install nfs-utils in the container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment