title | date | category |
---|---|---|
Complete end-to-end guide for developing dockerized lambda in Typescript, Terraform and SAM CLI |
2021-03-13T09:00:00.009Z |
development |
This is a full guide to locally develop and deploy a backend app with a recently released container image feature for lambda on AWS.
Needless to say, if you are a great fan of Docker, you would know how amazing it is. What you test on local is what you get when you deploy it, at least at the container level.
Since this feature is quite new, there have been lots of rabbit holes I fell into, and I'm sure others will too, so I will break every piece of rabbit hole down so that nobody gets trapped into it. This guide starts from the real basics like making a user or setting up terraform, so feel free to skip to the section you need.
Reason to use Terraform and SAM CLI together
Well, it seems that Terraform supports building a Docker image and deploying it to ECR out of the box, but after lots of digging, I noticed that things would get simpler if I just build docker image in another pipeline and deploy it with a few lines of shell script. So Terraform will used to define resources excluding the build and deployment process. There's no problem with that.
And, what SAM CLI? Terraform cannot replace SAM CLI and vice versa. SAM cli is useful in developing local lambdas because it automatically configures endpoints for each lambda and greatly removes barriers to the initial setup. Since lambda functions are 'special' in the way that they only get 'booted up and called' when they are invoked (unlike EC2 or Fargate), just doing ts-node my-lambda.ts
would not make it work. Of course there are many other solutions to this matter (such as sls
) but in this guide I will just use SAM CLI. But for many reasons SAM makes me want to use other better solutions if any... The reason follows right below.
Disclaimer for the people who are looking for how to 'hot-reload' Dockerfile for typescript or javascript based lambda: it won't work smoothly as of now. The best bet is to use nodemon
to watch a certain directory to trigger sam build
every single time, and in another shell launch sam local start-api
. It works as expected, but the current problem I see from here is that every single time it sam build
s, it would make another Docker image and another and so on, so there will be many useless dangling images stacked up in your drive, which you will need to delete manually because SAM CLI does not support passing in a argument that's equivalent to docker run --rm
. Anyways that's the story, so this is the reason I might want to try some other solutions. More on this on the relevant issue on Github. Please let me know if any of you had a good experience with sls
because I haven't used it much yet.
Ok. Now let's write some code.
Setup AWS for Terraform
First, make sure that you've installed and authorized on your AWS CLI. Installing AWS CLI is kind of out of scope here, so please follow the guide on AWS.
After successful installation, run:
https://gist.github.com/fbb8f9f76c0b62dd88511e06de8c21f6
You will be prompted to enter Access Key ID and Secret Access Key. Depending on your situation, there are different ways of how you can handle this, but for the sake of simplicity we can make one user (from AWS console. You will probably only use it for 'Programmatic access') that would have these policies for applying Terraform codes.
This one for setting S3 bucket as a backend:
https://gist.github.com/af74e227766aa3e086f6a5c683265249
And this one for locking the state:
https://gist.github.com/3bc1f4c573a6a0dbe560ed790ace6030
And the next one is quite tricky; because we will temporarily enable permissions related to managing IAM because we will first need to make a role from which we could assumeRole
whenever we try to plan and apply our IaC.
For now we can just go onto AWS console and make this policy:
https://gist.github.com/30cb616890b68fde49b42dfe45f20bf4
Make sure you will need to narrow down to specific actions and resources used after everything is done.
Now, now that you've made three distinct policies (or all in one, depending on your preferences), attach them to the user that you've just crated for running aws configure
.
Setup Terraform
If you haven't already, install terraform by following an instruction from the official website. Just download the binary and move it to the bin
folder.
Now verify version of terraform
https://gist.github.com/8ef783f624f3c57f5380e6c23f33757a
And then make main.tf
file in your project directory (I personally put it into IaC
folder because there will another folder for the 'real' .ts
codes for the backend):
main.tf
https://gist.github.com/c44a51ffd046960ff51e23908ca07574
Now, run terraform init
:
https://gist.github.com/0fa379d1c4bd91915c294ba83695147d
Then, we will need to add s3 backend and state locking. But before then, make a table on Dynamodb and also a bucket on S3, each for hosting IaC backend and locking the state.
Now we will need to add more to the policy on DynamoDB we created because we want to create a table:
https://gist.github.com/414644ea59ed85e0023f89c644055fa6
Then you could write this code (by the way, it may be a good idea to put this below IaC in a different general-purpose repository because the current repository is meant to be only used for lambda-related resouces. But for the sake of this article I will just write it away here):
https://gist.github.com/d6f35cb57810f2343990b5be9e03ca10
And then, also add S3 backend (you will need to add relevant IAM policies too here, but since we know how to do it, I will cut the explanation):
https://gist.github.com/c764ec6bf619881240ab1c4bc385de9e
Now, run terraform apply
, verify the changes, and enter yes
. DynamoDB table and S3 Bucket should have been created. Here's the code so far:
main.tf
https://gist.github.com/3348dd4adfde14c85671f5965e4b17e8
Now, add s3 backend and state lock:
https://gist.github.com/081d46a4583d202a63481b17dd1983d1
We are also going to use Docker provider, so add that too:
main.tf
https://gist.github.com/f0c1aae58a9bccc1aaa57dfcd20a656a
Now because you've added a backend and another provider, we will need to run terraform init
again, and then terraform apply
. Run it.
Setting up lambda
Now we will need to develop lambda on the local machine. Install SAM CLI:
https://gist.github.com/2225b4049d0fa7dd9a515c95b2b099cf
Note that the outdated versions would not support running Docker containers, so make sure that your version is the latest.
https://gist.github.com/2f8443ff6ff4e94aa8a6643c1108ce53
Now, we won't run sam --init
, because it will make it difficult to make the server into a monorepo structure. The reason that we will want to make it into a monorepo is that it will make it much easier to propery dockerize every single lambda and deploy it with dependencies that each of them only require to have. Instead we will use lerna to initialize the server folder.
https://gist.github.com/9d57d61cb3139b68acbceef3feb60bb5
https://gist.github.com/a57eb4d84c655fe96e3b24c7ba45f5ff
And as usual:
https://gist.github.com/4da6a67a79beb6a379f69402d2617821
Then it will give you this layout:
https://gist.github.com/42f98c154fad78c4a02c0154d019180e
Then, add your first function package. For the sake of this example, let's assume that we want to make a REST API, composed of many lambdas, each returning 'hello' as a response in different languages (which is totally useless in reality, but at least useful here). Our first lambda will be an English one.
https://gist.github.com/5736cdfa3e7fbd3fb4871a4e3e906c6d
Now the directory structure will look like this:
https://gist.github.com/e7c2a09508e4195bbbd1e5390fd5e023
Now, under server
, we will need to add some utils to build and invoke the function locally. Add modify the server/package.json
as follows and of course, run npm i
again:
https://gist.github.com/bd57dcec4a6bf41032b6b460cc86b71f
To add some explanatioon to what we are trying to do: these devDependencies
are going to be package-wide dependencies. These are not specific to any one of the functions that we are going to build; They will help in tooling general stuffs. That's what we put them here.
Dependencies:
@types/node
: we will need this to give proper type definitions for 'built-in node' modules likefs
orpath
.concurrently
: just a script runner.lerna
: you know it.nodemon
: this will help us watch a directory and build Docker image again.
Scripts:
start
,watch
,api
: we will need these to launch our lambda function locally and invoke it.
Now, you need to create template.yml
for SAM cli to consume and run what we want to run.
https://gist.github.com/16f2a64b2dcbf67cf0aa2f2baca002be
We won't be able to run sam build
or sam local start-api
yet, because we still need to setup Dockerfile
and ECR repository.
So far we have added template.yml
for running SAM CLI:
https://gist.github.com/c37344966dc040456fca874d0dbc7cb0
Now we will add Dockerfile
in packages/hello/
.
https://gist.github.com/9a66944a48ef8791770d19e2fd1c2791
This will be the content for Dockerfile:
https://gist.github.com/a0708ef80b15f6c396a99810aacf0a7a
To go through it line by line:
amazon/aws-lambda-nodejs:14
is the official amazon image for lambda. Current LTS of nodejs is 14, so we are using this.AS builder
is related to multi-stage builds in Docker; it helps reduce the final Docker image size. Basically in thisbuilder
stage, we will only build the output to be included in the final image, and any dependencies installed in this step won't be included in the final output image.WORKDIR /usr/app
: inside the docker image, set the working directory as/usr/app
. There isn't anyapp
folder in a normal docker image, so it will makeapp
directory. We will put the compiled js code there.COPY package*.json tsconfig.json ./
: we need these files for compiling typescript into javascriptt files.npm install
: it will install dependencies.npm run build
: it will compile typescript code into js.RUN ls -la # for debugging
: it is merely for debugging. While building, docker will output what's inside there at that time, for you to verify if you are doing what you intended to do.FROM amazon/aws-lambda-nodejs:14
: this is the second build stage in Docker. All outputs from the previous stage will be discarded in this stage unless explicitly specified to be included.RUN npm install --only=prod
: it will only installdependencies
butdevDepdencies
.COPY --from=builder /usr/app/lib /usr/app/lib
: it explicitly refers to the previousbuilder
stage to copy whatever that was inside/usr/app/lib
to the current/usr/app/lib
. In this case, it will copy all compiled javascript code.CMD [ "/usr/app/lib/index.handler" ]
: the command should bepath-to-lambda-handler-without-extension.handler
. That's just how it works.
Now we've added a Dockerfile. Now let's setup basic environment for lambda:
https://gist.github.com/2328b01be3ec1c2d2e6d955e111f9ba1
You will need to modify tsconfig
to use modern javascript features; Most prominently, add the following. This will allow you to use Promise
API. I recommend turning other options too, especially those related to strict-type checking:
https://gist.github.com/570689747a0f9a185c3197a5df2b8f53
Modify packages/hello/package.json
too. Be noted that any dependencies you add to be included in the final, compiled output code (javascript) will need to be added to dependencies
, not devDependencies
:
packages/hello/package.json
https://gist.github.com/e76c604d2068189f19d26284084113e3
Now, add a really simple lambda:
packages/hello/lib/index.ts
https://gist.github.com/d6d59dcc63564d757c4658f46d0b2c50
So far, we have created these: https://gist.github.com/41c610b04c74f88926c13469ca50224f
Now, create nodemon.json
under server/
to watch and build files:
nodemon.json
https://gist.github.com/e3465d7a00b06c5acb2177c3c7fffec2
After creating nodemon.json
, you can start running npm run watch
or npm start
. It would do two things: build Dockerfile as you make any changes under packages/
directory, and host a local endpoint for the lambda. It will be similar to hot-reload although it seems more like a hack; you won't need to cancel and run sam local start-api
again once you make a change. If it does not work, try again after creating ECR first.
https://gist.github.com/7b142b79af42a3b2ed3ce8cc6bbca05f
Oh, and you can delete __tests__
and lib/hello.js
because we are not using them. Anyways, now we are kind of ready to build this function into a docker image. Let's try it:
https://gist.github.com/9347874d3d93e6676b4e788e89b7d46a
Everything's cool, docker build succeeded. You can try running the image and test the request:
This is where SAM CLI should start to come in. But before then, we will need to make a ECR repository with terraform. Let's go back to terraform for a while.
Back to terraform: assume role and ECR
Now, we will need to create a role first because we will relay on that role to get required permissions to create whatever resource we want to. This is called 'assuming a role', and the reason why it's deemed to be a good practice is that you won't have to create multiple credentials (probably multiple users) to do certain thing that requires permissions. Instead, you borrow the permission for the period of time when you plan and apply the changes in the resources.
So how do we do it? First, let's create hello_role.tf
:
hello_role.tf
https://gist.github.com/9ba78bb7025951e4166a923381436172
For the sake of this article, we won't be diving deep into specific policies, so we will just allow almost all resources without specifying them in detail. For real-world usage, you will have to define exact statements giving just the right permissions.
What we are doing here, essentially, is that we are allowing localtf
user to assume the role of hello_role
that possesses all policies to run the hello server stack. This is called 'creating a trust relationship' (you will see this if you do this process on AWS). This way, localtf
won't always have to hold all permissions it needs. It only acquires them only when needed (i.e. deploying)
Once you are done writing hello_role.tf
, run terraform apply
to make changes.
Now, go back to main.tf
and add:
https://gist.github.com/86352ca003e1bf874662650c0ce183d7
Once you add assume_role
, now you can create any resources you want, using the permissions given by that role. Let's now make an ECR repository. Make ecr.tf
:
ecr.tf
https://gist.github.com/418a60c7fce87754d4422295d7eae3e4
Run apply
and terraform will soon make ECR.
Building and pushing the image to ECR
Now that we've made an ECR, we can go back to our server and write a little script to login, build and push the image of the lambda we are writing.
It's pretty much straightforward; just get your AWS cli ready for using; and authenticate to be able to use ECR from Docker CLI:
login-docker.sh
https://gist.github.com/89434ed5c4044219a9741987e6f0fb63
Next, tag your Docker image as latest
and separate timestamnp at the time of the build, so that latest tag will always be the latest built image, and then another image will remain for the record, which you can use later to revert back or do other things in some cases.
build-and-push-docker-image.sh
https://gist.github.com/676de137d8723c3c47e955bffbe1983d
Now, you can test it out yourself as such:
https://gist.github.com/f1056ede743200b7d1a5686f83800ca8
After this, you will be able to see on AWS ECR:
as you can see, the images are going to be tagged by timestamp, and the latest built image will be always tagged as latest
, and you are going to reference this tag in Terraform to apply newly built Docker image to lambda.
So far, we've made changes like so:
https://gist.github.com/1e57f3364ba4574e78ede0a39ebb6399
Creating Terraform resources for lambda and API gateway
Now the majority of the prepartion is done, so we can move onto creating actual lambda and API gateway.
lambda.tf
First, the most important part: you want to create lambda itself from Docker.
https://gist.github.com/1f9c8dedfb828d7316d2557348f6dd33
There is already a module just made for this, so use it to make lambda. Once you are done, apply
the changes.
The key here is that you are going to reference an image URI from ECR where the tag is the latest. If you have previously built and pushed a new docker image, the hash of the image is going to be different, thus causing a redeployment of the lambda function. Otherwise it has no idea if the docker image is new or not.
Again, after running the change, you can see the lambda on AWS console:
as you can see, compared to other lambdas, it has the package type of 'Image', which means it's not from a Zip, but a Docker image.
You should be able to see the image URI (including the hash) of the image at the bottom of the information about lambda. If you click on that image URI, you will be navigated to the latest image on ECR that you just built and pushed.
api_gateway.tf
Now, you will be able to test lambda on AWS Lambda console, but what we want in the end is something like sending GET /hello
to a certain domain and receiving a response. To be able to do that, we need to setup API Gateway.
For this example, we will setup a domain at api.hello.com
.
Here's how:
https://gist.github.com/1318de076815425487d07cce62bdf65e
I will explain the code one by one.
aws_api_gateway_rest_api
will create a REST api. It usually does not include a single endpoint; it usually contains multiple, like: api.hello.co/hello
, api.hello.co/bonjour
, api.hello.co/nihao
, and so on. On AWS, it is equivalent to a single row in APIs tab:
aws_api_gateway_resource
: to put it very simply, you can think of this as a single API endpoint that has not yet been deployed. In this case we create a single endpoint ending with /hello
.
In the below example, we have created two different resources with OPTIONS
and POST
, to the same path (hidden by black overlay). We will talk about creating OPTIONS resource to handle preflight requests later. For now, it would suffice to know that creating a REST resource means creating a certain endpoint.
aws_api_gateway_method
: Now, you want to create a REST method for that resource. Our /hello
endpoint will just require no auth (out of scope of this article), and be a GET method.
aws_lambda_permission
: By default, there's no permission for API gateway to invoke lambda function. So we are just granting a permission to it so that it can be executed.
aws_api_gateway_integration
: API gateway supports transforming (filtering, preprocessing, ...) a request before it reaches client or a response from the client before it reaches the actual lambda. We are not doing any special things here for this example, but you may want to use it in the future. For more, read relevant AWS docs.
aws_api_gateway_stage
: API gateway supports separating API into different stages out of the box. You should use this to separate your API across production, staging and development environments. For now, we will only make a stage for current terraform workspace, which is assumed to be dev
across all examples in this article. Once you apply your changes, you are going to be able to see this on your AWS console:
aws_api_gateway_deployment
: This is equivalent to clicking on 'Deploy API' on AWS console.
Once resources are created in API gateway, they must be deployed in order to be reachable from external clients. One little problematic thing is redeployment
; Even if you make a change to your REST API resources, it will not get deployed if redeployment
argument does not change. There are mainly two ways of getting around this:
- Use
timestamp()
to trigger redeployment for every singleapply
. Using this approach, lambda may be down for few seconds while redeployment. But it is for sure that it always deploys, so I would just go with this one if my service does not handle many users. - Use
md5(file("api_gateway.tf"))
to trigger redeployment whenever this file changes. But you need to always make sure that EVERYTHING related to API Gateway deployment only stays inside this file.
Ok. So far we have set up lambda and basic API Gateway configurations. Right now, you can test your API on Postman like this: first, go to AWS API Gateway console, and find a specific endpoint that is deployed to a certain stage. There should be an 'Invoke URL' at the top of the page.
Now, open up Postman, and
- Insert your invoke URL
- Click 'Authorization', and choose the type 'AWS Signature', and enter
AccessKey
andSecretKey
. These keys should be coming from one of user's credentials from AWS IAM Console. If you do not have one specialized for calling an API set up with lambda and API gateway from local environment, make one user for that and get the keys. - Insert your AWS Region.
- If your API requests any more query parameters or body, insert them.
- Click send, then it should work.
If you do not provide AWS Credentials in your request header, it won't work, because so far your API can be only used by IAM users known to the API gateway, and if you are just sending a request from your local computer without providing any access and secret keys, it won't know it's you. To make it work even without providing credentials for any public APIs intended to be exposed to client applications, you should now configure custom domain.
So far we have made changes to make lambda and API Gateway resources. The list of files we should have by now is the following.
https://gist.github.com/70c491c215b065f7a2d9c60bc165515b
Next, we will see how to create a custom domain and relate that domain to the REST API we just made.
Creating Terraform resources for Custom domain
Now, the problem is that we have the API, but it's not callable from any external client applications, which is a common case for many projects. So we want to register a domain first to represent our endpoints.
Before making a change for custom domain, we need to setup another AWS provider because we will need to use us-east-1
region for Edge-optimized custom domain name (that's the only region that supports creating an ACM certificate for Edge-optimized custom domain name).
There are two choices available for an API endpoint: 1. edge; 2. regional. If your endpoint should be accessed by worldwide clients, use edge; if your endpoint is specifically confined to be used in one specific region in the world, use regional. If you don't know what to do, it totally safe to go with edge for now. First, add another aws provider:
main.tf
https://gist.github.com/9b194abe5757e6401d8aad8f7d64100d
Then, you write the actual code to make a certificate and custom domain.
custom_domain.tf
https://gist.github.com/d36aa6fe1550e280e65ad32443be3560
Make sure you get the existing hosted zone ID (or any other zone ID that you intend to use) from AWS Route53 Console:
Use that ID to create Route53 Record for the custom domain name.
After you apply your change, you will be able to see ACM certificate being created on AWS Certificate Manager:
Just make sure you verified the status to be 'Issued' and the validation status to be 'success'. You may need to wait for several minutes before this completes. Also, if your certificate is not showing up, make sure that you are on us-east-1
, not anywhere else.
After you are done with this, now you can go back to API Gateway again, and configure custom domain. Now that you've registered a domain, you can see it right up from API Gateway console:
In the certificate dropdown, you should be able to see the domain that you have just created. Do not make create domain name there on the console. Now come back to terraform and let's write the equivalent code for that.
custom_domain.tf
https://gist.github.com/0a4222792b00d2ef867ca90b61d4faac
You are just going to fill out the options that you just saw from the API Gateway console. Just fill out the relevant info about certificate, domain name, and endpoint config.
Now, this is important: you need to create another Route53 Record to map your custom domain to cloudfront. Once you create your custom domain, AWS creates 'API Gateway domain name', circled with red in the below picture:
You need to route the traffic to api.hello.com
to this API Gateway domain name (an example of an API Gateway domain name would be asdfasdfasdf.cloudfront.net
as long as you are using EDGE
). That's what we are doing with aws_route53_record.custom_domain_to_cloudfront
. Otherwise, the response to your API will keep showing some weird errors that are really hard to guess the causes of. I found AWS really lacking a documentation on this part, so please be advised on this one. You need to create another Route53 Record.
You will be able to verify by entering Route53 console and looking for api.hello.com
. It should appear as the following:
api_gateway.tf
Aftrer that, you don't have to do many things; just add base path mapping resource, in api_gateway.tf
. Even if you do not have an additional trailing path to your endpoint, you must create a base path mapping. Otherwise your API won't be exposed to the public.
https://gist.github.com/c7aa2eae495a42240bb701dcafb21839
After applying this change, verify that your API mapping has been created:
Now, you can go back to Postman, and test your api by requesting GET api.hello.com/hello
. What may be confusing here is that you are not adding any path
in base path mapping. If you add hello
as a path, your API endpoint will be configured as api.hello.com/hello/hello
, which is obviously not what we want. So do not add any path mapping if you already have configured your path in aws_api_gateway_resource
). Anyways, request and response to the API endpoint should work as expected if everything has been setup correctly so far.
Enabling OPTIONS (preflight request)
Now, our client application, of course, is not Postman, so usually clients will request OPTIONS api.hello.com/hello
first, and then request GET api.hello.com/hello
, if they intend to send CORS requests, which is a very common case (read more about that from MDN docs)
If you have not done anything related to handling OPTIONS request, it's very likely that you will get some error in your client application, like this (I got this picture from somewhere else for the purpose of demonstration):
So let's do it! There's already a handy module written by a great dev, so we will just use that:
api_gateway.tf
https://gist.github.com/33bafd721a6c0eddff02680be22171ca
Note that if you have any custom headers, you must define it in your config. Next, verify that OPTIONS requests are now allowed on the console:
Also, you will need to change your lambda's response header too. Just adding Access-Control-Allow-Origin": "*"
will work:
packages/hello/lib/index.ts
https://gist.github.com/c98db1ce8fb9c419a2d46a5b376b5070
Now, apply the changes, and go back to your client app and retry the request. It should be working.
Summing up
If you have followed everything, you will have created these files:
https://gist.github.com/59c8326a9c6543e5851d88710f8d70cc
So far, we have looked at how to setup, develop and deploy a dockerized lambda application with Typescript, Terraform and SAM CLI. There are tonnes of things to cover on lambda.. maybe next time, it will be on using resources inside VPC from lambda. I hope you enjoyed this and found some valuable insights. Thank you.