Skip to content

Instantly share code, notes, and snippets.

@bleshik
Last active August 19, 2018 21:45
Show Gist options
  • Save bleshik/bf87ea8cf4b910c22e7bb8645ebd8239 to your computer and use it in GitHub Desktop.
Save bleshik/bf87ea8cf4b910c22e7bb8645ebd8239 to your computer and use it in GitHub Desktop.
How To Make Multiregional Application (And Pay Zero)

How To Make Multiregional Application (And Pay Zero)

Some time ago I wanted to create a small project to validate an idea. I wanted to quickly bootstrap it, run it and see if there is any interest in the application, and ideally, I don't want to pay too much while validating it.

Now, the problem was that the service itself aimed at companies whose staff may be distributed across the world, and I needed equal application performance for the users. Thus, even validating the idea required me to bootstrap a distributed infrastructure as well.

Therefore, I had a task of creating a multiregional application and pay as little as possible.

Since my current cloud provider is AWS, in the text below regions mean AWS regions: us-east-1, eu-west-1, etc.

Use CloudFormation For Everything

Frankly, making an application multiregional is just another reason to use CloudFormation. There are many more reasons for that, here a list from the top of my head:

  • Multiple Regions. Such applications need to get the same exact infrastructure deployed in several regions. You may deploy same CloudFormation template in each region.
  • Multiple Environments. You don't want to surprise your users by breaking everything with a small change in your infrastructure. You should definitely have a separate environment for testing. You may deploy same CloudFormation template for each environment.
  • Store your infrastructure declaration next to your application code. CloudFormation templates will be a part of your codebase with all the advantages: revision history, deployment on a push to a specific branch, etc.
  • Infrastructure reproducibility. Bring your whole stack down with a few clicks. And with a few clicks bring back up. It's quite helpful, e.g. for cost savings, you may terminate your test environments on weekends and bootstrap them on weekdays.

Go Serverless

This one doesn't relate to the multiregional nature of the application. However, that was the first and the most important step to make.

Let's be honest, the moment you launched your awesome project... No one even notices it. You will not have many users, that's for sure. It takes months to gain some.

Serverless basically says "you pay for the resources you actually use". Meaning, no users – nothing to charge for. That's why serverless infrastructure significantly helped me to save tons of money while validating the idea. The key services for serverless on AWS are Lambda, DynamoDB, S3, SQS and API Gateway (there are many more pay-per-use AWS services, for the full list see this link). Use these services as much as possible and applicable.

I've talked about making your backend and APIs serverless using AWS Lambda several times so I won't go into details on this one. However, you may want to check out my previous articles on this topic:

BTW, you may think the project was written in Java, but it was rewritten from Java to NodeJS eventually. If you're wondering why you may check out my thoughts on this process in this article.

Deploying the RESTful API of the application in multiple regions was quite easy. The hardest part was the routing. Remember, the whole point behind the multiregional deployment is to make users hit the nearest servers and get the maximum performance.

Unfortunately, there is no good way to have a single domain name for all of your AWS API Gateway endpoints. AWS Route53 has geolocation and latency routing policies, but you won't be able to use them together with AWS API Gateway. Luckily, I've solved this by implementing the routing on the frontend. I might go into details on this one in further articles.

S3 Replication

When a user puts his/her userpic to an S3 bucket in one region, we want to have that exact file in all buckets of all regions where the application is deployed.

S3 has a built-in feature called Cross-Region Replication, which may help us to achieve exactly what we wanted.

Though, this option will not work for cases when you have more than 2 regions, because a bucket may replicate its data to a single destination bucket only.

Could we workaround this? Suppose we want to do that with 3 regions: region1, region2, and region3. Let's make region1 replicate its data to region2, region2 to region3, and region3 to region1:

Great! Except it won't work... In the example on the picture region2 will not replicate the file to region3, the process will stop right after region1 replicates the file to region2. Even though, region2 CRR is configured to replicate everything to region3. I guess it's preventing measure against cyclic S3 replication.

Thus, this option will work for 2 regions case only.

Option 2. S3 Replication Using Lambda

A little bit more complex, but more powerful option that works for any number of regions. In each region, we create a Lambda listening for changes in the corresponding S3 bucket. If a file is added or changed, the file is replicated to all other regions by the Lambda.

I've created a minimalistic CloudFormation template to make such 3 regions replication setup. Simply use the following template in each region where the data should be replicated to:

https://console.aws.amazon.com/cloudformation/home?#/stacks/new?templateURL=https://s3-eu-west-1.amazonaws.com/bleshik/s3-replicator-CloudFormation.json

DynamoDB Replication

The scenario is same: a user puts some data to a DynamoDB table, we expect this piece of data to be replicated in all our regions into the same table.

Option 1. Global Tables

This is the most simple option. And I'd say it's the recommended one.

You just create a global table and it gets automatically replicated to all target regions by AWS.

The only problem I can see is adding a region. According to AWS, the whole global should be completely empty to be able to add a new region. This is tricky, but the workaround is quite easy: create a new global table in all required regions and then just copy data from the old table to the new one.

Option 2. DynamoDB Replication Using Lambda

Sames as for S3. We just create a Lambda Function in each region, it listens for changes and replicates the changes in all the required regions.

Now, this option is more complex, but there is a great advantage over the other option: your DynamoDB tables are not required to be empty. However, when adding a new region, you still will have to copy the data to the new region. The awesome thing is that you could do that "online", while the app is alive with no downtime and without losing any data.

That's very nice, but I think the only reason I came up with this option is that there were no global tables back then. Despite the advantage I mentioned, I'd prefer the option 1 if I could, because it is MUCH easier to implement.

Anyway, this replicator is implemented already in the project, so it's worth to be shown. However, it deserves a whole new article to describe how exactly it works. Also, I'm still not sure where exactly this code may be published: npmjs.com? AWS Serverless Application Repository? Something else? As soon as I come up with a good place to put the DynamoDB Replicator, I will put the link here.

What's next?

There are tons of topics I would like to mention further: DynamoDB Replicator, the API endpoints routing on frontend, or maybe my random thoughts on working remotely.

Comments, likes, and shares are highly appreciated. Cheers! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment