Skip to content

Instantly share code, notes, and snippets.

@alexcrichton
Last active November 11, 2019 10:38
Show Gist options
  • Save alexcrichton/65f17fe089979a4a359145a8992c67b8 to your computer and use it in GitHub Desktop.
Save alexcrichton/65f17fe089979a4a359145a8992c67b8 to your computer and use it in GitHub Desktop.

AWS and crates.io

As a recap, the motivation for this change is that we're moving away from EasyDNS and crates.io is the only remaining host whose DNS is configured by EasyDNS. The reason for this is because the apex domain name (crates.io) needs to point to a Heroku domain name (crates.io.herokudns.com). DNS is specified as "you can only add A/AAAA records for the apex domain" so we typically wouldn't be able to do this (point crates.io at Heroku's domain name), but EasyDNS implements an "ANAME record" which isn't actually a thing but is a way to emulate what we want.

The location of all our DNS, Route 53 now, does not support ANAME-style records in general. It does support, however, ANAME-ish records (AWS calls them aliases) for AWS-specific services like CloudFront. This means that to transfer DNS service to AWS we will need to put CloudFront (or some similar service) in front of crates.io.

Rust's usage of CloudFront

Rust already uses CloudFront for a number of other services:

  • It primarily hosts static.rust-lang.org, how we distribute Rust/Cargo/rustup binaries to users. (CDN in theory makes the download faster)
  • We also host all crates themselves on cloudfront via static.crates.io. The crates.io server redirects Cargo to download from this CDN.
  • Domains like doc.rust-lang.org go through CloudFront as well.

Almost all our existing usage of CloudFront is basically acting as a front to an otherwise static website. Crates.io would be the first dynamic content we put behind CloudFront.

CloudFront and crates.io

I would imagine that we would start out by basically just configuring this to exist, but being extremely conservative about caching. Unless responses actually have cache headers, nothing would be cached and CloudFront would always make a request to the origin server (Heroku)

Permissions and Access

One of the benefits of AWS is that it has pretty granular permissions. Sean already has an AWS account but we can make more for other members. I've tested this out a bit and it looks like:

  • Crates.io team members will be able to manage the S3 buckets (but only the crates.io-related S3 buckets)
  • Crates.io team members can manage crates.io DNS (but only crates.io DNS)
  • Crates.io team members can manage any CloudFront distribution, including crates.io ones.

That last one is sort of unfortunate but for whatever reason AWS doesn't provide granular access to CloudFront distributions.

Testing things out

I've prepared a few distributions to test things out. For Sean (and others who have access) you can go to https://console.aws.amazon.com/cloudfront/home and take a look at these:

The configuration is pretty standard right now, the main differences from the default are:

  • If caching headers aren't present, responses aren't cached.
  • Query strings are forwarded by default
  • Accept/User-Agent headers are forwarded to the origin
  • All cookies are forwarded to the origin
  • For the /assets path, none of the above apply (no forwarded headers, no query strings, default cache time is 1 day)
@pietroalbini
Copy link

Accept-Encoding since nginx will gzip large responses for us, however, we don't yet send a Vary header in the response for /assets so that would need to be special cased for now (contrary to what I said at the end of the first paragraph above) so that we don't accidentally send a compressed cached version to a client that doesn't support it. (For all other paths, we send Vary: Accept, Accept-Encoding, Cookie.)

Not sure it's worth the hassle, CloudFront itself also compresses responses automatically if Accept-Encoding is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment