Skip to content

Instantly share code, notes, and snippets.

@squarism
Last active April 22, 2024 22:39
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save squarism/0cea5711a3eeaedf4858 to your computer and use it in GitHub Desktop.
Save squarism/0cea5711a3eeaedf4858 to your computer and use it in GitHub Desktop.
Amazon is Almost

Amazon is Almost

AWS is full of almost things. This is a list of surprises across all services. Your job as an engineer is to think critically about trade offs. Amazon is an abstraction on hardware. You reap hardware bliss ignorance but you pay for the abstraction. There is no free lunch. Hardware still exists. Hardware in itself is an abstraction.

With AWS you are renting a hardware abstraction to avoid having to host your own. They do not run the abstractions after this point. You do.

Hardware moves: Hardware you ran -> Hardware they run People stay: Local infrastructure engineers (you) -> AWS abstraction engineers (you)

Engineers use / understand / live / die by abstractions. If you "cloud migrate", the hardware moves but the people stay while the abstraction changes.

The point of this list is a log of AWS surprises. Especially when something seem similar from traditional Infrastructure world. Something like this illustrates the feeling:

Expectation: S3 is a filesystem!

Reality: Ehhh ... S3 is more like a bucket and an HTTP API.

This is a list of negativity and I'm sorry about that.

S3

It's a filesystem in the cloud. But not quite. You can't treat subdirectories (buckets) like folders. You can't create a bucket within another bucket.

  • You can only have 100 buckets in your account without asking for more through a process.
  • Bucket names must be at least 3 characters long. You can't have an s3 bucket called s3.
  • Files are limited to 5GB on a single put. aws s3 cp gets around this.
  • Bucket names are essentially shared to the world and therefore unique like DNS even though your bucket might be private.
  • Wildcards aren't supported because it's not a shell, it's not a filesystem. But also, confusingly, the include and exclude flags aren't intuitive. aws s3 cp /tmp/foo/ s3://bucket/ --recursive --exclude "*" --include "*.jpg"

VPC

  • Routing is weird. It's partially abstracted so it's confusing if you know what you are doing and confusing if you don't know what you are doing.
  • Security groups don't follow patterns from other security systems. Like you can't have groups that inherit from other security groups.
  • The security editor is pretty wild even if you've been doing it for years. It took me way less time to understand Cisco access lists back in the day.
  • Internal hostnames are not stable. IP addresses are random at boot. Elastic IPs are limited to 5. Intra-LAN networking is mysterious unlike a normal LAN. Can you DHCP inside a VPC? This is a solved problem in traditional networking. Is a VPC like a LAN? Almost

EC2

  • They aren't really servers by default. Default ephermal storage can mean your / deletes if you power it off. Digital Ocean is less of a surprise for a server in the cloud. Yes, you can protect yourself. I mean principle of least surprise.
  • It's not a VM cluster. It's close. A lot of times you need to learn the invented name for the thing you know how to do in "the real world".
  • A NAT instance is a special micro instance that acts like a network appliance but it's a tiny server. You create it and then manage it somewhere else in the UI. There's a lot of this hunting around in the UI.

Redshift

  • It's a custom postgres fork. It's behind on versions. 8.x

Data Pipeline

  • It's an ETL tool except it create duplicates. So we wrote our own and it worked in less calendar time.
  • The tooling around exporting a workflow and reimporting simply did not work. Meaning (very sadly) exporting out of their tool and back in.

SQS

  • Wiring SNS to SQS permissions seem awful.
  • Acknowledgements happen sort of implicitly by deleting a message with a very special id. This isn't awful and could be wrapped but it is weird. In other queues, it's easier and called something like acknowledge. You'd maybe just say: message.ack!. In SQS, you'd want to wrap a couple of actions in a method called acknowledge or something similar. It's nice that it has this feature, it's just a little too raw imo.

SNS

  • I don't know yet how I'd expose the interface to manage the rules. Seems like you'd want business events in here, so wouldn't this want to be changed by non-devs (a promise hardly realized historically)?

Lambda

  • Only a few languages are support right now.
  • There's no local disk space. So if you are trying to go serverless (wut) then you don't have local disk to work from. This limits the types of problems able to be solved. So it's not quite "throw away all the servers".
  • ES6 is handled by transpiling but I think the version of node they are using should support it natively?

IAM

  • Credentials are managed by attaching permissions (in the form of a poorly documented json blob), directly to the user, directly to a group, to a role, or to a policy.
  • The API version string is a date. It's not v1 / v2 etc, it's like 2012-10-13.
  • The editor is not great at scale. It's better to not use it. Use an external tool or something.
  • The maximum limit for attaching a managed policy to an IAM role or user is 20. The maximum character size limit for managed policies is 6,144. You (currently) have to rearchitect your security stuff to use access points and/or suffer with confusing complexity for seemingly arbitrary limits from Amazon. Likely, the scale from Amazon's side is causing some limit to leak to AWS users.

ECR

ECR is a hosted docker registry, right? It's a docker registry just like the official docker registry, right?

I really expected ECR instances to be more of a normal docker registry. The tagging seems to be this: [random-dns-hostname].amazonaws.com/[image][:tag] where image would normally be an image tag to me. But the docs show examples like this: [random-dns-hostname].amazonaws.com/[repo][:tag]. This isn't normal to me. If I create an ECR repo called repo then I want to push an image like this: [random-dns-hostname].amazonaws.com/repo/[my-image-name][:tag] but the Cloudwatch logs say this repo doesn't exist and the docker push errors (almost like a network error).

Create a repo ahead of time

So it's terminology? But it's more than that. What if I want to create random images? I kind of don't want to pre-make the image name space because that's not how the original docker registry and the registry rewrite and other registries in this style ever worked.

ACM

It's a Certificate Authority, right? When you click New Certificate:

Request a public SSL/TLS certificate from Amazon. By default, public certificates are trusted by browsers and operating systems.

The circle of trust chain is already installed and trusted by browsers and operating systems. That's great! That way, we wouldn't have to create a private CA and deal with installing the chain into the circle of trust onto all our systems or burden our users with weird prompts! It's just like other certificate authorities! And it comes with my AWS account! Great! 🤩

The API does not allow for the public certificate to be exported to a PEM. You can't do it. If you try to do it through the console there's no Export button. 🫥

So, it's not a public CA. It's an internal CA with tight integration (convenient) but lock in. And, we're really talking about an Export .pem file here. A thing we can do on a private CA. A feature that a PKI server does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment