vkz/hoot.md

## hoot.md

      
    Raw
  

              hoot.md
            
          
    Debugging AWS CloudFront issues live with SSH


git.ht relies heavily on CDN caching. We use AWS CloudFront, but it'll work similarly for CloudFlare or another CDN. Hardly a surprise seeing how GitHoot is almost but not quite a blogging platform. There are a few good candidates for caching. We emphasize and nudge people towards RSS, cause, let's face it - RSS (or Atom) delivers cleaner and superior experience to your readers compared to whatever blog central du jour is in vogue. But then this knowledge, quite common in 1990s-00s, is rapidly going away. That aside. RSS feed is nothing more than an XML document. Naturally, we'll want to cache it. These feeds are unlikely to change often, I mean, how often do you blog? Not often enough, if you ask me, go check out git.ht right now. Once we fetch your GitHub gists, we turn them into hoots - pages hosted under your subdomain that are nicely rendered for readability and offer decent preview when you share on social media like Twitter. These will probably be relatively heavy and Amazon traffic is expensive. After all they have to pay road tax, maintain a fleet of pigeons to carry all those TCP packets, etc. Of course they'll charge us for the priviledge. We have to cache to save money. We also don't want to be a nuisance and swamp GitHub servers - we're decent web citizens and considerate neighbours. If you're still with me, you're probably asking what's that got to do with SSH. Getting to it.
Tailops and GitHoot

As I write this it appears this hoot will be another installment in what I came to call tailops series (coining it). If you want to read about our general setup that bridges together AWS and on-prem resources with the help of Tailscale, check out Tailscaling GitHoot aka episode 1. However, as long as your deployment machine - one that CloudFront likely designates as its distribution origin - can find and access your dev machine, you don't really need Tailscale. E.g. if your deployment is on-prem and both machines share a local network or you have another VPN solution in place or your dev machine happens to be on the same VPS in the cloud. You may not even need this much if you can re-configure sshd on your deployment machine and have GatewayPorts option enabled. We'll touch on that towards the end.
CloudFront caching refresher

Basically, we have AWS CloudFront and a server somewhere that's serving our website. Latter is setup to be CloudFront's distribution origin. Whenever a request hits one of CF's edge servers it'll check the status of the content in its internal cache. If the contents TTL (assuming our machine's last response set it with something like Cache-Control: max-age=15) hasn't expired, it'll return the contents, else it'll hit our server, probably, with a couple of cache related headers: if-none-match assuming we responded with an ETag last time and maybe if-modified-since. Finally, we'll respond with 304 or the changed content. Nothing unusual. If some or all of that sounds a bit unfamiliar, but you'd like to learn more, I generally find AWS documentation quite good even when you're starting out. It was my first time dealing with CDNs and it worked for me without StackOverflow or Googling around. Head over to AWS CloudFront Request Response Behavior for Custom Origins or read the entire manual. I recommend getting your hands dirty and doing a few deployments with S3 origin or something - trust me - you'll learn quicker. This may've come up in earlier hoots, but I'll repeat it here. I place value on AWS as a gadget to experiment and learn with more than a cloud infrastructure I'm willing to host my projects on. I digress.
Setting up CDN caching

I'll assume that you correctly setup the CloudFront distribution bits e.g. using AWS console. It doesn't require much, only needs to know where to send requests when it needs content to deliver and cache. Challenge then is in having your server respond appropriately i.e. correctly set the above mentioned HTTP headers. Feel free to insert your favorite cache invalidation and computer science joke or anecdote here. Get it wrong and there'll be some unpleasantness to dig yourself out of. I may get lucky and get it right the first time: write that code, deploy to prod environment - voila everything works as you anticipated. I know from experience that I'm not particularly lucky especially when I have multiple machines talking to each other. Going through the hack - deploy - test - repeat cycle gets old pretty quickly, and in general, being a spoilt Clojure brat who lives inside the REPL aka a smug lisp weenie, I prefer to debug everything live if not in production: write my HTTP handler, eval, perform browser request, fix the handler, eval, repeat request, and so on until satisfied. The need to deploy and having requests go through CloudFront sorta throws a wrench in this workflow. What are we to do? Endure the long and painful turn around time? Not when we have SSH. You know what they say. Learn a few flexible tools and keep em sharp - will serve you a lifetime. SSH is one.
Deploymment vs dev

What's in the name? What's the difference between the host you deploy to and your dev machine? Ideally, there should be none. But that's not how people usually roll. We'll leave my software development philosophy for another hoot. For the purpose of configuring our caching correctly there really is none. You make a request, CDN edge will talk to the host and port you specified as source of truth and that's it. Even if you are not programming Common Lisp, Smalltalk or Clojure and need to go through the compile - restart the world cycle, the need to deploy, then go through CloudFront will slow you down not to mention annoy the hell out of anyone sane. Either that or you're a professional web developer with severe case of Stockholm syndrome. At the very least allow me to assume you don't tolerate long compile times. You don't, do you? Let's go with Golang - that should be ok. If you're a Golang programmer the following maybe a pleasant way to do away with caching and should work for you just as well.
One SSH trick

Well, technically, two, but they are symmetric and one is unlikely to work out of the box. Trick in question is of course port forwarding that SSH can do for you when you establish a connection. Grep for -L and -R options in ssh man. In a typical Unix fashion expect to confuse the two any time you need em. As ever Unix utils will try to save you a few keystrokes by having you spend 20 minutes greping their respective manpages any time you use them. SSH connection has two sides - ain't that something - remote and local. Naturally, the best way to tell one from the other is to put one left of a : delimiter and the other right (I'm just dripping with sarcasm tonight, aren't I). Flip of a coin. Every. Freaking. Time. Anyway. SSH lets you bind a port on one side and every time someone attempts to connect to it, it'll divert traffic to a (possibly different) port on the other side. We expect the other side in question to be able to handle incoming requests. Have you guessed it, yet? Say, our deployed service listens on port 3030 that's bound on every interface or some external interface we care about on our deployment machine. This is where your CDN would have to send requests - distribution origin in CloudFront's lingo. Well, doing it so directly would be too pedestrian, so chances are roughly 100% you'd have a proxy load balancing in front of it. Something like AWS ALB, so that would be the real origin and it'll have a rule sending traffic to your server or a target group. That's a long-winded way of saying requests will be sent to your server's port 3030. Neither CloudFront, nor ALB or whatever else you have proxying requests, care about the identity of your deployment. Prod? Test? Dev? Matters not at all. They have IP (or HOSTNAME) and PORT of the target configured somewhere and that's what they'll use. Well then, why don't we forego deployment step entirely. We'll need to forward any connection to port 3030 on the deployment machine (aka remote, aka prod) to some port on our dev laptop (aka local). That is, we run our web service on our dev machine locally and we run nothing on our prod machine. That's because we need to keep its port 3030 (or whatever) unbound so that SSH is able to grab it and forward to our dev. Still with me? We're forwarding a remote port to a local service. That would be option -R or expanded -R remote_host:remote_port:local_host:local_port. You'll have to check the ssh man if you care about shorthand notation and syntax for binding multiple interfaces with * or 0.0.0.0 or whatever. To further avoid confusion, let's use different ports on remote and local: 3030 on the remote machine, and start our web service locally on port 3000. I predict you'll go through a few iterations, which will leave you puzzled. From your dev machine you'll probably try:
$ ssh -N -R 3030:localhost:3000 you@remote
# command succeeds but anything hitting remote on 3030 will timeout

$ ssh -N -R *:3030:localhost:3000 you@remote
# command succeeds but anything hitting remote on 3030 will timeout

$ ssh -N -R EXPLICIT_LAN_OR_TAIL_OR_EXT_IP:3030:localhost:3000 you@remote
# command succeeds but anything hitting remote on 3030 will timeout

This is where I save you time and what little hair you still have. Just check the ports and sockets that are bound ss -tulwp or whatever your Linux uses. Grep for 3030 if you have many. Seeing it? Exactly. You'll see no external interface on the remote bind 3030. Only localhost. I mean, technically the above commands succeeded to bind an interface - the loopback interface. Was that what you expected or wanted? Didn't think so. And it didn't signal an error. Not even a warning. RTFM people cause it reads:
Specifying a remote bind_address will only succeed if the
server's GatewayPorts option is enabled (see sshd_config(5)).

would you have noticed this bit inside a bigger paragraph? Of course you would. I wouldn't. I didn't. I have a life and kids. Well, I have kids.
IIUC this is a matter of security, that's why it is off by default and you need to reconfigure sshd. Would be one way to do it, of course. Let's not. My deployment machine ran GNU Guix and I didn't feel like changing the definition of my system. If ssh -R is out, what do we do? We recall that there is a symmetric ssh -L that binds a local port and forwards it to a remote (service). Wait, what's the diffence you ask? Read the previous sentence or the ssh man carefully. Host where you start your ssh journey makes all the difference. You can bind local ports at will, it's the remote ports that present the problem. We ran ssh -R from our local dev machine, but for our trick to work we'll have to run ssh -L on the remote connecting to our local dev laptop. As long as the remote has a route to reach our local dev machine (same lan or VPN or Tailnet) we'll just need to SSH ... twice. First we ssh you@remote to spawn a shell session, then in it ssh -L *:3030:localhost:3000 you@dev_laptop. I know how it sounds. I just dug a tunnel in your tunnel. With more tunnels underneath if you're doing it over Tailscale. Multiple encryptions that we don't need. But who cares. We no longer need to redeploy: iterate on your code locally and have WWW queries go through CloudFront, your proxy or load balancer, your prod machine and finally to your local dev. Just make sure you start with some short TTL or no max-age to avoid waiting.
SSH is the way

True in my opinion, but this isn't the only way we could've achieved the same thing. If your CloudFront distribution origin points at a load balancer, that is to say your actual server is behind a proxy, you could change where your proxy redirects traffic. Here's an example that'd work for GitHoot that's behind an AWS Application Load Balancer. We have an ALB target group that "targets" our main deployment host. We could've created another target group that delivers to our dev machine - yes, you'll need Tailscale or another VPN solution for this to work with an on-prem resources - else you won't be able to target an IP address. See episode 1 for details. With this dev target group in hand, we'd need to edit relevant rules on our ALB to send traffic to it. It's actually easier than it sounds, so not the worst solution, but anytime I can avoid mucking with AWS configuration is a win in my book.
SSH is your Swiss army knife. It wasn't the first time I employed something similar. If you target JVM be it with Clojure, Java or Scala or whatever's popular these days, you're probably at least vaguely aware of JMX that you can use for managing and monitoring JVM processes including remote ones. If you haven't you have some learning to do. Ditto JDK Flight Recorder. I bet all the cool kids think Open Telemetry is like the coolest thing after K8S and Node.js. Ah, ignorance is bliss. Boring and balding enterprize programmers running circles around everyone else - there's an earth shattering image - best put it out of your mind. So yeah, like many good things in the Java ecosystem its documentation is all over the place, so people cargo cult and spread a lot of nonsense and noise - setting it up correctly could be a challenge. Setting it up locally, however, is mostly trivial. If not for continuous monitoring, for management, debugging and profiling the above ssh -L trick will work wonderfully. I may hoot about my Clojure with JMX setup later. No promises, though.
Any comments?

As always this hoot is nothing more than a GitHub gist. Please, use its respective comments section if you have something to say. Till next time.
Always be blogging

One last plug for GitHoot. We make blogging as simple as creating a GitHub gist. Let me just say how important it is for a programmer to communicate and do it well. IMO blogging regularly remains unsurpassed when it comes to learning how to write and communicate concisely and well. Not to mention, you're quite unlikely to make a name for yourself just programming. Don't just brush it off - you'll want to be known - one day - when you decide to start a company or something. Take it from me. I wish I'd started many years ago. So, yeah, try GitHoot, subscribe and start writing already. Git hooting, y'all!
Coda


Did you know I'm running a (mostly) Clojure SWAT team in London at fullmeta.co.uk? We are always on the lookout for interesting contracts. Now that git.ht is up, which you should go ahead and check out rigth now, I'll be going back to CTOing and consulting. I am available for hire now. May have another great Clojure engineer mid to late June'23. Get in touch directly or find us on LinkedIn. We aren't cheap, but we're very good. We can also perform competently in modern Java, ES6/TS, Go, C#, etc.


  Find me Twitter