Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?

GCPing is a simple single-page website where you can find out the relative latency between your browser and multiple GCP regions. GCP manages data centers around the world, but which one is closest to you, or more importantly, your customers? GCPing can answer that.

GCPing is a 20% project, and for the four years since its inception it was backed by small f1-micro GCE VMs in each region, with a public IP address, and a static HTML and JavaScript frontend served from GCS.

Over time, new regions would come online, or basic maintenance and upgrades would be necessary, but for the most part GCPing ran itself. A global load-balancer was added to demonstrate how the LB can direct traffic to the nearest regional backend. The server was containerized to run on Container-Optimized OS for simpler auto-updating VMs.

But as features were added, the script to deploy GCPing got larger and more complex. As a 20% project, there's often not a lot of time to fix issues or simplify things. With more and more regions, that meant more VMs to run, and while they're not expensive, it started to cost $250-300 per month to run all those VMs!

Over the last four years, computing at GCP has changed. In addition to GCE VMs, we now have Cloud Run, a serverless container-based platform. Deploying containers to Cloud Run is simple, and Cloud Run services can run in nearly every region GCP supports. [[[ and the rest are coming soon? ]]]

So in November 2020, I decided to rewrite GCPing as a serverless application on Cloud Run.

Porting the server container from COS to Cloud Run was simple, I didn't even really need to change my code (what little of it there was). The main change was rewriting the deployment process. The old deployment process was a hairy bash script that invoked gcloud to set up resources. If some step failed, it could leave the site in an inconsistent state that I'd have to take time to debug.

Cloud Run's Terraform support meant I could replace hundreds of lines of bash with a couple dozen declarative config stanzas. Deploying the site is as easy as running terraform apply, which presents a diff between the current state of the world and the state I've declared locally, and a prompt to apply my changes in the correct order.

Another bonus was HTTPS -- Cloud Run services are always served over HTTPS, with a unique URL for each service. The GCE VM approach had every VM with a public IP address serving plaintext HTTP.

Easier Deployment

Cost Savings:q

HTTPS Everywhere!


Previous Architecture

  • static HTML+JS frontend served from GCS
  • setup.sh script:
    • ensured an f1-micro VM in each region with static IP
    • created global LB with static IP
    • generated config.js to list regions->IPs for JS frontend

Originally Ubuntu VMs that ran a Go binary, later a COS VM deployed using gcloud compute create-with-container and a container image built using ko.

Cost Breakdown

Problems

  • cost! $200-250/month
  • serving from GCS didn't support HTTPS unless I put it behind a load balancer ($$$)
  • serving from GCE didn't support HTTPS unless I put it behind a load balancer ($$$)
  • absolute lowest latency isn't really the goal -> use a CDN!
    • relative latency answers "what's closest"
  • deployment script was very imperative, occasionally required hand-tuning
  • create-with-container is a gcloud feature, not reproducible in APIs or things like Terraform
  • running a full VM 24/7 presents a potential security issue

New Architecture

  • static HTML+JS frontend served by Cloud Run services
  • container image built using ko (no change)
  • Terraform config:

Intermediate Architecture

  • global LB for each regional Cloud Run service (https://asia-east1.gcping.com)
    • unnecessary cost for vanity, Cloud Run services already get a stable domain name w/ Google-managed SSL certs

Advantages

  • cost! ~$20/month, depending on usage
  • HTTPS everywhere!
  • traffic is low and very bursty -> don't pay while nobody's using it
  • declarative deployment w/ Terraform (create-or-update w/ diff + confirm)

Future Work

  • automate reconciliation w/ GitHub Actions ("GitOps")
  • possibly automate turning up new regions? At least notify so I can do it manually...
  • budget alerts in case of traffic spikes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment