GCPing is a simple single-page website where you can find out the relative latency between your browser and multiple GCP regions. GCP manages data centers around the world, but which one is closest to you, or more importantly, your customers? GCPing can answer that.
GCPing is a 20% project, and for the four years since its inception it was backed by small f1-micro GCE VMs in each region, with a public IP address, and a static HTML and JavaScript frontend served from GCS.
Over time, new regions would come online, or basic maintenance and upgrades would be necessary, but for the most part GCPing ran itself. A global load-balancer was added to demonstrate how the LB can direct traffic to the nearest regional backend. The server was containerized to run on Container-Optimized OS for simpler auto-updating VMs.
But as features were added, the script to deploy GCPing got larger and more complex. As a 20% project, there's often not a lot of time to fix issues or simplify things. With more and more regions, that meant more VMs to run, and while they're not expensive, it started to cost $250-300 per month to run all those VMs!
Over the last four years, computing at GCP has changed. In addition to GCE VMs, we now have Cloud Run, a serverless container-based platform. Deploying containers to Cloud Run is simple, and Cloud Run services can run in nearly every region GCP supports. [[[ and the rest are coming soon? ]]]
So in November 2020, I decided to rewrite GCPing as a serverless application on Cloud Run.
Porting the server container from COS to Cloud Run was simple, I didn't even
really need to change my code (what little of it there was). The main change
was rewriting the deployment process. The old deployment process was a hairy
bash script that invoked gcloud
to set up resources. If some step failed, it
could leave the site in an inconsistent state that I'd have to take time to
debug.
Cloud Run's Terraform support meant I could replace hundreds of lines of bash
with a couple dozen declarative config stanzas. Deploying the site is as easy
as running terraform apply
, which presents a diff between the current state
of the world and the state I've declared locally, and a prompt to apply my
changes in the correct order.
Another bonus was HTTPS -- Cloud Run services are always served over HTTPS, with a unique URL for each service. The GCE VM approach had every VM with a public IP address serving plaintext HTTP.
- static HTML+JS frontend served from GCS
- setup.sh script:
- ensured an f1-micro VM in each region with static IP
- created global LB with static IP
- generated config.js to list regions->IPs for JS frontend
Originally Ubuntu VMs that ran a Go binary, later a COS VM deployed using
gcloud compute create-with-container
and a container image built using ko
.
- cost! $200-250/month
- serving from GCS didn't support HTTPS unless I put it behind a load balancer ($$$)
- serving from GCE didn't support HTTPS unless I put it behind a load balancer ($$$)
- absolute lowest latency isn't really the goal -> use a CDN!
- relative latency answers "what's closest"
- deployment script was very imperative, occasionally required hand-tuning
create-with-container
is a gcloud feature, not reproducible in APIs or things like Terraform- running a full VM 24/7 presents a potential security issue
- static HTML+JS frontend served by Cloud Run services
- container image built using
ko
(no change) - Terraform config:
- ensures container deployed to Cloud Run all regions
- ensures global LB across regional serverless network endpoint groups (NEGs)
- global LB w/ managed SSL cert for https://global.gcping.com and https://gcping.com
- global LB for each regional Cloud Run service (https://asia-east1.gcping.com)
- unnecessary cost for vanity, Cloud Run services already get a stable domain name w/ Google-managed SSL certs
- cost! ~$20/month, depending on usage
- HTTPS everywhere!
- traffic is low and very bursty -> don't pay while nobody's using it
- declarative deployment w/ Terraform (create-or-update w/ diff + confirm)
- automate reconciliation w/ GitHub Actions ("GitOps")
- possibly automate turning up new regions? At least notify so I can do it manually...
- budget alerts in case of traffic spikes