abe-winter/zonal-cluster.md

## zonal-cluster.md

      
    Raw
  

              zonal-cluster.md
            
          
    GKE zonal cluster migration

Apparently a GKE cluster costs $75 a month; this is the most expensive component of the cloud bill for my side project. There is a free tier ... but I recently learned it only applies to zonal clusters, not regional clusters. Much to my chagrin. The pricing docs are very unclear.
So I moved from a regional cluster to a zonal cluster. If you know kube at all you know this is not the kind of thing kube makes easy.
Hard parts about this:

moving the storage over
recreating the NEGs

Scroll the end for billing charts.
Moving storage

Although most of my data is in a managed GCP SQL DB, two key things are on kube 'persistent volumes' -- session data, which is stored in redis (to my endless regret), and a 10gb cache file that I have forgotten how to recreate (:upside_down_smiley:).
This post was my north star: https://acv.engineering/posts/migrating-a-kubernetes-pvc/. It's from the tech blog of an auto auction company and it saved my bacon.
I will summarize what it said:

kubectl get -oyaml pv > old-pv.yaml && kubectl get -oyaml pvc > old-pvc.yaml in your old cluster
apply those yamls in your new cluster (after running helm?)
do some edits bc the IDs won't match

I ended up applying the yamls before helming. I think it created some extra volumes as well? But my most important data, the sessions and the 10g unreplaceable blah, seem to be intact, so I'm happy.
Storing session data in redis was a mistake, DB would have been fine + redis has had other problems. (At first it kept getting deleted whenever my cluster rebooted bc redis was a pod not a RS, sigh fun).
For a while helm + kubectl were throwing errors like this one when I tried to install my stuff:

Error: INSTALLATION FAILED: cannot patch "MYVOLUMENAME" with kind PersistentVolumeClaim: PersistentVolumeClaim "MYVOLUMENAME" is invalid: spec.resources.requests.storage: Forbidden: field can not be less than previous value

Turns out my '2G' volumes ended up '2Gi' at some point and I had to change that in YAML.
In an earlier project, while trying to downsize my regional cluster from 2 zones to 1, I learned that in a regional cluster, PVs seem to be tied to the zone where they're created. Ugh.
-> ah but you can at least move the underlying gcp disks between zones ... if you dig around in the gcloud CLI. there's a gcloud compute disks move command that lets me move disks between zones, it's just not in the web UI. urghgh this would have saved so much time.
Recreating the NEGs

My load balancer points to my pods via kube Service objects which expose named 'NEGs', network endpoint groups. These are created automatically by the Service objects. Afterwards you can inspect them in gcloud compute network-endpoint-groups list.
All fun and automated ... and completely broken in my particular edge case. Kube 'ingress' is way too cloud-specific imo. I have never had a completely smooth experience with it anywhere, and it's different every time I look.
I got a classy kube gibberish message like:

Warning  SyncNetworkEndpointGroupFailed  48m (x12 over 55m)  neg-controller  Failed to sync NEG "MYNEGNAME" (will retry): neg name MYNEGNAME is already in use, found conflicting description: expected description of NEG object "us-central1-a"/"MYNEGNAME" to be {"cluster-uid":"some-uuid","namespace":"default","service-name":"MYNEGNAME","port":"8000"}, but got {"cluster-uid":"some-other-uuid","namespace":"default","service-name":"MYNEGNAME","port":"8000"}

The fix here was to:

edit the NEGs? nope this is some weird internal state I can't access
delete the NEGs so the services would recreate them ... but I can't delete them because they're ref'd by my backend services. (not sure if this is a GCP quirk or a terraform quirk)
point all the backend services at a random NEG I was planning to delete, terraform apply
delete the NEGs with gcloud CLI
wait a while for the kube Service objects to recreate them
restore my terraform config and terraform apply again.

a.k.a this was not a zero downtime migration (not that I was trying for one). I didn't understand how the NEGs were attached and therefore didn't apply the first rule of clean migrations: add new one under a differnet name, make sure both are working, delete old.
Other things I did to save money

Committed use discounts or CUDs. I bought them for my e2-medium instance (the kube node pool) and for SQL. The e2 compute credits took 24 hours to kick in but I'm happy with them. The SQL credits kicked in immediately -- but they raised my price.
It turns out my tiny SQL instance is not CUD-eligible. CUDs are in theory nonrefundable but the billing person was nice and I had screenshots of their tool estimating I would save money. They canceled it.
Disk, esp fancy 'balanced performance' disk, is not free. I shrank my k8s boot disk from 100g to 50g.
All told, my bill went from ~$7-8 / day last month with 2 nodes in a regional cluster, to ~$3 / day now with zonal cluster, single node, and compute CUD. Todo paste billing screenshots in here.
Billing screenshot


The TLDR is that red (compute) went down by half when I removed one of two kube nodes, and blue (k8s) went away when I migrated from regional to zonal cluster.
Details:

5/20-5/26: end of previous month. ~$8 per day. There's a 'compute free tier' which makes this number variable over the month, but call it $7-8.
5/27: red (compute) goes down by half because I shrank my regional cluster from 2 nodes to 1
5/28 the tiny purple line at the bottom shrinks because I deleted some DNS zones
5/27-5/29: yellow (SQL) goes up because my sql committed use discount did not actually work. On 6/3 they deleted it. Less visible, but compute goes from 1.85 to 1.59 because of CUDs.
5/30-6/1: red (compute) goes down because the compute free tier restarts on the first of the month
6/15 blue (kube) disappears because I migrated from regional to zonal cluster. Regional and zonal clusters both cost $0.10/hour but zonal clusters are free. red increases on 6/15 because two nodes were up at the same time for a bit
6/16 compute free tier ends for the month, red (compute) gets larger
6/16-6/18: green (networking) shrinks because I removed an unused NAT. red (compute) shrinks because I shrank the kube boot disk from 100g -> 50g ad deleted an extra 20g disk that I created during the migration(s)

If you're playing with the billing console, I have been using the default report (last 30 days, group by service) as well as filter to one service, group by SKU.
My bill in april (before any tweaks) was $230. This is ~ $7.50 * 30.
My bill in july, next month, should be $3 * 30 ~= $90. This is probably still too high; I am stuck on a larger box than I need because the kubernetes 'kube-system' workload takes up a lot of each node.