Run kops get cluster -o yaml --full > cluster-full.yaml
for reference and backup
kops edit cluster:
- change networking to
spec: networking: weave: mtu: 8912
- change topology to match:
spec:
#[...]
topology:
bastion:
bastionPublicName: bastion.<cluster-name>
dns:
type: Public
masters: private
nodes: private
- save (vim
:wq
) kops update cluster
This will also update a few other fields, like networkPluginName
and configureCloudRoutes
Run kops get cluster -o yaml --full > cluster-full-v1.yaml
for reference.
kops edit cluster
, subnets type Public
-> Private
. Add utility net for each subnet, with cidr to match
networkCIDR
and make sure they dont overlap
spec:
#[...]
subnets:
- cidr: 172.20.32.0/19
name: eu-west-1a
type: Private
zone: eu-west-1a
- cidr: 172.20.64.0/19
name: eu-west-1b
type: Private
zone: eu-west-1b
- cidr: 172.20.96.0/19
name: eu-west-1c
type: Private
zone: eu-west-1c
- cidr: 172.20.0.0/22
name: utility-eu-west-1a
type: Utility
zone: eu-west-1a
- cidr: 172.20.4.0/22
name: utility-eu-west-1b
type: Utility
zone: eu-west-1b
- cidr: 172.20.8.0/22
name: utility-eu-west-1c
type: Utility
zone: eu-west-1c
Also set the api to loadbalancer instead of dns:
spec:
#[...]
api:
loadBalancer:
type: Public
kops get ig
- for each ig:
kubectl edit ig <ig-name>
and remove line withassociatePublicIp: true
kops update cluster
- check if output looks sane.- Run
kops get cluster -o yaml --full > cluster-full-v2.yaml
for reference. - have lunch, coffee break or whatever work for you. If things break in next step you have to be able to sit for while
# kops update cluster --yes
I0614 12:09:48.619426 23371 executor.go:91] Tasks: 0 done / 124 total; 40 can run
I0614 12:09:49.546706 23371 executor.go:91] Tasks: 40 done / 124 total; 20 can run
I0614 12:09:50.282524 23371 executor.go:91] Tasks: 60 done / 124 total; 54 can run
I0614 12:09:53.494105 23371 executor.go:91] Tasks: 114 done / 124 total; 7 can run
I0614 12:09:54.100612 23371 executor.go:91] Tasks: 121 done / 124 total; 3 can run
I0614 12:09:54.166540 23371 natgateway.go:266] Waiting for NAT Gateway "nat-0b97d67ba07694ea1" to be available (this often takes about 5 minutes)
I0614 12:09:54.167893 23371 natgateway.go:266] Waiting for NAT Gateway "nat-03a9ed7ac5518deb3" to be available (this often takes about 5 minutes)
I0614 12:09:54.236704 23371 natgateway.go:266] Waiting for NAT Gateway "nat-09eb41a0c3c06d1aa" to be available (this often takes about 5 minutes)
I0614 12:12:10.804405 23371 executor.go:91] Tasks: 124 done / 124 total; 0 can run
I0614 12:12:10.807009 23371 dns.go:152] Pre-creating DNS records
I0614 12:12:12.482297 23371 update_cluster.go:229] Exporting kubecfg for cluster
Kops has set your kubectl context to <cluster-name>
When doing rolling update you'll probably see this:
kops rolling-update cluster
Using cluster from kubectl context: <clustername>
Unable to reach the kubernetes API.
Use --cloudonly to do a rolling-update without confirming progress with the k8s API
error listing nodes in cluster: Get https://api.<clustername>/api/v1/nodes: dial tcp <api ip addr>:443: i/o timeout
Next step is to do a rolling update with --cloudonly as we can' reach our API now. You will get no draining or other safe procedures this way. Any work that needs to be scaled down properly must be done before these operation starts
# kops rolling-update cluster --cloudonly --yes
Using cluster from kubectl context: <clustername>
Using cluster from kubectl context: <clustername>
NAME STATUS NEEDUPDATE READY MIN MAX
master-eu-west-1a NeedsUpdate 1 0 1 1
master-eu-west-1b NeedsUpdate 1 0 1 1
master-eu-west-1c NeedsUpdate 1 0 1 1
nodes NeedsUpdate 3 0 1 6
W0614 12:16:24.031505 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:16:24.031525 23427 rollingupdate_cluster.go:460] Stopping instance "i-0afdf347d38de97e3", in AWS ASG "master-eu-west-1c.masters.<clustername>".
W0614 12:21:24.268966 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
W0614 12:21:24.269025 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:21:24.269035 23427 rollingupdate_cluster.go:460] Stopping instance "i-07a4c3d75c3c959bd", in AWS ASG "master-eu-west-1a.masters.<clustername>".
W0614 12:26:24.780248 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
W0614 12:26:24.780341 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:26:24.780366 23427 rollingupdate_cluster.go:460] Stopping instance "i-0f404c6d5b08e0ae1", in AWS ASG "master-eu-west-1b.masters.<clustername>".
W0614 12:31:25.351674 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
W0614 12:31:25.351850 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:31:25.351883 23427 rollingupdate_cluster.go:460] Stopping instance "i-04110186afe01a445", in AWS ASG "nodes.<clustername>".
W0614 12:33:25.768280 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
W0614 12:33:25.768321 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:33:25.768332 23427 rollingupdate_cluster.go:460] Stopping instance "i-045f4589362b37ec0", in AWS ASG "nodes.<clustername>".
W0614 12:35:26.179288 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
W0614 12:35:26.179369 23427 rollingupdate_cluster.go:372] Not draining cluster nodes as 'cloudonly' flag is set.
I0614 12:35:26.179398 23427 rollingupdate_cluster.go:460] Stopping instance "i-0752c6bbfaf29dc8f", in AWS ASG "nodes.<clustername>".
W0614 12:37:26.694596 23427 rollingupdate_cluster.go:401] Not validating cluster as cloudonly flag is set.
I0614 12:37:26.697947 23427 rollingupdate_cluster.go:241] Rolling update completed!
- There is always things I forget, or that could have been done differently. I forgot to create the Bastion instancegroup:
Run this command, save the output (if it looks ok) and runkops create ig --name=<cluster-name> bastions --role Bastion --subnet utility-eu-west-1a,utility-eu-west-1b,utility-eu-west-1c
kops update cluster --yes
. Shouldn't need to do a rolling update at this point. - And kube2iam needs to change config from
cbr0
(iface for kubenet) toweave
(iface for weave)
Also, your milage may vary - your cluster might have different peculiarities this guide won't cover. Also it is important to have DNS/Route53 set up properly so that kops/kubernetes can update it.
I think I was hit by
weaveworks/weave#3011 / weaveworks/weave#2997
Seeing lots of