KEP-1880 allows to configure via API objects the ServiceCIDRs assigned to Kubernetes clusters
This demo is using kind and based on the existing PR kubernetes/kubernetes#116516
A live demo was done in the SIG-Network meeting on Oct 12th 2023 https://youtube.com/playlist?list=PL69nYSiGNLP2E8vmnqo5MwPOY25sDWIxb&si=vTNuT7EBFujoQOce
You can checkout the PR locally and build your own image
kind build node-image --image kindest:servicecidr
or use my own one, to create the cluster just specify the image and the configuration that enable the Alpha Runtime and Feature flags.
kind create cluster --image aojea/kindest:servicecidr --config kind-config.yaml -v9 --name servicecidr
You can observe after creation that there are several objects created at bootsrap:
- Default ServiceCIDR , named kubernetes, created from the flags values
kubectl get servicecidrs
NAME IPV4 IPV6 AGE
kubernetes 10.96.0.0/28 <none> 17m
- Default Kubernets Service, named kubernetes, the first IP from the default ServiceCIDR
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 17m
The default Service has a ClusterIP 10.96.0.1, that must create the corresponding IPAddress object
kubectl get ipaddress
NAME PARENTREF
10.96.0.1 services/default/kubernetes
10.96.0.10 services/kube-system/kube-dns
The relation of this objects is Services get allocated ClusterIPs from the ServiceCIDRs, to solve the problem of ClusterIP uniqueness across the cluster, each ClusterIP has an IPAddress object associated.
The ServiceCIDRs are protected with finalizers, to avoid leaving Service ClusterIPs orphans, the finalizer is only removed if there is another subnet that contains the existing IPAddresses or there are no IPAddresses belonging to the subnet.
There are cases that the ServiceCIDR range is exhausted, previously, increasing the Service range was a disruptive operation that could also cause data loss. With this new feature users just need to add a new ServiceCIDR.
Create Services so we exhaust the existing ServiceCIDR
for i in $(seq 1 13); do kubectl create service clusterip "test-$i" --tcp 80 -o json | jq -r .spec.clusterIP; done
for i in $(seq 1 13); do kubectl create service clusterip "test-$i" --tcp 80 -o json | jq -r .spec.clusterIP; done
10.96.0.11
10.96.0.5
10.96.0.12
10.96.0.13
10.96.0.14
10.96.0.2
10.96.0.3
10.96.0.4
10.96.0.6
10.96.0.7
10.96.0.8
10.96.0.9
error: failed to create ClusterIP service: Internal error occurred: failed to allocate a serviceIP: range is full
We can see how the last Service fails to be created because the range is full, so we just create a new ServiceCIDR
$ cat cidr.yaml
apiVersion: networking.k8s.io/v1alpha1
kind: ServiceCIDR
metadata:
name: newcidr1
spec:
ipv4: 192.96.0.0/24
$ kubectl apply -f cidr.yaml
servicecidr.networking.k8s.io/newcidr1 created
and we can see how we can create new Services
for i in $(seq 13 16); do kubectl create service clusterip "test-$i" --tcp 80 -o json | jq -r .spec.clusterIP; done
192.96.0.48
192.96.0.200
192.96.0.121
192.96.0.144
that get IPs from the new Service CIDR
kubectl get service/test-13
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-13 ClusterIP 192.96.0.83 <none> 80/TCP 25s
A ServiceCIDR can not be deleted if there are still IPs pending on that
kubectl delete servicecidr newcidr1
servicecidr.networking.k8s.io "newcidr1" deleted
a finalizer will keep it there
kubectl get servicecidr newcidr1 -o yaml
apiVersion: networking.k8s.io/v1alpha1
kind: ServiceCIDR
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.k8s.io/v1alpha1","kind":"ServiceCIDR","metadata":{"annotations":{},"name":"newcidr1"},"spec":{"ipv4":"192.96.0.0/24"}}
creationTimestamp: "2023-10-12T15:11:07Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2023-10-12T15:12:45Z"
finalizers:
- networking.k8s.io/service-cidr-finalizer
name: newcidr1
resourceVersion: "1133"
uid: 5ffd8afe-c78f-4e60-ae76-cec448a8af40
spec:
ipv4: 192.96.0.0/24
status:
conditions:
- lastTransitionTime: "2023-10-12T15:12:45Z"
message: There are still IPAddresses referencing the ServiceCIDR, please remove
them or create a new ServiceCIDR
reason: OrphanIPAddress
status: "False"
type: Ready
until all the referenced IPAddresses are deleted
for i in $(seq 13 16); do kubectl delete service "test-$i" ; done
service "test-13" deleted
service "test-14" deleted
service "test-15" deleted
service "test-16" deleted
so it can be completely removes
kubectl get servicecidr newcidr1
Error from server (NotFound): servicecidrs.networking.k8s.io "newcidr1" not found
Another common use case is when users want to move the existing Service range to a new range, imagine we want to move our 10.96.0.0/28 to 192.168.7.0/24.
We can follow the next steps:
- Create new ServiceCIDR with 192.168.7.0/24
- Delete the default ServiceCIDR to make its Ready condition to False, so no new IPAddresses will be allocated from it.
- At this point only the kubernetes.default Service must be in the default ServiceCIDR subnet
- Recreate all the existing Services to they get IPs from the new ServiceCIDR (Delete and Create)
- At this point we can start a new apiserver with the flags matching the new ServiceCIDR range
- When the new apiserver is running and ready, we can shutdown the old apiserver.
- Then delete the "kubernetes.default" Service, this will unblock the deletion of the default ServiceCIDR, that will be recreated by the new apiserver, and also create the new kubernetes.default Service in the new range
- At this point we can delete the temporary ServiceCIDR, since we'll be overlapping with the new created default ServiceCIDR
This is the same as previous, but just using a different IP family for the new subnet.