Skip to content

Instantly share code, notes, and snippets.

@bprashanth
Created October 1, 2015 17:37
Show Gist options
  • Save bprashanth/a1eaa27ef28f2c7fee0f to your computer and use it in GitHub Desktop.
Save bprashanth/a1eaa27ef28f2c7fee0f to your computer and use it in GitHub Desktop.
```yaml
apiVersion: v1
kind: Service
metadata:
name: echoheadersx
labels:
app: echoheaders
spec:
type: NodePort
ports:
- port: 80
nodePort: 30301
targetPort: 8080
protocol: TCP
name: http
selector:
app: echoheaders
---
apiVersion: v1
kind: Service
metadata:
name: echoheadersdefault
labels:
app: echoheaders
spec:
type: NodePort
ports:
- port: 80
nodePort: 30302
targetPort: 8080
protocol: TCP
name: http
selector:
app: echoheaders
---
apiVersion: v1
kind: Service
metadata:
name: echoheadersy
labels:
app: echoheaders
spec:
type: NodePort
ports:
- port: 80
nodePort: 30284
targetPort: 8080
protocol: TCP
name: http
selector:
app: echoheaders
---
apiVersion: v1
kind: ReplicationController
metadata:
name: echoheaders
spec:
replicas: 1
template:
metadata:
labels:
app: echoheaders
spec:
containers:
- name: echoheaders
image: bprashanth/echoserver:0.0
ports:
- containerPort: 8080
```
@bprashanth
Copy link
Author

GCE Load Balancer Controller

A controller that orchestrates the life-cycle of GCE L7 Load Balancers based on Kubernetes Ingress.

Disclaimer:

  • This is a work in progress.
  • It relies on an experimental Ingress resource release with Kubernetes 1.1, as such, any guarantees that apply to experimental api in kubernetes also apply here.
  • The loadbalancer controller pod is not aware of your GCE quota.

Overview

GCE have a single resource representing an L7 loadbalancer. To achive L7 loadbalancing through kubernetes, we employ a resource called Ingress. Each Ingress creates the following GCE resource graph:

Global Forwarding Rule -> TargetHttpProxy -> Url Map -> Backend Service -> Instance Group

The L7 controller manages the lifecycle of each component in the graph. If an edge is disconnected, it fixes it. Each Ingress translates to a new GCE L7, the Instance Group and Backend Services are shared across L7s. This allows fanout, whereby you can acquire a single public IP from GCE and use it to route traffic to various backend services based on the url path+hostname.

Implementation details

The controller manages cloud resources through a notion of pools. Each pool is the representation of the last known state of a logical cloud resource. Pools are periodically synced with the desired state, as reflected by the kubernetes api. When you create a new Ingress, the following happens:

  • Create BackendServices for each kubernetes backend in the Ingress, through the backend pool.
  • Add nodePorts for each BackendService to an Instance Group with all the instances in your cluster, through the instance pool.
  • Create a UrlMap, TargetHttpProxy, Global Forwarding Rule through the loadbalancer pool.
  • Update the loadbalancer's urlmap according to the Ingress.

Periodically, each pool checks that it has a valid connection to the next hop in the above resource graph. So for example, the backend pool will check that each backend is connected to the instance group and that the node ports match, the instance group will check that all the kubernetes nodes are a part of the instance group, and so on. Since Backends are a limited resource, they're shared (well, everything is limited by your quota, this applies doubly to backend services). This means you can setup N Ingress' exposing M services through different paths and the controller will only create M backends. When all the Ingress' are deleted, the backend pool GCs the backend.

Creation

Before you can start creating Ingress you will need a loadbalancer/Ingress controller. We can use the rc.yaml in this directory:

$ kubectl create -f rc.yaml
replicationcontroller "gcelb" created
$ kubectl get pods
NAME                READY     STATUS    RESTARTS   AGE
gcelb-xxa53         1/1       Running   0          12s

A couple of things to note about this controller:

  • It needs a service with a node port to use as the default backend. This is the backend that's used when an Ingress does not specify the default.
  • It has an intentionally long terminationGracePeriod, this is only required with the --quit-on-sigterm flag (see Deletion)
  • Don't start 2 instances of the controller in a single cluster, they will fight each other.

The loadbalancer controller will watch for Services, Nodes and Ingress. Nodes already exist (the nodes in your cluster). We need to create the other 2. You can do so using the ingress-app.yaml in this directory.

A couple of things to note about the Ingress:

  • It creates a Replication Controller for a simple echoserver application, with 1 replica.
  • It creates 3 services for the same application pod: echoheaders[x, y, default]
  • It creates an Ingress with 2 hostnames and 3 endpoints (foo.bar.com{/foo} and bar.baz.com{/foo, /bar}) that access the given service
$ kubectl create -f ingress-app.yaml
$ kubectl get svc
NAME                 CLUSTER_IP     EXTERNAL_IP   PORT(S)   SELECTOR          AGE
echoheadersdefault   10.0.43.119    nodes         80/TCP    app=echoheaders   16m
echoheadersx         10.0.126.10    nodes         80/TCP    app=echoheaders   16m
echoheadersy         10.0.134.238   nodes         80/TCP    app=echoheaders   16m
kubernetes           10.0.0.1       <none>        443/TCP   <none>            21h

$ kubectl get ing
NAME      RULE          BACKEND                 ADDRESS
echomap   -             echoheadersdefault:80
          foo.bar.com
          /foo          echoheadersx:80
          bar.baz.com
          /bar          echoheadersy:80
          /foo          echoheadersx:80

You can tail the logs of the controller to observe its progress:

$ kubectl logs --follow gcelb-xxa53
I1005 22:11:26.731845       1 instances.go:48] Creating instance group k8-ig-foo
I1005 22:11:34.360689       1 controller.go:152] Created new loadbalancer controller
I1005 22:11:34.360737       1 controller.go:172] Starting loadbalancer controller
I1005 22:11:34.380757       1 controller.go:206] Syncing default/echomap
I1005 22:11:34.380763       1 loadbalancer.go:134] Syncing loadbalancers [default/echomap]
I1005 22:11:34.380810       1 loadbalancer.go:100] Creating l7 default-echomap
I1005 22:11:34.385161       1 utils.go:83] Syncing e2e-test-beeps-minion-ugv1
...

When it's done, it will update the status of the Ingress with the ip of the L7 it created:

$ kubectl get ing
NAME      RULE          BACKEND                 ADDRESS
echomap   -             echoheadersdefault:80   107.178.254.239
          foo.bar.com
          /foo          echoheadersx:80
          bar.baz.com
          /bar          echoheadersy:80
          /foo          echoheadersx:80

Go to your GCE console and confirm that the following resources have been created through the HTTPLoadbalancing panel:

  • A Global Forwarding Rule
  • An UrlMap
  • A TargetHTTPProxy
  • BackendServices (one for each kubernetes nodePort service)
  • An Instance Group (with ports corresponding to the BackendServices)

The HTTPLoadBalancing panel will also show you if your backends have responded to the health checks, wait till they do. This can take a few minutes. If you see Health status will display here once configuration is complete. the L7 is still bootstrapping. Wait till you have Healthy instances: X. Even though the GCE L7 is driven by our controller, which notices the kubernetes healtchecks of a pod, we still need to wait on the first GCE L7 health check to complete. Once your backends are up and healthy:

$ curl --resolve foo.bar.com:80:107.178.245.239 http://foo.bar.com/foo
CLIENT VALUES:
client_address=('10.240.29.196', 56401) (10.240.29.196)
command=GET
path=/echoheadersx
real path=/echoheadersx
query=
request_version=HTTP/1.1

SERVER VALUES:
server_version=BaseHTTP/0.6
sys_version=Python/3.4.3
protocol_version=HTTP/1.0

HEADERS RECEIVED:
Accept=*/*
Connection=Keep-Alive
Host=107.178.254.239
User-Agent=curl/7.35.0
Via=1.1 google
X-Forwarded-For=216.239.45.73, 107.178.254.239
X-Forwarded-Proto=http

You can also edit /etc/hosts instead of using --resolve.

Updates

Say you don't want a default backend and you'd like to allow all traffic hitting your loadbalancer at /foo to reach your echoheaders backend service, not just the traffic for foo.bar.com. You can modify the Ingress Spec:

spec:
  rules:
  - http:
      paths:
      - path: /foo
..

and replace the existing Ingress:

$ kubectl replace -f ingress-app.yaml
ingress "echomap" replaced

$ curl http://107.178.254.239/foo
CLIENT VALUES:
client_address=('10.240.143.179', 59546) (10.240.143.179)
command=GET
path=/foo
real path=/foo
...

$ curl http://107.178.254.239/
<pre>
INTRODUCTION
============
This is an nginx webserver for simple loadbalancer testing. It works well
for me but it might not have some of the features you want. If you would
...

A couple of things to note about this particular update:

  • An Ingress without a default backend inherits the backend of the Ingress controller.
  • A IngressRule without a host gets the wildcard. This is controller specific, some loadbalancer controllers do not respect anything but a DNS subdomain as the host. You cannot set the host to a regex.
  • You never want to delete then re-create an Ingress, as it will result in the controller tearing down and recreating the loadbalancer.

A note on resilience

The loadbalancer controller executes a control loop, it uses the kubernetes resources as a spec for the desired state, and the GCE cloud resources as the observed state, and drives the observed to the desired. This means you can go to the GCE UI and disconnect links in the graph, and the controller will fix them for you. An easy link to break is the url map itself, but you can also disconnect a target proxy from the urlmap, or remove an instance from the instance group (note this is different from deleting the instance, the loadbalancer controller will not recreate it if you do so). Modify one of the url links in the map to point to another service and wait till the controller sync (this happens as frequently as you tell it to, via the --resync-period flag which defaults to 30s). Note that the GCE api itself won't allow you to delete resources that have dependencies, but you can break links that black hole traffic in interesting ways, and that's exactly what the controller will stop you from doing. The same goes for the kubernetes side of things, the api server will validate against obviously bad updates, but if you, say, relink an Ingress so it points to the wrong backends the controller will blindly follow.

Deletion

Most production loadbalancers live as long as the nodes in the cluster and are torn down when the nodes are destroyed. That said, there are plenty of use cases for deleting an Ingress, deleting a loadbalancer controller, or just purging external loadbalancer resources alltogether.

Deleting a loadbalancer controller pod will not affect the loadbalancers themselves, this way your backends won't suffer a loss of availability if the scheduler pre-empts your controller pod. Deleting a single loadbalancer is as easy as deleting an Ingress via kubectl:

$ kubectl delete ing echoamp

GCE BackendServices are ref-counted and deleted by the controller as you delete Kubernetes Ingress'. If you want to delete everything in the cloud when the loadbalancer controller pod dies, start it with the --quit-on-sigterm flag. When a pod is killed it's first sent a SIGTERM, followed by a grace period (set to 10minutes for loadbalancer controllers), followed by a SIGKILL. The controller pod uses this time to delete cloud resources. If there is a failure in this stage, just recreate and kill the pod, or send a GET to its /quit endpoint.

Troubleshooting:

  • Check loadbalancer controller pod logs via kubectl
    A typical sign of trouble is repeated retries in the logs:
I1006 18:58:53.451869       1 loadbalancer.go:268] Forwarding rule k8-fw-default-echomap already exists
I1006 18:58:53.451955       1 backends.go:162] Syncing backends [30301 30284 30301]
I1006 18:58:53.451998       1 backends.go:134] Deleting backend k8-be-30302
E1006 18:58:57.029253       1 utils.go:71] Requeuing default/echomap, err googleapi: Error 400: The backendService resource 'projects/kubernetesdev/global/backendServices/k8-be-30302' is already being used by 'projects/kubernetesdev/global/urlMaps/k8-um-default-echomap'
I1006 18:58:57.029336       1 utils.go:83] Syncing default/echomap
  • Check the GCE console
  • Make sure you only have a single loadbalancer controller running
  • Check if you can access the backend service directly via nodeip:nodeport
  • Make sure the initial GCE health checks have passed
  • If you just want to purge cloud resources, recreate the pod with --quit-on-sigterm

Wishlist:

Analytics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment