Skip to content

Instantly share code, notes, and snippets.

@whitlockjc
Last active June 22, 2018 04:33
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save whitlockjc/11b64dd30fb91e94cbb429c0df4ff315 to your computer and use it in GitHub Desktop.
Save whitlockjc/11b64dd30fb91e94cbb429c0df4ff315 to your computer and use it in GitHub Desktop.
Proof of concept on how to implement an intra-cluster router within Kubernetes.

Overview

Imagine you are an API Management company and your business depends on your ability to be involved in the request/response lifecycle for HTTP-based API traffic. Also imagine that you've got a Kubernetes cluster that runs both your company's applications and even some client applications. This means when it comes to doing API Management for all necessary traffic, you need to be involved in the request/response lifecycle for targets running within Kubernetes for both requests originating outside the cluster and even some (if not all) requests originating within the cluster. To continue this conversation, let's establish some terminology:

  • Inter-Cluster: An external request is made for an API that maps to a resource running within Kubernetes
  • Intra-Cluster: An internal request is made for an API that maps to a resource running within Kubernetes

The question at hand is how do you as a Kubernetes cluster owner get involved in the request/response lifecycle for all necessary API traffic as described above?

Inter-Cluster Communication

When it comes to Inter-Cluster traffic handling, Kubernetes does not provide a complete ingress implementation itself. While Kubernetes does provide you with the primitive for storing the routing rules, called Ingress Resources, it does not ship with an Ingress Controller that listens at the edge of the cluster and routes traffic based on the routing rules. While this might seem like an oversight on Kubernetes' part, this is actually a good thing for a number of reasons:

  1. You can implement ingress (rule storage, Ingress Controller implementation, ...) however you deem fit. This means if you don't agree with Kubernetes' approach, you can design your own.
  2. If you own the Ingress Controller, you have the necessary touch point to be involved in the request/response lifecycle

Note: There are a number of example Ingress Controller implementations located here: https://github.com/kubernetes/contrib/tree/master/ingress/controllers (These can be used as-is or as a guide for implementing your own Ingress Controller)

So if you own the Ingress Controller implementation, you can change the direct Client -> Ingress Controller -> Pod traffic flow to Client -> Ingress Controller -> API Management -> Pod. Problem solved.

Note: Replace "API Management" with whatever your business needs are.

Intra-Cluster Communication

When you are within a Kubernetes cluster, communication happens directly. You are provided with an IP address of a Pod/Service you depend on (or you go get it using the Kubernetes API) or you can use DNS (assuming you have deployed the optional, but strongly suggested, DNS cluster add-on). Unfortunately, Intra-Cluster communication does not have a central equivalent for the Ingress Controller. [Pods][pods] can communicate directly with other Pods and Pods can communicate directly with [Services][services] but there is no Ingress Controller equivalent between them.

Note: When using Services, there is a middleman called kube-proxy that is involved but it is purely for load balancing. Not only that but kube-proxy is not in the request/response lifecycle when not using Services.

This means that out of the box, there is no way to be involved in the request/response lifecycle for Intra-Cluster communication. But fret not, below is a proposal on how this can be done which would not only solve the problem but it reuses the ingress concept which should be really simple. Let's get to the proposal.

Proposal

While working on an Ingress Controller implementation (https://github.com/30x/k8s-pods-ingress) we realized that processing Inter-Cluster requests is almost identical to processing Intra-Cluster requests: You have routing rules based on hostname/path combinations and a Controller that uses these rules to route to Pods/Services. The main difference is while Inter-Cluster requests will be for some hostname, like www.github.com, Intra-Cluster requests would be for an IP, Pod name or Service name. These nuances do not change the fact that you could repurpose an Ingress Controller to handle both the Inter-Cluster and Intra-Cluster communication routing. (This does not mean you have to have one Controller deployed that handles both types of traffic. This just means that the same source base could be written to serve both the Inter-Cluster and Intra-Cluster use cases.)

Below are the details of the proposal. For each of the important pieces, there is likely more than one way to do this. The hope is that this will be a launching point to discuss the viability of supporting something like this natively within Kubernetes, and maybe even coming up with the best solution for doing this with Kubernetes as-is.

To simplify the problem being solved, we need to take the typical PodA -> PodB communication and turn it into PodA -> Intra-Cluster Router -> PodB.

Considerations

Before we go into the proposal details, we should get one thing straight: This proposal is being made with the hopes that the application author is not impacted. Here are the design considerations:

  • There should be no Kubernetes-specific code in their application, at least not related to routing unless that is a part of their application
  • This implementation should not dictatate what Kubernetes constructs can/cannot be used
  • This implementation should not require the application author to do weird things (Example: Requiring the application to make requests like curl -v "Host: SERVICE_NAME" http://INGRESS/path instead of curl -v http://SERVICE_NAME/path)

Implementation

The Controller

Much like the Ingress Controller, we need some Controller that will process traffic based on hostname/path combinations and route based on the known routing rules. The specifics on how you deploy your Controller is not important really unless your Controller has some mode-specific configuration. Other than that, you can deploy your Controller as a Replication Controller, a Replica Set, a Pod, etc.

Note: For proper isolation, you might deploy your Controller so that it is not exposed to the outside world.

Now while the actual specifics on how you deploy your Controller do not matter, the Controller will dictate how and where the routing rules are stored. Once your Controller is deployed, we will create a Service for the router so that we can reference it later.

Routing Rules

So now that we have a Controller deployed that can handle traffic, we need to look at the options for storing the routing rules for the Intra-Cluster traffic routing. As mentioned above the Controller dictates how and where the routing rules are stored. But since there is no native Kubernetes object to store these rules, we have a few options:

  • A custom Kubernetes object
  • Overload the Ingress Resource object (not suggested) but using labels to dicate Inter-Cluster vs. Intra-Cluster routing
  • Use annotations/labels for storing these rules

Wiring

Now up to this point, nothing special has been suggested. While the problem domain is not natively solved by Kubernetes, nothing has stopped Kubernetes from natively handling the needs mentioned in the proposal above. What is missing at this point is wiring it up so that application authors can request some hostname/path and that traffic gets automatically routed through our Intra-Cluster Router.

To achieve our needs, we somehow need to make it so that a Pod/Service name gets pointed to our Intra-Cluster Controller. One way to do this would be to use proxy Services. The idea here is you would create a Service that points to the Intra-Cluster Router instead of your Pods. To do this, you would use the appropriate Label Selector that identifies Pods running the Intra-Cluster Router. Then when your service is resolved, the traffic would go to the Intra-Cluster Router which itself would route the traffic to the appropriate Pod(s) based on the routing rules.

This is not ideal and it does not solve the direct Pod<->Pod communication. But since that is an anti-pattern of sorts, maybe that does not matter. But for Services, this should work.

Conclusion

At the end of the day, what makes this work is a convention that instead of an application creating a Service that resolves to their Pods, it resolves to the Intra-Cluster Router Pods. It just so happens that something similar has come up recently called "Service Aliases": kubernetes/kubernetes#13748


Example

Below are example Kubernetes deployment files for each of the moving pieces described above. This example will be built using the k8s-pods-ingress as the Intra-Cluster Router.

Intra-Cluster Router

Note: As mentioned above, you can deploy the Intra-Cluster Router using whatever Kubernetes construct that fits your needs. To remove any ambiguity, below is an example deployment that will deploy both an Inter-Cluster router (Ingress Controller) and an Intra-Cluster router. The reason this is important is to show how you could use the same deployment to describe both internal and external routes in the same deployment.

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: routers
  labels:
    names: routers
spec:
  template:
    metadata:
      labels:
        names: routers
    spec:
      containers:
      - image: whitlockjc/k8s-pods-ingress:v0
        imagePullPolicy: Always
        name: inter-cluster-router
        ports:
          - containerPort: 80
            hostPort: 80
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          # Use the configuration to use the public/private paradigm (Inter-Cluster version)
          - name: API_KEY_SECRET_LOCATION
            value: routing:public-api-key
          - name: HOSTS_ANNOTATION
            value: publicHosts
          - name: PATHS_ANNOTATION
            value: publicPaths
      - image: whitlockjc/k8s-pods-ingress:v0
        imagePullPolicy: Always
        name: intra-cluster-router
        ports:
          - containerPort: 81
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          # Use the configuration to use the public/private paradigm (Intra-Cluster version)
          - name: API_KEY_SECRET_LOCATION
            value: routing:private-api-key
          - name: HOSTS_ANNOTATION
            value: privateHosts
          - name: PATHS_ANNOTATION
            value: privatePaths
          # Since we cannot have two containers listening on the same port, use a different port for the private router
          - name: PORT
            value: "81"

Intra-Cluster Service

apiVersion: v1
kind: Service
metadata:
  name: intra-cluster-router-service
  labels:
    name: intra-cluster-router-service
spec:
  ports:
    - port: 80
  selector:
    name: routers

Your Application

apiVersion: v1
kind: ReplicationController
metadata:
  name: my-application
  labels:
    name: my-application
spec:
  replicas: 1
  selector:
    name: my-application
  template:
    metadata:
      labels:
        name: my-application
        routable: "true"
      annotations:
        # Expose this application to the Inter-Cluster router so that http://test.apigee.com/nodejs routes here
        publicHsots: "test.apigee.com"
        publicPaths: "3000:/nodejs"
        # Expose this application to the Intra-Cluster router so that http://my-application.my-namespace/nodejs routes here
        privateHosts: "my-application.my-namespace"
        privatePaths: "3000:/nodejs"
    spec:
      containers:
      - name: nodejs-k8s-env
        image: whitlockjc/nodejs-k8s-env:v0
        env:
          - name: PORT
            value: "3000"
        ports:
          - containerPort: 3000
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace

Your Application Service (The Proxy Service)

apiVersion: v1
kind: Service
metadata:
  name: my-application-service
  labels:
    name: my-application-service
spec:
  ports:
    - port: 80
  selector:
    # This is basically as proxy service so instead of using a label selector
    # that points to your Pods, point it to the Intra-Cluster Router Pod(s).
    name: intra-cluster-router-service

Your Other Application

So now that you've got your application deployed that allows for Intra-Cluster communication processing, let's deploy another application that consumes this application called my-other-application.

apiVersion: v1
kind: ReplicationController
metadata:
  name: my-other-application
  labels:
    name: my-other-application
spec:
  replicas: 1
  selector:
    name: my-other-application
  template:
    metadata:
      labels:
        name: my-other-application
    spec:
      containers:
      - name: some-application
        image: whitlockjc/some-application

At this point, if you deployed all of these files above your my-other-application should be able to make an HTTP request to my-application-service.my-namespace and it should be routed through the Intra-Cluster router and end up at the Pod corresponding to your application. Here is the flow: my-other-application -> my-application-service -> Intra-Cluster Router -> my-application

@whitlockjc
Copy link
Author

I now realize this approach is somewhat naive in that this works fine within the confines of a single namespace but once you want to cross the namespace boundary, you really should be a clusterIP for the intra-cluster router service and selector-less services for the alias/proxy services with a manually created endpoint object for the alias/proxy service pointing to the clusterIP of the intra-cluster router.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment