Skip to content

Instantly share code, notes, and snippets.

@keymon
Created February 5, 2021 11:19
Show Gist options
  • Save keymon/3979f25b61577057ac0eff4b34e149a4 to your computer and use it in GitHub Desktop.
Save keymon/3979f25b61577057ac0eff4b34e149a4 to your computer and use it in GitHub Desktop.
Troubleshooing weird nodePort routing in KOPS

Context

We are trying to expose some pods as nodeport to allow inter k8s communication. But the nodeport does not get routed properly to the target pod when the origin is one of the pods themselves.

We run KOPS v1.11.9

Test scenario

Using this test scenario:

---
apiVersion: v1
kind: Service
metadata:
  name: echo-server
  namespace: hector-test
spec:
  externalTrafficPolicy: Cluster
  ports:
  - name: http
    nodePort: 30180
    port: 30180
    protocol: TCP
    targetPort: 80
  selector:
    app: echo-server
  sessionAffinity: None
  type: NodePort
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server-local
  namespace: hector-test
spec:
  externalTrafficPolicy: Local
  ports:
  - name: http
    nodePort: 30181
    port: 30181
    protocol: TCP
    targetPort: 80
  selector:
    app: echo-server
  sessionAffinity: None
  type: NodePort
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: echo-server
  namespace: hector-test
spec:
  serviceName: echo-server
  podManagementPolicy: OrderedReady
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: echo-server
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
      - image: ealen/echo-server
        imagePullPolicy: Always
        name: echo-server
        ports:
        - containerPort: 80

  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate

There are 2 nodeports:

  • 30180 uses externalTrafficPolicy: Cluster
  • 30181 uses externalTrafficPolicy: Local

Observed behaviour

We run this script to see the responses:

TARGET_NODE=1.2.3.4

# Port 30180 with Cluster
for i in $(seq 100); do curl  -qs ${TARGET_NODE}:30180 | jq -r .environment.HOSTNAME; done | sort |uniq -c 

# Port 30181 with Local
for i in $(seq 100); do curl  -qs ${TARGET_NODE}:30181 | jq -r .environment.HOSTNAME; done | sort |uniq -c 
  • From any pod running on node outside that is NOT these echo-server pods, inside of KOPS (or any other node):

    • 30180 Cluster => Roundrobins both pods
    • 30181 Local => Hits only one Pod
  • From one of the pods (on KOPS v1.11.9)

    • 30180 Cluster => Roundrobins both pods
    • 30180 Local => Roundrobins both pods !!!??! This is unexpected
  • On a more modern k8s, eg. EKS v1.16.15, from one of the pods

    • 30180 Cluster => Roundrobins both pods
    • 30181 Local => Hits only one Pod

Conclusion

  • A nodeport with externalTrafficPolicy: Cluster will do always round robin of the service
  • KOPS v1.11.9 has some bug, so ``externalTrafficPolicy: Local` does roundrobin when hit from the pods associated to the service.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment