Skip to content

Instantly share code, notes, and snippets.

@xvzf
Last active January 5, 2022 11:52
Show Gist options
  • Save xvzf/f36376937a384b0a2f074c48067c6129 to your computer and use it in GitHub Desktop.
Save xvzf/f36376937a384b0a2f074c48067c6129 to your computer and use it in GitHub Desktop.

Which component are you using?:

cluster-autoscaler

What version of the component are you using?:

Component version: 1.21.2

What k8s version are you using (kubectl version)?:

1.21.2-eks

kubectl version Output
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:41:42Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-06eac09", GitCommit:"5f6d83fe4cb7febb5f4f4e39b3b2b64ebbbe3e97", GitTreeState:"clean", BuildDate:"2021-09-13T14:20:15Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?:

We're running ClusterAPI on EKS with a MachineDeployment and want to scale this one using cluster-autoscaler. A minimal set of manifests is here:

We're getting the strange behaviour that nodes are removed completely from the cluster (scale back to 1 node) and then cluster-autoscaler kicks-in again. We first thought about our gitops reconciliation being the issue though on a fresh setup with everything disabled we're able to reproduce it as well.

We also lots of

I0105 11:42:15.900121       1 request.go:597] Waited for 194.315819ms due to client-side throttling, not priority and fairness, request: GET:https://10.100.0.1:443/apis/cluster.x-k8s.io/v1beta1/namespaces/flux-system/machinedeployments/<redacted>/scale
...

in our logs, repeated 5-10 times.

We continued investigation by enabling audit logging on the Kubernetes APIServer and were able to identify several get&update operations by the cluster-autoscaler service account per second (2-6 calls per second!) which explains the client side rate limiting.

We tried playing around with the reconciliation intervals etc but nothing seems to work and we're running out of ideas. Is this a known bug on the cluster autoscaler?

What did you expect to happen?:

Autoscaling works & nodes are not replaced every 5-15 minutes

What happened instead?:

Heavy load on the APIServer with a small system, nodes get replaced completely every 5-15 minutes. Diesab

How to reproduce it (as minimally and precisely as possible):

This is a minimal set of YAML files (requires cluster-api to being set-up with the AWS provider) to replicate it.

Manifests used to replicate
apiVersion: v1
kind: ServiceAccount
metadata:
  name: gh-demo-ca
  namespace: flux-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: gh-demo-ca
rules:
- apiGroups:
  - cluster.x-k8s.io
  resources:
  - machinedeployments
  - machinedeployments/scale
  - machines
  - machinesets
  verbs:
  - get
  - list
  - update
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: gh-demo-ca
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: gh-demo-ca
subjects:
- kind: ServiceAccount
  name: gh-demo-ca
  namespace: flux-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gh-demo-ca
  namespace: flux-system
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: gh-demo-ca
  template:
    metadata:
      labels:
        name: gh-demo-ca
    spec:
      containers:
      - args:
        - --v=9
        - --stderrthreshold=info
        - --cloud-provider=clusterapi
        - --expander=least-waste
        - --kubeconfig=/kubeconf/value
        - --clusterapi-cloud-config-authoritative
        - --skip-nodes-with-local-storage=false
        - --leader-elect=true
        - --leader-elect-lease-duration=30s
        - --leader-elect-renew-deadline=20s
        - --leader-elect-retry-period=2s
        - --leader-elect-resource-lock=leases
        - --scan-interval=30s
        - --regional=true
        - --node-group-auto-discovery=clusterapi:namespace=flux-system,clusterName=gh-demo
        command:
        - /cluster-autoscaler
        image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.22.2
        imagePullPolicy: IfNotPresent
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 250m
            memory: 150Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /kubeconf
          name: gh-demo-kubeconfig
      serviceAccountName: gh-demo-ca
      volumes:
      - name: gh-demo-kubeconfig
        secret:
          defaultMode: 256
          secretName: gh-demo-kubeconfig
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSMachineTemplate
metadata:
  name: gh-demo-md-0
  namespace: flux-system
spec:
  template:
    spec:
      iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
      instanceType: t3a.medium
      sshKeyName: containous
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: AWSManagedControlPlane
metadata:
  name: gh-demo-cp
  namespace: flux-system
spec:
  disableVPCCNI: false
  iamAuthenticatorConfig:
    mapUsers: [] # redacted
  region: us-central-1
  sshKeyName: containous
  version: v1.21.5
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: gh-demo
  namespace: flux-system
spec:
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: AWSManagedControlPlane
    name: gh-demo-cp
  infrastructureRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: AWSManagedControlPlane
    name: gh-demo-cp
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: EKSConfigTemplate
metadata:
  name: gh-demo
  namespace: flux-system
spec:
  template:
    spec:
      containerRuntime: containerd
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  annotations:
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-max-size: "3"
    cluster.x-k8s.io/cluster-api-autoscaler-node-group-min-size: "1"
  name: gh-demo-md-0
  namespace: flux-system
spec:
  clusterName: gh-demo
  selector:
    matchLabels: {}
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: EKSConfigTemplate
          name: gh-demo
      clusterName: gh-demo
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AWSMachineTemplate
        name: gh-demo-md-0
      version: v1.21.5

Anything else we need to know?:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment