Skip to content

Instantly share code, notes, and snippets.

@trejo08
Created January 27, 2021 14:17
Show Gist options
  • Save trejo08/68acc0bb47983ab9e885af6d5e6338a9 to your computer and use it in GitHub Desktop.
Save trejo08/68acc0bb47983ab9e885af6d5e6338a9 to your computer and use it in GitHub Desktop.
k8s cluster autoscaler command line options for version v1.18.3
$ docker run k8s.gcr.io/autoscaling/cluster-autoscaler:v1.18.3 /cluster-autoscaler -h
Usage of /cluster-autoscaler:
pflag: help requested
--add-dir-header If true, adds the file directory to the header
--address string The address to expose prometheus metrics. (default ":8085")
--alsologtostderr log to standard error as well as files
--aws-use-static-instance-list Should CA fetch instance types in runtime or use a static list. AWS only
--balance-similar-node-groups Detect similar node groups and balance the number of nodes between them
--cloud-config string The path to the cloud provider configuration file. Empty string for no configuration file.
--cloud-provider string Cloud provider type. Available values: [aws,azure,gce,alicloud,baiducloud,magnum,digitalocean,clusterapi] (default "gce")
--cloud-provider-gce-l7lb-src-cidrs cidrs CIDRs opened in GCE firewall for L7 LB traffic proxy & health checks (default 130.211.0.0/22,35.191.0.0/16)
--cloud-provider-gce-lb-src-cidrs cidrs CIDRs opened in GCE firewall for L4 LB traffic proxy & health checks (default 130.211.0.0/22,209.85.152.0/22,209.85.204.0/22,35.191.0.0/16)
--cluster-name string Autoscaled cluster name, if available
--clusterapi-cloud-config-authoritative Treat the cloud-config flag authoritatively (do not fallback to using kubeconfig flag). ClusterAPI only
--cores-total string Minimum and maximum number of cores in cluster, in the format <min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. (default "0:320000")
--estimator string Type of resource estimator to be used in scale up. Available values: [binpacking] (default "binpacking")
--expander string Type of node group expander to be used in scale up. Available values: [random,most-pods,least-waste,price,priority] (default "random")
--expendable-pods-priority-cutoff int Pods with priority below cutoff will be expendable. They can be killed without any consideration during scale down and they don't cause scale up. Pods with null priority (PodPriority disabled) are non expendable. (default -10)
--gpu-total MultiStringFlag Minimum and maximum number of different GPUs in cluster, in the format <gpu_type>:<min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. Can be passed multiple times. CURRENTLY THIS FLAG ONLY WORKS ON GKE. (default [])
--ignore-daemonsets-utilization Should CA ignore DaemonSet pods when calculating resource utilization for scaling down
--ignore-mirror-pods-utilization Should CA ignore Mirror pods when calculating resource utilization for scaling down
--ignore-taint MultiStringFlag Specifies a taint to ignore in node templates when considering to scale a node group (default [])
--kubeconfig string Path to kubeconfig file with authorization and master location information.
--kubernetes string Kubernetes master location. Leave blank for default
--leader-elect Start a leader election client and gain leadership before executing the main loop. Enable this when running replicated components for high availability. (default true)
--leader-elect-lease-duration duration The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. (default 15s)
--leader-elect-renew-deadline duration The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. (default 10s)
--leader-elect-resource-lock endpoints The type of resource object that is used for locking during leader election. Supported options are endpoints (default) and `configmaps`. (default "leases")
--leader-elect-resource-name string The name of resource object that is used for locking during leader election.
--leader-elect-resource-namespace string The namespace of resource object that is used for locking during leader election.
--leader-elect-retry-period duration The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. (default 2s)
--log-backtrace-at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log-dir string If non-empty, write log files in this directory
--log-file string If non-empty, use this log file
--log-file-max-size uint Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
--logtostderr log to standard error instead of files (default true)
--max-autoprovisioned-node-group-count int The maximum number of autoprovisioned groups in the cluster. (default 15)
--max-bulk-soft-taint-count int Maximum number of nodes that can be tainted/untainted PreferNoSchedule at the same time. Set to 0 to turn off such tainting. (default 10)
--max-bulk-soft-taint-time duration Maximum duration of tainting/untainting nodes as PreferNoSchedule at the same time. (default 3s)
--max-empty-bulk-delete int Maximum number of empty nodes that can be deleted at the same time. (default 10)
--max-failing-time duration Maximum time from last recorded successful autoscaler run before automatic restart (default 15m0s)
--max-graceful-termination-sec int Maximum number of seconds CA waits for pod termination when trying to scale down a node. (default 600)
--max-inactivity duration Maximum time from last recorded autoscaler activity before automatic restart (default 10m0s)
--max-node-provision-time duration Maximum time CA waits for node to be provisioned (default 15m0s)
--max-nodes-total int Maximum number of nodes in all node groups. Cluster autoscaler will not grow the cluster beyond this number.
--max-total-unready-percentage float Maximum percentage of unready nodes in the cluster. After this is exceeded, CA halts operations (default 45)
--memory-total string Minimum and maximum number of gigabytes of memory in cluster, in the format <min>:<max>. Cluster autoscaler will not scale the cluster beyond these numbers. (default "0:6400000")
--min-replica-count int Minimum number or replicas that a replica set or replication controller should have to allow their pods deletion in scale down
--namespace string Namespace in which cluster-autoscaler run. (default "kube-system")
--new-pod-scale-up-delay duration Pods less than this old will not be considered for scale-up. (default 0s)
--node-autoprovisioning-enabled Should CA autoprovision node groups when needed
--node-deletion-delay-timeout duration Maximum time CA waits for removing delay-deletion.cluster-autoscaler.kubernetes.io/ annotations before deleting the node. (default 2m0s)
--node-group-auto-discovery <name of discoverer>:[<key>[=<value>]] One or more definition(s) of node group auto-discovery. A definition is expressed <name of discoverer>:[<key>[=<value>]]. The `aws` and `gce` cloud providers are currently supported. AWS matches by ASG tags, e.g. `asg:tag=tagKey,anotherTagKey`. GCE matches by IG name prefix, and requires you to specify min and max nodes per IG, e.g. `mig:namePrefix=pfx,min=0,max=10` Can be used multiple times. (default [])
--nodes MultiStringFlag sets min,max size and other configuration data for a node group in a format accepted by cloud provider. Can be used multiple times. Format: <min>:<max>:<other...> (default [])
--ok-total-unready-count int Number of allowed unready nodes, irrespective of max-total-unready-percentage (default 3)
--profiling Is debug/pprof endpoint enabled
--regional Cluster is regional.
--scale-down-candidates-pool-min-count int Minimum number of nodes that are considered as additional non empty candidatesfor scale down when some candidates from previous iteration are no longer valid.When calculating the pool size for additional candidates we takemax(#nodes * scale-down-candidates-pool-ratio, scale-down-candidates-pool-min-count). (default 50)
--scale-down-candidates-pool-ratio float A ratio of nodes that are considered as additional non empty candidates forscale down when some candidates from previous iteration are no longer valid.Lower value means better CA responsiveness but possible slower scale down latency.Higher value can affect CA performance with big clusters (hundreds of nodes).Set to 1.0 to turn this heuristics off - CA will take all nodes as additional candidates. (default 0.1)
--scale-down-delay-after-add duration How long after scale up that scale down evaluation resumes (default 10m0s)
--scale-down-delay-after-delete duration How long after node deletion that scale down evaluation resumes, defaults to scanInterval (default 0s)
--scale-down-delay-after-failure duration How long after scale down failure that scale down evaluation resumes (default 3m0s)
--scale-down-enabled Should CA scale down the cluster (default true)
--scale-down-gpu-utilization-threshold float Sum of gpu requests of all pods running on the node divided by node's allocatable resource, below which a node can be considered for scale down.Utilization calculation only cares about gpu resource for accelerator node. cpu and memory utilization will be ignored. (default 0.5)
--scale-down-non-empty-candidates-count int Maximum number of non empty nodes considered in one iteration as candidates for scale down with drain.Lower value means better CA responsiveness but possible slower scale down latency.Higher value can affect CA performance with big clusters (hundreds of nodes).Set to non positive value to turn this heuristic off - CA will not limit the number of nodes it considers. (default 30)
--scale-down-unneeded-time duration How long a node should be unneeded before it is eligible for scale down (default 10m0s)
--scale-down-unready-time duration How long an unready node should be unneeded before it is eligible for scale down (default 20m0s)
--scale-down-utilization-threshold float Sum of cpu or memory of all pods running on the node divided by node's corresponding allocatable resource, below which a node can be considered for scale down (default 0.5)
--scale-up-from-zero Should CA scale up when there 0 ready nodes. (default true)
--scan-interval duration How often cluster is reevaluated for scale up or down (default 10s)
--skip-headers If true, avoid header prefixes in the log messages
--skip-log-headers If true, avoid headers when opening log files
--skip-nodes-with-local-storage If true cluster autoscaler will never delete nodes with pods with local storage, e.g. EmptyDir or HostPath (default true)
--skip-nodes-with-system-pods If true cluster autoscaler will never delete nodes with pods from kube-system (except for DaemonSet or mirror pods) (default true)
--stderrthreshold severity logs at or above this threshold go to stderr (default 2)
--unremovable-node-recheck-timeout duration The timeout before we check again a node that couldn't be removed before (default 5m0s)
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
--write-status-configmap Should CA write status information to a configmap (default true)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment