Skip to content

Instantly share code, notes, and snippets.

@tommituura
Created April 3, 2020 11:16
Show Gist options
  • Save tommituura/0b0b65ecd7a794692e7e9a906266752b to your computer and use it in GitHub Desktop.
Save tommituura/0b0b65ecd7a794692e7e9a906266752b to your computer and use it in GitHub Desktop.

Controlling resource usage in Kubernetes

NODE resources

For simplicity, we are looking at three types of resources that need to be controlled:

  • CPU cycles
  • RAM capacity
  • Disk capacity (PersistentVolumes)

There are four main concepts that work together to facilitate resource usage controls in Kubernetes:

  • Resourcequota
  • Limitrange
  • Limits
  • Requests
  • (and default values for limits and requests)

This is my attempt at explaining how these things work. Note: I am looking at this from the perspective of future Openshift cluster admin, which means that I am (going to be) tasked partly with protecting the different projects / namespaces from each other, too.

  • ResourceQuota states at namespace level how much CPU, RAM or Disk the pods/containers may use, combined. If nothing else, setting this is a must. In a multi-tenant cluster like Openshift, Project/namespace admins should NOT have the ability to rewrite a project's resourcequota. But the story of resource usage limiting does not end there!

Request and Limit values should be attached to every single pod, at the very least for RAM and CPU, in order to allow kubernetes keep the system running smoothly.

  • Request: This is the amount of CPU/RAM that Kubernetes guarantees every container in the pod may get. If Kubernetes sees that the cluster does not have the resources to honor this guarantee, the pod will not be scheduled. IOW, this is the lower bound of resource for the containers.
  • Limit: This is the amount of CPU/RAM that Kubernetes may give to every container in the pod. This is very useful value in many ways: The pod may claim this amount as a self-restriction mechanism to protect the cluster from itself (bugs or surprise loads for server software), and the Kubernetes cluster has some usable information of how much resources the program running inside the container may actually need to use in order to be useful, while still being able to restrict huge spikes.
    • A usable strategy might be to have a absolutely minimal Request values for CPU / RAM, just enough for the pods to barely run, and use the Limit values to cap the container-level resource usage. Maybe?

Lastly, let's look at LimitRange object, that is used by cluster administrators to streamline the management of resource usage limitations:

  • LimitRange is applied to a project / namespace and it also should not be rewritable by project/namespace admins in a proper multi-tenant cluster. With LimitRange, cluster administration can force minimum & maximum amounts for pod/container -level requests and limits, as well as some sensible defaults to let project/namespace admins / developers to just use the cluster.

It is up to cluster administration to make sure the ResourceQuota and LimitRange are sensibly coherent together, like not giving a default limit and default request values that are farther from each other than the allowed maxLimitRequestRatio since that mistake will mean developers will have to state at least one of those values or Kubernetes will refuse to schedule the pods thanks to the applied default values being in violation of the max ratio value. Also, setting a higher maximum per-container Limit than the ResourceQuota in effect for the namespace is also somewhat non-sensical, if still doable, as long as the Request falls below the ResourceQuota. And so on.

WHY IS RESOURCEQUOTA INSUFFICIENT?

If the cluster starts to experience CPU / Memory shortage, Kubernetes will need a way to prioritise the pods. Pods that do not have requests / limits attached to them are, from kube-scheduler's point of view, unreasonable. "Unreasonable" meaning that there is no way for the scheduler to make any sensible decisions about what to put and where. Therefore those pods will always be the first ones to get evicted from cluster.

With CPU, kube-scheduler may throttle the pods and the processes will simply run slower. With RAM, this is not possible due to obvious reasons, and evictions will happen.

The full story and details about how kubernetes selects evictions in case of resource shortage will have to wait for a later time.

OBJECT-COUNT resources

Then there are limits how many configmaps, secrets, pods, and other kubernetes objects a namespace can hold. There is no kubernetes-side hard limit on how big a configmap or secret can be, but the etcd backend datastore can only hold 1MB objects, and other parts of the plumbing in apiserver might also have other limits. Word on the streets seems to think that around 1MB is the current upper limit for a single configmap/secret/other objects. Some more information here: https://stackoverflow.com/a/53015758/1889463

We are postponing / skipping this side of resource limiting for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment