Skip to content

Instantly share code, notes, and snippets.

@CAFxX
Last active May 31, 2018 23:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CAFxX/df495d6eae14a0097f57d9f1c4acdfea to your computer and use it in GitHub Desktop.
Save CAFxX/df495d6eae14a0097f57d9f1c4acdfea to your computer and use it in GitHub Desktop.
Multi-tenant CPU fairness

Multi-tenant CPU fairness

Idea

We want to:

  • make sure that CPU usage in one container can't significantly affect CPU availability in a different container, regardless if the other container is below or above its CPU shares
  • minimizing false-positive CPU overload notifications
  • maximize utilization of unused CPU resources

To this end we would like to find a scheme that progressively gives more freedom in CPU resource consumption to processes that stay below their share of CPU resources (including using more than their allocated shares) while restricting processes that use more than their share of CPU resources, especially if CPU-bound, to use only their allocation.

To avoid performance cliffs, it would likely be better to find a progressive/dynamic solution rather than a step-wise/fixed one.

Design

Period duration heuristic

The key idea is to measure whether a process is CPU bound or not by characterizing typical resource consumption behavior for different types of processes:

  • CPU-bound processes are likely to exhaust their CPU quanta and be preempted by the scheduler.
  • I/O- or event- bound processes are likely to not exhaust their quanta and block on I/O or some lock; in general they are unlikely to be preempted. At the same time spinning processes may also be spinning around locks (e.g. a process with a memory leak and a concurrent collector), or around broken I/O calls.

Use avg_atom/nr_voluntary_switches/nr_involuntary_switches (or other entries) from /proc/$PID/sched; possibly augmented with knowledge on the number/type of syscalls made by the process.

  • Binary solution: define two classes (low_latency, bulk) each with an associated period (e.g. 10ms, 200ms); if Δnr_voluntary_switches > Δnr_involuntary_switches then assign to low latency, otherwise assign to high latency
  • Average-runtime solution: measure avg_runtime, i.e. how long the process runs on average before switching (both voluntary and involuntary); set the period to k*(avg_runtime) where k=2
  • Percentile-runtime solution: measure runtime_percentile_x, i.e. how long the process runs before switching (both voluntary and involuntary) up to a certain percentile x% (e.g. 90%); set the period to runtime_percentile_x

Period quota heuristic

The key idea is:

  • as actual average CPU usage by the container tends to 0, the period quota tends to num_cpu * period_duration
  • as actual average CPU usage by the container tends to num_cpu * period_duration, the period quota should tend to the minimum cpu time guaranteed by the CPU shares
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment