Skip to content

Instantly share code, notes, and snippets.

@GuillaumeDerval
Last active August 29, 2015 14:09
Show Gist options
  • Save GuillaumeDerval/c6a0a097125defa503d1 to your computer and use it in GitHub Desktop.
Save GuillaumeDerval/c6a0a097125defa503d1 to your computer and use it in GitHub Desktop.
Note for the new scheduler of INGInious

Notes about the new scheduler for INGInious

Why does INGInious needs a (new) scheduler?

There is a problem with the assumption that the OS will handle the charge of multiples tasks running concurrently(right now, INGInious immediately launches tasks it receives, (nearly) without limits), mainly because of memory usage.

For a strange reason, processes are never/not enough put to swap, leading to OOM in other containers.

Can we resolve the swap problem?

There is an unused option in Docker (MemorySwap), that allows to fix the amount of Swap available to a container. But it seems that the default value is infinite, so this won't help.

So it seems that a failure of the OS to manage lots of processes using lots of memory.

CPU problems

When the scheduler will be done, we will probably restrict the number of running tasks to the number of CPUs/cores. We cannot do that right now because of the probable "starvation" (in the sense described below) that could occur.

What about the pause (freeze) command in docker?

The freeze command in docker put a running container hierarchy inside the Freezer control group. The system will then exclude the processes from the container from its own scheduler, which that no CPU will be used by the container anymore.

That would be useful if the CPU was a problem, but that's not the case. But when a process is in the Freezer cgroup, there should be (but I can't find any indication about that anywhere) greater chance for this process to be swapped. This needs to be tested.

What are the available actions for the scheduler?

The scheduler can

  • Start a task
  • Kill a task (for restarting it later) (to avoid?)
  • Verify the status of the task (task is done or not)
  • Freeze a task
  • Unfreeze a task

So the scheduler needs to be non-preemptive, and should be aware that when a task is launched, it may (if we don't consider the option of killing/restarting it) never stops before the timeout.

What are the constrains that applies on the scheduler?

  • CPU constraints
  • Memory constraints
  • Must avoid starvation
  • Need to minimize the total time to execute the tasks (with trade-offs)
  • Short tasks must have a short waiting time, while long tasks may have very long waiting times (this is another type of starvation). This is clearly the biggest point to optimize.
  • As said earlier, the scheduler cannot be (completely) preemptive: once a task is started, it cannot be put out of memory (but can be frozen)
  • New tasks can be added at any time.

Why not putting the status of a container to disk?

The best solution available (Criu http://criu.org/Main_Page) is not really compatible with Docker(http://kimh.github.io/blog/en/criu/experiment-to-suspend-and-resume-docker-container-with-criu/, http://criu.org/Docker), so we can't use these kind of techniques right now.

Are there open-source scheduler available to get some ideas?

HTCondor does seem to have the ability to move tasks from one computer to another, which means its scheduler can be/is preemptive. There are loads of documentation about schedulers used in operating systems, but most of them are preemptive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment