GuillaumeDerval/scheduler.rst

## scheduler.rst

      
    Raw
  

              scheduler.rst
            
          
    Notes about the new scheduler for INGInious


Why does INGInious needs a (new) scheduler?

There is a problem with the assumption that the OS will handle the charge of multiples tasks running concurrently(right now, INGInious immediately launches tasks it receives, (nearly) without limits), mainly because of memory usage.
For a strange reason, processes are never/not enough put to swap, leading to OOM in other containers.

Can we resolve the swap problem?

There is an unused option in Docker (MemorySwap), that allows to fix the amount of Swap available to a container. But it seems that the default value is infinite, so this won't help.
So it seems that a failure of the OS to manage lots of processes using lots of memory.

CPU problems

When the scheduler will be done, we will probably restrict the number of running tasks to the number of CPUs/cores.
We cannot do that right now because of the probable "starvation" (in the sense described below) that could occur.

What about the pause (freeze) command in docker?

The freeze command in docker put a running container hierarchy inside the Freezer control group.
The system will then exclude the processes from the container from its own scheduler, which that no CPU will be used by the container anymore.
That would be useful if the CPU was a problem, but that's not the case. But when a process is in the Freezer cgroup, there should be (but I can't find any indication about that anywhere) greater chance for this process to be swapped. This needs to be tested.

What are the available actions for the scheduler?

The scheduler can

Start a task
Kill a task (for restarting it later) (to avoid?)
Verify the status of the task (task is done or not)
Freeze a task
Unfreeze a task

So the scheduler needs to be non-preemptive, and should be aware that when a task is launched, it may (if we don't consider the option of killing/restarting it) never stops before the timeout.

What are the constrains that applies on the scheduler?


CPU constraints
Memory constraints
Must avoid starvation
Need to minimize the total time to execute the tasks (with trade-offs)
Short tasks must have a short waiting time, while long tasks may have very long waiting times (this is another type of starvation). This is clearly the biggest point to optimize.
As said earlier, the scheduler cannot be (completely) preemptive: once a task is started, it cannot be put out of memory (but can be frozen)
New tasks can be added at any time.


Why not putting the status of a container to disk?

The best solution available (Criu http://criu.org/Main_Page) is not really compatible with Docker(http://kimh.github.io/blog/en/criu/experiment-to-suspend-and-resume-docker-container-with-criu/, http://criu.org/Docker), so we can't use these kind of techniques right now.

Are there open-source scheduler available to get some ideas?

HTCondor does seem to have the ability to move tasks from one computer to another, which means its scheduler can be/is preemptive.
There are loads of documentation about schedulers used in operating systems, but most of them are preemptive.