Skip to content

Instantly share code, notes, and snippets.

@nettings
Last active May 12, 2022 18:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nettings/8651e9645c31fcf3cfe448d2e993ad77 to your computer and use it in GitHub Desktop.
Save nettings/8651e9645c31fcf3cfe448d2e993ad77 to your computer and use it in GitHub Desktop.

Scheduling notes

last update 2022-05-12 11:00 by JN

Current Shylight deployments on Intel NUC require the Liquorix Kernel on top of an Ubuntu 20.04 LTS system. However, it seems that the shylight application does not make use of realtime scheduling. To verify, execute the following while shylight is running:

ps -eLo rtprio,cls,pid,pri,nice,cmd | grep shylight

You will see that most threads operate under the standard SCHED_OTHER class ("TS"), while some use the SCHED_BATCH queue ("B"), which has an even lower priority.

Since scheduling jitter has affected the reliability of the EtherCAT thread in the past (which has lead to the recommendation to use the Liquorix kernel), it would be worthwhile to try SCHED_FIFO on the EtherCATMaster thread (or whatever is actually handling the low-level EtherCAT traffic in shylight).

Scheduling classes, niceness and priority

The Linux kernel has a number of different scheduling classes.

For standard and low priority tasks, these are SCHED_OTHER (the standard queue), SCHED_BATCH (for non-interactive but potentially CPU-intensive tasks), and SCHED_IDLE (for idle tasks that are not urgent but should use leftover CPU time). Tasks in the OTHER and BATCH queues will obey the nice value. For all other queues, it has no effect.

For timing-critical tasks, we have SCHED_FIFO (first-in, first-out queuing, every thread with a higher rt priority preempts everyone below and runs until it gives up control explicitly), SCHED_RR (round-robin, same as FIFO except there is a maximum assigned time slice, thread will be blocked when it is exhausted), and SCHED_DEADLINE. Tasks in the FIFO and RR queues will be ordered according to the rtprio value.

On Linux, each thread has its own scheduling class, niceness and rtprio. Individual threads and their scheduling parameters can be observed with the htop command line tool or the ps invocation mentioned above. ps uses shorthands for the schedling classes:

ps scheduling class observes niceness? observes rtprio? notes
TS SCHED_OTHER yes no the default
B SCHED_BATCH yes no non-interactive, cpu-intensive
IDL SCHED_IDLE no no idle tasks
FF SCHED_FIFO no yes realtime-critical tasks, can hang the system
RR SCHED_RR no yes realtime-critical tasks, will be pre-empted
DL? SCHED_DEADLINE no no see man 7 sched

Choosing a suitable rtprio

The kernel uses the same scheduling queues for its own device driver threads. When using ps, kernel threads can be identified by brackets around the "command" name.

When deciding the appropriate priority for a real-time userspace thread, a few things must be taken into account:

  1. Order the relative priorities of interacting userspace threads correctly.

Example: The JACK sound server main thread should have a slightly higher rtprio than client threads. Both should have higher priority than any other userspace rt threads (if any).

  1. Consider the priorities of relevant kernel threads.

With userspace RT scheduling, it is possible to starve even the kernel of CPU time! Most kernel tasklets ("interrupt handlers") run SCHED_FIFO at rtprio 50. In our case, that would be true also for the Ethernet driver tasklet. If we go above that, we could theoretically starve our hardware handler. Not good. So let's start below 50.

  1. Consider whether interrupt handlers should be deprioritized or raised.

Let's say your hardware is so weird that whenever the user wiggles the mouse, you are missing an Ethernet packet deadline. You could then identify the mouse interrupt handler and lower its rtprio, or gently raise the Ethernet handler to, say, 55, to make it win over all other kernel hardware jobs.

Note: The absolute values of niceness and rtprio mean nothing, they are just used to create an ordered set.

Notes on real-time safety

Any SCHED_FIFO thread must be strictly realtime safe, i.e. have absolutely deterministic run-time, lest it bring down the system. It must not

  • allocate memory except from a custom pre-allocated memory pool that is known to be rt-safe
  • make any system calls
  • do any high-level hardware I/O (that includes GUI stuff)
  • sleep
  • block

Instead of blocking, it should call sched_yield() when the job is done.

A good rule of thumb for calling external library functions is: if you don't know the call tree all the way down to the bare metal, it's probably not realtime safe.

If any of the pages accessed by a realtime thread are swapped out to disk, you are in trouble, since reacting to a page fault and getting the page back (from spinning rust in the worst case) will miss deadlines.

To prevent this, use the mlock() mechanism to pin memory to RAM and prevent it from ever being swapped out. Since this operation is also a potential DOS, the user needs to have the appropriate capabilities.

Scheduling and permissions

By default, only root may obtain SCHED_FIFO or SCHED_RR when creating a thread, since any such thread could effectively starve everyone else, or, in the case of SCHED_FIFO, lock up the machine hard with no way of recovery.

It is possible to grant realtime scheduling and memory locking privileges to selected users or groups using the ulimit mechanism configured by /etc/security/limits.conf. Keep in mind that you are giving away denial-of-service privileges! To allow all members of the audio group to obtain realtime scheduling, add the following to said file:

@audio		-	rtprio		49
@audio		-	memlock		<kbytes>

Kernel configuration and real-time patches

All realtime scheduling settings are available in the stock kernel, provided it has been configured with CONFIG_PREEMPT_DESKTOP. That is the case for many distribution kernels. It allows kernel tasks to be pre-empted by userspace tasks where possible.

With the addition of the real-time preemptible kernel patches from https://cdn.kernel.org/pub/linux/kernel/projects/rt/ , additional preemption points are added to the kernel in long code paths. These preemption points are activated by setting CONFIG_PREEMPT_RT (which is not available in stock kernels).

Even without the full RT patches, using SCHED_FIFO has huge benefits in practice for standard audio tasks, and it might solve the EtherCAT issues we have been seeing.

Ubuntu does not run a CONFIG_PREEMPT_DESKTOP-enabled kernel by default, but Ubuntu Studio does.

@nettings
Copy link
Author

nettings commented May 12, 2022

@nettings
Copy link
Author

@nettings
Copy link
Author

nettings commented May 12, 2022

A howto on manual IRQ rtprio tuning I wrote a long time ago for the FFADO project. Possibly outdated in detail, but the general concept is still valid:
http://subversion.ffado.org/wiki/IrqPriorities

@nettings
Copy link
Author

Rui's rtirq may also show some bitrot, but it still contains very useful clues and might even still work usefully as-is: https://github.com/rncbc/rtirq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment