next up previous
Next: The Planet CCRMA package Up: Planet CCRMA at home Previous: Installing Planet CCRMA on


Understanding low latency

Note: this is an old section of the old install guide that I'm keeping around as the contents are still somewhat valid.

What we are trying to optimize is the latency of the whole system. What is 'latency' in this context? Roughly speaking, the time that elapses between a hardware device issues a hardware interrupt, and the time the process that deals with it is run.

Hardware interrupts

Let's assume we are talking about your sound card and that your favorite player is playing a soundfile. The soundcard has several internal buffers that have to be periodically filled by your program to keep playback free of interruptions. When one of those buffers is emptied the card issues a hardware interrupt. This is a pin in one of the sound card chips that ultimately links to a pin in the processor inside your computer. The interrupt is supposed to redirect the flow of instruction execution in the processor to an interrupt handler routine that is programmed to deal with whatever the interrupt is signalling (in this case refill with new samples a free buffer in the sound card). Here is the first roadblock that can add latency to the whole process. Interrupts have to be enabled before the interrupt line can actually affect the flow of program execution. But the linux kernel needs to disable interrupts sometimes, when it is in internal critical sections of code that cannot be interrupted. While the kernel is good at keeping this time short, sometimes those internal sections of code that need to be protected from interrupts are long and can delay the interrupt for quite a while. One of the (many) potential culprits is an unoptimized EIDE hard disk as, by default, the driver is set to very conservative settings that can keep the interrupts disabled for a long time. That makes it impossible to achieve low latencies. This is one of the reasons why we need to 'tune' EIDE disks.

So let's keep going. Assuming the interrupts are eventually enabled, the system will jump to the interrupt routine which is normally very short. Interrupts are disabled while inside the interrupt handler (which can obviously reenable them if that is possible), so the driver designers want to keep the code that executes in the handler as short as possible. This is not the code that will be sending samples back to the sound card! One of the actions that this code will take is to wake up a process that will deal with the rest of the task to be done. This is where the second potential readblock to short latencies occurs.

The scheduler and the low latency patch

The processor inside your computer is constantly switching between many tasks. Just do a ``ps auxw'' to see what is currently running in your computer. Each of those entries represent a separate 'task' that is sharing time slots of processor time. At any given time most of those programs are sleeping, waiting for an event that will wake them up. One of them is your playback program. Most of the time it is doing nothing, just waiting for another buffer to be available to be filled with samples. Actually your program (of maybe just a thread within it, if it is multi-threaded) is blocked at an alsa library call which in turn is blocked in a write or read to the actual device in the sound driver, which in turn is sleeping waiting for an interrupt from the soundcard to wake it up (I think this is how it works, wizards out there correct me if I'm missing something).

So the hardware interrupt handler will wake up the task that ultimately will lead back to unblocking your application so that it can supply the next bufferfull of samples to the sound card. At the time this happens the processor is probably busy with some other task, and some time will elapse till your task will transition from being awake to being running and doing useful work.

The kernel itself arbitrates this, each time its scheduler is run it checks for tasks that are ready to run (have been 'awakened'), finds the one that has the highest priority and gives it the processor (this is not the whole explanation, see the sched_setscheduler man page for all the details). For this to happen the scheduler has to run. And it is not running all the time. Getting the scheduler to run often enough is the target of the low latency patch. Sometimes the kernel needs to do lengthy tasks that are not broken up with scheduler runs. If the scheduler does not run, your task does not get a chance at grabbing the processor. If that time is long enough, all buffers inside the soundcard empty and a dropout occurs. The wizards that write the low latency patch try to identify those critical sections in the kernel empirically, and insert scheduler calls to break them up safely into shorter pieces, so that other tasks get a chance to run. So having the low latency patch installed and enabled can help a lot. Obviously linux is not a hard real time operating system and there is no way to guarrantee that your task will be awakened in time, but in the real world it is good enough (a real time os like QNX would be far more appropriate than Linux, Windows or MacOS for real time work).

Schedulling policies

So now we have the interrupts disabled for the shortest possible time, and the scheduler is running often enough so that the linux kernel itself does not introduce big latency hits every once in a while. But that is not enough. If your playback program is not running with high enough priority it could happen that the linux scheduler gives the processor to some other task, and your playback programs is stuck, awake but powerless, waiting for the next scheduler run to happen (and a chance to get the processor). Tasks have dynamic priorities assigned to them (see the nice and renice utilities) and you could make your task a high priority one and that would make things better. But even that is not enough. Priorities for this scheduling policy are dynamic and change over time. The higher the number of times the scheduler skipped a task that it ready to run, the higher the scheduler will increase its priority, so that eventually it will run when the priority has gotten high enough. All tasks can be interrupted at any time by another higher priority task, and even if your task has the highest dynamic priority, it will eventually lose to another process, most probably at the worst time (can you hear the click coming?). So what do we do now?

The scheduler has three different ways of scheduling tasks, the so called scheduling policies. The normal scheduling policy (SCHED_OTHER) works more or less in the way I have described so far. The scheduler selects the next highest priority task to run and gives it a go, but the scheduler can run again at any time (for example, it normally runs every 10 msecs no matter what, triggered by the timer tick) and your task can be interrupted and put temporarily back in the ready to run list. But there are two additional scheduling policies designed for real-time programs. Those are the First in-First out (SCHED_FIFO) and Round Robin (SCHED_RR) scheduling policies. Very low latency audio applications definitely have to be run with one of those scheduling policies, otherwise it is impossible to attain reliable 'under load' low latencies. The audio task has to have the highest priority no matter what. But with that power comes a responsability.

A task with SCHED_FIFO policy has a static priority that will not be altered by the scheduler and is higher than all other normal SCHED_OTHER tasks. Furthermore, SCHED_FIFO tasks have to voluntarily yield the processor back to the scheduler either through a system call or through calling sched_yield, in other words, that task cannot be interrupted by any of the normal tasks that are running in the linux environment (except, of course, by a task running with the same SCHED_FIFO policy and a higher static priority!). So, if your program has a bug, gets into an infinite loop and does NOT yield back to the scheduler the whole computer will freeze. It will not crash in the absolute sense of the word. It is still running quite nicely, but your task is using all the processor time and not yielding back to the scheduler so that no other task gets a chance to run, not even the kernel (and its scheduler). Ever. You have to power-cycle the whole thing or press the reset button if you have one (I'm amazed at the optimism of the hardware designers that do not include reset buttons in their computers). Low latency real-time priority apps have to be very well designed. Some spawn an additional processe that runs periodically with higher SCHED_FIFO priority than the main task, so that they can check on a stuck process and kill it, a watchdog approach that saves you from a complete freeze.

Phew, that was long...

Summary

Summarizing, you need tuned drivers that do not disable interrupts for long, low latency patches in the kernel so that the scheduler runs often enough and your application itself has to run with the SCHED_FIFO scheduling policy so that it gets the best chance of grabbing the processor when it needs it.

When everything is in place things work incredibly well. The system can be running an audio task with no dropouts and a few milliseconds of latency while the computer is being loaded with disk accesses, screen refreshes and whatnot. The mouse gets jerky, windows update very slowly but not a dropout to be heard.

I wonder if anybody got this far :-)



Subsections
next up previous
Next: The Planet CCRMA package Up: Planet CCRMA at home Previous: Installing Planet CCRMA on

© Copyright 2001...2011 Fernando Lopez-Lezcano, CCRMA, Stanford University.
All rights reserved.