This year, Bootlin engineer Maxime Chevallier attended the Real-Time Summit, which took place after ELCE in Edinburgh. In a similar fashion to the Linux Media Summit, this was a 1-day conference dedicated to Real-Time topics in the Linux ecosystem, more specifically the Linux Kernel.
The future of Preempt-RT
Real-Time (or should we say, deterministic) behavior in the Linux kernel has been pursued for a long time, the most famous effort being the Preempt-RT patch. As Steven Rostedt announced during his talk at ELCE 2018, the Preempt-RT patch is close to being fully merged in mainline Linux, we can expect to see this happen in 2019.
Some of the maintainers of the Preempt-RT patch were present at the Real-Time summit, including Thomas Gleixner who lead the discussion throughout the day.
This was the occasion to discuss the remaining points to be addressed for Preempt-RT to make it into mainline Linux :
- Printk : As Steven Rostedt explained at ELCE 2017, printk is not very real-time friendly. The main issue was worked around, but John Ogness presented his current work of fully redesigning printk’s behaviour.
- Thomas Gleixner talked about the current state of softirq handling, which is also a critical point for determinism. They work by “stealing” some irq context time, falling back to ksoftirqd when necessary. This is particularly problematic for networking drivers that heavily rely on softirq.
- Peter Zijlstra exposed the different scheduler related issues that needs to be addressed, focusing on SCHED_DEADLINE.
Modeling and analyzing the kernel behavior
All the talks weren’t about the Preempt-RT match merging effort. Daniel Bristot de Oliveira presented his ongoing academic work on modeling the Linux task model. The idea here is to build a formal model that doesn’t take shortcuts or idealize the way tasks are handled in the kernel, so that this can be used as a basis for academic research on topics such as scheduling.
One of the main arguments is that there’s a gap in terms of language and methodology used between kernel developers and the academic world. Daniel explained how he managed to build a huge state-machine representing the task model, and how he uses it now to verify that tasks behave how they should by running trace events in the state machine.
This talk sparked a lot if interesting discussions, for example Peter Zijlstra suggested to compile the state machine into eBPF code and run it live in the kernel.
Julia Lawall was present in the room, and improvised a talk inspired by Daniel’s presentation. She presented DSAC, a static analysis tool dedicated to finding Sleeping in Atomic Context bugs. Julia is involved in the development and use of the coccinelle tool, and explained that it is quickly limited when trying to find that categories of bugs, where sleeping calls can be deeply nested in a call stack protected by spinlocks. Using LLVM, DSAC can analyze complex scenarios with multiple level of nesting and indirect calls to detect SAC bugs. After analyzing the v4.17 kernel sources for only a few hours, the tool was able to detect more than 1000 bugs, 220 of which were confirmed.
The overall technical level of the different talks was high, leading to passionate discussions and suggestions on every topic that was brought during the day.