This year, Bootlin missed the Embedded Linux Conference North America which took place late August in San Diego, US. It was the first time in many years that Bootlin was completely absent from an Embedded Linux Conference.
But the coming Embedded Linux Conference Europe is going to be different in that respect: Bootlin will once again have a strong presence at this event, which in 2019 takes in Bootlin’s home country, France, from October 28 to October 30. And this year, ELCE is not only in France, but more precisely in Lyon, the city where one of the 3 Bootlin offices is located, so for some of our engineers it will be a very local conference!
Flash subsystems status update, from Miquèl Raynal and Richard Weinberger. Miquèl is a Bootlin engineer, maintainer of the NAND flash subsystem in Linux, and co-maintainer of the MTD subsystem. He will co-present with the other MTD co-maintainer Richard Weinberger an update on the MTD subsystem, its recent changes and future work.
Every year, the X.Org community organizes the X.Org Developers Conference, the main conference to discuss graphics support in Linux. Despite the name, the conference is no longer restricted to X.Org topics, but also covers Wayland, Mesa3D and many other topics.
The 2019 edition will take place on October 2-4 in Montréal, Canada, and the schedule of this event is already available.
Bootlin engineer Paul Kocialkowski will participate to this conference. Paul is Bootlin’s display and graphics expert, he is one of the developer of the Allwinner VPU support in Linux and has made several contributions to the Allwinner DRM driver, as well as worked on the RaspberryPi graphics controller automated testing. Participating to this conference allows us to stay up-to-date with the latest developments in the Linux graphics community.
If you’re attending the conference, do not hesitate to get in touch with Paul!
Kernel Recipes has become over the past few years a well-known conference, with an interesting line-up of speakers and an audience limited to 100-150 attendees giving a particular atmosphere to this event. Bootlin engineers have regularly participated and gave several talks at Kernel Recipes or Embedded Recipes in previous editions (2013, 2016, 2017, 2018).
This year, Bootlin engineer Grégory Clement will participate to the 3 days of Kernel Recipes in Paris, on September 25-27. Do not hesitate to get in touch with Grégory during the event, to discuss Linux kernel development, embedded Linux, career or business opportunities with Bootlin.
The next edition of the Linux Plumbers conference will take place from September 9 to September 11 in Lisbon, Portugal. A number of engineers from Bootlin will participate to Linux Plumbers, to attend the Networking Summit track and many of the other micro-conferences organized as part of this event.
SiFive is a semi-conductor company that produces chips based on the RISC-V architecture. On May 15th, they organized a Technical Symposium in Grenoble on May 15th and we took the opportunity to attend, as the agenda looked interesting.
It was especially nice having Krste Asanovic present many of the topics, wearing different hats (RISC-V Foundation Chairman of the Board and SiFive Co-Founder and Chief Architect). The RISC-V architecture and its history and use cases were presented. One of the main benefit of having a brand new ISA (instruction set architecture), Asanovic said, is that it doesn’t have to handle legacy instructions and compatibility. Moreover, RISC-V is a frozen ISA, the base instructions are frozen and optional extensions which have been approved are also frozen. Finally, the ISA is open and anybody can implement a CPU core. During the presentation, the RISC-V ISA was (obviously) favorably compared to competing ISAs, mainly ARM.
Another interesting topic was the presentation of SiFive’s business model. They want anyone, including small companies to be able to design an SoC fitting their particular product, instead of having to choose from a set of more general purpose SoC. This can be done by using an existing SiFive RISC-V core or by customizing one. SiFive then offers a library of IPs that can be added on the SoC and third party IPs are available through their Designshare program. They handle NDA, contract and licensing and will collect non recurring engineering costs and royalties once the SoC is mass produced but not during the prototyping phase. They first provide virtualized chips and then sample chips. For the core, they also provide RTL that can run on FPGAs. For mass production, SiFive partnered with TSMC and their customers can benefit from their process (down to 7nm).
The most relevant topic for us was the software ecosystem. There is a very nice will to get code upstream and this is the case for GCC, binutils, newlib, gdb, glibc, qemu. Clang/LLVM is coming up. Regarding the Linux kernel port, it still requires some work as the core architecture support is there but no devices drivers or device tree support yet. There is however a fully working vendor tree. FreeBSD seems to be in the same state.
Most of the remaining time was focused on the design and customization tool available here.
SiFive also Sponsored Linus Sebastian (from Linus Tech Tips) for a video:
To conclude, it was an very interesting day. At Bootlin, we are delighted to see architecture designers and silicon vendors actively pushing software support upstream and we are looking forward to work on RISC-V platforms.
Real-Time (or should we say, deterministic) behavior in the Linux kernel has been pursued for a long time, the most famous effort being the Preempt-RT patch. As Steven Rostedt announced during his talk at ELCE 2018, the Preempt-RT patch is close to being fully merged in mainline Linux, we can expect to see this happen in 2019.
Some of the maintainers of the Preempt-RT patch were present at the Real-Time summit, including Thomas Gleixner who lead the discussion throughout the day.
This was the occasion to discuss the remaining points to be addressed for Preempt-RT to make it into mainline Linux :
Printk : As Steven Rostedt explained at ELCE 2017, printk is not very real-time friendly. The main issue was worked around, but John Ogness presented his current work of fully redesigning printk’s behaviour.
Thomas Gleixner talked about the current state of softirq handling, which is also a critical point for determinism. They work by “stealing” some irq context time, falling back to ksoftirqd when necessary. This is particularly problematic for networking drivers that heavily rely on softirq.
Peter Zijlstra exposed the different scheduler related issues that needs to be addressed, focusing on SCHED_DEADLINE.
Modeling and analyzing the kernel behavior
All the talks weren’t about the Preempt-RT match merging effort. Daniel Bristot de Oliveira presented his ongoing academic work on modeling the Linux task model. The idea here is to build a formal model that doesn’t take shortcuts or idealize the way tasks are handled in the kernel, so that this can be used as a basis for academic research on topics such as scheduling.
One of the main arguments is that there’s a gap in terms of language and methodology used between kernel developers and the academic world. Daniel explained how he managed to build a huge state-machine representing the task model, and how he uses it now to verify that tasks behave how they should by running trace events in the state machine.
This talk sparked a lot if interesting discussions, for example Peter Zijlstra suggested to compile the state machine into eBPF code and run it live in the kernel.
Julia Lawall was present in the room, and improvised a talk inspired by Daniel’s presentation. She presented DSAC, a static analysis tool dedicated to finding Sleeping in Atomic Context bugs. Julia is involved in the development and use of the coccinelle tool, and explained that it is quickly limited when trying to find that categories of bugs, where sleeping calls can be deeply nested in a call stack protected by spinlocks. Using LLVM, DSAC can analyze complex scenarios with multiple level of nesting and indirect calls to detect SAC bugs. After analyzing the v4.17 kernel sources for only a few hours, the tool was able to detect more than 1000 bugs, 220 of which were confirmed.
The overall technical level of the different talks was high, leading to passionate discussions and suggestions on every topic that was brought during the day.
The Linux Plumbers Conference (LPC) was held a few weeks ago in Vancouver, BC. As always there were several tracks where contributors gave a presentation of on-going or future work, and discussed it with the audience, on specific topics such as thermal, containers, real time, device tree and many more. For the first time at LPC a 2-day networking track took place. As we work on a diversity of networking projects at Bootlin we decided to attend.
The hot topic of the last couple of years in conferences in the network subsystem is XDP, so the conference was not exception. We saw a handful of talks and discussions about the on-going work and support of XDP within the kernel. XDP provides a programmable network data path (using eBPF) in the Linux kernel to process bare metal packets at the lowest point in the network stack. Packets are processed directly in the drivers’ Rx queues, before any allocation happen (such as socket buffers). Facebook is one well known heavy user of this technology (every packet toward Facebook is processed by XDP) and its engineers gavefeedback about how they use XDP and the issues they faced. Other projects and companies are currently evaluating and starting to use XDP as well: we also saw presentations about XDP/eBPF in Open vSwitch, DPDK or kTLS.
While XDP/eBPF was featured in most of the discussions, other interesting topics where brought up. Andrew Lunn gave a presentation about the current need to go beyond 1G copper PHYs for many Linux enabled embedded devices. This was very interesting for us as we used and worked on the technologies used within the Linux kernel to address this, such as Phylink and the SFP bus (we used those when enabling 10G interfaces in the Marvell MacchiatoBin board).
Another presentation caught our attention as the topic was related to what we do at Bootlin. Jesse Brandeburg from Intel talked about the networking hardware offloads and their APIs. He exposed a brief history of the offloads supported by NICs and then showed some issues with the current APIs, where some use cases or behaviors are not clearly defined and sometimes overlap. This is a feeling we share as we experienced it while implementing some of those hardware networking offloads. Jesse’s idea was to open a discussion to come up with better solutions within the next years, as NICs offloading continue to grow.
The Linux Plumbers Conference was very pleasant and well organized. We had the chance to attend the networking track, seeing lots of great cutting-edge topics being discussed; as well as other interesting tracks.
We’d like to thank the conference and track organizers, we had a great time! Videos, slides and papers are now available on the official website or on Youtube.
This year’s edition of the Linux Media Summit happened a month ago, in Edinburgh, right after the Embedded Linux Conference. Since we were already at the ELCE, and that we’ve been more and more involved in the media community thanks to our work on the Allwinner CSI driver and more importantly the Cedrus driver, it was natural for us to attend.
The media summit is usually a meeting to discuss the hot topics, so the whole day was a mix and match of various status updates and discussions on the future needs and developments around the Video4Linux2 framework.
Most of the discussion was about how to improve the contributor’s experience and improve the maintenance. The DRM subsystem was used as an example, since the number of patches are in the same order of magnitude, and a number of v4l2 contributors are also contributing to DRM drivers. Part of the improvement of both the maintenance and contribution experience will also come through some CI work, so there was a lot of discussions on how to improve the already existing tools (such as v4l2-compliance) but also how to setup some automatic tooling to run those tests as early as possible.
A good part of the day was also spent on dealing with the current developments, such as the Request API we’ve used in the Cedrus driver, and how to integrate that API into popular multimedia frameworks like gstreamer or ffmpeg. It looks like our libva implementation was well received, so it will probably be made standard and hosted on linuxtv.org in the near future. Other developments discussed were fault tolerant v4l2, in order to deal with video pipelines where one or several components might not work anymore, and storing the v4l2 controls state in a persistent way.
It was overall a very productive day, and it’s always nice to meet people you interact with over mailing list and IRC on a regular basis. If you want more information, you can read the extensive report.
Next week-end, a local free and open-source software conference called Capitole du Libre will take place in Toulouse, France, where Bootlin has one of its offices. Bootlin will participate to this event in several ways:
Bootlin engineer Maxime Chevallier will give a talk about Networking under Linux, in which he will give an introduction to the Linux kernel networking stack.
We encourage free software developers and users from the south west of France to join this event, which has been organized for several years, and provides a very nice selection of talks and tutorials. And of course, this conference is entirely free, and no registration is required.
The Embedded Linux Conference Europe edition 2018 took place a few weeks ago in Edinburgh, Scotland, and no less than 9 engineers from Bootlin attended the conference. While our previous blog post shared the videos and slides of our talks, tutorials and demos, in this blog post we would like to highlight a selection of talks that Bootlin engineers found interesting. We asked each of the 9 engineers who attended the event to pick one talk they liked, and make a small write-up about it. Of course, many other talks were interesting and what makes a talk interesting is very subjective!
Getting Your Patches in Mainline Linux: What Not To Do (and a Few Things You Could Try Instead), by Marc Zyngier
Talk selected by Maxime Ripard
Marc gave a talk on a subject that is often debated, and still confusing to newcomers: how to contribute. He first started by presenting the various actors involved in a contribution: a contributor, a maintainer and a reviewer. He also took the time to explain the various objectives that everyone has which is something that is often overlooked by the other parties and the conferences on this subject. He then went on to explain and document the good practices that can be used in order to contribute to most subsystems. This was overall a great overview, and we definitely recommend it to people willing to start contributing.
Real Time is Coming to Linux; What Does that Mean to You? , by Steven Rostedt
Talk selected by Michael Opdenacker
In this talk about PREEMPT_RT, the speaker, who’s a long time contributor to this feature, was approaching the subject on a new angle, taking for granted that PREEMPT_RT is in mainline Linux. That’s not quite right yet, but this is possible before the next Embedded Linux Conference, in August next year. One proof that this is on the verge of being true is that its authors no longer call it a patch set, but just PREEMPT_RT. Rostedt also added that Linux can now be called a Deterministic Operating System (aka DOS!).
So, Rostedt first explains what PREEMPT_RT is about and how it addresses the challenges of users who are determined to be deterministic (that’s my pun here, not Steven’s).
Doing this, Steven recalled the “Priority inheritance” issue that is best known through the fact that it happened on Mars on the Pathfinder robot. A high priority and critical system process got starved by a lower priority one because an even lower priority process was holding the lock the high priority process was waiting for, causing some system services to be unavailable. This caused a watchdog to kick in and reboot the system endlessly. Such an issue is addressed by “Priority inheritance”, allowing a lock-holding process to inherit the priority of the highest priority process waiting for the lock. Priority inheritance is now supported in kernel locks thanks to PREEMPT_RT.
By the way, I learned that there are now 5 preemption models in the kernel, instead of four originally with PREEMPT_RT. There is now a “Basic RT” option in which you have all the PREEMPT_RT features except the sleeping spinlocks, which is useful for debugging such features.
So now that PREEMPT_RT is almost in mainline, what should kernel developers do? The main thing is to stop adding non determinism to Linux. For example, Rostedt strongly advised against rw_locks and semaphores on multiple CPUs. That’s horrible for cache lines, as they do not scale. You should use RCU mechanisms instead.
As a kernel developer, you shouldn’t use preempt_disable() either, unless you know it is done for a very short amount of time. Similarly, if you find code that uses local_irq_save(), that’s most likely a bug. Instead, people should use spin_lock_irqsave() and spin_lock_irq(), which disable interrupts only when PREEMPT_RT is not enabled.
Rostedt ended his talk by answering a question about what will remain of the PREEMPT_RT patch set. Even when the most important parts of PREEMPT_RT are in mainline, some changesets are likely to remain for some time, just to address cases that don’t have a solution yet. 99.9% of the users will be able to do without it. That’s what a mainline solution means: no patches to apply.
Uh-oh, It’s I/O Ordering! by Will Deacon
Talk selected by Miquèl Raynal
Will gave his second talk at an ELCE about I/O ordering, 6 years after the first talk on that subject. For this purpose, he started with an introduction to the memory consistency models (in 5 minutes!) to show the audience how a very simple program, ran on two CPUs, could produce very strange results due to store buffering. Because his assumption was a bit hard to believe for such a simple program, he proved us he was right by actually running it on his laptop. While such kind of tricky behavior applies to memory, the same odd situation may happen with I/Os! After a theoretical explanation, he gave a few examples (mostly taken from the mainline Linux kernel) of good and bad code sections and explained why. If you are a device driver writer, this talk should be of interest! The examples are real use cases that you might encounter someday (if not already) and knowing how to workaround the most generic caveats with the right memory barrier or even doing a dummy read to enforce ordering is something you will want to master to avoid strange random bugs.
Sebastian started the talk by presenting what this subsystem is used for and its history, which he knows in great length since he took over the maintainership of the power supply subsystem in the Linux kernel in 2014. While it’s not the subsystem with the hardest concepts to grasp, Sebastian explained that he aimed, with his talk, at providing an accessible approach to the subsystem for people who’re trying to get started in the Linux kernel or in this specific subsystem. Having contributed to this subsystem a few patches and drivers in my early days as a kernel developer, I can say that I wish I had seen his talk before to quicken my understanding of the power supply subsystem. Scrolling down the slides, he presented a very simple example of a dummy driver, Device Tree nodes and how to configure what’s exposed to sysfs. Sebastian also gave a few words on Open-Circuit Voltage in batteries which is interesting for getting more precise values of the battery capacity depending on its age and temperature, and the ongoing work on supporting this in the kernel. He concluded with the future plans for the subsystem, which are mainly related to batteries, their fuel gauges and chargers.
Arnd gave an update on the status of the effort to get a 32-bit kernel handle the 32-bit time_t overflow which will happen in January 2038. He first started to explain why this is necessary. This boils down to the huge number of 32-bit products that are still being introduced on the market with some of them having a very long service life. Arnd said this work has been on-going since 2014, when John Stultz switched the internal timekeeping code to a 64-bit second counter. The device drivers then needed fixing. This was done by addressing them individually by changing:
time* to ktime_t
time* to jiffies
time_t to time64_t
timespec/timeval to timespec64
CLOCK_REALTIME to CLOCK_MONOTONIC
The driver userspace interface also needed to be changed. Some IOCTLs were easy to change because they are already using different numbers depending on the size of the argument they take. The other IOCTLs had to be redefined. It gets worse Arnd said, explaining how the read, write and mmap callbacks are getting fixed.
While the VFS layer got fixed earlier this year, some filesystems are still work in progress and other ones are not fixable because they use a 32-bit time on disk. The only way is to move away from those.
Arnd then went over the biggest remaining part of the work, the system calls. The 32-bit compat syscalls mechanism is reused and a __kernel_timespec type has been introduced to handle time at the boundary. He then listed the affected system calls and their current status.
He ended by talking about userspace and the plan to handle the issue in glibc. Finally, he mentioned what distributions will have to do.
On this Rock I will Build my System – Why Open-Source Firmware Matters, by Lucas Stach
Talk selected by Grégory Clement
Lucas started to present what we used to have in embedded world: a minimalist firmware which acts only as a bootloader and with no interaction with the kernel.
Then he showed why with the virtualization there were some needs to have CPU power management in a single place. This was defined by the PSCI: the purpose of it was to have the bare-metal and the virtualized kernel seeing the same interface. What should have been a simple and delimited interface then became more and more complex due to the hardware constraints. Indeed, in many SoCs multiples devices or CPUs can share the same register. Besides, an interface such as the I2C used by a PMIC can also be shared. This lead to moving the entire register inside the firmware or to have lock mechanisms between the kernel and the firmware. In conclusion, the kernel implementation became easier but at the expense of a complex firmware.
The sad news, is that most of the firmwares are not copyleft which can lead to closed source binaries, making the debugging very difficult for the kernel. Even if the firmware remains open source, having the hardware management split in two parts, makes the debugging more complex. However, there is nothing we can do about it, because there are valid reasons to have a firmware. The only thing we should be vigilant about is the openness of the firmware source.
Handling Security Flaws in an Open Source Project, by Jeremy Allison
Talk selected by Antoine Ténart
Samba is a well known re-implementation of the SMB protocol and as such is used in several consumer devices — such as NAS. As open source software are more and more used in new products, correctly handling security flaws and their fixes is becoming an important topic.
Jeremy Allison, one of the core developers of Samba, gave a talk about how Samba is dealing with security issues and what questions other projects should ask themselves to handle those the right way. He talked about the process to put in place to take security seriously, how to respond to vulnerability reporters and to security issues, and how to notify downstream vendors so that products in the wild are patched before the CVE is made public.
Jeremy Allison also presented three examples of security flaws in Samba. He described how they were handled at the time, the difficulties the Samba developers encountered, and gave a postmortem.
Security is important and we found this talk to be a must-see for open source maintainers and developers, as it gave a good insight on how to properly handle security vulnerabilities in a project. One of the key points was how to coordinate the security responses to avoid having the users being at risk.
Improve Linux User-Space Core Libraries with Restartable Sequences, by Mathieu Desnoyers
Talk selected by Maxime Chevallier
Following-up on the good LWN coverage of the restartable sequences, Mathieu Desnoyers gave an interesting talk on the current userspace support, and some feedback regarding the shortcomings of the current implementation.
Restartable sequences allow to implement lockless per-cpu sections of code, that will be automatically aborted (or restarted) whenever migration, preemption or signal delivery occurs before the final “commit” operation is done.
This is useful to read some performance counters from userspace with a minimal overhead since there’s no lock involved to protect the critical section.
Mathieu explained that these critical sections need to be written in assembly code, but thanks to the librseq and its set of macros, users shouldn’t have to worry about this.
Mathieu then presented some of the shortcomings of rseqs, one of them being that they can’t be debugged in step-by-step (since a signal interrupts the sequence, causing it to abort). To solve these shortcomings, Mathieu gave a quick glimpse of a possible new system-call, cpu_opv(), that would allow users to execute a limited sequence of instructions with preemption and migration disabled.
Power Debugging with JTAG, by Patrick Titiano & Alexandre Bailon, Baylibre
Talk selected by Thomas Petazzoni
In this talk, BayLibre engineers Patrick Titiano and Alexandre Bailon introduced libSoCCA (SoC Continuous Analyzer), a Python library that allows to watch over JTAG what a SoC is doing.
This library allows remote access to the registers of a SoC through JTAG, and uses the SoC interconnect debug port rather than the CPU debug port. Non-intrusive observation of what the SoC is doing is thus possible, even when the CPU is idle or in a low-power state.
libSoCCA uses SVD (System View Description) files, which are XML files that describe all the registers of the SoC, their bitfields and possible values. This format is not specific to libSoCCA, since it is already used by Keil, and apparently some SoC vendors provide such SVD files for their SoCs. Unfortunately, not all vendors do this, and creating such SVD files from the SoC datasheet is a very long and boring process. In addition, the speakers pointed out that the SVD file format lacked an include directive, which would be very useful to share register definitions between SoC.
With the information provided by the SVD files and a connection to the target over JTAG that uses OpenOCD, libSoCCA is then used to implement a number of different
PMUGraph, which shows power management statistics of the device. Compared to solution such as perf or powertop, this solution has the advantage of being non-intrusive.
memtool, which provides a way of manipulating registers without having to manually fiddle with register offsets and bitfields. It could be summarized as a remote devmem that knows your SoC registers. This kind of feature can be found in proprietary JTAG tools, and was lacking in the open-source world.
clocktool (development not started yet), which shows the state of the SoC clocks remotely, a bit like clk_summary in debugfs, but which works even when the SoC is idle or in a low power state, which is precisely a moment where getting clock status may be useful for debugging.
Overall, we found libsocca very interesting as it opens up lots of possibilities. It would be useful to have a better file format than SVD to describe SoC registers though, and it would also be nice to have an on-target variant of memtool.