Back from the Embededded Linux Conference: selection of talks #1

As we wrote in a previous blog post, 11 engineers from Bootlin attended the Embedded Linux Conference in Seattle in April. We have a tradition after such an event to share with you a selection of talks that we have found useful. In order to achieve this, we ask each of our engineers who participated to the conference to pick one talk they would like to highlight, and write a short summary/feedback about the talk. In this first installment of this series of blog posts, we’ll share our selection of 4 first talks.

Talk Real-Time Linux: What Is Next?, by Daniel Bristot de Oliveira, Sebastian Siewior, Kate Stewart

Talk chosen by Bootlin engineer Maxime Chevallier, who is also the author and trainer of our Real-time Linux with PREEMPT_RT training course.

As with every release, the Preempt-RT patchset is closer and closer to being fully merged in to the mainline tree. While there are still some issues left to be addressed, namely printk, the panel discussion was a good opportunity to discuss what’s next. What can we expect when Preempt-RT is merged, and what are the next steps going to be ?

Over the years, Preempt-RT and its developer community have used this cozy corner of the Linux kernel to work on new, advanced and complex features that eventually made it into the mainline kernel. Things aren’t going to change much on that regard, efforts to improve scheduling, or lately, RT-throttling, are now being made in the mainline tree.

Another point that was brought-up is the continuing need to educate users on how to use Preempt-RT. For people who need a Real-Time linux-based operating system, using Preempt-RT is just the beginning of the journey, most of the work really is about knowing how the entire system behaves in real-life conditions, and avoid latencies coming from the hardware, the kernel, or user applications. This won’t change much after Preempt-RT lands mainline, the need for education stays the same. Some propositions made to improve on the realtime linux wiki were made, to use it as a centralized place to share configuration examples and best practices.

Tools such as rtla have however made it much easier for users to gather and share information on the issues they encounter, this tool can be expected to gain more and more traction, and in turn help users get familiar with Real-Time debugging with more ease.

We’re all very excited to see the hard work of the Preempt-RT developers make it into mainline Linux, but it was clear that this is going to be just another step in the long journey of the Real-Time Linux project.

Compound Interest – Dealing with Two Decades of Technical Debt in Embedded Linux, by Bartosz Golaszewski, Linaro

Talk chosen by Bootlin engineer Luca Ceresoli

In this talk with some humor and lots of serious considerations, Bartosz discussed the high cost of “technical debt”, i.e. dealing with the cost of suboptimal solutions used in software development and the accumulation of workarounds and hacks over time to circumvent the effort of fixing the initial design.

By implementing a suboptimal software component a developer borrows time, which will have to be paid back when using this suboptimal software component (this is “technical debt”, compared to “financial debt”). Later on one can add workarounds and hacks instead of fixing the original design flaw, thus accumulating “compound interest”. The problem is much worse when the suboptimal design touches programming interfaces (especially the user space APIs!).

The problem with APIs is that after they are added people will inevitably start using them in new ways, some which were not even considered initially. This makes future restructuring more and more complex as none of the use cases should be broken.

In the Linux kernel this usually happens when a subsystem has a lot of users, so modifying its internal API is extremely costly. Another common reason is that people implementing a new subsystem tend to start by copying an existing subsystem, thus copying the same errors – and duplicating technical debt. Bartosz finds the lack of a good deprecation mechanism in the Linux kernel exacerbates this problem.

The GPIO subsystem, which Bartosz maintains, is a good example. It has a long history, predating device tree and the device driver model, it has lots of providers and many more users, finally it started as a very naïve set of function prototypes and evolved over time. It provides a number-based sysfs interface that is very problematic, on top of which a new, better interface has been developed, called gpiod.

The large variety of GPIO users (some in atomic context) and GPIO providers (some which may sleep) created a long-standing issue as serialization of calls to the GPIO subsystem appeared close to unfeasible. However Bartosz found a good solution using SRCU, which has been merged in v6.9. One part of the technical debt has been paid back, finally.

The debt that is the hardest to pay back is the one originating from user space APIs, which just cannot be removed for backward compatibility. People have been long encouraged to use the new gpiod interface but some still complain that it has no easy way to set a GPIO output persistently. Bartosz is working on a solution to this by implementing a D-Bus protocol to set GPIOs and a related daemon that is able to persistently keep GPIOs set. Bartosz will be very happy if people test it and report their findings! Once finished, this will leave no excuses to switch to the gpiod user API.

Overall we found this talk enlightening about how bad programming design creates technical debt, the difficulty to solve it and what to do to minimize it in the first place.

How (Not) to Get the Vendor Driver Merged Upstream, by Dmitry Baryshkov, Linaro

Talk chosen by Bootlin engineer Bastien Curutchet

In this talk Dmitry Baryshkov presented the fundamentals of a good contribution to the Linux kernel.

He started with the classic steps to fulfill when you want to contribute and mention the checks that have to be made before submitting a patch as:

  • ./script/checkpatch.pl to check code style compliance
  • make dtbs_check and make dt_bindings_check when changes are made in Device Tree bindings
  • make W=1 to fix all warnings

Then, Dmitry presented the main b4 commands that can be used to prepare and send your patches upstream. He also gave advice about the patch series contents: kernel must compile and work after application of each patch; it is better to split a big patch into small ones and a big patch series into smaller ones; commit log must describe why the introduced change is needed. Actually, Krzysztof Kozlowski also insisted on this last point in another good talk: giving context to maintainers help them during their review.

Finally he talked about the good and bad behavior habits when contributing. His first advice is ‘Don’t fear asking questions‘, if there is something that you don’t understand in a reviewer’s comment or question: don’t ignore it and ask for details.
The second thing to do is to leave people time to review. It is not needed to send a new iteration right after the first comment you get. It’s better to leave others time to add more comments (or even debate about some points) before sending your next iteration. The golden rule would be: never do more than one iteration per day, even one per week for a big patchset. According to him, if you don’t get any answer at all during two weeks, it’s ok to resend your patch series or to ping the maintainer.

As a conclusion, Dmitry pointed out that the first contribution is the hardest, and things will get easier and easier with following contributions.

As a fairly new contributor to the Linux kernel, I really appreciated this talk because Dmitry went through every question I asked myself and my colleagues before and during my first contribution so I’d say this talk can be really useful for someone who would like to start contributing to the Linux kernel project.

Talk Maximizing SD Card Life, Performance, and Monitoring with KrillKounter, by Andrew Murray

Talk chosen by Bootlin engineer João Marcos Costa

The apparent unreliability of SD Cards is a usual complaint from customers when they find that data has been corrupted, or that the throughput announced by the manufacturer is far from reality.

In this talk, Andrew Murray starts with an overview of what a typical SD Card contains: the card’s connectors, a microcontroller (the SD Controller), and a NAND memory chip. However, as he points out, NAND is fundamentally unreliable.

Such unreliability comes mainly from the degradation of the semiconductor that composes the Floating-gate MOSFET. This transistor is the base for both NAND and NOR Flash, each one with inherently different levels of granularity for writing operations. The talk focuses on NAND Flash, as it can be found everywhere: SD Cards, eMMC, SSD, USB sticks, etc.

NAND’s access limitations are a consequence of its write granularity: a whole block of pages need to be erased so you can write into a single page. The presentation illustrates with an example where a 3×3 block (i.e., with 9 pages) needs to write into its first page. A first approach would be writing an entirely new block: the “new” first page would be written into this block, and the previous eight pages would be copied from the old block. The drawback here is called write amplification: we ended up with 9 write operations instead of only one.

The lifetime of NAND Flash is measured in Program/Erase cycles (P/E cycle), and this value changes according to the memory cell architecture: SLC (Single-level cell, 1 bit per cell) has an approximate lifespan of 10000 P/E cycles, MLC (Multi-level cell, 2 bits per cell) and TLC (Triple-level cell, 3 bits per cell) have approximative lifespans of 3000 P/E and 1000 P/E respectively.

One of the talk’s main points is the experimental testing exploring the lifetime of 4 regular 8Gb MLC consumer SD Cards and writing to them until they fail. Both sequential and random writes were used. As for the sequential writes, the block size used was 512Kb. As for the random writes, three different values were used: 4Kb, 128Kb and 512Kb. The overall conclusions were that sequential writes have a fairly higher throughput and lower degradation. For the random writes, the larger the block size, the higher the throughput and the lower the degradation levels. Overall, the conclusion is that sequential large accesses are usually better. This whole experiment was tracked with the Open Source daemon KrillKounter, responsible for logging block layer statistics and allowing to determine the wear on a SD Card.

The talk was particularly instructive, as it starts from a practical overview of SD Cards and NAND Flash, then dives for a moment in electronics specifics to explain how the oxide degradation works, and finally presents us with an experiment to confront theoretical values of the SD Cards lifespan. As a bonus, it provides us with KrillKounter, a tool to analyze the wear on SD Cards.

Bootlin engineer Louis Chauvet at Linux Display hackfest

2024 Display Next HackfestFrom May 14 to May 16, Igalia is organizing the 2024 Display Next Hackfest, an event where talented developers will gather to explore the latest technologies and trends in the Linux Display Stack.

As explained on the event website:

It has an unconference format where participants propose topics for presenting, roadmapping, discussing and examining together. It aims to unblock bottlenecks, design solutions, raise pitfalls and accommodate the needs of each layer of the display stack. Participants should feel free to propose any topic which interests them. Some topics from the previous edition include: HDR and color management, frame timing and variable refresh rate (VRR), atomic flips, testing and CI, etc.

Bootlin engineer Louis Chauvet, who has started contributing to the Linux kernel VKMS driver, and is starting to work on IGT and the latest version of the Chamelium CI testing hardware, will participate to this hackfest, together with many developers from Igalia, Redhat, Intel, Google, RaspberryPi, AMD, ARM, Collabora and more. This will allow us to discuss current developments and topics, and meet the relevant developers of the Linux graphics/display community.

Our talks at Embedded Open Source Summit 2024

The Embedded Open Source Summit 2024 took place on Apr 16-18 in Seattle, with many talks on a wide range of embedded Linux topics. 11 engineers from Bootlin participated to this conference and four of us gave talks, for which we are happy to publish the slides and videos in this blog post.

Bootlin team at Embedded Open Source Summit 2024
Bootlin team at Embedded Open Source Summit 2024

Continue reading “Our talks at Embedded Open Source Summit 2024”

Bootlin at Open Source Experience and SIDO in Paris, Dec 6-7

Paris will be hosting next week-end a combined event composed of the Open Source Experience and SIDO, the first dedicated to open-source technologies, and the second to IoT, AI, digital infrastructure and cybersecurity.

Open Source Experience

Thomas Petazzoni, Bootlin CEO, will be representing Bootlin at these events, and will also be participating to the round table Embedded systems security: a technical and organizational approach on December 7, at 2:30 PM UTC+1. The abstract of the round table is:

Security is a major issue. Embedded systems are increasingly complex and connected, making them more vulnerable. The aim of this round table is to discuss best practices for guaranteeing security

Thomas will be speaking with Daniel Fages (Freelance), Eloi Bail (Savoir Faire Linux) and Jean-Charles Verdié (Canonical), and the round table will be moderated by Cédric Ravalec (Smile).

If you’re interested in discussing career, business or partnership opportunities with Bootlin, do not hesitate to contact Thomas Petazzoni ahead of the event to schedule a meeting.

Back from Netdev 0x17

At Bootlin, we focus on Embedded Linux development and support, and these embedded devices often have a network interface, be-it an Ethernet port, a Wireless chip or some other kind of communication channel that falls under the Linux Networking Stack’s framework.

So it’s always interesting to see what the rest of the community is working on, and meet in real life people we interact with on the netdev mailing list.

That’s why this year, Alexis Lothoré and Maxime Chevallier flew to Vancouver to participate to the Netdev Conference, a 5 days event organised by the Netdev Society, a small non-profit run by volunteers dedicated to holding this event.

Bootlin at Netdev

Most talks at Netdev are not directly covering topics we’re actively working on, but it’s always refreshing to see these new exciting technologies that could trickle their way down to the embedded world a few years from now. It is also always pretty interesting to stay up to date about challenges encountered by other parts of the networking industry, at scales way different than the ones we are used to.

We learned for example what CXL is about, what it brings and the effort that are made to design new networking hardware around this technology to change the way we think about datacenter networking.

When we attended Netdev 0x13 in 2019, QUIC was one of the hot topics. This year, Homa was under the spotlight with talks on what it is, and how this new protocol could address some of TCP’s problems.

Like all previous editions, we learned all the progress that were made with TC and its future, new ways of bypassing the kernel stack, BPF integration in the kernel, along with XDP which continued to be more and more powerful.

Another hot topic in the kernel is the introduction of the Rust language, and the network subsystem is a pretty relevant target for the new features brought by the language. As a consequence, Rust subsystem maintainers Miguel Ojeda and Wedson Almeida Filho gave an overview of Rust benefits compared to traditional C code, and then showed a step-by-step implementation of a kernel-side TCP server module. While this example is not perfectly representative of classic network-related drivers we usually write, it was a nice showcase of current state of kernel APIs abstractions in Rust.

We also discovered the new use-case that is now driving most of the datacenter networking efforts, which is without surprise AI and Machine Learning. Turns out, if you want your ChatGPT to answer up-to-date replies without having to wait for too long, you need a powerful and well-organized datacenter for the training part, and networking engineering takes a big part in it to keep all those GPUs fed at a relevant pace.

This lead to the devmem TCP effort, which started to feel a bit familiar for us as it uses dma-buf, which we also sometimes use on multimedia pipelines. The ML and AI topic was introduced to us by the wonderful Keynote session given by Manya Ghobadi, who got all the audience captivated by how AI and ML works, what AI workloads requires in terms of network traffic scheduling, datacenter topology and computing hardware that uses optical computing.

On the final day, we even had a visit from Jakub Kicinski (one of the co-maintainers of Linux networking tree), presenting what he had been working on, and gave us an update on the netdev development statistics (and basically, his main point is that we do need to review more patches).

For the first time, there was a talk from Bootlin at netdev, as Maxime presented one of the topics he’s been working on lately : Improving multi-PHY and multi-port interfaces support. Although it was one of the only talks focusing on the low-levels aspects of the Ethernet stack, it triggered some discussions and interest from the community, which will help further improving the ongoing work.

The slides and videos of the event will be published at some point in the future, we will for sure mention this to our readers when it becomes available.

We’ll conclude this short feedback by thanking once again the Netdev Board members, organizers, speakers and the audience for this great event.

We’ll come back 🙂

Bootlin at Capitole du Libre, November 18-19, Toulouse, France

Capitole du LibreCapitole du Libre is THE open-source/free-software event that takes place each year in Toulouse, France. Turns out that half of Bootlin’s team is precisely based in Toulouse, and obviously we are big fan of open-source/free-software, and therefore we have always supported, contributed and participated to Capitole du Libre in one way or another. Bootlin’s CEO Thomas Petazzoni is actually one of the founders of the Capitole du Libre event, back in 2007-2008.

This year, Capitole du Libre will take place on November 18-19, as usual at ENSEEIHT, an engineer school located in the heart of Toulouse.

Bootlin is first financially supporting the event by being one of the Platine sponsors. Thanks to this, we will have a booth at the event, so if you want to meet us, coming to Capitole du Libre is a good idea.

Secondly, Bootlin is also contributed to the event by having 4 of its engineers give talks:

Attending Capitole du Libre is free, so we definitely recommend all free-software/open-source users, developers, contributors to join this great event, and we look forward to meeting the local open-source community at Capitole du Libre!

Yocto Project Summit 2023.11: 2 Bootlin talks

The Yocto Project regularly organizes an-online conference called the Yocto Project Summit. The next edition, Yocto Project Summit 2023.11 will take place on November 28-30, from 12:00 to 18:00 UTC, and at just $40, attending is really affordable.

Yocto Project Summit

Bootlin is not only a big user of the Yocto Project, but also a significant contributor to the project, so we’re happy to announce that our two talk proposals for the Yocto Project Summit 2023.11 have been accepted. Bootlin engineers will therefore deliver the following talks:

If you are a user of the Yocto Project, or intend to become one, we can only recommend you to attend this event. And of course, if you need training on Yocto Project, or engineering/support services, do not hesitate to contact us!

Bootlin at Netdev 0x17, THE Technical Conference on Linux Networking

VancouverBootlin will be at the Netdev 0x17 conference, subtitled THE Technical Conference on Linux Networking. It is indeed one of the major event for developers working on the networking side of the Linux kernel to gather and discuss current and future topics. This year, the conference will take place from Oct 30 to Nov 3 in Vancouver, Canada.

Bootlin is involved in a number of Linux kernel networking developments: development and/or improvement of Linux kernel drivers for Ethernet MACs, Ethernet PHYs, WiFi chips, support for SFP, for Ethernet switches, for PTP offloading, for MACsec offloading, improvements to the 802.15.4 stack, and more. As such, it is very relevant for us to meet the Linux kernel networking community, present our work, and understand where things are heading to in the networking stack.

Our engineers Maxime Chevallier and Alexis Lothoré will both attend the conference. In addition, Maxime will be presenting a talk titled Improving multi-phy and multi-port interfaces:

This talk will describe current use-cases where one MAC is connected to multiple PHYs (chained, or in parallel) and multiple front-facing ports, either through multiple PHYs or through a single multi-port PHY. There exist support for some of these scenarios already, but it is limited by the fact that the PHY device is hidden behind a net_device from userspace’s point of view. We therefore can’t configure an individual PHY when multiple PHYs are present on a link (through SFP transceivers for example), and selecting which front-facing port to use is also limited. This talk will describe ongoing work to support these complex topologies, the challenges faced and expected improvements.

We look forward to attending this event in a few weeks time!

Feedback from ELCE 2023: selection of talks #4

As we reported in a previous blog post, almost the entire Bootlin engineering team was at the Embedded Linux Conference Europe in Prague in June. In order to share with our readers more about what happened at this conference, we have asked all engineers at Bootlin to select one talk they found interesting and useful and share a short summary of it. We will share this feedback in a series of blog posts: first post, second post, third post, this one being the fourth and final post of the series.

Do the Time Warp – the Rocky Horror PTP Show: Verification of Network Time Synchronization in the Real World

Talk by Johannes Zink, chosen by Bootlin engineer Köry Maincent

As we are currently dealing with PTP at Bootlin and facing several weird behaviors, this talk resonated well with our current state of mind. Currently, most of our clock usage uses NTP but some specific usage may need PTP to have high-precision clock synchronization between devices.

In this talk, Johannes first describes briefly the principles of PTP and its implementation in the Linux kernel, where the PTP is either managed by the MAC (often), the PHY or by software, and Userspace, with the description of the Linuxptp project. Then he goes straight to the issues he faced. For non-PTP users, it might be a bit harsh to follow the tests and oscilloscope measurements described by Johannes. He describes several possible issues and clock behaviors you can face, which might help a new PTP user to not spend too much time on debugging some tricky PTP behavior. Also one of the important things he notices is to “Always check your assumptions!”, which he wants to spread as a religious mantra. Using his common pitfalls and best practices may be a good thing when putting a hand in the PTP mechanism.

And don’t forget “Always check your assumptions!”!

Slides: PDF
Video: Youtube

Setting up Yocto Layers and Builds with Official Tools – 2023 Edition

Talk by Alexander Kanavin, chosen by Bootlin engineer Jérémie Dautheribes

As a Yocto user, you may have already wondered, ‘Why aren’t there official tools for creating and managing BitBake-based projects in a reproducible manner?’ Perhaps you have already used tools like repo, Git submodules, kas, or even created your own scripts.
In this talk, Alexander Kanavin – one of the major contributors to the Yocto project – introduces the tools currently under development within OE-core/poky to address this situation.

Slides: ODP
Video: Youtube

WirePlumber 0.5 Propelling PipeWire for the Embedded

Talk by Ashok Sidipotu, chosen by Bootlin engineer Alexandre Belloni

Ashok started to present a quick introduction to what Pipewire is. A nice block diagram explains what it looks like in action. Then the discussion switches to the session manager and why it is important.
WirePlumber is now the default session manager, replacing PipeWire media session. It manages the control path and dynamically creates PipeWire objects.
The main changes are:

  • config syntax is switching from Lua to SPA JSON, just like PipeWire. More info is available is this blog post
  • the event dispatcher has been created to handle PipeWire signals. This allows to prioritize signals and to avoid race conditions. This feature has a nice example and a fairly complete blog post

This talk is a nice overview of what is happening in the PipeWire ecosystem which is now quite mature. It is also great to see the improvements and that the embedded use case is not forgotten.

Slides: PDF
Video: Youtube

Feedback from ELCE 2023: selection of talks #3

As we reported in a previous blog post, almost the entire Bootlin engineering team was at the Embedded Linux Conference Europe in Prague in June. In order to share with our readers more about what happened at this conference, we have asked all engineers at Bootlin to select one talk they found interesting and useful and share a short summary of it. We will share this feedback in a series of blog post: first post, second post, this one being the third of the series.

rtla timerlat: Debugging Real-time Linux Scheduling Latency

Talk by Daniel Bristot de Oliveira, chosen by Bootlin engineer Maxime Chevallier.

Talks related to real-time linux debugging are pretty common at ELCE, I gave one myself in 2017 and I’ve been attending most of them since then. Besides a headache, what I could get from attending all these talks is that this topic is complex, time consuming, and that there’s a lot of different methodologies one can use to find the cause of these elusive problems.

Users who aren’t very familiar with the inner workings of the Linux Kernel can ask for help on mailing-lists, and the reply usually asks for a trace. This is where things get complicated, the Linux kernel tracer is very powerful, but can drown users in a flood of trace events from which it is difficult to extract the relevant data.

Hopefully, Daniel’s talk is going to make this kind of talk less common, as the tool he wrote and presented, rtla, makes it easy to gather important information about the cause of undesired latencies. By using cleverly placed trace-points, in-kernel testing tools (timerlat and osnoise) and an automated trace analyzer, rtla can not only detect latencies as cyclictest would, it can also give you what caused the latency. If it’s a blocking problem, rtla tells you which process is blocking your task. If it’s an interference, rtla will tell you which task or interrupt caused the latency, and can even detect if the hardware itself is the culprit.

For developers, this tool is also a perfect way to gather user feedback and bug reports that are small, precise and easily reproducible.

I therefore strongly recommend checking out Daniel’s talk and his dedicated blog article.

Slides: PDF
Video: Youtube

Zbus – the Lightweight and Flexible Zephyr Message Bus

Talk by Robrigo Peixoto, chosen by Bootlin engineer Thomas Perrot

Zbus is a new message bus for Zephyr allowing threads to communicate to many others, easily. This bus allows to implement several bus topologies:

    • one-to-one
    • one-to-many
    • Many-to-many

In addition, it can be used on very constrained systems.

In this talk, Rodrigo explained in detail how Zbus works, through a few examples. A thread can read or publish in bus channels, and when a message is published into a channel:

      • The Listener’s callbacks are executed
      • A notification is put to the subscriber’s queues
      • Then the subscriber will be executed by priority order

The bus is managed by a dispatcher, named Virtual Distributed Event Dispatcher (VDED) that is robust to priority inversion.

We found Zbus to be a very interesting feature because before there was no easy way to implement one-to-many and many-to-many topologies, but also one-to-one communications without having to manage the problems of inverting priorities and to use FIFO, LIFO, pipe, etc.

Slides: PDF
Video: Youtube

Linux Power ! (from the Perspective of a PMIC Vendor)

Talk by Matti Vaittinen, chosen by Bootlin engineer Kamel Bouhara.

PMICs (Power Management Integrated Circuit) are a key component of low power embedded systems as they often handle complexity in controlling various power voltages required by SoCs. In his talk Matti Vaittinen started by depicting the various devices that can be embedded in a PMIC (Power Management Integrated Circuit): watchdog, RTC, GPIOs are examples of such extra functionalities. He reminded us the reason why such devices are best fitted in the Linux MFD subsystem to take advantage of existing code. However the main subsystem used to implement support for a PMIC is the regulator subsystem and the talk gives us a good understanding of how it works, the concept of provider/consumer, how to register multiple regulators for a PMIC and how to handle specific events. A focus is made on error detection and how over current errors are reported over three categories:

      • PROTECTION : hardware level errors reported when protection limit is reached
      • ERROR: Unrecoverable errors that don’t directly involve hardware shutdown.
      • WARNING: System is still recoverable but requires specific action to be taken

Some PMICs also provide IRQs to notify errors or events and the kernel provides a helper function to handle such notifications and map them to specific actions depending on their severity.

Overall, we found this talk interesting to understand bettert the features provided by PMICs, and how these features are supported by Linux.

Slides: PDF
Video: Youtube