The Kernel Recipes conference, to be held in Paris on September 26-28, has become over the years a very popular conference on kernel topics, with its atypical single-track format and limitation to 100 attendees. With this conference taking place in Paris, France, Bootlin engineers obviously never missed the chance to attend, and this year again, several of us will participate to the event.
The Embedded Linux Conference is one of the most important events in the embedded Linux industry, and Bootlin has been participating to this event non-stop since its creation in 2007. So it should be no surprise that we will once again be participating to the 2018 edition of this conference, which will take place on October 22-24 in Edinburgh, Scotland.
Of course, we are not only attending, but also giving a number of talks and tutorials:
Monday 22 October at 11:15, Maxime Ripard is giving a talk Supporting Hardware Codecs in a Linux system, in which he will explain how HW video decoders/encoders are supported in the Video4Linux kernel subsystem, and share his experience working on the Allwinner VPU support in Linux.
Monday 22 October at 12:05, Antoine Ténart and Maxime Chevallier are giving a common talk Networking: From the Ethernet MAC to the Link Partner, in which they will demystify a number of acronyms and technologies used in networking hardware, and detail how Ethernet MAC and PHY are represented and managed in Linux.
Thomas Petazzoni will be giving a tutorial as part of the Embedded Apprentice Linux Engineer track, titled Getting started with Buildroot. This tutorial has not been scheduled yet.
Michael Opdenacker will also be giving a tutorial as part of the Embedded Apprentice Linux Engineer track, titled Introduction to Linux kernel driver programming. This tutorial has not been scheduled yet.
While Bootlin CEO Michael Opdenacker was in the Embedded Linux Conference Europe Program Commitee for a number of years, he’s been replaced this year by Bootlin CTO Thomas Petazzoni. Bootlin was thus involved in the daunting but very interesting task of reviewing and selecting the talks to compose the program of this year’s event.
This is going to be a very busy week for us, and we are looking forward to attending the great talks proposed by all other speakers, and meeting the embedded Linux community once again!
A group of Linux kernel developers is organizing on September 11-14 the Alpine Linux Persistence and Storage Summit, a meeting of kernel developers to discuss the hot topics in Linux storage and file systems, such as persistent memory, NVMe, multi-pathing, raw or open channel flash and I/O scheduling.
Bootlin engineers Boris Brezillon, who is the co-maintainer of the MTD subsystem in the Linux kernel, and Miquèl Raynal, who is the co-maintainer of the NAND subsystem in the Linux kernel, will be attending this event. Through this participation, Bootlin is supporting the work done by its engineers acting as Linux kernel maintainers: they will have the chance to meet other kernel developers and discuss the current issues and future of storage-related subsystems. After the event, we will be reporting on our blog about the discussions that took place.
As discussed in our previous blog post, Bootlin had again a strong presence at the Embedded Linux Conference North-America, with 8 attendees, 5 talks, one BoF and two E-ALE tutorial sessions.
In this blog post, we would like to highlight a number of talks from the conference that we found interesting. Each Bootlin engineer who attended the conference has selected one talk, and gives his/her feedback about this talk.
Device Tree BoF – Frank Rowand
Talk selected by Michael Opdenacker
The Device Tree BoF (Birds of a Feather session, which means an informal session about a technical topic, allowing participants to openly share questions and information) has been part of Embedded Linux Conferences for at least 2 or 3 years. For me, it has always been a good source of updates about the topic.
Frank started by sharing details about the Device Tree Workshop held in October in Prague, a one day meet-up and workshop for Device Tree contributors (like Maxime Ripard and Thomas Petazzoni from Bootlin who were invited), to address issues and plan for the next months. Slides and notes can be found on elinux.org.
Frank then went on by mentioning utilities, such as:
scripts/dtc/dt_to_config. It is not very new in the mainline kernel, but useful to generate a kernel configuration suitable for the devices present on your platform, in case you didn’t know this tool exists.
There’s an upcoming patch adding options to dtc (--annotate --full) to keep track of the line numbers in the device tree sources. This helps to locate in which DT source file a given property value comes from. The patch was idle for some time but Julia Lawall volunteered to take care of it. Thanks to her updates, this feature should be accepted in mainline soon.
The device tree compiler in mainline has also been augmented with further build checks. You can now use them by adding W=1 to make dtb. Here is an example for the Beagle Bone Black dtb:
make W=1 am335x-boneblack.dtb
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/i2c@44e0b000/tda19988 has a reg or ranges property, but no unit name
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/i2c@44e0b000/tda19988/ports/port@0 has a unit name, but no reg property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/i2c@44e0b000/tda19988/ports/port@0/endpoint@0 has a unit name, but no reg property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/ethernet@4a100000/slave@4a100200 has a unit name, but no reg property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/ethernet@4a100000/slave@4a100300 has a unit name, but no reg property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (unit_address_vs_reg): Node /ocp/lcdc@4830e000/port/endpoint@0 has a unit name, but no reg property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (simple_bus_reg): Node /ocp/l4_wkup@44c00000/prcm@200000/clocks missing or empty reg/ranges property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (simple_bus_reg): Node /ocp/l4_wkup@44c00000/prcm@200000/clockdomains missing or empty reg/ranges property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (simple_bus_reg): Node /ocp/l4_wkup@44c00000/scm@210000/scm_conf@0/clocks missing or empty reg/ranges property
arch/arm/boot/dts/am335x-boneblack.dtb: Warning (simple_bus_reg): Node /ocp/l4_wkup@44c00000/scm@210000/clockdomains missing or empty reg/ranges property
As far as I am concerned, the most interesting news remained the one that since Linux 4.15, device tree overlays are now easier to code. You no longer have to define weird “fragment” elements. You can now directly write normal nodes and use labels. The syntax is now exactly the same as for regular device tree sources!
For more details, grab the slides and if you event want to follow the discussions that happened that day, watch the video.
Tutorial: Introduction to Reverse Engineering – Mike Anderson
Talk selected by Quentin Schulz
Mike presented in an unusual 2-hour-long slot the different techniques to reverse-engineer things and the different reasons why you’d do so. After a mandatory disclaimer that reverse engineering may be illegal in some regions of the world, he introduced the different tools that anyone aspiring to reverse engineer should possess: from the obvious logic analyzer, multimeter, screwdrivers to the surprising heat gun. He then gave the first and very important step of the reverse engineering process: gathering information about the product by looking for patents, the FCC registration, manufacturer as well as carefully opening its case to examine the different components (maybe with the help of a microscope).
Later, Mike gave the multiple ways to retrieve the firmwares from the product, from the soldering of a JTAG interface to the downloading from the official website. He then offered some tools that can be used to dive into binaries and start the guessing game, and he finished his talk with an example of a reverse engineering of a protocol which required a lot of guessing and social reverse engineering.
Mike’s talk was pleasant to attend because of the high-level presentation of how to do reverse engineering while giving a quick real-life example.
Graphics Performance Analysis with FrameRetrace: A Responsive UI for ApiTrace – Mark Janes, Intel
Talk selected by Boris Brezillon
I first heard of FrameRetrace when Eric Anholt asked us to add support for the VC4 GPU to this tool, and my experience with it had been rather frustrating in that I was mainly struggling to make it work on a not yet supported architecture instead of being a simple user. So, when I saw that Mark was giving a talk on FrameRetrace usefulness and how to use it, I figured I couldn’t miss it. Well, those who looked carefuly at the schedule know I couldn’t attend it because I was giving my talk at the same time, but I did see it at FOSDEM a month before, and I’m pretty sure not much has changed since then.
Mark first described the GPU debugging/perf-anlysis tools ecosystem, saying that each GPU vendor has its own proprietary tools which most of the time are only supported on Windows. FrameRetrace is an attempt at providing a tool that exposes similar features while being open-source, cross-platform, and easily extensible to new hardware. This project is actually based on an exisiting project called ApiTrace, which it uses to capture OpenGL traces. The new feature that is interesting in FrameRetrace is that you can select the frame you want to replay, get all the hardware perf counters exposed by the GPU for this specific rendering job in order to figure out what is going wrong and then play with the OpenGL code to see how you can make things better and replay the rendering job with your local modification to see if it actually solves the problem.
I must admit I was really impressed by the demo, and now I understand why Eric (and others in the community) would like to have their GPU properly supported in FrameRetrace. It really looks like the kind of tool you don’t know you need until you’ve tested it, but once you do, you can’t do without it.
The Salmon Diet: Up-Streaming Drivers as a Form of Optimization – Gilad Ben-Yossef
Talk selected by Miquèl Raynal
Gilad was hired about a year ago to become the maintainer of the ARM® TrustZone® CryptoCell® device driver. Until now this driver was out of tree until it has been decided to upstream it. Here starts Gilad’s story.
It appeared that the right way to make it upstream was to go through the staging tree and the whole process around it. It was the first time for him to do it that way, that is why he felt it was interesting to share his experience.
At the beginning of his talk, he recalls that the driver was actually working, people already relied on it. Plus, it was released under the GPL. While all of this could make you think it was clean enough, Gilad realized that people who wrote it actually did not think about upstreaming and almost every patch to clean that driver removed more lines than it added, shrinking step by step the driver until 30% of the lines were removed!
Of course, removing the existing hardware abstraction macros was something to do, as well as running and correcting the whole checkpatch.pl output, but there are plenty of other good habits that one can adapt to his own situation, explained and well illustrated all along this talk.
Measuring and Summarizing Latencies using the Trace Event Subsystem – Tom Zanussi
Talk selected by Maxime Chevallier
Having some experience dealing with RT topics on Linux, I was looking forward to seeing Tom’s talk about these tracing features.
He gave really good examples on how to use the existing ‘latency histogram’ traces to get a cyclictest-like metric of wakeup latencies by measuring the time between sched_waking and sched_switch, and explaining how this could be re-used for other measurements such as network latencies.
What he presented was more than just having a trace in a buffer when a function is called. The tracing subsystem allows the use of handlers to perform actions when an event occurs. As an example, he demonstrated how to use the onmax handler to accumulate the maximum wakeup latency observed, each time saving crucial pieces of information on the execution context.
He then went on to describe the next-level features that are being merged, namely function events by Steven Rostedt, and Inter events by Tom himself. They allow the user to use any of the kernel functions as tracepoints, and build complex events and traces to pinpoint really specific sequences.
I recommend to read this LWN article on inter-event tracing, and of course have a look at Tom’s talk and slides.
Steering Xenomai into the Real-Time Linux Future – Jan Kiszka
Talk selected by Thomas Petazzoni
In this talk, long-term Xenomai developer and Siemens engineer Jan Kiszka gave a very interesting status of the Xenomai project and its roadmap. He started by refreshing the audience about what Xenomai is: an RTOS-to-Linux portability framework. It comes in two flavors: a co-kernel extension for a patched Linux kernel, and as libraries running for native Linux (including PREEMPT_RT). He then went on to compare the respective advantages and drawbacks of the two flavors, citing accurate modeling of legacy RTOS behavior and strong separation of real-time vs. non real-time code as the key advantages of the co-kernel approach.
Jan then summarized the history of Xenomai, from the early days as a sub-project of RTAI to the current status of Xenomai 3.0, released in 2015 after more than 5 years of development. However, even though Xenomai is widely used in the industry, its development relies on just a few individuals. Siemens is a heavy user of Xenomai, and in 2017, they started a discussion: should they migrate away from Xenomai or invest into it. Siemens made the decision to invest in the project. The same year, Xenomai main developer Philippe Gerum published an e-mail RTnet, Analogy and the elephant in the room also calling for help to maintain some parts of Xenomai.
Following those discussions, some changes were decided in the Xenomai project: Philippe Gerum will step back from the project lead, and Jan Kiszka will take over his role.
Regarding the I-Pipe kernel patch (which allows to support the co-kernel approach), the Xenomai project will discontinue a number of architectures (nios2, SH, Blackfin, PPC64, ARM < v7) and will only maintain patches for the latest Linux kernel LTS, in order to reduce the maintenance effort.
Jan announced that Xenomai 3.0.7 is soon to be released, that Xenomai 3.1 will introduce ARM64 support, and that Xenomai 2 is unmaintained and therefore users should migrate to Xenomai 3. He also gave a status on the driver stacks, citing that RTnet needs more love, and that Analogy is orphaned and needs a new maintainer.
Towards the end of his talk, Jan then started discussing the more distant future of Xenomai. The future version of Xenomai has the goal of improving the integration of the co-kernel approach, to simplify maintenance and possibly provide a chance to be upstreamed in Linux. This new approach will be split in two elements, called Dovetail (interrupt routing, co-kernel hooks) and Steely (co-kernel implementation). He made it clear that this is currently not production-ready at all. He noted that this new implementation allows a significant reduction of the code base, about 50% smaller than the current implementation. The code is already available in two Git repositories: Dovetail and Steely.
All in all, Jan’s talk was a very interesting one, providing a good coverage of Xenomai’s status and future. The video of his talk is definitely worth watching, and the slides are also available.
An Introduction to Asymmetric Multiprocessing: When this Architecture can be a Game Changer and How to Survive It – Nicola La Gloria & Laura Nao
Talk selected by Mylène Josserand
In this talk, Nicola La Gloria and Laura Nao from Kynetics presented how to handle communication between a micro-controller running bare metal code and a CPU with a full OS (such as GNU/Linux or Android).
They showed the different approaches for communication (supervised or not supervised: i.e. CPU and MCU can communicate using an hypervisor or directly) and presented the Inter-Processing Communication.
After this introduction, Laura told us about their use-case, running on an NXP i.MX7, which comprises a Cortex-M4 micro-controller and a Cortex-A7 processor: the MCU retrieves data from a sensor, which will be displayed by the CPU.
She gave feedback on how they implemented this communication and the different mechanisms used (Message Unit, RPMsg, RDC, kexec/kdump, etc). The explanation of the different mechanisms was really interesting, and particularly relevant for those who had never heard about them.
They did a short tutorial and gave some tips that would definitely be appreciated by people who start this kind of project. And finally, they did a demonstration of all the work they have done.
So if you are interested in the subject or even only for your general knowledge, have a look at their talk (video and slides).
System-in-Package Technology: Making it Easier to Build Your Own Linux Computer – Erik Welsh & Jason Kridner
Talk selected by Alexandre Belloni
Eric Welsh started to talk about how software influences hardware design and why open source hardware matters. He then presented the System-in-Package technology and in particular the Octavo OSD3358. It includes a TI AAM335x SoC, DDR3 SDRAM, a PMIC and all the related power circuitry, components which are always required. This allows the hardware engineers to concentrate on the added value of the final product.
Great pictures and videos of the SiP internals and its manufacturing process were shown.
Jason then came on stage to present the OSD3358 integration on the PocketBeagle.
Eric finally explained how easy it is to assemble the OSD3358 on a PCB, even by hand with a video to prove it. He finally concluded by summing up the benefits of using a SiP: easy bring-up, lower cost of PCB, easy manufacturing and migration from SBC prototyping to custom PCB.
It was quite enlightening for software engineers as it showed the hardware internals with some great details.
Have a look at the video and slides.
In conclusion, this was again a really nice opportunity to share and acquire knowledge from other engineers deeply involved in the Open Source community, as well as meeting people that we sometimes know only by their name on a mailing list. Next year this event will happen in Monterey Bay, California (March 19 – 21, 2019). See you there!
Like every year for more than 10 years, Bootlin engineers will participate to the next Embedded Linux Conference, which takes place in Portland on March 12-14. Of course, it will be our first ELC with our new company name! In total, eight engineers from Bootlin will participate to the event. Maxime Chevallier, who joined Bootlin last Monday, will be attending the conference, his first one with a Bootlin hat (but Maxime has already been a speaker at the last Embedded Linux Conference Europe).
We will also be giving a number of talks, tutorials or moderating Bird of a Feather sessions:
Miquèl Raynal will give a talk titled Drive your NAND with Linux, sharing his experience rewriting the NAND controller driver for Marvell platforms, significantly improving the NAND core subsystem along the way, making it more flexible to support advanced NAND controllers.
We’re really happy to again meet the embedded Linux open-source community at this event! It is worth mentioning that following this event, Bootlin CTO Thomas Petazzoni will be in the Silicon Valley on March 15-16, available for business meetings: do not hesitate to contact us if you’re interested.
Beyond participating to the event, Maxime and Thomas also presented briefly on two topics:
Maxime Ripard brought up the topic of handling foreign DT bindings (see slides). Currently, the Device Tree bindings documentation is stored in the Linux kernel source tree, in Documentation/devicetree/bindings/. However, in theory, bindings are not operating-system specific, and indeed the same bindings are used in other projects: U-Boot, Barebox, FreeBSD, Zephyr, and probably more. Maxime raised the question of what these projects should do when they create new bindings or extend existing ones? Should they contribute a patch to Linux? Should we have a separate repository for DT bindings? A bit of discussion followed, but without getting to a real conclusion.
Thomas Petazzoni presented on the topic of avoiding duplication in Device Tree representations (see slides). Recent Marvell Armada processors have a hardware layout where a block containing multiple IPs is duplicated several times in the SoC. In the currently available Armada 8040 there are two copies of the CP110 hardware block, and the Linux kernel carries a separate description for each. While very similar, those descriptions have subtle differences that make it non-trivial to de-duplicate. However, future SoCs will not have just 2 copies of the same hardware block, 4 copies or potentially more. In such a situation, duplicating the Device Tree description is no longer reasonable. Thomas presented a solution based on the C pre-processor, and commented on other options, such as a script to generate DTs, or improvements in the DT compiler itself. A discussion around those options followed, and while tooling improvements were considered as being the long-term solution, in the short term the solution based on the C pre-processor was acceptable upstream.
For Bootlin, participating to such events is very important, as it allows to expose to kernel developers the issue we are facing in some of our projects, and to get direct feedback from the developers on how to move forward on those topics. We definitely intend to continue participating in similar events in the future, for topics of interest to Bootlin.
The Netdev 2.2 conference took place in Seoul, South Korea. As we work on a diversity of networking topics at Bootlin as part of our Linux kernel contributions, Bootlin engineers Alexandre Belloni and Antoine Ténart went to Seoul to attend lots of interesting sessions and to meet with the Linux networking community. Below, they report on what they learned from this conference, by highlighting two talks they particularly liked.
David S. Miller gave a keynote about reducing the size of core structures in the Linux kernel networking core. The idea behind his work is to use smaller structures which has many benefits in terms of performance as less cache misses will occur and less memory resources are needed. This is especially true in the networking core as small changes may have enormous impacts and improve performance a lot. Another argument from his maintainer hat perspective is the maintainability, where smaller structures usually means less complexity.
He presented five techniques he used to shrink the networking core data structures. The first one was to identify members of common base structures that are only used in sub-classes, as these members can easily be moved out and not impact all the data paths.
The second one makes use of what David calls “state compression”, aka. understanding the real width of the information stored in data structures and to pack flags together to save space. In his mind a boolean should take a single bit whereas in the kernel it requires way more space than that. While this is fine for many uses it makes sense to compress all these data in critical structures.
Then David S. Miller spoke about unused bits in pointers where in the kernel all pointers have 3 bits never used. He argued these bits are 3 boolean values that should be used to reduce core data structure sizes. This technique and the state compression one can be used by introducing helpers to safely access the data.
Another technique he used was to unionize members that aren’t used at the same time. This helps reducing even more the structure size by not having areas of memory never used during identified steps in the networking stack.
Finally he showed us the last technique he used, which was using lookup keys instead of pointers when the objects can be found cheaply based on their index. While this cannot be used for every object, it helped reducing some data structures.
While going through all these techniques he gave many examples to help understanding what can be saved and how it was effective. This was overall a great talk showing a critical aspect we do not always think of when writing drivers, which can lead to big performance improvements.
WireGuard: Next-generation Secure Kernel Network Tunnel — slides — video
Jason A. Donenfeld presented his new and shiny L3 network tunneling mechanism, in Linux. After two years of development this in-kernel formally proven cryptographic protocol is ready to be submitted upstream to get the first rounds of review.
The idea behind Wireguard is to provide, with a small code base, a simple interface to establish and maintain encrypted tunnels. Jason made a demo which was impressive by its simplicity when securely connecting two machines, while it can be a real pain when working with OpenVPN or IPsec. Under the hood this mechanism uses UDP packets on top of either IPv4 and IPv6 to transport encrypted packets using modern cryptographic principles. The authentication is similar to what SSH is using: static private/public key pairs. One particularly nice design choice is the fact that Wireguard is exposed as a stateless interface to the administrator whereas the protocol is stateful and timer based, which allow to put devices into sleep mode and not to care about it.
One of the difficulty to get Wireguard accepted upstream is its cryptographic needs, which do not match what can provide the kernel cryptographic framework. Jason knows this and plan to first send patches to rework the cryptographic framework so that his module nicely integrates with in-kernel APIs. First RFC patches for Wireguard should be sent at the end of 2017, or at the beginning of 2018.
We look forward to seeing Wireguard hit the mainline kernel, to allow everybody to establish secure tunnels in an easy way!
Netdev 2.2 was again an excellent experience for us. It was an (almost) single track format, running alongside the workshops, allowing to not miss any session. The technical content let us dive deeply in the inner working of the network stack and stay up-to-date with the current developments.
Thanks for organizing this and for the impressive job, we had an amazing time!
During the closing session of this conference, Bootlin CEO Michael Opdenacker has received from the hands of Tim Bird, on behalf of the ELCE committee, an award for its continuous participation to the Embedded Linux Conference Europe. Indeed, Michael has participated to all 11 editions of ELCE, with no interruption. He has been very active in promoting the event, especially through the video recording effort that Bootlin did in the early years of the conference, as well as through the numerous talks given by Bootlin.
Bootlin is proud to see its continuous commitment to knowledge sharing and community participation be recognized by this award!
As discussed in our previous blog post, Bootlin had a strong presence at the Embedded Linux Conference Europe, with 7 attendees, 4 talks, one BoF and one poster during the technical show case.
In this blog post, we would like to highlight a number of talks from the conference that we found interesting. Each Bootlin engineer who attended the conference has selected one talk, and gives his/her feedback about this talk.
uClibc Today: Still Makes Sense – Alexey Brodkin
Talk selected by Michael Opdenacker
Alexey Brodkin, an active contributor to the uClibc library, shared recent updates about this C library, trying to show people that the project was still active and making progress, after a few years during which it appeared to be stalled. Alexey works for Synopsys, the makers of the ARC architecture, which uClibc supports.
If you look at the repository for uClibc releases, you will see that since version 1.0.0 released in 2015, the project has made 26 releases according to a predictable schedule. The project also runs runtime regression tests on all its releases, which weren’t done before. The developers have also added support for 4 new architectures (arm64 in particular), and uClibc remains the default C library that Buildroot proposes.
Alexey highlighted that in spite of the competition from the musl library, which causes several projects to switch from uClibc to musl, uClibc still makes a lot of sense today. As a matter of fact, it supports more hardware architectures than glibc and musl do, as it’s the only one to support platforms without an MMU (such as noMMU ARM, Blackfin, m68k, Xtensa) and as the library size is still smaller that what you get with musl (though a static hello_world program is much smaller with musl if you have a close look at the library comparison tests he mentioned).
Alexey noted that the uClibc++ project is still alive too, and used in OpenWRT/Lede by default.
Identifying and Supporting “X-Compatible” hardware blocks – Chen-Yu Tsai
Talk selected by Quentin Schulz
An SoC is made of multiple IP blocks from different vendors. In some cases the source or model of the hardware blocks are neither documented nor marketed by the SoC vendor. However, since there are only very few vendors of a given IP block, stakes are high that your SoC vendor’s undocumented IP block is compatible with a known one.
With his experience in developing drivers for multiple IP blocks present in Allwinner SoCs and as a maintainer of those same SoCs, Chen-Yu first explained that SoC vendors often either embed some vendors’ licensed IP blocks in their SoCs and add the glue around it for platform- or SoC-specific hardware (clocks, resets and control signals), or they clone IP blocks with the same logic but some twists (missing, obfuscated or rearranged registers).
To identify the IP block, we can dig into the datasheet or the vendor BSP and compare those with well documented datasheets such as the one for NXP i.MX6, TI KeyStone II or the Zynq UltraScale+ MPSoC, or with mainline drivers. Asking the community is also a good idea as someone might have encountered an IP with the same behaviour before and can help us identify it quicker.
Some good identifiers for IPs could be register layouts or names along with DMA logic and descriptor format. For the unlucky ones that have been provided only a blob, they can look for the symbols in it and that may slightly help in the process.
He also mentioned that identifying an IP block is often the result of the developer’s experience in identifying IPs and other time just pure luck. Unfortunately, there are times when someone couldn’t identify the IP and wrote a complete driver just to be told by someone else in the community that this IP reminds him of that IP in which case the work done can be just thrown away. That’s where the community plays a strong role, to help us in our quest to identify an IP.
Chen-Yu then went on with the presentation of the different ways to handle the multiple variants of an IP blocks in drivers. He said that the core logic of all IP drivers is usually presented as a library and that the different variants have a driver with their own resources, extra setup and use this library. Also, a good practice is to use booleans to select features of IP blocks instead of using the ID of each variant.
For IPs whose registers have been altered, the way to go is usually to write a table for the register offsets, or use regmaps when bitfields are also modified. When the IP block differs a bit too much, custom callbacks should be used.
He ended his talk with his return from experience on multiple IP blocks (UART, USB OTG, GMAC, EMAC and HDMI) present in Allwinner SoCs and the differences developers had to handle when adding support for them.
printk(): The Most Useful Tool is Now Showing its Age – Steven Rostedt & Sergey Senozhatsky
Talks selected by Boris Brezillon. Boris also covered the related talk “printk: It’s Old, What Can We Do to Make It Young Again?” from the same speakers.
Maybe I should be ashamed of saying that but printk() is one of the basic tool I’m using to debug kernel code, and I must say it did the job so far, so when I saw these presentations talking about how to improve printk() I was a bit curious. What could be wrong in printk() implementation?
Before attending the talks, I never digged into printk()’s code, because it just worked for me, but what I thought was a simple piece of code appeared to be a complex infrastructure with locking scheme that makes you realize how hard it is when several CPUs are involved.
At its core, printk() is supposed to store logs into a circular buffer and push new entries to one or several consoles. In his first talk Steven Rostedt walked through the history of printk() and explained why it became more complex when support for multi CPU appeared. He also detailed why printk() is not re-entrant and the problem it causes when called from an NMI handler. He finally went through some fixes that made the situation a bit better and advertised the 2nd half of the talk driven by Sergey Senozhatsky.
Note that between these two presentations, the printk() rework has been discussed at Kernel Summit, so Sergey already had some feedback on his proposals. While Steven presentation focused mainly on the main printk() function, Sergey gave a bit more details on why printk() can deadlock, and one of the reasons why preventing deadlocks is so complicated is that printk() delegates the ‘print to console’ aspect to console drivers which have their own locking scheme. To address that, it is proposed to move away from the callback approach and let console drivers poll for new entries in the console buffer instead, which would remove part of the locking issues. The problem with this approach is that it brings even more uncertainty on when the logs are printed on the consoles, and one of the nice things about printk() in its current form is that you are likely to have the log printed on the output before printk() returns (which helps a lot when you debug things).
More robust I2C designs with a new fault-injection driver – Wolfram Sang
Talk selected by Miquèl Raynal
Although Wolfram had a lot of troubles starting its presentation lacking a proper HDMI adaptater, he gave an illuminating talk about how, as an I2C subsystem maintainer, he would like to strengthen the robustness of I2C drivers.
He first explained some basics of the I2C bus like START and STOP conditions and introduced us to a few errors he regularly spots in drivers. For instance, some badly written drivers used a START and STOP sequence while a “repeated START” was needed. This is very bad because another master on the bus could, in this little spare idle delay, decide to grab the medium and send its message. Then the message right after the repeated START would not have the expected effect. Of course plenty other errors can happen: stalled bus (SDA or SCL stuck low), lost arbitration, faulty bits… All these situations are usually due to incorrect sequences sent by the driver.
To avoid so much pain debugging obscure situations where this happens, he decided to use an extended I2C-gpio interface to access SDA and SCL from two external GPIOs and this way forces faulty situations by simply pinning high or low one line (or both) and see how the driver reacts. The I2C specification and framework provide everything to get out of a faulty situation, it is just a matter of using it (sending a STOP condition, clocking 9 times, operate a reset, etc).
Wolfram is aware of his quite conservative approach but he is really scared about breaking users by using random quirks so he tried with this talk to explain his point of view and the solutions he wants to promote.
Two questions that you might have a hard time hearing were also interesting. The first person asked if he ever considered using a “default faulty chip” designed to do by itself this kind of fault injection and see how the host reacts and behaves. Wolfram said buying hardware is too much effort for debugging, so he was more motivated to get something very easy and straightforward to use. Someone else asked if he thought about multiple clients situation, but from Wolfram’s point of view, all clients are in the same state whether the bus is busy or free and should not badly behave if we clock 9 times.
Having worked recently on a number of display related drivers, it was quite natural to go see what I was probably going to work on in a quite close future.
Hans started by talking about HDMI in general, and the various signals that could go through it. He then went on with what was basically a war story about all the mistakes, gotchas and misconceptions that he encountered while working on a video-conference box for Cisco. He covered the hardware itself, but also more low-level oriented aspects, such as the clocks frequencies needed to operate properly, the various signals you could look at for debugging or the issues that might come with the associated encoding and / or color spaces, especially when you want to support as many displays as you can. He also pointed out the flaws in the specifications that might lead to implementation inconsistencies. He concluded with the flaws of various HDMI adapters, the issues that might arise using them on various OSes, and how to work around them when doable.
Johan started his talk at ELCE by exposing the problem with how serial ports (UARTs) are currently handled in the Linux kernel. Serial ports are handled by the TTY layer, which allows user-space applications to send and receive data with what is connected on the other side of the UART. However, the kernel doesn’t provide a good mechanism to model the device that is connected at the other side of the UART, such as a Bluetooth chip. Due to this, people have resorted to either writing user-space drivers for such devices (which falls short when those devices need additional resources such as regulators, GPIOs, etc.) or to developing specific TTY line-discipline in the kernel. The latter also doesn’t work very well because a line discipline needs to be explicitly attached to a UART to operate, which requires a user-space program such as hciattach used in Bluetooth applications.
In order to address this problem, Johan picked up the work initially started by Rob Herring (Linaro), which consisted in writing a serial device bus (serdev in short), which consists in turning UART into a proper bus, with bus controllers (UART controllers) and devices connected to this bus, very much like other busses in the Linux kernel (I2C, SPI, etc.). serdev was initially merged in Linux 4.11, with some improvements being merged in follow-up versions. Johan then described in details how serdev works. First, there is a TTY port controller, which instead of registering the traditional TTY character device will register the serdev controller and its slave devices. Thanks to this, one can described in its Device Tree child nodes of the UART controller node, which will be probed as serdev slaves. There is then a serdev API to allow the implementation of serdev slave drivers, that can send and receive data of the UART. Already a few drivers are using this API: hci_serdev, hci_bcm, hci_ll, hci_nokia (Bluetooth) and qca_uart (Ethernet).
We found this talk very interesting, as it clearly explained what is the use case for serdev and how it works, and it should become a very useful subsystem for many embedded applications that use UART-connected devices.
The purpose of this talk was to show how to shrink Gstreamer to make it fit in an embedded Linux device. First, Olivier Crête introduced what GStreamer is, it was very high level but well done. Then after presenting the issue, he showed step by step how he managed to reduce the footprint of a GStreamer application to fit in his device.
In a first part it was a focus on features specific to GStreamer such as how to generate only the needed plugins. Then most of the tricks showed could be used for any C or C++ application. The talk was pretty short so there was no useless or boring part Moreover, the speaker himself was good and dynamic.
To conclude it was a very pleasant talk teaching step by step how to reduce the footprint of an application being GStreamer or not.
With its special one track format, an attendance limited to 100 people, excellent choice of talks and nice social events, Kernel Recipes remains a very good conference that we really enjoyed. Embedded Recipes, which was in its first edition this year, followed the same principle, with the same success. We’re looking forward to attending next year editions, and hopefully contributing a few talks as well. See you all at Embedded and Kernel Recipes in 2018!