Wrapping up the Allwinner VPU crowdfunded Linux driver work

Back in early 2018, Bootlin started a crowd-funding campaign to fund the development of an upstream Linux kernel driver for the VPU found in Allwinner processors. Thanks to the support from over 400 contributors, companies and individuals, we have been able to bring support for hardware-accelerated video decoding in the mainline Linux kernel for Allwinner platforms.

From April 2018 to end of 2019, Paul Kocialkowski and Maxime Ripard at Bootlin worked hard on developing the driver and getting it accepted upstream, as well as developing the corresponding user-space components. We regularly published the progress of our work on this blog.

As of the end of 2019, we can say that all the goals defined in the Kickstarter have been completed:

  • We have an upstream Linux kernel for the Allwinner VPU, in drivers/staging/media/sunxi/cedrus, which supports MPEG2 decoding (since Linux 4.19), H264 decoding (since Linux 5.2) and H265 decoding (will be in the upcoming Linux 5.5)
  • We have a user-space VA-API implementation called libva-v4l2-request, and which allows to use any Linux kernel video codec based on the request API.
  • We have enabled the Linux kernel driver on all platforms we listed in our Kickstarter campaign: A13/A10S/A20/A33/H3 (since Linux 4.19), A64/H5 (since Linux 4.20), A10 (since Linux 5.0) and H6 (since Linux 5.1, contributed by Jernej Skrabec)

This means that the effort that was funded by the Kickstarter campaign is now over, and from now on, we are operating in maintenance mode regarding the cedrus driver: we are currently not actively working on developing new features for the driver anymore.

The Cedrus Linux driver in action

Of course, there are plenty of additional features that can be added to the driver: support for H264 encoding, support for high-profile H264 decoding, support for other video codecs. Bootlin is obviously available to develop those additional features for customers, do not hesitate to contact us if you are interested.

Overall, we found this experience of funding upstream Linux kernel development through crowd-funding very interesting and we’re happy to have been successful at delivering what was promised in our campaign. Looking at the bigger picture, the Linux userspace API for video decoding with stateless hardware codecs in V4L2 has been maturing for a while and is getting closer and closer to being finalized and declared a stable kernel API: this project has been key in the introduction of this API, as cedrus was the first driver merged to require and use it. Additional drivers are appearing for other stateless decoding engines, such as the Hantro G1 (found in Rockchip, i.MX and Microchip platforms) or the rkvdec engine. We are of course also interested in working on support for these VPUs, as we have gained significant familiarity with all things related to hardware video decoding during the cedrus adventure.

Linux 5.4 released, Bootlin contributions inside

LinuxThis time around, we’re quite late to the party, but Linux 5.4 was indeed released a number of weeks ago, and once again, Bootlin contributed a number of patches to this Linux kernel release. As usual, the most useful source of information to learn about the major features brought by Linux 5.4 are the LWN articles (part 1, part 2) and the KernelNewbies Wiki.

With a total of 143 patches contributed to this release, Bootlin is the 17th contributing company by number of commits acccording to the Linux Kernel Patch Statistic.

Here are the highlights of our contributions:

  • Antoine Ténart contributed support for IEEE 1588 Precision Time Protocol (PTP) to the Microsemi Ocelot Ethernet switch driver, which Bootlin developed and upstreamed in 2018 (see our blog post)
  • In the MTD subsystem, a number of contributions to the spi-nor support, written originally by Boris Brezillon, made their way upstream.
  • In the support of Microchip (formerly Atmel) platforms, Kamel Bouhara, who joined Bootlin in September 2019, sees his first kernel contribution merged as a Bootlin engineer: dropping useless support for platform_data from the Atmel PWM driver.
  • In the support of Allwinner platforms
    • Maxime Ripard contributed a brand new driver for the Allwinner A10 camera interface driver, a driver that we started at Bootlin for the CHIP platform back in the days, and that we finished more recently.
    • Maxime Ripard contributed a significant number of improvements to the sun4i-i2s audio interface driver, especially TDM support, which was developed as part of a customer project at Bootlin.
    • Maxime Ripard also contributed numerous enhancements to Allwinner platform Device Tree files, especially in the area of using YAML schemas.
  • In the support for Marvell platforms
    • Grégory Clement added cpufreq support to the Marvell Armada 7K/8K platform by extending some of its clock drivers.
    • Miquèl Raynal contributed improvements to the Marvell CP110 COMPHY driver, which is used to control SERDES lanes on the Marvell Armada 7K/8K platforms, and added the description of the SERDES lanes used by various IP blocks in those processors.
  • Alexandre Belloni, as the RTC subsystem maintainer, did a number of fixes and improvements in several RTC drivers (mainly pcf2123 and pcf8563)
  • For the LPC3250 platform, for which Bootlin delivered a modern BSP to a customer last year, Alexandre Belloni fixed an issue in the lpc_eth network driver, which was preventing the system from booting if the network had been initialized by the bootloader.

In addition to being contributors, some Bootlin engineers are also maintainers of various parts of the Linux kernel, and as such review and merge code from other contributors:

  • As the RTC subsystem maintainer and Microchip platform co-maintainer, Alexandre Belloni merged 47 patches from other contributors
  • As the MTD subsystem co-maintainer, Miquèl Raynal merged 33 patches from other contributors
  • As the Marvell platform co-maintainer, Grégory Clement merged 11 patches from other contributors

Here are the details of all our contributions to Linux 5.4:

Security contributions to OpenWrt: dm-verity and SELinux

OpenWrtWhile Bootlin is largely known for its expertise with the Buildroot and Yocto/OpenEmbedded build systems, we do also work with other build systems for customer projects. Specifically, in 2019, we have worked for one of our customers on extending OpenWrt to add support for two security features: dm-verity and SELinux, which we have contributed to upstream OpenWrt. In this blog post, we provide some details about those features, and how they are integrated in OpenWrt.

dm-verity support

dm-verity is a device mapper target that allows to create a block device on top of an existing block device, with a transparent integrity checking in-between. Provided a tree of per-block hashes that is generated offline, dm-verity will verify at run-time that all the data read from the underlying block device matches the hashes that are provided. This allows to guarantee that the data has not been modified, as a root hash must be passed from a trusted source when setting up the dm-verity block device at boot time. If any bit in the storage has been modified, the verification of the hashes all the way up to the root hash will fail, and the I/O operation on the block of data being read from storage will be rejected. Therefore, dm-verity is typically used as part of a secure boot strategy, which allows the root hash to be passed by the bootloader to the kernel, where the bootloader and kernel themselves are verified by other means. Also, due to the nature of the integrity verification, dm-verity provides a read-only block device, and will therefore only work with read-only filesystems.

dm-verity was also presented in our Secure Boot from A to Z talk the Embedded Linux Conference 2018, from slide 28.

dm-verity in "Secure Boot from A to Z"

We implemented an integration of this mechanism in OpenWrt, and contributed a first version back in March, and we just sent a second version in November.

In essence, our integration consists in:

  1. Packaging the different tools that are needed to generate at build time the tree of hashes corresponding to a given filesystem image. The important package here is cryptsetup (patch 05/12), but it requires packaging a few dependencies: libjson-c (patch 04/12), popt (patch 03/12), lvm2 (patch 02/12) and libaio (patch 01/12)
  2. Extending the mechanism of OpenWrt to generate FIT images for the kernel so that it can include a U-Boot script (patch 06/12 and patch 08/12). Indeed, we have chosen to embed the root hash information inside the FIT image, as FIT image can be signed and verified by the bootloader before booting, ensuring that the root hash is part of the trusted information.
  3. Extending the squashfs filesystem image generation logic so that a dm-verity-capable image can optionally be generated (patch 09/12). If this is the case, then the squashfs image itself is concatenated with the tree of hashes, and a U-Boot script containing the details of the dm-verity image is generated. This includes the important root hash information.
  4. Backporting to the 4.14 and 4.19 Linux kernels currently supported by OpenWrt the DM_INIT mechanism that is in upstream Linux since 5.1, and which allows to setup a device mapper target at boot time using the dm-mod.create= kernel argument (patch 10/12). This allows to have the root filesystem on a device mapper block device, without the need for an initramfs to setup the device mapper target.
  5. Showing with the example of the Marvell Armada XP GP platform how to enable this mechanism on a specific hardware platform already supported by OpenWrt (patch 11/12 and patch 12/12).

For more details, you can read the cover letter of the patch series.

SELinux support

SELinux logoSELinux is a Linux security module that implements Mandatory Access Control and that is generally pretty infamously known in the Linux user community for being difficult to use and configure. However, it is widely used in security-sensitive systems, including embedded systems and as such, makes sense to see supported in OpenWrt. For example, SELinux is already supported in the Yocto/OpenEmbedded ecosystem through the meta-selinux layer, and in the Buildroot project since 2014, contributed by Collins Aerospace.

In short, the basic principle of SELinux is that important objects in the system (files, processes, etc.) are associated to a security context. Then, a policy defines which operations are allowed, depending on the security context of who is doing the operation and on what the operation takes place. This policy is compiled into a binary policy, which is loaded into the kernel early at boot time, and then enforced by the kernel during the system life. Of course, around this, SELinux provides a wide range of tools and libraries to manipulate the policy, build the policy, debug policy violations, and more.

The SELinux support in OpenWrt comes in two parts: a number of additional packages for various libraries and applications, and some integration work in OpenWrt. We will cover both in the next sections. It is worth mentioning that our work does not provide a SELinux policy specifically modified or adjusted for OpenWrt: we simply use the SELinux reference policy, which users will have to tune to their needs.

SELinux-related packages

Getting SELinux to work required a number of new packages to be added in OpenWrt. Those packages were contributed to the community-maintained package feed at https://github.com/openwrt/packages/. They were initially submitted through the mailing list, and then submitted as a pull request.

In short, it contains:

  • libsepol, the binary policy manipulation library.
  • libselinux, the runtime SELinux library that provides interfaces to ELinux-aware applications.
  • audit, which contains the user space utilities for storing and searching the audit records generated by the audit subsystem of the Linux kernel.
  • libcap-ng, which allows to use the POSIX capabilities, and is needed by policycoreutils.
  • policycoreutils, which is the set of core SELinux utilities such as sestatus, secon, setfiles, load_policy and more.
  • libsemanage, which is the policy management library.
  • checkpolicy, which is the SELinux policy compiler.
  • refpolicy, which is the SELinux reference policy.
  • selinux-python, a number of SELinux tools written in Python, especially audit2allow for policy debugging.

SELinux integration

Our second patch series, for OpenWrt itself, allows to build a SELinux-enabled system thanks to the following changes:

  • Allow to build Busybox with SELinux support, so that all the Busybox applets that support SELinux specific options such as -Z can be built with libselinux (patch 1/7)
  • Add support in OpenWrt’s init application, called procd, for loading the SELinux policy at boot time (patch 2/7). This patch has been submitted separately for integration into the procd project.
  • Add support for building a new host tool called fakeroot (patch 3/7).
  • Add support for building squashfs images with extended attributes generated by SELinux setfiles tool (patch 4/7). This is why fakeroot is needed: writing those extended attributes that store SELinux security contexts require root access, so we run the entire process within a fakeroot environment. This also requires building the squashfs tools with extended attributes support (patch 7/7).
  • Add new options to enable in the Linux kernel support for SELinux and SquashFS with extended attributes (patch 5/7 and patch 6/7).

Conclusion

Integrating those two security features in OpenWrt required numerous changes in the build system, and the corresponding patches are still under review by the OpenWrt community. We hope to see these features merged in 2020.

Back from ELCE 2019: our selection of talks

The Embedded Linux Conference Europe edition 2019 took place a few weeks ago in Lyon, France, and no less than 7 engineers from Bootlin attended the conference. We would like to highlight a selection of talks that Bootlin engineers found interesting. We asked each of the 7 engineers who attended the event to pick one talk they liked, and make a small write-up about it. Of course, many other talks were interesting and what makes a talk interesting is very subjective!

Introduction to HyperBus Memory Devices, by Vignesh Raghavendra

Talk selected by Gregory Clement

Vignesh started his talk by presenting the HyperBus from the hardware point of view. It is a bus using 8 data lines, using either a single or a differential clock as well as a bi-directional data strobe. These last two features clearly indicate that the designers of this bus seek high bandwidth. Two types of memory are available: HyperRAM and HyperFlash and the talk focused on the second one.

The read throughput of the HyperFlash can reach 400MB/s, it is compatible with SPI flash and is an alternative to the octal SPI NOR flashes. Then Vignesh presented the transactions done on the bus for the flash, which is very similar to what is done on SPI bus. He also compares the traditional parallel CFI flashes to the HyperFlash. And finally he describes the 2 types of controllers: Dedicated HyperBus Controllers and Multi IO Serial controllers.

In the last part of the talk, Vignesh presented the recently add kernel features, and future improvements.

This talk was a good introduction to this new bus, covering the hardware parts as well as software support in the Linux kernel.




[PDF]

Open Source Graphics 101: Getting Started, by Boris Brezillon

Talk selected by Paul Kocialkowski

During this talk, Boris provides a comprehensive and accessible overview of the graphics stack that supports GPUs in systems based on the Linux kernel. He also provides insight about the inner workings and architecture of GPUs. Although Boris defines himself as a not (yet) experienced Graphics developer, his talk contains all the elements needed to get a clear first idea on the topic as he covers hardware, kernel and userspace aspects.

To begin with, he explains how the GPU pipeline is split into multiple stages that are needed one after the other to generate a final image from a set of 3D models. The first stage is geometry and involves operations on the vertices that compose the 3D models, followed by rasterization where the view of the 3D scene is materialized on a 2D viewport, producing the end image. He also presents the concept of shaders: they are small dedicated programs that run on the GPU to make each stage configurable and flexible in order to produce the exact wanted result.

He then provides details about how GPU architectures implement massive parallelization to provide efficient results and also details some of the pitfalls that can occur with this approach. After that, Boris presents how the main CPU interacts with the GPU, introducing the concept of a command stream to submit jobs to the GPU.

With all these concepts laid out comes the time for him to present how the software stack is organized to support GPUs. After a general overview, specific aspects are presented. This includes graphics APIs such as OpenGL and Vulkan (and how they follow distinct paradigms) but also covers topics related to Mesa internals such as intermediate representations or windowing system integration. Kernel aspects are not forgotten either and a rationale regarding the (unusual) kernel/userspace separation in place for GPUs is also provided to clarify prominent design choices.

This talk succeeds at providing an introduction to GPUs and 3D rendering that can be understood without specific graphics knowledge while also giving a good idea of how the supporting software stack is organized. It is highly recommended for anyone interested in learning more on these topics!




[PDF]

Learning the Linux Kernel Configuration Space: Results and Challenges, Matthieu Acher

Talk selected by Michael Opdenacker

TuxML (Linux and Machine Learning) is an open-source research project aiming at exploring the Linux kernel configuration space through machine learning. With more than 15,000 configuration parameters, the Linux kernel now has up to 106000 possible configurations. Compare this figure to 1080, the approximate number of atoms in the universe.

As it is not possible to test all such configurations (all the more as each takes about 10 minutes to compile), the goal of the project is to predict “interesting” configurations, that could expose distinct bugs.

Starting from random configurations (from make randconfig), they use statistical learning to eventually pinpoint sets of parameters causing build failures, and avoid testing configurations that are expected to fail. This way, TUXML can be more efficient in exploring the configuration space and find bugs.

That’s typically where researchers can help us engineers and Linux kernel contributors. You need a solid theoretical background in machine learning to process data efficiently.

On Linux 4.13, the research team has managed to explore more than 15,000 different configurations through more than 95,000 hours of computing, eventually to find (and fix) 16 bugs in Linux. Some of these bugs may not come as a surprise for experienced kernel developers, but some others could expose unexpected issues that a human user may not find spontaneously.

They are also trying to use their data to predict the impact of configuration parameters on kernel size, but it turns out that size is hard to predict. At least, they managed to find “influential” options, some expected ones, and some less expected ones, deserving further investigation.

This project looks definitely useful for improving the test coverage of the Linux kernel, by working smarter than trying to compile purely random configurations. Your help is needed for testing, investigating and fixing kernel bugs, and giving your feedback.




[PDF]

Going beyond printk messages, Sergio Prado

Talk selected by Miquèl Raynal

Sergio gave a talk about debugging with an interesting approach: he started by acknowledging that today, printk is very widely used to do serious debugging but that in some cases it would be much more efficient to use other tools. Indeed there are plenty of open source tools available out there so why don’t we use them?

He presented a table indicating, from his point of view, which tool would best fit a given problem and then enumerated a few tips and commands that everybody can use to understand what went wrong in their kernel, for instance after a panic.

Is addr2line the best way to avoid printk messages right after a panic? or maybe the Linux script faddr2line? or even GDB? Maybe you don’t have access to the panic trace yet, in this case you could be interested in looking at pstore or kdump?

Or maybe an issue will more efficiently be hunted with tracing, in this case Sergio shown several static and dynamic options: using tracepoints, kprobes, ftrace, and proposed many others.

Lock-ups and memory leaks are also covered in the slides (see below), but not in the video because unfortunately the 35 minutes slot allowed was not enough for Sergio to detail all these interesting debugging methods. We wish he had more time to give all his feedback around these underused -while powerful- tools!




[PDF]

Supporting Video (de)serializers in Linux: Challenges and Works in Progress

Talk selected by Thomas Petazzoni

Luca Ceresoli’s talk at this ELCE is a good example of an interesting talk, as it combines an introduction to new hardware, what is the status of the support for this hardware in Linux, and what are the challenges to overcome to complete the integration of the hardware support in the kernel, with some open discussion.

Luca’s talk was about the support for video serializers and deserializers, with a focus on camera support. Cameras are usually connected to a system-on-chip using a parallel interface, a MIPI CSI interface or some LVDS interface. However, these interfaces only work for very short distances between the system-on-chip and the camera, and may not work well in electromagnetically noisy environments. For such situations, there are some technical solutions that consists in serializing the camera stream in a fast robust link (typically a coax cable) and then deserialize it before it is captured by the SoC through a standard camera interface. This fast robust link of course transports the stream data itself, but can also transport control information (GPIO status, I2C bus to talk to the remote camera sensor).

There are two main technologies today implementing this: the GMSL technology from Maxim and the FDP-LinkIII technology from Texas Instruments. Luca’s focus is on the latter technology, since that’s what he has been working on for the past months.

After this hardware introduction, Luca gave a status of the different patch series that have been posted by various contributors (himself included) on the Linux kernel mailing lists: some preliminary support for GMSL has been posted by Kieran Bingham, and some preliminary support for FDP-LinkIII has been posted by Luca.

Luca then presented the ideal implementation to support these interfaces, but then quickly dived into the troubles and tribulations: there is no support for stream multiplexing in Video4Linux currently, there is no support for parts of a V4L pipeline going faulty, and there is no support for hotplugging in V4L.

Then, there are some challenges with how to handle the remote I2C bus offered by those serializers/deserializers. Since camera sensors often have the same I2C address, the serializers/deserializers often have some sort of “solution” to this: an I2C switch in the GMSL (de)serializers, and a translation table for I2C addresses in FDP-LinkIII (de)serializers. Luca discussed how these are currently supported in Linux.

At the end of the talk, quite a bit of discussion took place, both about the V4L issues and the I2C issues raised in Luca’s talk. Overall, it was a useful talk if you’re interested in this specific topic.




[PDF]

V4L2: A Status Update

Talk selected by Maxime Chevallier

As someone not very familiar with the V4L2 Framework, I was pleasantly surprised that Hans’ talk was done in such a way that both experienced and beginner developers were able to follow.

Hans started with a quick introduction to the various concepts of video encoding and decoding that needs to be understood to follow the highly technical explanations of the current status of stateless and stateful codecs support.

Besides describing the technical challenge of implementing such support, Hans gave a good overview of the challenges that are faced by the community, focusing on the necessity of having good testing tools, such as the new vicodec driver.

He described the complexity of implementing support for Stateless decoders (where the hardware decoder doesn’t keep track of the state, this has to be done in software), and explained that the new Request API is a good step towards achieving such support, with 2 decoders already supported in the staging area.

Hans then explained the userspace APIs that are to be used when dealing with Stateless decoders, starting some interesting discussions along the way.

All in all, such a talk is a good example of how we can use events such as ELCE to both give good technical insight on existing frameworks, but also to trigger discussions about the ongoing and future work amongst the active developers and maintainers that are brought together by the event.




[PDF]

One Build to Rule Them All: Building FreeRTOS & Linux Using Yocto

Talk selected by Alexandre Belloni

In this talk, Alejandro Hernandez explains how to build FreeRTOS using OpenEmbedded.

He starts by explaining the use case for building both FreeRTOS and a Linux system using the same build system, in this case OpenEmbedded.

He then shows the meta-freertos layer he developped to get OpenEmbedded to build FreeRTOS. The toolchain he used is fairly classic with GCC, binutils, gdb. The main difference is that newlib is used as the C library.
meta-freertos then depends on previous work that has been done, integrating a newlib and libgloss recipes in oe-core. The core of meta-freertos is then a class, freertos-app.bbclass, allowing to abstract many details allowing to build a FreeRTOS application and image for the target. A poky-freertos distribution configuration is also provided.

Alejandro then demoes multiple FreeRTOS applications.

Finally, he goes over multiconfig, the multiple configuration build dependencies, allowing OpenEmbedded to build an image using a configuration but depending on tasks using a different configuration. In other words, this allows to build a Linux system image after building a FreeRTOS application so it can be included in the image. This is very useful in the case Linux is running on the application processor and needs to load FreeRTOS on a smaller processor.




[PDF]

Authenticate and Encrypted Storage on Embedded Linux

Talk selected by Kamel Bouhara

Jan’s talk is introducing us to the current authentication and encryption methods that go on top of the Linux storage stack.

He started reminding us some basic crypto terminologies and then depicted all the existing technologies by the storage they fit into.

For block device storage dm-verity is a good choice to verify integrity of read-only filesystems and the verification is done on each node of a hash tree. For a file or application based verification fsverity is a more relevant tool as it allows on-demand verification.

On raw NAND devices, the integrity should be checked using the UBI filesystem associated to an HMAC or image signature authentication with a root key. This solution can be completed with fscrypt to encrypt specific data on the filesytem.

For the encryption stage, Jan mentioned the ecryptfs project, which is not maintained anymore and could be well replaced by fscrypt which allows to hold several keys in the same filesystem in a multi-user environment. It is therefore a good alternative to dm-crypt which is a block-based based encryption solution used on large block devices and it is not protected against replay attacks using old blocks.

For a TPM based authentication, he recommendeds using the kernel integrity subsystem called IMA/EVM, which is a layout on top of other filesystem, the project Keylime is good example for this: https://sched.co/TLCY.

Jan shared some good practices on how to manage the Master key storage like not using a password based key, if possible use hardware capabilities like ARM TrustZone and OPTEE and use of a verified boot and key wrapping for the master key.




[PDF]

Back from ELCE 2019: our talks videos, slides, and more!

With 8 engineers participating to the Embedded Linux Conference Europe, almost the entire Bootlin engineering team took part to the conference. As usual, we not only attended the event, but also contributed by giving a total of 5 talks and 2 tutorials, for which we’re happy to share below the videos and slides. Also, as part of this conference, Bootlin CTO Thomas Petazzoni received an award for his contribution to the conference.

Buildroot, what’s new ?

Talk given by Thomas Petazzoni, slides in PDF and slides source code.

Timing boot time reduction techniques

Talk given by Michael Opdenacker, slides in PDF, slides source code.

Integrating hardware-accelerated video decoding with the display stack

Talk given by Paul Kocialkowski, slides in PDF, slides source code.

RTC subsystem, recent changes and where it is heading

Talk given by Alexandre Belloni, slides in PDF, slides source code.

Flash subsystems status update

Talk given by Miquèl Raynal (from Bootlin) and Richard Weinberger (from sigma star gmbh), slides in PDF, slides source code.

Offloading network traffic classification to hardware

Talk given by Maxime Chevallier, slides in PDF, slides source code.

Introduction to Linux Kernel driver programming

Tutorial given by Michael Opdenacker, slides in PDF, slides source document. The video is not yet available, but should be published in the future.

Introduction to the Buildroot embedded Linux build system

Tutorial given by Thomas Petazzoni, slides in PDF, slides source code. The video is not yet available, but should be published in the future.

Award to Thomas Petazzoni

During the traditional closing game of the conference, we were really happy to have Bootlin’s CTO Thomas Petazzoni called on stage, to receive from the hands of Tim Bird, an award for his continuous 11 year participation to the conference, with 24 presentations given, one keynote and for the past two years, participation to the conference program committee. We are honored and proud by this recognition of Thomas contribution to the conference.

Thomas Petazzoni receives ELCE conference award

Thomas Petazzoni receives ELCE conference award

Thomas Petazzoni receives ELCE conference award