Allwinner VPU support in mainline Linux status update (week 11)

After the initial submission of the Sunxi-Cedrus driver last week, I spent most of this week looking into the sun4i DRM (Direct Rendering Manager) driver. The driver is in charge of handling the display pipeline on Allwinner SoCs. Tight integration of the VPU and the display pipeline is required in order to achieve decent video playback performance. That is because the output format of the VPU is a 32×32 tiled format based on NV12, a YUV420 semi-planar format, with one plane for the Y component (luminance) and one plane for the interleaved UV components (chrominance). While NV12 is a standard format for video output, the tiling is rather specific to the VPU, so the frames have to be untiled before they can be used. This operation, when done in software, is rather slow. Moreover, software-based compositing of the decoded frames is also a bottleneck that impacts the overall performance.

In order to circumvent these issues, we will be using the display engine itself to untile the VPU output frames and show the untiled frames directly in a dedicated hardware plane, that is then composed with the primary plane. This requires several features and especially support for the display engine’s frontend, that has the required components to untile and decode the frames. Partial support for the frontend was recently contributed by Maxime Ripard and is on its way to landing in the mainline Linux kernel, providing a base for my VPU-related work. Maxime’s patches allow scaling hardware planes (among other things), a feature that will be very useful for scaling videos to the screen size in hardware rather than software (which is another major bottleneck for performance).

Support for untiling the VPU frames is approaching completion (luminance is correctly decoded while chrominance is not yet correctly handled).

Decoding the MB32 tiled format with sun4i-drm

Once the frames are properly shown on screen, it’ll be time to make sure that dmabuf works as expected, which will allow us to send buffers from the VPU to the display engine without any copy, thus improving performance.

We should be making good progress on this topic over the upcoming week and start contributing patches to the sun4i DRM driver, so stay tuned for our next status update!

Linux 4.15 released, Bootlin contributions

Penguin from Mylène Josserand
Drawing from Mylène Josserand,
based on a picture from Samuel Blanc under CC-BY-SA

After a month of February busier than usual, with the renaming of our company from Free Electrons to Bootlin, our participation to FOSDEM and the welcoming of Maxime Chevallier, the latest addition to our engineering team, our article on the latest release of the Linux kernel arrives a bit late, more than a month after Linux 4.15 has been released by Linus Torvalds.

As usual, did an interesting coverage of this release cycle merge window, highlighting the most important changes: The first half of the 4.15 merge window and The rest of the 4.15 merge window. Due to the now well-known Spectre and Meltdown vulnerabilities and the resulting effort to try to mitigate them, 4.15 required a -rc9, which happened the last time back in 2011 with the 3.1, Torvalds said.

According to Linux Kernel Patch statistics, Free Electrons (now Bootlin) contributed 150 patches to this release, making it the 16th contributing company by number of commits.

The main highlights of our contributions are:

  • In the RTC subsystem, Alexandre Belloni made a number of improvements to various drivers, mainly making them use the nvmem subsystem where appropriate, and use the recently introduced rtc_register_device() API.
  • In the MTD subsystem, both Boris Brezillon and Miquèl Raynal made a number of contributions, mainly fixes.
  • For Marvell platforms
    • Antoine Ténart contributed a few fixes to the >inside-secure crypto accelerator driver, used on Marvell Armada 3700 and Armada 7K/8K
    • Antoine Ténart also contributed fixes and improvements to the mvpp2 network driver, used for the Ethernet controller on the Marvell Armada 7K/8K. His improvements include preparation work to support Receive Side Scaling (RSS).
    • Antoine Ténart enabled more networking ports and features in some Armada 7K/8K boards, especially SFP ports on Armada 7040 DB and Armada 7040 DB.
    • Boris Brezillon contributed a few fixes to the Marvell CESA crypto accelerator driver, used on the older Orion, Kirkwood, Armada 370/XP/38x processors. He migrated the driver to use the skcipher interface of the Linux kernel crypto framework.
    • Grégory Clement enabled NAND support on Armada 7K, and contributed a number of fixes around MMC support for some Marvell boards.
    • Thomas Petazzoni contributed a few minor Device Tree enhancements for Marvell platforms: fixing MPP muxing on an older Kirkwood platform, enabling more PCIe ports on Armada 8040 DB, etc.
    • Miquèl Raynal contributed support for more advanced statistics in the mvpp2 network driver.
    • Miquèl Raynal added support for the extended UART for the Marvell Armada 3720 processor, both in the UART driver and in the Device Tree.
  • For the RaspberryPi platform, Boris Brezillon contributed a few fixes to the vc4 display driver, and added support for the new DRM_IOCTL_VC4_GEM_MADVISE ioctl, which can be used to ask the userspace applications to purge inactive buffers when allocations start to fail in the kernel.
  • For Allwinner platforms
    • Mylène Josserand contributed a fix for the Allwinner A83 clock driver, fixing I2C bus clocks.
    • Quentin Schulz contributed a few fixes to the sun4i-gpadc-iio.c driver, which is used for the ADCs on several Allwinner processors.
    • Maxime Ripard made a number of fixes to the sun8i-codec driver, fixing clock issues, left/right channels inversion, etc.
    • Maxime Ripard made a number of improvements to the sun4i DRM display driver.
    • Maxime Ripard improved the support for the A83 processor (described the UART1 controller, the MMC1 controller, added support for display clocks) and added the Device Tree for a new A83 device.
    • Maxime Ripard also did a number of cleanups and misc improvements in a significant number of Device Tree files for Allwinner platforms.
  • Thomas Petazzoni made a few fixes to the sh_eth network driver, used on several Renesas SuperH platform, as part of a recent project Bootlin did on SuperH 4.

Bootlin engineers are not only contributors, but also maintainers of various subsystems in the Linux kernel, which means they are involved in the process of reviewing, discussing and merging patches contributed to those subsystems:

  • Maxime Ripard, as the Allwinner platform co-maintainer, merged 108 patches from other contributors
  • Boris Brezillon, as the MTD/NAND maintainer, merged 34 patches from other contributors
  • Alexandre Belloni, as the RTC maintainer and Atmel platform co-maintainer, merged 50 patches from other contributors
  • Grégory Clement, as the Marvell EBU co-maintainer, merged 24 patches from other contributors

Here is the commit by commit detail of our contributons to 4.15:

Allwinner VPU support in mainline Linux status update (week 10)

Just over a week ago, I started my internship focused on adding upstream Linux kernel support for the Allwinner VPU at Bootlin’s Toulouse office. The team has been super-friendly and very helpful to help me get settled and I’m definitely happy about moving to Toulouse for the occasion!

This first week of work was focused on studying and rebasing the work done by Florent Revest a year and a half ago. As a main development target, I went for an A33-based board, the SinA33 from Sinlinx. Florent’s patches for the sunxi-cedrus driver were rebased against the latest release candidate version of Linus’ tree, v4.16-rc4.

VPU decoding with Cedrus on the Sinlinx A33

The driver was then adapted to use the latest version of the V4L2 request API, a crucial piece of plumbing needed to provide coherency between setting specific controls for the media stream and the input/output buffers that these controls are related to. A few bugs needed fixing along the way, in order to avoid memory corruptions (use-after-free) and to properly schedule the VPU to run when a request is submitted. With these fixes the driver was ready, so it was sent for review on the linux-media mailing list. On the userspace side, the cedrus-specific libva was also updated to use the latest version of the request API.

The next step in the pipeline is to use a common buffer for the VPU’s decoded frame and the display controller’s plane, using dmabuf. This should bring a significant performance improvement and eventually allow for hardware-based scaling when decoding videos through the standard DRM/KMS interfaces. However, this requires adding support for the specific format used by the VPU (a multiplanar NV12 format with 32×32 tiles) into the display controller code.

Bootlin contributes a new interface to the Linux NAND subsystem

MTD stack

Over the last months, Bootlin engineers Boris Brezillon and Miquèl Raynal have been working on rewriting the NAND controller driver used on a large number of Marvell SoCs. This NAND controller driver had grown very complicated, and Miquèl’s adventure in this rework led him to contribute a new interface to the NAND framework, in order to simplify implementing NAND controller drivers for complex NAND controllers. In this blog post, Miquèl summarizes the original issue, and how it is solved by the ->exec_op() interface he has contributed.


The NAND framework is the layer between the generic MTD layer and the NAND controller drivers. Its purpose is to handle MTD requests and transform them into understandable NAND operations the controller will have to send to the NAND chip.

For general information about NANDs, the reader is invited to read the ONFI specification (Open NAND Flash Interface) which defines the most common NAND operations.

Interacting with a NAND chip

Raw NANDs (so-called “parallel NANDs”) are slave devices waiting for instructions from the controller. An operation is a sequence of instructions usually referred as “command” (CMD), “addresses” (ADDR), and “data” cycles (DATA_IN/DATA_OUT) and sometimes wait periods (WAITRDY). Some everyday operations any NAND enthusiast should know by heart are, for instance:

NAND operation example

How it was handled in the Linux kernel

Today, a majority of NAND controlller drivers implement the ->cmd_ctrl() hook. It aimed to be a very small function, designed to just send command and address cycles independently, usually embedding some very controller-specific logic. This hook was supposed to be called by a function of higher level from the NAND core, ->cmdfunc(). In addition to calling ->cmd_ctrl() to send command and address cycles, the core would also call ->read|write_byte|word|buf() hooks to actually move data from the NAND controller and the memory (the DATA parts in the diagram above).

This approach worked very well with simple NAND controllers, which are just able to send command and address cycles one at a time to the NAND chip, without any extra intelligence. However, NAND controllers have become more and more complex and now can handle higher-level operations, usually to provide higher performance. For example, a NAND controller may provide an operation that would do all of the command and address cycles of a read-page operation in one-go. Some controllers even support only those higher-level operations, and are not able to simply do the basic operation of sending one command cycle or one data cycle. To handle such controllers, their drivers were overloading the ->cmdfunc() hook directly, circumventing the generic NAND core implementation of ->cmdfunc(). This is a first drawback: it is no longer possible to easily add logic to the NAND core to support new NAND operations, because some drivers overload the ->cmdfunc() logic. Worse, ->cmdfunc() doesn’t provide some information such as the length of the data transfer, which some controllers actually need in order to run the desired operation. NAND controller drivers started to have complicated state machines just to work around the NAND framework limitations.

NAND stack before exec_op

Some driver-specific implementations of this hook started diverging from the original one, giving maintainers a lot of pain to maintain the whole subsystem, specifically when they needed to introduce additional vendor-specific operations support. These implementations were not only diverse but also incomplete, sometimes buggy and most importantly, developers had to guess the data that would probably be moved by the core after that, which is clearly a symptom that the framework was not fitting the user needs anymore.

The ->exec_op() era

The NAND subsystem maintainers decided to switch to a new approach, based on a new hook called ->exec_op(), implemented by NAND controller drivers and called by the generic NAND core. The logic behind that name is to provide to every controller a generic interface that can easily be extended and exposes the overall NAND operation to be performed. This way, the driver can optimize depending on the controller capabilities without the need of a complex state machine as ->cmdfunc() was.

All major NAND generic raw operations like reset, reading the NAND ID, selecting a set of timings, reading/writing data and so on found their place into small internal functions named nand_[operation]_op().

From the NAND controller driver point of view, an array of instructions is received for each operation. The controller then needs to parse these instructions, decides if it can handle the overall operation, splits the operation if needed, and executes what is requested.

Using the ->exec_op() interface is as simple as declaring a list with the controller capabilities, each entry of this array having a callback function knowing the overall operation that will actually handle all the logic. The NAND core was enhanced with a proper parser that one may use in his driver to handle the callback selection logic.

NAND stack with exec_op

For a more complete overview, one can check the slides and the video of Miquèl’s presentation at FOSDEM about NAND flash memories and the introduction of ->exec_op() in the Linux kernel.

Current status

The ->exec_op() interface in the NAND core has been accepted and merged upstream, and will be part of Linux 4.16. The first driver converted to this new interface was obviously the NAND controller driver used on Marvell platforms, pxa3xx_nand. It has been rewritten as marvell_nand, and will also be part of Linux 4.16. Even though the new driver is longer (by lines of code) than the previous one, it supports additional features (such as raw read and write operations), allows the NAND core to pass custom commands to the NAND chip, and has a logic that is a lot less complicated.

Miquèl has also worked on converting the fsmc_nand driver to ->exec_op(), but this work hasn’t been merged yet. In the community, Stefan Agner has taken on the task to convert the vf610_nfc driver to this new approach.

Bootlin is proud to have contributed such enhancements to the Linux kernel, and hopes to see other developers contribute to this subsystem in the near future, by migrating their favorite NAND controller driver to ->exec_op()!

Bootlin at the Embedded Linux Conference 2018

Like every year for more than 10 years, Bootlin engineers will participate to the next Embedded Linux Conference, which takes place in Portland on March 12-14. Of course, it will be our first ELC with our new company name! In total, eight engineers from Bootlin will participate to the event. Maxime Chevallier, who joined Bootlin last Monday, will be attending the conference, his first one with a Bootlin hat (but Maxime has already been a speaker at the last Embedded Linux Conference Europe).

Embedded Linux Conference 2018

We will also be giving a number of talks, tutorials or moderating Bird of a Feather sessions:

We’re really happy to again meet the embedded Linux open-source community at this event! It is worth mentioning that following this event, Bootlin CTO Thomas Petazzoni will be in the Silicon Valley on March 15-16, available for business meetings: do not hesitate to contact us if you’re interested.

Crowdfunding campaign for upstream Linux kernel driver for Allwinner VPU

Back in 2012, Bootlin (formerly Free Electrons) engineer Maxime Ripard pioneered the support for Allwinner processors in the official Linux kernel. Today, thanks to the contributions of numerous developers around the world and our involvement, there is very good support for a large number of Allwinner processors in the Linux kernel, to the point where actual Allwinner-based products are shipping with the mainline kernel.

Despite this major effort, there is one area that has remained unsupported in the mainline kernel: the video decoding and encoding engine, which allows to accelerate in hardware the decoding and encoding of popular codecs such as MPEG2, MPEG4 or H264. Last summer, we successfully implemented a prototype, supporting MPEG2 decoding and partially MPEG4 decoding.

Today, we are launching a crowdfunding campaign to fund the remainder of the development: finishing MPEG4 decoding support, implementing H264 decoding, optimizing the rendering of video frames in cooperation with the display driver, and upstreaming the driver. We also have additional goals of supporting H265, encoding support, and additional Allwinner SoCs.

In the vendor-provided kernel, this video decoding/encoding unit is supported by a kernel driver that uses a non-standard user-space API, in conjunction with a binary-only userspace blob. Fortunately, a number of people have done an enormous reverse engineering effort, which we have leveraged for our existing prototype, and which we intend to use to continue the development of this upstream driver. Both Maxime Ripard and our intern Paul Kocialkowski will be working on this crowdfunded project.

This is our first crowdfunding campaign to fund upstream Linux kernel development, and we are interested in seeing how much interest there is in such a financing model. Help us making this a success by spreading the word!

Free Electrons becomes Bootlin

Bootlin logo

Free Electrons is changing to a new name, in the context of a trademark dispute.

Reasons for changing

On July 25, 2017, the company FREE SAS, a French telecom operator, known as the owner of the website, filed a complaint before the District Court of Paris against Free Electrons and its founder Michael Opdenacker for infringing upon 3 trademarks which include the word “free” and on FREE SAS’s rights on its domain name and its company name.

In this complaint, FREE SAS asked, among others, the French judges to order Free Electrons and its founder Michael Opdenacker to pay the total sum of 107,000 euros on various grounds, to order Free Electrons to change name, to delete the domain name “” within 15 days and to cease all use of the sign “FREE ELECTRONS” but also of the term “free” alone or with any other terms in any field in which FREE SAS is active or for any goods and services covered by its prior trademarks.

Michael Opdenacker and Free Electrons’ management consider that these claims are unfounded as both companies were coexisting peacefully since 2005.

The services we offer are different, we target a different audience (professionals instead of individuals), and most of our communication efforts are in English, to reach an international audience. Therefore Michael Opdenacker and Free Electrons’ management believe that there is no risk of confusion between Free Electrons and FREE SAS.

However, FREE SAS has filed in excess of 100 oppositions and District Court actions against trademarks or name containing “free”. In view of the resources needed to fight this case, Free Electrons has decided to change name without waiting for the decision of the District Court.

This will allow us to stay focused on our projects rather than exhausting ourselves fighting a long legal battle.

The new name

Amongst all the new names we considered, “Bootlin” came out as our favorite option. It can’t express all our values but it corresponds to what we’ve been working on since the beginning and hope to continue to do for many years: booting Linux on new hardware.

Of course, “booting” here shouldn’t be limited to getting a first shell prompt on new hardware. It means doing whatever is needed to run Linux by taking the best advantage of software and hardware capabilities.

Same team, same passion

Nothing else changes in the company. We are the same engineers, the same Linux kernel contributors and maintainers (now 6 of us have their names in the Linux MAINTAINERS file), with the same technical skills and appetite for new technical challenges.

More than ever, we remain united by the passion we all share in the company since the beginning: working with hardware and low-level software, working together with the free software community, and sharing the experience with others so that they can at least get the best of what the community offers and hopefully one day become active contributors too. “Get the best of the community” is effectively one of our slogans.

Practical details

The only thing we’re changing is the name (“Bootlin” instead of “Free Electrons”), the domain name ( instead of and the logo. The two penguins, our mascots which have been the key identification of Free Electrons for many years will stay the same. Except for the domain name change, all URLs should stay the same, and all e-mail addresses too.

For the moment, we’ve just migrated the mail and main web servers. The other services will be updated progressively.

For practical reasons, the name of the company running Bootlin will remain “Free Electrons” for a few more months. Until then, there won’t be any impact on the way we interact with our customers. We will let our ongoing customers know when the legal name changes.

What about links to resources, made by community websites but also in mailing lists archives and in public forums? Of course, we redirected the old URLs to the new ones, and will continue to do so as long as we can. However, depending on the outcome of the legal procedure, we may not be able to keep the domain forever. Therefore, we would be grateful if you could update all your links to our site whenever feasible, to avoid the risk of broken links in the future.

Free Electrons at FOSDEM and Buildroot Developers meeting

The FOSDEM conference will take place next week-end in Brussels, Belgium. As the biggest open-source conference event in Europe, featuring a number of talks related to embedded systems and generally low-level development, Free Electrons never misses this event!

Fosdem 2018 logo

This year, Free Electrons engineer Miquèl Raynal will be giving a talk Drive your NAND within Linux – Forget the word “nightmare”, sharing details on the enhancements he has contributed to the Linux kernel MTD subsystem, and which are scheduled to be merged in the 4.16 Linux kernel release.

In addition to Miquèl’s talk, a number of other Free Electrons engineers will be attending the event: Mylène Josserand, Quentin Schulz, Antoine Ténart, Boris Brezillon and Thomas Petazzoni.

Buildroot logoFinally, Free Electrons is also sponsoring the participation of Thomas Petazzoni to the Buildroot Developers Meeting, which is a 2-day event dedicated to the development of the Buildroot embedded Linux build system. With 14 attendees, this event will have the largest number of participants it ever had. We take this opportunity to thank Google and Mind, who are sponsoring the event by providing the meeting room, lunch and social event for the attendees.

Back from ELCE: participation to the Device Tree workshop

After publishing our slides and videos from the Embedded Linux Conference Europe (ELCE), reporting on talks selected by Free Electrons engineers, and mentioning the award given to Michael Opdenacker, here comes the last blog post giving feedback from our participation to the 2017 edition of this conference.

On Thursday after ELCE, Free Electrons engineers Maxime Ripard and Thomas Petazzoni participated to the Device Tree workshop, a day-long meeting to discuss the status and future of Device Tree support, especially in the context of the Linux kernel.

Device Tree Workshoup group photo 2017

Beyond participating to the event, Maxime and Thomas also presented briefly on two topics:

  • Maxime Ripard brought up the topic of handling foreign DT bindings (see slides). Currently, the Device Tree bindings documentation is stored in the Linux kernel source tree, in Documentation/devicetree/bindings/. However, in theory, bindings are not operating-system specific, and indeed the same bindings are used in other projects: U-Boot, Barebox, FreeBSD, Zephyr, and probably more. Maxime raised the question of what these projects should do when they create new bindings or extend existing ones? Should they contribute a patch to Linux? Should we have a separate repository for DT bindings? A bit of discussion followed, but without getting to a real conclusion.
  • Thomas Petazzoni presented on the topic of avoiding duplication in Device Tree representations (see slides). Recent Marvell Armada processors have a hardware layout where a block containing multiple IPs is duplicated several times in the SoC. In the currently available Armada 8040 there are two copies of the CP110 hardware block, and the Linux kernel carries a separate description for each. While very similar, those descriptions have subtle differences that make it non-trivial to de-duplicate. However, future SoCs will not have just 2 copies of the same hardware block, 4 copies or potentially more. In such a situation, duplicating the Device Tree description is no longer reasonable. Thomas presented a solution based on the C pre-processor, and commented on other options, such as a script to generate DTs, or improvements in the DT compiler itself. A discussion around those options followed, and while tooling improvements were considered as being the long-term solution, in the short term the solution based on the C pre-processor was acceptable upstream.

For Free Electrons, participating to such events is very important, as it allows to expose to kernel developers the issue we are facing in some of our projects, and to get direct feedback from the developers on how to move forward on those topics. We definitely intend to continue participating in similar events in the future, for topics of interest to Free Electrons.

MIPI I3C specification published, and new iteration of Linux I3C subsystem

MIPI I3C specification publishedBack in August 2017, Free Electrons contributed to the Linux kernel a patch series adding support for the new MIPI I3C bus, a bus that aims at replacing busses like I2C and SPI, by offering better performance, lower power consumption, and new features like discovery, in-band interrupts and hot join.

At the time of our submission, the I3C specification was closed, but a few days ago, the MIPI Alliance announced that the I3C specification was now publicly available. This is of course very good news as it will allow a much easier and wider adoption of I3C, and it was a somewhat unexpected move since the MIPI Alliance had traditionally kept its specifications only for its members. Hopefully the I3C experience will encourage the MIPI Alliance to follow the same direction for existing or future protocols.

With this announcement from the MIPI Alliance, it was time for us to submit an updated version of our I3C support for the Linux kernel, which Free Electrons engineer Boris Brezillon did on Thursday: [PATCH v2 0/7] Add the I3C subsystem. Compared to the previous version submitted in August, this new version has interesting improvements:

  • A generic infrastructure to support IBIs (in-band interrupts) was added
  • Helpers to support hot-join were added to the core I3C subsystem
  • The Cadence I3C controller driver was improved to support IBIs and hot-join
  • And of course, many of the comments received on the first iteration have been addressed

With the specification now public, we hope to receive useful comments and feedback from the Linux kernel community to improve, and hopefully in the near future, merge the support for the MIPI I3C bus.