Software architecture of Bootlin’slab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working.

We previously introduced the lab and its integration in KernelCI, and presented its hardware infrastructure. Now is time to explain how it actually works on the software side.

Continuous integration in Linux kernel

Because of Linux’s well-known ability to run on numerous platforms and the obvious impossibility for developers to test changes on all these platforms, continuous integration has a big role to play in Linux kernel development and maintenance.

More generally, continuous integration is made up of three different steps:

  • building the software which in our case is the Linux kernel,
  • testing the software,
  • reporting the tests results;
KernelCI complete process
KernelCI complete process

KernelCI checks hourly if one of the Git repositories it tracks have been updated. If it’s the case then it builds, from the last commit, the kernel for ARM, ARM64 and x86 platforms in many configurations. Then it stores all these builds in a publicly available storage.

Once the kernel images have been built, KernelCI itself is not in charge of testing it on hardware. Instead, it delegates this work to various labs, maintained by individuals or organizations. In the following section, we will discuss the software architecture needed to create such a lab, and receive testing requests from KernelCI.

Core software component: LAVA

At this moment, LAVA is the only supported software by KernelCI but note that KernelCI offers an API, so if LAVA does not meet your needs, go ahead and make your own!

What is LAVA?

LAVA is a self-hosted software, organized in a server-dispatcher model, for controlling boards, to automate boot, bootloader and user-space testing. The server receives jobs specifying what to test, how and on which boards to run those tests, and transmits those jobs to the dispatcher linked to the specified board. The dispatcher applies all modifications on the kernel image needed to make it boot on the said board and then fully interacts with it through the serial.

Since LAVA has to fully and autonomously control boards, it needs to:

  • interact with the board through serial connection,
  • control the power supply to reset the board in case of a frozen kernel,
  • know the commands needed to boot the kernel from the bootloader,
  • serve files (kernel, DTB, rootfs) to the board.

The first three requirements are fulfilled by LAVA thanks to per-board configuration files. The latter is done by the LAVA dispatcher in charge of the board, which downloads files specified in the job and copies them to a directory accessible by the board through TFTP.

LAVA organizes the lab in devices and device types. All identical devices are from the same device type and share the same device type configuration file. It contains the set of bootloader instructions to boot the kernel (e.g.: how and where to load files) and the bootloader configuration (e.g.: can it boot zImages or only uImages). A device configuration file stores the commands run by a dispatcher to interact with the device: how to connect to serial, how to power it on and off. LAVA interacts with devices via external tools: it has support for conmux or telnet to communicate via serial and power commands can be executed by custom scripts (pdudaemon for example).

Control power supply

Some labs use expensive Switched PDUs to control the power supply of each board but, as discussed in our previous blog post we went for several Devantech ETH008 Ethernet-controlled relay boards instead.

Linaro, the organization behind LAVA, has also developed a software for controlling power supplies of each board, called pdudaemon. We added support for most Devantech relay boards to pdudaemon.

Connect to serial

As advised in LAVA’s installation guide, we went with telnet and ser2net to connect the serial port of our boards. Ser2net basically opens a Linux device and allows to interact with it through a TCP socket on a defined port. A LAVA dispatcher will then launch a telnet client to connect to a board’s serial port. Because of the well-known fact that Linux devices name might change between reboots, we had to use udev rules in order to guarantee the serial we connect to is the one we want to connect to.

Actual testing

Now that LAVA knows how to handle devices, it has to run jobs on those devices. LAVA jobs contain which images to boot (kernel, DTB, rootfs), what kind of tests to run when in user space and where to find them. A job is strongly linked to a device type since it contains the kernel and DTB specifically built for this device type.

Those jobs are submitted to the different labs by the KernelCI project. To do so, KernelCI uses a tool called lava-ci. Amongst other things, this tool contains a big table of the supported platforms, associating the Device Tree name with the corresponding hardware platform name. This way, when a new kernel gets built by KernelCI, and produces a number of Device Tree Blobs (.dtb files), lava-ci knows what are the corresponding hardware platforms to run the kernel on. It submits the jobs to all the labs, which will then only run the tests for which they have the necessary hardware platform. We have contributed a number of patches to lava-ci, adding support for the new platforms we had in our lab.

LAVA overall architecture

Reporting test results

After KernelCI has built the kernel, sent jobs to contributing labs and LAVA has run the jobs, KernelCI will then get the tests results from the labs, aggregate them on its website and notify maintainers of errors via a mailing list.

Challenges encountered

As in any project, we stumbled on some difficulties. The biggest problems we had to take care of were board-specific problems.

Some boards like the Marvell RD-370 need a rising edge on a pin to boot, meaning we cannot avoid pressing the reset button between each boot. To work out this problem, we had to customize the hardware (swap resistors) to bypass this limitation.

Some other boards lose their serial connection. Some lose it when resetting their power but recover it after a few seconds, problem we found acceptable to solve by infinitely reconnecting to the serial. However, we still have a problem with a few boards which randomly close their serial connection without any reason. After that, we are able to connect to the serial connection again but it does not send any character. The only way to get it to work again is to physically re-plug the cable used by the serial connection. Unfortunately, we did not find yet a way to solve this bug.

The Linux kernel of our server refused to bind more than 13 USB devices when it was time to create a second drawer of boards. After some research, we found out the culprit was the xHCI driver. In modern computers, it is possible to disable xHCI support in the BIOS but this option was not present in our server’s BIOS. The solution was to rebuild and install a kernel for the server without the xHCI driver compiled. From that day, the number of USB devices is limited to 127 as in the USB specification.

Conclusion

We have now 35 boards in our lab, with some being the only ones represented in KernelCI. We encourage anyone, hobbyists or companies, to contribute to the effort of bringing continuous integration of the Linux kernel by building your own lab and adding as many boards as you can.

Interested in becoming a lab? Follow the guide!

Hardware infrastructure of Bootlin’slab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working, starting with this first blog post that details the hardware infrastructure of our lab.

Introduction

In a lab built for continuous integration, everything has to be fully automated from the serial connections to power supplies and network connections.

To gather as much information as we can get to establish the specifications of the lab, our engineers filled a spreadsheet with all boards they wanted to have in the lab and their specificities in terms of connectors used the serial port communication and power supply. We reached around 50 boards to put into our lab. Among those boards, we could distinguish two different types:

  • boards which are powered by an ATX power supply,
  • boards which are powered by different power adapters, providing either 5V or 12V.

Another design criteria was that we wanted to easily allow our engineers to take a board out of the lab or to add one. The easier the process is, the better the lab is.

Home made cabinet

Bootlin' 8 drawers labTo meet the size constraints of Bootlin office, we had to make the lab fit in a 100cm wide, 75cm deep and 200cm high space. In order to achieve this, we decided to build the lab as a large home made cabinet, with a number of drawers to easily access, change or replace the boards hosted in the lab. As some of our boards provide PCIe connectors, we needed to provide enough height for each drawer, and after doing a few measurements, decided that a 25cm height for our drawers would be fine. With a total height of 200cm, this gives a maximum of 8 drawers.

In addition, it turns out that most of our boards powered by ATX power supplies are rather large in size, while the ones powered by regular power adapters are usually much smaller. In order to simplify the overall design, we decided that all large boards would be grouped together on a given set of drawers, and all small boards would be grouped together on another set of drawers: i.e we would not mix large and small boards in the same drawer. With the 100cm x 75cm size limitation, this meant a drawer for small boards could host up to 8 boards, while a drawer for large boards could host up to 4 boards. From the spreadsheet containing all the boards supposed to be in the lab, we eventually decided there would be 3 large drawers for up to 12 large boards and 5 small drawers for up to 40 small or medium-sized boards.

Furthermore, since the lab will host a server and a lot of boards and power supplies, potentially producing a lot of heat, we have to keep the lab as open as it can be while making sure it is strong enough to hold the drawers. We ended up building our own cabinet, made of wood bought from the local hardware store.

We also want the server to be part of the lab. We already have a small piece of wood to strengthen the lab between the fourth and sixth drawers we could use to fix the server. We decided to give a mini-PC (NUC-like) a try, because, after all, it’s only communicating with the serial of each board and serving files to them. Thus, everything related to the server is fixed and wired behind the lab.

Make the lab autonomous

What continuous integration for the Linux kernel typically needs are control of:

  1. the power for each board
  2. serial port connection
  3. a way to send files to test, typically the kernel image and associated files

In Bootlin lab, these different tasks are handled by a dedicated server, itself hosted in the lab.

Serial port control

Serial connections are mostly handled via USB on the server side but there are many different connectors on the target side (in our lab, we have 6 different connectors: DE9, microUSB, miniUSB, 2.54″ male pins, 2.54″ female pins and USB-B). Therefore, our server has to have a physical connection with each of the 50 boards present in the lab. The need for USB hubs is then obvious.

Since we want as few cables connecting the server and the drawers as possible, we decided to have one USB hub per drawer, be it a large drawer or a small drawer. In a small drawer, up to 8 boards can be present, meaning the hub needs at least 8 USB ports. In a large drawer, up to 4 serial connections can be needed so smaller and more common USB hubs can do the work. Since the serial connection may draw some current on the USB port, we wanted all of our USB hubs to be powered with a dedicated power supply.

All USB hubs are then connected to a main USB hub which in turn is connected to our server.

Power supply control

Our server needs to control each board’s power to be able to automatically power on or off a board. It will power on the board when it needs to test a new kernel on it and power it off at the end of the test or when the kernel has frozen or could not boot at all.

In terms of power supplies, we initially investigated using Ethernet-controlled multi-sockets (also called Switched PDU), such as this device. Unfortunately, these devices are quite expensive, and also often don’t provide the most appropriate connector to plug the cheap 5V/12V power adapters used by most boards.

So, instead, and following a suggestion from Kevin Hilman (one of KernelCI’s founder and maintainer), we decided to use regular ATX power supplies. They have the advantage of being inexpensive, and providing enough power for multiple boards and all their peripherals, potentially including hard drives or other power-hungry peripherals. ATX power supplies also have a pin, called PS_ON#, which when tied to the ground, powers up the ATX power supply. This easily allows to turn an ATX power supply on or off.

In conjunction with the ATX power supplies, we have a selected Ethernet-controlled relay board, the Devantech ETH008, which contains 8 relays that can be remote controlled over the network.

This gives us the following architecture:

  • For the drawers with large boards powered by ATX directly, we have one ATX power supply per board. The PS_ON pin from the ATX power supply is cut and rewired to the Ethernet controlled relay. Thanks to the relay, we control if PS_ON is tied to the ground or not. If it’s tied to the ground, then the board boots, when it’s untied from the ground, the board is powered off.
  • For the drawers with small boards, we have a single ATX power supply per drawer. The 12V and 5V rails from the ATX power supply are then dispatched through the 8-relay board, then connected to the appropriate boards, through DC barrel or mini-USB/micro-USB cables, depending on the board. The PS_ON is always tied to the ground, so those ATX power supplies are constantly on.

In addition, we have added a bit of over-voltage protection, by adding transient-voltage-suppression diodes for each voltage output in each drawer. These diodes will absorb all the voltage when it exceeds the maximum authorized value and explode, and are connected in parallel in the circuit to protect.

Network connectivity

As part of the continuous integration process, most of our boards will have to fetch the Linux kernel to test (and potentially other related files) over the network through TFTP. So we need all boards to be connected to the server running the continuous integration software.

Since a single 52 port switch is both fairly expensive, and not very convenient in terms of wiring in our situation, we instead opted for adding 8-port Gigabit switches to each drawer, all of them being connected via a central 16-port Gigabit switch located at the back of the home made cabinet. This central switch not only connects the per-drawer switches, but also the server running the continuous integration software, and the wider Internet.

In-drawer architecture: large boards

A drawer designed for large boards, powered by an ATX power supply contains the following components:

  • Up to four boards
  • Four ATX power-supplies, with their PS_ON# connected to an 8-port relay controller. Only 4 of the 8 ports are used on the relay.
  • One 8-port Ethernet-controlled relay board.
  • One 4-port USB hub, connecting to the serial ports of the four boards.
  • One 8-port Ethernet switch, with 4 ports used to connect to the boards, one port used to connect to the relay board, and one port used for the upstream link.
  • One power strip to power the different components.
Large drawer example scheme
Large drawer example scheme
Large drawer in the lab
Large drawer in the lab

In drawer architecture: small boards

A drawer designed for small boards contains the following components:

  • Up to eight boards
  • One ATX power-supply, with its 5V and 12V rails going through the 8-port relay controller. All ports in the relay are used when 8 boards are present.
  • One 8-port Ethernet-controlled relay board.
  • One 10-port USB hub, connecting to the serial ports of the eight boards.
  • Two 8-port Ethernet switches, connecting the 8 boards, the relay board and an upstream link.
  • One power strip to power the different components.
Small drawer example scheme
Small drawer example scheme
Small drawer in the lab
Small drawer in the lab

Server

At the back of the home made cabinet, a mini PC runs the continuous integration software, that we will discuss in a future blog post. This mini PC is connected to:

  • A main 16-port Gigabit switch, itself connected to all the Gigabit switches in the different drawers
  • A main USB hub, itself connected to all the USB hubs in the different drawers

As expected, this allows the server to control the power of the different boards, access their serial port, and provide network connectivity.

Detailed component list

If you’re interested by the specific components we’ve used for our lab, here is the complete list, with the relevant links:

Conclusion

Hopefully, sharing these details about the hardware architecture of our board farm will help others to create a similar automated testing infrastructure. We are of course welcoming feedback on this hardware architecture!

Stay tuned for our next blog post about the software architecture of our board farm.

Linux 4.8 released, Bootlin contributions

Adelie PenguinLinux 4.8 has been released on Sunday by Linus Torvalds, with numerous new features and improvements that have been described in details on LWN: part 1, part 2 and part 3. KernelNewbies also has an updated page on the 4.8 release. We contributed a total of 153 patches to this release. LWN also published some statistics about this development cycle.

Our most significant contributions:

  • Boris Brezillon improved the Rockchip PWM driver to avoid glitches basing that work on his previous improvement to the PWM subsystem already merged in the kernel. He also fixed a few issues and shortcomings in the pwm regulator driver. This is finishing his work on the Rockchip based Chromebook platforms where a PWM is used for a regulator.
  • While working on the driver for the sii902x HDMI transceiver, Boris Brezillon did a cleanup of many DRM drivers. Those drivers were open coding the encoder selection. This is now done in the core DRM subsystem.
  • On the support of Atmel platforms
    • Alexandre Belloni cleaned up the existing board device trees, removing unused clock definitions and starting to remove warnings when compiling with the Device Tree Compiler (dtc).
  • On the support of Allwinner platforms
    • Maxime Ripard contributed a brand new infrastructure, named sunxi-ng, to manage the clocks of the Allwinner platforms, fixing shortcomings of the Device Tree representation used by the existing implementation. He moved the support of the Allwinner H3 clocks to this new infrastructure.
    • Maxime also developed a driver for the Allwinner A10 Digital Audio controller, bringing audio support to this platform.
    • Boris Brezillon improved the Allwinner NAND controller driver to support DMA assisted operations, which brings a very nice speed-up to throughput on platforms using NAND flashes as the storage, which is the case of Nextthing’s C.H.I.P.
    • Quentin Schulz added support for the Allwinner R16 EVB (Parrot) board.
  • On the support of Marvell platforms
    • Grégory Clément added multiple clock definitions for the Armada 37xx series of SoCs.
    • He also corrected a few issues with the I/O coherency on some Marvell SoCs
    • Romain Perier worked on the Marvell CESA cryptography driver, bringing significant performance improvements, especially for dmcrypt usage. This driver is used on numerous Marvell platforms: Orion, Kirkwood, Armada 370, XP, 375 and 38x.
    • Thomas Petazzoni submitted a driver for the Aardvark PCI host controller present in the Armada 3700, enabling PCI support for this platform.
    • Thomas also added a driver for the new XOR engine found in the Armada 7K and Armada 8K families

Here are in details, the different contributions we made to this release:

Linux 4.7 statistics: Bootlin engineer #2 contributor

LWN.net has published yesterday an article containing statistics for the 4.7 development cycle. This article is available for LWN.net subscribers only during the coming week, and will then be available for everyone, free of charge.

It turns out that Boris Brezillon, Bootlin engineer, is the second most active contributor to the 4.7 kernel in number of commits! The top three contributors in number of commits are: H Hartley Sweeten (208 commits), Boris Brezillon (132 commits) and Al Viro (127 commits).

LWN.net 4.7 kernel statistics

In addition to being present in the most active developers by number of commits, Boris Brezillon is also in the #11 most active contributor in terms of changed lines. As we discussed in our previous blog post, most contributions from Boris were targeted at the PWM subsystem on one side (atomic update support) and the NAND subsystem on the other side.

Another Bootlin engineer shows up in the per-developer statistics: Maxime Ripard is the #17 most active contributor by lines changed. Indeed, Maxime contributed a brand new DRM/KMS driver for the Allwinner display controller.

As a company, Bootlin is ranked for the 4.7 kernel as the #12 most active company by number of commits, and #10 by number of changed lines. We are glad to continue being such a contributor to the Linux kernel development, as we have been for the last four years. If you want your hardware to be supported in the official Linux kernel, contact us!

Linux 4.7 released, Bootlin contributions

Adelie PenguinLinux 4.7 has been released on Sunday by Linus Torvalds, with numerous new features and improvements that have been described in details on LWN: part 1, part 2 and part 3. KernelNewbies also has an updated page on the 4.7 release. We contributed a total of 222 patches to this release.

Our most significant contributions:

  • Boris Brezillon has contributed a core improvement to the PWM subsystem: a mechanism that allows to update the properties of a PWM in an atomic fashion. This is needed when a PWM has been initialized by the bootloader, and the kernel needs to take over without changing the properties of the PWM. See the main patch for more details. What prompted the creation of this patch series is a problem on Rockchip based Chromebook platforms where a PWM is used for a regulator, and the PWM properties need to be preserved across the bootloader to kernel transition. In addition to the changes of the core infrastructure, Boris contributed numerous patches to fix existing PWM users.
  • In the MTD subsystem, Boris Brezillon continued his cleanup efforts
    • Use the common Device Tree parsing code provided by nand_scan_ident() in more drivers, rather than driver-specific code.
    • Move drivers to expose their ECC/OOB layout information using the mtd_ooblayout_ops structure, and use the corresponding helper functions where appropriate. This change will allow a more flexible description of the ECC and OOB layout.
    • Document the Device Tree binding that should now be used for all NAND controllers / NAND chip, with a clear separation between the NAND controller and the NAND chip. See this commit for more details.
  • In the RTC subsystem, Mylène Josserand contributed numerous improvements to the rv3029 and m41t80 drivers, including the addition of the support for the RV3049 (the SPI variant of RV3029). See also our previous blog post on the support of Microcrystal’s RTCs/.
  • On the support of Atmel platforms
    • Boris Brezillon contributed a number of fixes and improvements to the atmel-hlcdc driver, the DRM/KMS driver for Atmel platforms
  • On the support of Allwinner platforms
    • Maxime Ripard contributed a brand new DRM/KMS driver to support the display controller found on several Allwinner platforms, with a specific focus on Allwinner A10. This new driver allows to have proper graphics support in the Nextthing Co. C.H.I.P platform, including composite output and RGB output for LCD panels. To this effect, in addition to the driver itself, numerous clock patches and Device Tree patches were made.
    • Boris Brezillon contributed a large number of improvements to the NAND controller driver used on Allwinner platforms, including performance improvements.
    • Quentin Schulz made his first kernel contribution by sending a patch fixing the error handling in a PHY USB driver used by Allwinner platforms.
  • On the support of Marvell platforms
    • Grégory Clement made some contributions to the mv_xor driver to make it 64-bits ready, as the same XOR engine is used on Armada 3700, a Cortex-A53 based SoC. Grégory then enabled the use of the XOR engines on this platform by updating the corresponding Device Tree.
    • Romain Perier did some minor updates related to the Marvell cryptographic engine support. Many more updates will be present in the upcoming 4.8, including significant performance improvements.
    • Thomas Petazzoni contributed some various fixes (cryptographic engine usage on some Armada 38x boards, HW I/O coherency related fixes).
    • Thomas also improved the support for Armada 7K and 8K, with the description of more hardware blocks, and updates to drivers.

Here are in details, the different contributions we made to this release:

Bootlin contributes to KernelCI.org

The Linux kernel is well-known for its ability to run on thousands of different hardware platforms. However, it is obviously impossible for the kernel developers to test their changes on all those platforms to check that no regressions are introduced. To address this problem, the KernelCI.org project was started: it tests the latest versions of the Linux kernel from various branches on a large number of hardware plaforms and provides a centralized interface to browse the results.

KernelCI.org project
KernelCI.org project

From a physical point of view, KernelCI.org relies on labs containing a number of hardware platforms that can be remotely controlled. Those labs are provided by various organizations or individuals. When a commit in one of the Linux kernel Git branches monitored by KernelCI is detected, numerous kernel configurations are built, tests are sent to all labs and results are collected on the KernelCI.org website. This allows kernel developers and maintainers to detect and fix bugs and regressions before they reach users. As of May, 10th 2016, KernelCI stats show a pool of 185 different boards and around 1900 daily boots.

Bootlin is a significant contributor to the Linux kernel, especially in the area of ARM hardware platform support. Several of our engineers are maintainers or co-maintainers of ARM platforms (Grégory Clement for Marvell EBU, Maxime Ripard for Allwinner, Alexandre Belloni for Atmel and Antoine Ténart for Annapurna Labs). Therefore, we have a specific interest in participating to an initiative like KernelCI, to make sure that the platforms that we maintain continue to work well, and a number of the platforms we care about were not tested by the KernelCI project.

Over the last few months, we have been building our boards lab in our offices, and we have joined the KernelCI project since April 25th. Our lab currently consists of 15 boards:

  • Atmel SAMA5D2 Xplained
  • Atmel SAMA5D3 Xplained
  • Atmel AT91SAM9X25EK
  • Atmel AT91SAM9X35EK
  • Atmel AT91SAMA5D36EK
  • Atmel AT91SAM9M10G45EK
  • Atmel AT91SAM9261EK
  • BeagleBone Black
  • Beagleboard-xM
  • Marvell Armada XP based Plathome Openblocks AX3
  • Marvell Armada 38x Solidrun ClearFog,
  • Marvell Armada 38x DB-88F6820-GP
  • Allwinner A13 Nextthing Co. C.H.I.P
  • Allwinner A33 Sinlinx SinA33
  • Freescale i.MX6 Boundary Devices Nitrogen6x

We will very soon be adding 4 more boards:

  • Atmel SAMA5D4 Xplained
  • Atmel SAMA5D34EK
  • Marvell Armada 7K 7040-DB (ARM64)
  • Marvell Armada 39x DB

Bootlin board farm

Three of the boards we have were already tested thanks to other KernelCI labs, but the other sixteen boards were not tested at all. In total, we plan to have about 50 boards in our lab, mainly for the ARM platforms that we maintain in the official Linux kernel. The results of all boots we performed are visible on the KernelCI site. We are proud to be part of this unique effort to perform automated testing and validation of the Linux kernel!

In the coming weeks, we will publish additional articles to present the software and physical architecture of our lab and the program we developed to remotely control boards that are in our lab, so stay tuned!

Linux 4.6 released, with Bootlin contributions

Adelie PenguinThe 4.6 version of the Linux kernel was released last Sunday by Linus Torvalds. As usual, LWN.net had a very nice coverage of this development cycle merge window, highlighting the most significant changes and improvements: part 1, part 2 and part 3. KernelNewbies is now active again, and has a very detailed page about this release.

On a total of 13517 non-merge commits, Bootlin contributed for this release a total of 107 non-merge commits, a number slightly lower than our past contributions for previous kernel releases. That being said, there are still a few interesting contributions in those 107 patches. We are particular happy to see patches from all our eight engineers in this release, including from Mylène Josserand and Romain Perier, who just joined us mid-March! We also already have 194 patches lined-up for the next 4.7 release.

Here are the highlights of our contributions to the 4.6 release:

  • Atmel ARM processors support
    • Alexandre Belloni and Boris Brezillon contributed a number of patches to improve and cleanup the support for the PMC (Power Management and Clocks) hardware block. As expected, this involved patching both clock drivers and power management code for the Atmel platforms.
  • Annapurna Labs Alpine platforms support
    • As a newly appointed maintainer of the Annapurna Labs ARM/ARM64 Alpine platforms, Antoine Ténart contributed the base support for the ARM64 Alpine v2 platform: base platform support and Device Tree, and an interrupt controller driver to support MSI-X
  • Marvell ARM processors support
    • Grégory Clement added initial support for the Armada 3700, a new Cortex-A53 based ARM64 SoC from Marvell, as well as a first development board using this SoC. So far, the supported features are: UART, USB and SATA (as well as of course timers and interrupts).
    • Thomas Petazzoni added initial support for the Armada 7K/8K, a new Cortex-A72 based ARM64 SoC from Marvell, as well as a first development board using this SoC. So far, UART, I2C, SPI are supported. However, due to the lack of clock drivers, this initial support can’t be booted yet, the clock drivers and additional support is on its way to 4.7.
    • Thomas Petazzoni contributed an interrupt controller driver for the ODMI interrupt controller found in the Armada 7K/8K SoC.
    • Grégory Clement and Thomas Petazzoni did a few improvements to the support of Armada 38x. Thomas added support for the NAND flash used on Armada 370 DB and Armada XP DB.
    • Boris Brezillon contributed a number of fixes to the Marvell CESA driver, which is used to control the cryptographic engine found in most Marvell EBU processors.
    • Thomas Petazzoni contributed improvements to the irq-armada-370-xp interrupt controller driver, to use the new generic MSI infrastructure.
  • Allwinner ARM processors support
    • Maxime Ripard contributed a few improvements to Allwinner clock drivers, and a few other fixes.
  • MTD and NAND flash subsystem
    • As a maintainer of the NAND subsystem, Boris Brezillon did a number of contributions in this area. Most notably, he added support for the randomizer feature to the Allwinner NAND driver as well as related core NAND subsystem changes. This change is needed to support MLC NANDs on Allwinner platforms. He also contributed several patches to continue clean up and improve the NAND subsystem.
    • Thomas Petazzoni fixed an issue in the pxa3xx_nand driver used on Marvell EBU platforms that prevented using some of the ECC configurations (such as 8 bits BCH ECC on 4 KB pages). He also contributed minor improvements to the generic NAND code.
  • Networking subsystem
    • Grégory Clement contributed an extension to the core networking subsystem that allows to take advantage of hardware capable of doing HW-controlled buffer management. This new extension is used by the mvneta network driver, useful for several Marvell EBU platforms. We expect to extend this mechanism further in the future, in order to take advantage of additional hardware capabilities.
  • RTC subsystem
    • As a maintainer of the RTC subsystem, Alexandre Belloni did a number of fixes and improvements in various RTC drivers.
    • Mylène Josserand contributed a few improvements to the abx80x RTC driver.
  • Altera NIOSII support
    • Romain Perier contributed two patches to fix issues in the kernel running on the Altera NIOSII architecture. The first one, covered in a previous blog post, fixed the NIOSII-specific memset() implementation. The other patch fixes a problem in the generic futex code.

In addition, several our of engineers are maintainers of various platforms or subsystems, so they do a lot of work reviewing and merging the contributions from other kernel developers. This effort can be measured by looking at the number of patches on which they Signed-off-by, but for which they are not the author. Here are the number of patches that our engineered Signed-off-by, but for which they were not the author:

  • Alexandre Belloni, as the RTC subsystem maintainer and the Atmel ARM co-maintainer: 91 patches
  • Maxime Ripard, as the Allwinner ARM co-maintainer: 65 patches
  • Grégory Clement, as the Marvell EBU ARM co-maintainer: 45 patches
  • Thomas Petazzoni, simply resubmitting patches from others: 2 patches

Here is the detailed list of our contributions to the 4.6 kernel release:

How we found that the Linux nios2 memset() implementation had a bug!

NIOS II processorNiosII is a 32-bit RISC embedded processor architecture designed by Altera, for its family of FPGAs: Cyclone III, Cyclone IV, etc. Being a soft-core architecture, by using Altera’s Quartus Prime design software, you can adjust the CPU configuration to your needs and instantiate it into the FPGA. You can customize various parameters like the instruction or the data cache size, enable/disable the MMU, enable/disable an FPU, and so on. And for us embedded Linux engineers, a very interesting aspect is that both the Linux kernel and the U-Boot bootloader, in their official versions, support the NIOS II architecture.

Recently, one of our customers designed a custom NIOS II platform, and we are working on porting the mainline U-Boot bootloader and the mainline Linux kernel to this platform. The U-Boot porting went fine, and quickly allowed us to load and start a Linux kernel. However, the Linux kernel was crashing very early with:

[    0.000000] Linux version 4.5.0-00007-g1717be9-dirty (rperier@archy) (gcc version 4.9.2 (Altera 15.1 Build 185) ) #74 PREEMPT Fri Apr 22 17:43:22 CEST 2016
[    0.000000] bootconsole [early0] enabled
[    0.000000] early_console initialized at 0xe3080000
[    0.000000] BUG: failure at mm/bootmem.c:307/__free()!
[    0.000000] Kernel panic - not syncing: BUG!

This BUG() comes from the __free() function in mm/bootmem.c. The bootmem allocator is a simple page-based allocator used very early in the Linux kernel initialization for the very first allocations, even before the regular buddy page allocator and other allocators such as kmalloc are available. We were slightly surprised to hit a BUG in a generic part of the kernel, and immediately suspected some platform-specific issue, like an invalid load address for our kernel, or invalid link address, or other ideas like this. But we quickly came to the conclusion that everything was looking good on that side, and so we went on to actually understand what this BUG was all about.

The NIOS II memory initialization code in arch/nios2/kernel/setup.c does the following:

bootmap_size = init_bootmem_node(NODE_DATA(0),
                                 min_low_pfn, PFN_DOWN(PHYS_OFFSET),
                                 max_low_pfn);
[...]
free_bootmem(memory_start, memory_end - memory_start);

The first call init_bootmem_node() initializes the bootmem allocator, which primarily consists in allocating a bitmap, with one bit per page. The entire bootmem bitmap is set to 0xff via a memset() during this initialization:

static unsigned long __init init_bootmem_core(bootmem_data_t *bdata,
        unsigned long mapstart, unsigned long start, unsigned long end)
{
        [...]
        mapsize = bootmap_bytes(end - start);
        memset(bdata->node_bootmem_map, 0xff, mapsize);
        [...]
}

After doing the bootmem initialization, the NIOS II architecture code calls free_bootmem() to mark all the memory pages as available, except the ones that contain the kernel itself. To achieve this, the __free() function (which is the one triggering the BUG) clears the bits corresponding to the page to be marked as free. When clearing those bits, the function checks that the bit was previously set, and if it’s not the case, fires the BUG:

static void __init __free(bootmem_data_t *bdata,
                        unsigned long sidx, unsigned long eidx)
{
        [...]
        for (idx = sidx; idx < eidx; idx++)
                if (!test_and_clear_bit(idx, bdata->node_bootmem_map))
                        BUG();
}

So to summarize, we were in a situation where a bitmap is memset to 0xff, but almost immediately afterwards, a function that clears some bits finds that some of the bits are already cleared. Sounds odd, doesn’t it?

We started by double checking that the address of the bitmap was the same between the initialization function and the __free function, verifying that the code was not overwriting the bitmap, and other obvious issues. But everything looked alright. So we simply dumped the bitmap after it was initialized by memset to 0xff, and to our great surprise, we found that the bitmap was in fact initialized with the pattern 0xff00ff00 and not 0xffffffff. This obviously explained why we were hitting this BUG(): simply because the buffer was not properly initialized. At first, we really couldn’t believe this: how it is possible that something as essential as memset() in Linux was not doing its job properly?

On the NIOS II platform, memset() has an architecture-specific implementation, available in arch/nios2/lib/memset.c. For buffers smaller than 8 bytes, this memset implementation uses a simple naive loop, iterating byte by byte. For larger buffers, it uses a more optimized implementation, using inline assembly. This implementation copies data per blocks of 4-bytes rather than 1 byte to speed-up the memset.

We quickly tested a workaround that consisted in using the naive implementation for all buffer sizes, and it solved the problem: we had a booting kernel, all the way to the point where it mounts a root filesystem! So clearly, it’s the optimized implementation in assembly that had a bug.

After some investigation, we found out that the bug was in the very first instructions of the assembly code. The following piece of assembly is supposed to create a 4-byte value that repeats 4 times the 1-byte pattern passed as an argument to memset:

/* fill8 %3, %5 (c & 0xff) */
"       slli    %4, %5, 8\n"
"       or      %4, %4, %5\n"
"       slli    %3, %4, 16\n"
"       or      %3, %3, %4\n"

This code takes as input in %5 the one-byte pattern, and is supposed to return in %3 the 4-byte pattern. It goes through the following logic:

  • Stores in %4 the initial pattern shifted left by 8 bits. Provided an initial pattern of 0xff, %4 should now contain 0xff00
  • Does a logical or between %4 and %5, which leads to %4 containing 0xffff
  • Stores in %3 the 2-byte pattern shifted left by 16 bits. %3 should now contain 0xffff0000.
  • Does a logical or between code>%3 and %4, i.e between 0xffff0000 and 0xffff, which gives the expected 4-byte pattern 0xffffffff

When you look at the source code, it looks perfectly fine, so our source code review didn’t spot the problem. However, when looking at the actual compiled code disassembled, we got:

34:	280a923a 	slli	r5,r5,8
38:	294ab03a 	or	r5,r5,r5
3c:	2808943a 	slli	r4,r5,16
40:	2148b03a 	or	r4,r4,r5

Here r5 gets used for both %4 and %5. Due to this, the final pattern stored in r4 is 0xff00ff00 instead of the expected 0xffffffff.

Now, if we take a look at the output operands, %4 is defined with the "=r" constraint, i.e an output operand. How to prevent the compiler from re-using the corresponding register for another operand? As explained in this document, "=r" does not prevent gcc from using the same register for an output operand (%4) and input operand (%5). By adding the constrainst & (in addition to "=r"), we tell the compiler that the register associated with the given operand is an output-only register, and so, cannot be used with an input operand.

With this change, we get the following assembly output:

34:	2810923a 	slli	r8,r5,8
38:	4150b03a 	or	r8,r8,r5
3c:	400e943a 	slli	r7,r8,16
40:	3a0eb03a 	or	r7,r7,r8

Which is much better, and correctly produces the 0xffffffff pattern when 0xff is provided as the initial 1-byte pattern to memset.

In the end, the final patch only adds one character to adjust the inline assembly constraint and gets the proper behavior from gcc:

diff --git a/arch/nios2/lib/memset.c b/arch/nios2/lib/memset.c
index c2cfcb1..2fcefe7 100644
--- a/arch/nios2/lib/memset.c
+++ b/arch/nios2/lib/memset.c
@@ -68,7 +68,7 @@ void *memset(void *s, int c, size_t count)
 		  "=r" (charcnt),	/* %1  Output */
 		  "=r" (dwordcnt),	/* %2  Output */
 		  "=r" (fill8reg),	/* %3  Output */
-		  "=r" (wrkrega)	/* %4  Output */
+		  "=&r" (wrkrega)	/* %4  Output only */
 		: "r" (c),		/* %5  Input */
 		  "0" (s),		/* %0  Input/Output */
 		  "1" (count)		/* %1  Input/Output */

This patch was sent upstream to the NIOS II kernel maintainers:
[PATCH v2] nios2: memset: use the right constraint modifier for the %4 output operand, and has already been applied by the NIOS II maintainer.

We were quite surprised to find a bug in some common code for the NIOS II architecture: we were assuming it would have already been tested on enough platforms and with enough compilers/situations to not have such issues. But all in all, it was a fun debugging experience!

It is worth mentioning that in addition to this bug, we found another bug affecting NIOS II platforms, in the asm-generic implementation of the futex_atomic_cmpxchg_inatomic() function, which was causing some preemption imbalance warnings during the futex subsystem initialization. We also sent a patch for this problem, which has also been applied already.

Bootlin engineer Boris Brezillon becomes Linux NAND subsystem maintainer

Bootlin engineer Boris Brezillon has been involved in the support for NAND flashes in the Linux kernel for quite some time. He is the author of the NAND driver for the Allwinner ARM processors, did several improvements to the NAND GPMI controller driver, has initiated a significant rework of the NAND subsystem, and is working on supporting MLC NANDs. Boris is also very active on the linux-mtd mailing list by reviewing patches from others, and making suggestions.

Hynix NAND flash

For those reasons, Boris was recently appointed by the MTD maintainer Brian Norris as a new maintainer of the NAND subsystem. NAND is considered a sub-subsystem of the MTD subsystem, and as such, Boris will be sending pull requests to Brian, who in turn is sending pull requests to Linus Torvalds. See this commit for the addition of Boris as a NAND maintainer in the MAINTAINERS file. Boris will therefore be in charge of reviewing and merging all the patches touching drivers/mtd/nand/, which consist mainly of NAND drivers. Boris has created a nand/next branch on Github, where he has already merged a number of patches that will be pushed to Brian Norris during the 4.7 merge window.

We are happy to see one of our engineers taking another position as a maintainer in the kernel community. Maxime Ripard was already a co-maintainer of the Allwinner ARM platform support, Alexandre Belloni a co-maintainer of the RTC subsystem and Atmel ARM platform support, Grégory Clement a co-maintainer of the Marvell EBU platform support, and Antoine Ténart a co-maintainer of the Annapurna Labs platform support.

Bootlin contributions to Linux 4.5

Adelie PenguinLinus Torvalds just released Linux 4.5, for which the major new features have been described by LWN.net in three articles: part 1, part 2 and part 3. On a total of 12080 commits, Bootlin contributed 121 patches, almost exactly 1% of the total. Due to its large number of contribution by patch number, Bootlin engineer Boris Brezillon appears in the statistics of top-contributors for the 4.5 kernel in the LWN.net statistics article.

This time around, our important contributions were:

  • Addition of a driver for the Microcrystal rv1805 RTC, by Alexandre Belloni.
  • A huge number of patches touching all NAND controller drivers and the MTD subsystem, from Boris Brezillon. They are the first step of a more general rework of how NAND controllers and NAND chips are handled in the Linux kernel. As Boris explains in the cover letter, his series aims at clarifying the relationship between the mtd and nand_chip structures and hiding NAND framework internals to NAND. […]. This allows removal of some of the boilerplate code done in all NAND controller drivers, but most importantly, it unifies a bit the way NAND chip structures are instantiated.
  • On the support for the Marvell ARM processors:
    • In the mvneta networking driver (used on Armada 370, XP, 38x and soon on Armada 3700): addition of naive RSS support with per-CPU queues, configure XPS support, numerous fixes for potential race conditions.
    • Fix in the Marvell CESA driver
    • Misc improvements to the mv_xor driver for the Marvell XOR engines.
    • After four years of development the 32-bits Marvell EBU platform support is now pretty mature and the majority of patches for this platform now are improvements of existing drivers or bug fixes rather than new hardware support. Of course, the support for the 64-bits Marvell EBU platform has just started, and will require a significant number of patches and contributions to be fully supported upstream, which is an on-going effort.
  • On the support for the Atmel ARM processors:
    • Addition of the support for the L+G VInCo platform.
    • Improvement to the macb network driver to reset the PHY using a GPIO.
    • Fix Ethernet PHY issues on Atmel SAMA5D4
  • On the support for Allwinner ARM processors:
    • Implement audio capture in the sun4i audio driver.
    • Add the support for a special pin controller available on Allwinner A80.

The complete list of our contributions: