At the end of 2016, the MIPI consortium has finalized the first version of its I3C specification, a new communication bus that aims at replacing older busses like I2C or SPI. According to the specification, I3C gets closer to SPI data rate while requiring less pins and adding interesting mechanisms like in-band interrupts, hotplug capability or automatic discovery of devices connected on the bus. In addition, I3C provides backward compatibility with I2C: I3C and legacy I2C devices can be connected on a common bus controlled by an I3C master.
For more details about I3C, we suggest reading the MIPI I3C Whitepaper, as unfortunately MIPI has not publicly released the specifications for this protocol.
For the last few months, Bootlin engineer Boris Brezillon has been working with Cadence to develop a Linux kernel subsystem to support this new bus, as well as Cadence’s I3C master controller IP. We have now posted the first version of our patch series to the Linux kernel mailing list for review, and we already received a large number of very useful comments from the kernel community.
Bootlin is proud to be pioneering the support for this new bus in the Linux kernel, and hopes to see other developers contribute to this subsystem in the near future!
Linus Torvalds has released the 4.12 Linux kernel a week ago, in what is the second biggest kernel release ever by number of commits. As usual, LWN had a very nice coverage of the major new features and improvements: first part, second part and third part.
LWN has also published statistics about the Linux 4.12 development cycles, showing:
Bootlin as the #14 contributing company by number of commits, with 221 commits, between Broadcom (230 commits) and NXP (212 commits)
Bootlin as the #14 contributing company number of changed lines, with 16636 lines changed, just two lines less than Mellanox
Bootlin engineer and MTD NAND maintainer Boris Brezillon as the #17 most active contributor by number of lines changed.
Our most important contributions to this kernel release have been:
On Atmel AT91 and SAMA5 platforms:
Alexandre Belloni has continued to upstream the support for the SAMA5D2 backup mode, which is a very deep suspend to RAM state, offering very nice power savings. Alexandre touched the core code in arch/arm/mach-at91 as well as pinctrl and irqchip drivers
Boris Brezillon has converted the Atmel PWM driver to the atomic API of the PWM subsystem, implemented suspend/resume and did a number of fixes in the Atmel display controller driver, and also removed the no longer used AT91 Parallel ATA driver.
Quentin Schulz improved the suspend/resume hooks in the atmel-spi driver to support the SAMA5D2 backup mode.
On Allwinner platforms:
Mylène Josserand has made a number of improvements to the sun8i-codec audio driver that she contributed a few releases ago.
Maxime Ripard added devfreq support to dynamically change the frequency of the GPU on the Allwinner A33 SoC.
Quentin Schulz added battery charging and ADC support to the X-Powers AXP20x and AXP22x PMICs, found on Allwinner platforms.
Quentin Schulz added a new IIO driver to support the ADCs found on numerous Allwinner SoCs.
Quentin Schulz added support for the Allwinner A33 built-in thermal sensor, and used it to implement thermal throttling on this platform.
On Marvell platforms:
Antoine Ténart contributed Device Tree changes to describe the cryptographic engines found in the Marvell Armada 7K and 8K SoCs. For now only the Device Tree description has been merged, the driver itself will arrive in Linux 4.13.
Grégory Clement has contributed a pinctrl and GPIO driver for the Marvell Armada 3720 SoC (Cortex-A53 based)
Grégory Clement has improved the Device Tree description of the Marvell Armada 3720 and Marvell Armada 7K/8K SoCs and corresponding evaluation boards: SDHCI and RTC are now enabled on Armada 7K/8K, USB2, USB3 and RTC are now enabled on Armada 3720.
Thomas Petazzoni made a significant number of changes to the mvpp2 network driver, finally adding support for the PPv2.2 version of this Ethernet controller. This allowed to enable network support on the Marvell Armada 7K/8K SoCs.
Thomas Petazzoni contributed a number of fixes to the mv_xor_v2dmaengine driver, used for the XOR engines on the Marvell Armada 7K/8K SoCs.
Thomas Petazzoni cleaned-up the MSI support in the Marvell pci-mvebu and pcie-aardvark PCI host controller drivers, which allowed to remove a no-longer used MSI kernel API.
On the ST SPEAr600 platform:
Thomas Petazzoni added support for the ADC available on this platform, by adding its Device Tree description and fixing a clock driver bug
Thomas did a number of small improvements to the Device Tree description of the SoC and its evaluation board
Thomas cleaned up the fsmc_nand driver, which is used for the NAND controller driver on this platform, removing lots of unused code
In the MTD NAND subsystem:
Boris Brezillon implemented a mechanism to allow vendor-specific initialization and detection steps to be added, on a per-NAND chip basis. As part of this effort, he has split into multiple files the vendor-specific initialization sequences for Macronix, AMD/Spansion, Micron, Toshiba, Hynix and Samsung NANDs. This work will allow in the future to more easily exploit the vendor-specific features of different NAND chips.
Other contributions:
Maxime Ripard added a display panel driver for the ST7789V LCD controller
In addition, several Bootlin engineers are also maintainers of various kernel subsystems. During this release cycle, they reviewed and merged a number of patches from kernel contributors:
Maxime Ripard, as the Allwinner co-maintainer, merged 94 patches
Boris Brezillon, as the NAND maintainer and MTD co-maintainer, merged 64 patches
Alexandre Belloni, as the RTC maintainer and Atmel co-maintainer, merged 38 patches
Grégory Clement, as the Marvell EBU co-maintainer, merged 32 patches
The details of all our contributions for this release:
Since April 2016, we have our own automated testing infrastructure to validate the Linux kernel on a large number of hardware platforms. We use this infrastructure to contribute to the KernelCI project, which tests every day the Linux kernel. However, the tests being done by KernelCI are really basic: it’s mostly booting a basic Linux system and checking that it reaches a shell prompt.
However, LAVA, the software component at the core of this testing infrastructure, can do a lot more than just basic tests.
The need for custom tests
With some of our engineers being Linux maintainers and given all the platforms we need to maintain for our customers, being able to automatically test specific features beyond a simple boot test was a very interesting goal.
In addition, manually testing a kernel change on a large number of hardware platforms can be really tedious. Being able to quickly send test jobs that will use an image you built on your machine can be a great advantage when you have some new code in development that affects more than one board.
We identified two main use cases for custom tests:
Automatic tests to detect regression, as does KernelCI, but with more advanced tests, including platform specific tests.
Manual tests executed by engineers to validate that the changes they are developing do not break existing features, on all platforms.
An appropriate root filesystem, that contains the various userspace programs needed to execute the tests (benchmarking tools, validation tools, etc.)
A test suite, which contains various scripts executing the tests
A custom test tool that glues together the different components
The custom test tool knows all the hardware platforms available and which tests and kernel configurations apply to which hardware platforms. It identifies the appropriate kernel image, Device Tree, root filesystem image and test suite and submits a job to LAVA for execution. LAVA will download the necessary artifacts and run the job on the appropriate device.
Building custom rootfs
When it comes to test specific drivers, dedicated testing, validation or benchmarking tools are sometimes needed. For example, for storage device testing, bonnie++ can be used, while iperf is nice for networking testing. As the default root filesystem used by KernelCI is really minimalist, we need to build our owns, one for each architecture we want to test.
Buildroot is a simple yet efficient tool to generate root filesystems, it is also used by KernelCI to build their minimalist root filesystems. We chose to use it and made custom configuration files to match our needs.
We ended up with custom rootfs built for ARMv4, ARMv5, ARMv7, and ARMv8, that embed for now Bonnie++, iperf, ping (not the Busybox implementation) and other tiny tools that aren’t included in the default Buildroot configuration.
Our Buildroot fork that includes our custom configurations is available as the buildroot-ci Github project (branch ci).
The custom test tool
The custom test tool is the tool that binds the different elements of the overall architecture together.
One of the main features of the tool is to send jobs. Jobs are text files used by LAVA to know what to do with which device. As they are described in LAVA as YAML files (in the version 2 of the API), it is easy to use templates to generate them based on a single model. Some information is quite static such as the device tree name for a given board or the rootfs version to use, but other details change for every job such as the kernel to use or which test to run.
We made a tool able to get the latest kernel images from KernelCI to quickly send jobs without having a to compile a custom kernel image. If the need is to test a custom image that is built locally, the tool is also able to send files to the LAVA server through SSH, to provide a custom kernel image.
The entry point of the tool is ctt.py, which allows to create new jobs, providing a lot of options to define the various aspects of the job (kernel, Device Tree, root filesystem, test, etc.).
This tool is written in Python, and lives in the custom_tests_tool Github project.
The test suite
The test suite is a set of shell scripts that perform tests returning 0 or 1 depending on the result. This test suite is included inside the root filesystem by LAVA as part of a preparation step for each job.
We currently have a small set of tests:
boot test, which simply returns 0. Such a test will be successful as soon as the boot succeeds.
crypto test, to do some minimal testing of cryptographic engines
usb test, to test USB functionality using mass storage devices
simple network test, that just validates network connectivity using ping
All those tests only require the target hardware platform itself. However, for more elaborate network tests, we needed to get two devices to interact with each other: the target hardware platform and a reference PC platform. For this, we use the LAVA MultiNode API. It allows to have a test that spans multiple devices, which we use to perform multiple iperf sessions to benchmark the bandwidth. This test has therefore one part running on the target device (network-board) and one part running on the reference PC platform (network-laptop).
Our current test suite is available as the test_suite Github project. It is obviously limited to just a few tests for now, we hope to extend the tests in the near future.
First use case: daily tests
As previously stated, it’s important for us to know about regressions introduced in the upstream kernel. Therefore, we have set up a simple daily cron job that:
Sends custom jobs to all boards to validate the latest mainline Linux kernel and latest linux-nextli>
Aggregates results from the past 24 hours and sends emails to subscribed addresses
Updates a dashboard that displays results in a very simple page
Second use case: manual tests
The custom test tool ctt.py has a simple command line interface. It’s easy for someone to set it up and send custom jobs. For example:
ctt.py -b beaglebone-black -m network
will start the network test on the BeagleBone Black, using the latest mainline Linux kernel built by KernelCI. On the other hand:
will run the mmc test on the Marvell Armada 7040 and Armada 8040 development boards, using the locally built kernel image and Device Tree.
The result of the job is sent over e-mail when the test has completed.
Conclusion
Thanks to this custom test tool, we now have an infrastructure that leverages our existing lab and LAVA instance to execute more advanced tests. Our goal is now to increase the coverage, by adding more tests, and run them on more devices. Of course, we welcome feedback and contributions!
With 137 patches contributed, Bootlin is the 18th contributing company according to the Kernel Patch Statistics. Bootlin engineer Maxime Ripard appears in the list of top contributors by changed lines in the LWN statistics.
Our most important contributions to this release have been:
Support for Atmel platforms
Alexandre Belloni improved suspend/resume support for the Atmel watchdog driver, I2C controller driver and UART controller driver. This is part of a larger effort to upstream support for the backup mode of the Atmel SAMA5D2 SoC.
Alexandre Belloni also improved the at91-poweroff driver to properly shutdown LPDDR memories.
Boris Brezillon contributed a fix for the Atmel HLCDC display controller driver, as well as fixes for the atmel-ebi driver.
Support for Allwinner platforms
Boris Brezillon contributed a number of improvements to the sunxi-nand driver.
Mylène Josserand contributed a new driver for the digital audio codec on the Allwinner sun8i SoC, as well a the corresponding Device Tree changes and related fixes. Thanks to this driver, Mylène enabled audio support on the R16 Parrot and A33 Sinlinx boards.
Maxime Ripard contributed numerous improvements to the sunxi-mmc MMC controller driver, to support higher data rates, especially for the Allwinner A64.
Maxime Ripard contributed official Device Tree bindings for the ARM Mali GPU, which allows the GPU to be described in the Device Tree of the upstream kernel, even if the ARM kernel driver for the Mali will never be merged upstream.
Maxime Ripard contributed a number of fixes for the rtc-sun6i driver.
Maxime Ripard enabled display support on the A33 Sinlinx board, by contributing a panel driver and the necessary Device Tree changes.
Maxime Ripard continued his clean-up effort, by converting the GR8 and sun5i clock drivers to the sunxi-ng clock infrastructure, and converting the sun5i pinctrl driver to the new model.
Quentin Schulz added a power supply driver for the AXP20X and AXP22X PMICs used on numerous Allwinner platforms, as well as numerous Device Tree changes to enable it on the R16 Parrot and A33 Sinlinx boards.
Support for Marvell platforms
Grégory Clement added support for the RTC found in the Marvell Armada 7K and 8K SoCs.
Grégory Clement added support for the Marvell 88E6141 and 88E6341 Ethernet switches, which are used in the Armada 3700 based EspressoBin development board.
Romain Perier enabled the I2C controller, SPI controller and Ethernet switch on the EspressoBin, by contributing Device Tree changes.
Thomas Petazzoni contributed a number of fixes to the OMAP hwrng driver, which turns out to also be used on the Marvell 7K/8K platforms for their HW random number generator.
Thomas Petazzoni contributed a number of patches for the mvpp2 Ethernet controller driver, preparing the future addition of PPv2.2 support to the driver. The mvpp2 driver currently only supports PPv2.1, the Ethernet controller used on the Marvell Armada 375, and we are working on extending it to support PPv2.2, the Ethernet controller used on the Marvell Armada 7K/8K. PPv2.2 support is scheduled to be merged in 4.12.
Support for RaspberryPi platforms
Boris Brezillon contributed Device Tree changes to enable the VEC (Video Encoder) on all bcm283x platforms. Boris had previously contributed the driver for the VEC.
In addition to our direct contributions, a number of Bootlin engineers are also maintainers of various subsystems in the Linux kernel. As part of this maintenance role:
Maxime Ripard, co-maintainer of the Allwinner ARM platform, reviewed and merged 85 patches from contributors
Alexandre Belloni, maintainer of the RTC subsystem and co-maintainer of the Atmel ARM platform, reviewed and merged 60 patches from contributors
Grégory Clement, co-maintainer of the Marvell ARM platform, reviewed and merged 42 patches from contributors
Boris Brezillon, maintainer of the MTD NAND subsystem, reviewed and merged 8 patches from contributors
Here is the detailed list of contributions, commit per commit:
For a few months, Bootlin has been helping the Raspberry Pi Foundation upstream to the Linux kernel a number of display related features for the Raspberry Pi platform.
The main goal behind this upstreaming process is to get rid of the closed-source firmware that is used on non-upstream kernels every time you need to enable/access a specific hardware feature, and replace it by something that is both open-source and compliant with upstream Linux standards.
Eric Anholt has been working hard to upstream display related features. His biggest contribution has certainly been the open-source driver for the VC4 GPU, but he also worked on the display controller side, and we were contracted to help him with this task.
Our first objective was to add support for SDTV (composite) output, which appeared to be much easier than we imagined. As some of you might already know, the display controller of the Raspberry Pi already has a driver in the DRM subsystem. Our job was to add support for the SDTV encoder (also called VEC, for Video EnCoder). The driver has been submitted just before the 4.10 merge window and surprisingly made it into 4.10 (see also the patches). Eric Anholt explained on his blog:
The Raspberry Pi Foundation recently started contracting with Bootlin to give me some support on the display side of the stack. Last week I got to review and release their first big piece of work: Boris Brezillon’s code for SDTV support. I had suggested that we use this as the first project because it should have been small and self contained. It ended up that we had some clock bugs Boris had to fix, and a bug in my core VC4 CRTC code, but he got a working patch series together shockingly quickly. He did one respin for a couple more fixes once I had tested it, and it’s now out on the list waiting for devicetree maintainer review. If nothing goes wrong, we should have composite out support in 4.11 (we’re probably a week late for 4.10).
Our second objective was to help Eric with HDMI audio support. The code has been submitted on the mailing list 2 weeks ago and will hopefully be queued for 4.12. This time on, we didn’t write much code, since Eric already did the bulk of the work. What we did though is debugging the implementation to make it work. Eric also explained on his blog:
Probably the biggest news of the last two weeks is that Boris’s native HDMI audio driver is now on the mailing list for review. I’m hoping that we can get this merged for 4.12 (4.10 is about to be released, so we’re too late for 4.11). We’ve tested stereo audio so far, no compresesd audio (though I think it should Just Work), and >2 channel audio should be relatively small amounts of work from here. The next step on HDMI audio is to write the alsalib configuration snippets necessary to hide the weird details of HDMI audio (stereo IEC958 frames required) so that sound playback works normally for all existing userspace, which Boris should have a bit of time to work on still.
On our side, it has been a great experience to work on such topics with Eric, and you should expect more contributions from Bootlin for the Raspberry Pi platform in the next months, so stay tuned!
Linus Torvalds has released the 4.9 Linux kernel yesterday, as was expected. With 16214 non-merge commits, this is by far the busiest kernel development cycle ever, but in large part due to the merging of thousands of commits to add support for Greybus. LWN has very well summarized what’s new in this kernel release: 4.9 Merge window part 1, 4.9 Merge window part 2, The end of the 4.9 merge window.
As usual, we take this opportunity to look at the contributions Bootlin made to this kernel release. In total, we contributed 116 non-merge commits. Our most significant contributions this time have been:
Contribution of an input ADC resistor ladder driver, written by Alexandre Belloni. As explained in the commit log: common way of multiplexing buttons on a single input in cheap devices is to use a resistor ladder on an ADC. This driver supports that configuration by polling an ADC channel provided by IIO.
On Atmel platforms, improvements to clock handling, bug fix in the Atmel HLCDC display controller driver.
On Marvell EBU platforms
Addition of clock drivers for the Marvell Armada 3700 (Cortex-A53 based), by Grégory Clement
Several bug fixes and improvements to the Marvell CESA driver, for the crypto engine founds in most Marvell EBU processors. By Romain Perier and Thomas Petazzoni
Support for the PIC interrupt controller, used on the Marvell Armada 7K/8K SoCs, currently used for the PMU (Performance Monitoring Unit). By Thomas Petazzoni.
Enabling of Armada 8K devices, with support for the slave CP110 and the first Armada 8040 development board. By Thomas Petazzoni.
On Allwinner platforms
Addition of GPIO support to the AXP209 driver, which is used to control the PMIC used on most Allwinner designs. Done by Maxime Ripard.
Initial support for the Nextthing GR8 SoC. By Mylène Josserand and Maxime Ripard (pinctrl driver and Device Tree)
The improved sunxi-ng clock code, introduced in Linux 4.8, is now used for Allwinner A23 and A33. Done by Maxime Ripard.
Add support for the Allwinner A33 display controller, by re-using and extending the existing sun4i DRM/KMS driver. Done by Maxime Ripard.
Addition of bridge support in the sun4i DRM/KMS driver, as well as the code for a RGB to VGA bridge, used by the C.H.I.P VGA expansion board. By Maxime Ripard.
Numerous cleanups and improvements commits in the UBI subsystem, in preparation for merging the support for Multi-Level Cells NAND, from Boris Brezillon.
Improvements in the MTD subsystem, by Boris Brezillon:
Addition of mtd_pairing_scheme, a mechanism which allows to express the pairing of NAND pages in Multi-Level Cells NANDs.
Improvements in the selection of NAND timings.
In addition, a number of Bootlin engineers are also maintainers in the Linux kernel, so they review and merge patches from other developers, and send pull requests to other maintainers to get those patches integrated. This lead to the following activity:
Maxime Ripard, as the Allwinner co-maintainer, merged 78 patches from other developers.
Grégory Clement, as the Marvell EBU co-maintainer, merged 43 patches from other developers.
Alexandre Belloni, as the RTC maintainer and Atmel co-maintainer, merged 26 patches from other developers.
Boris Brezillon, as the MTD NAND maintainer, merged 24 patches from other developers.
The complete list of our contributions to this kernel release:
As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working.
Because of Linux’s well-known ability to run on numerous platforms and the obvious impossibility for developers to test changes on all these platforms, continuous integration has a big role to play in Linux kernel development and maintenance.
More generally, continuous integration is made up of three different steps:
building the software which in our case is the Linux kernel,
testing the software,
reporting the tests results;
KernelCI checks hourly if one of the Git repositories it tracks have been updated. If it’s the case then it builds, from the last commit, the kernel for ARM, ARM64 and x86 platforms in many configurations. Then it stores all these builds in a publicly available storage.
Once the kernel images have been built, KernelCI itself is not in charge of testing it on hardware. Instead, it delegates this work to various labs, maintained by individuals or organizations. In the following section, we will discuss the software architecture needed to create such a lab, and receive testing requests from KernelCI.
Core software component: LAVA
At this moment, LAVA is the only supported software by KernelCI but note that KernelCI offers an API, so if LAVA does not meet your needs, go ahead and make your own!
What is LAVA?
LAVA is a self-hosted software, organized in a server-dispatcher model, for controlling boards, to automate boot, bootloader and user-space testing. The server receives jobs specifying what to test, how and on which boards to run those tests, and transmits those jobs to the dispatcher linked to the specified board. The dispatcher applies all modifications on the kernel image needed to make it boot on the said board and then fully interacts with it through the serial.
Since LAVA has to fully and autonomously control boards, it needs to:
interact with the board through serial connection,
control the power supply to reset the board in case of a frozen kernel,
know the commands needed to boot the kernel from the bootloader,
serve files (kernel, DTB, rootfs) to the board.
The first three requirements are fulfilled by LAVA thanks to per-board configuration files. The latter is done by the LAVA dispatcher in charge of the board, which downloads files specified in the job and copies them to a directory accessible by the board through TFTP.
LAVA organizes the lab in devices and device types. All identical devices are from the same device type and share the same device type configuration file. It contains the set of bootloader instructions to boot the kernel (e.g.: how and where to load files) and the bootloader configuration (e.g.: can it boot zImages or only uImages). A device configuration file stores the commands run by a dispatcher to interact with the device: how to connect to serial, how to power it on and off. LAVA interacts with devices via external tools: it has support for conmux or telnet to communicate via serial and power commands can be executed by custom scripts (pdudaemon for example).
Control power supply
Some labs use expensive Switched PDUs to control the power supply of each board but, as discussed in our previous blog post we went for several Devantech ETH008 Ethernet-controlled relay boards instead.
Linaro, the organization behind LAVA, has also developed a software for controlling power supplies of each board, called pdudaemon. We added support for most Devantech relay boards to pdudaemon.
Connect to serial
As advised in LAVA’s installation guide, we went with telnet and ser2net to connect the serial port of our boards. Ser2net basically opens a Linux device and allows to interact with it through a TCP socket on a defined port. A LAVA dispatcher will then launch a telnet client to connect to a board’s serial port. Because of the well-known fact that Linux devices name might change between reboots, we had to use udev rules in order to guarantee the serial we connect to is the one we want to connect to.
Actual testing
Now that LAVA knows how to handle devices, it has to run jobs on those devices. LAVA jobs contain which images to boot (kernel, DTB, rootfs), what kind of tests to run when in user space and where to find them. A job is strongly linked to a device type since it contains the kernel and DTB specifically built for this device type.
Those jobs are submitted to the different labs by the KernelCI project. To do so, KernelCI uses a tool called lava-ci. Amongst other things, this tool contains a big table of the supported platforms, associating the Device Tree name with the corresponding hardware platform name. This way, when a new kernel gets built by KernelCI, and produces a number of Device Tree Blobs (.dtb files), lava-ci knows what are the corresponding hardware platforms to run the kernel on. It submits the jobs to all the labs, which will then only run the tests for which they have the necessary hardware platform. We have contributed a number of patches to lava-ci, adding support for the new platforms we had in our lab.
Reporting test results
After KernelCI has built the kernel, sent jobs to contributing labs and LAVA has run the jobs, KernelCI will then get the tests results from the labs, aggregate them on its website and notify maintainers of errors via a mailing list.
Challenges encountered
As in any project, we stumbled on some difficulties. The biggest problems we had to take care of were board-specific problems.
Some boards like the Marvell RD-370 need a rising edge on a pin to boot, meaning we cannot avoid pressing the reset button between each boot. To work out this problem, we had to customize the hardware (swap resistors) to bypass this limitation.
Some other boards lose their serial connection. Some lose it when resetting their power but recover it after a few seconds, problem we found acceptable to solve by infinitely reconnecting to the serial. However, we still have a problem with a few boards which randomly close their serial connection without any reason. After that, we are able to connect to the serial connection again but it does not send any character. The only way to get it to work again is to physically re-plug the cable used by the serial connection. Unfortunately, we did not find yet a way to solve this bug.
The Linux kernel of our server refused to bind more than 13 USB devices when it was time to create a second drawer of boards. After some research, we found out the culprit was the xHCI driver. In modern computers, it is possible to disable xHCI support in the BIOS but this option was not present in our server’s BIOS. The solution was to rebuild and install a kernel for the server without the xHCI driver compiled. From that day, the number of USB devices is limited to 127 as in the USB specification.
Conclusion
We have now 35 boards in our lab, with some being the only ones represented in KernelCI. We encourage anyone, hobbyists or companies, to contribute to the effort of bringing continuous integration of the Linux kernel by building your own lab and adding as many boards as you can.
As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working, starting with this first blog post that details the hardware infrastructure of our lab.
Introduction
In a lab built for continuous integration, everything has to be fully automated from the serial connections to power supplies and network connections.
To gather as much information as we can get to establish the specifications of the lab, our engineers filled a spreadsheet with all boards they wanted to have in the lab and their specificities in terms of connectors used the serial port communication and power supply. We reached around 50 boards to put into our lab. Among those boards, we could distinguish two different types:
boards which are powered by an ATX power supply,
boards which are powered by different power adapters, providing either 5V or 12V.
Another design criteria was that we wanted to easily allow our engineers to take a board out of the lab or to add one. The easier the process is, the better the lab is.
Home made cabinet
To meet the size constraints of Bootlin office, we had to make the lab fit in a 100cm wide, 75cm deep and 200cm high space. In order to achieve this, we decided to build the lab as a large home made cabinet, with a number of drawers to easily access, change or replace the boards hosted in the lab. As some of our boards provide PCIe connectors, we needed to provide enough height for each drawer, and after doing a few measurements, decided that a 25cm height for our drawers would be fine. With a total height of 200cm, this gives a maximum of 8 drawers.
In addition, it turns out that most of our boards powered by ATX power supplies are rather large in size, while the ones powered by regular power adapters are usually much smaller. In order to simplify the overall design, we decided that all large boards would be grouped together on a given set of drawers, and all small boards would be grouped together on another set of drawers: i.e we would not mix large and small boards in the same drawer. With the 100cm x 75cm size limitation, this meant a drawer for small boards could host up to 8 boards, while a drawer for large boards could host up to 4 boards. From the spreadsheet containing all the boards supposed to be in the lab, we eventually decided there would be 3 large drawers for up to 12 large boards and 5 small drawers for up to 40 small or medium-sized boards.
Furthermore, since the lab will host a server and a lot of boards and power supplies, potentially producing a lot of heat, we have to keep the lab as open as it can be while making sure it is strong enough to hold the drawers. We ended up building our own cabinet, made of wood bought from the local hardware store.
We also want the server to be part of the lab. We already have a small piece of wood to strengthen the lab between the fourth and sixth drawers we could use to fix the server. We decided to give a mini-PC (NUC-like) a try, because, after all, it’s only communicating with the serial of each board and serving files to them. Thus, everything related to the server is fixed and wired behind the lab.
Make the lab autonomous
What continuous integration for the Linux kernel typically needs are control of:
the power for each board
serial port connection
a way to send files to test, typically the kernel image and associated files
In Bootlin lab, these different tasks are handled by a dedicated server, itself hosted in the lab.
Serial port control
Serial connections are mostly handled via USB on the server side but there are many different connectors on the target side (in our lab, we have 6 different connectors: DE9, microUSB, miniUSB, 2.54″ male pins, 2.54″ female pins and USB-B). Therefore, our server has to have a physical connection with each of the 50 boards present in the lab. The need for USB hubs is then obvious.
Since we want as few cables connecting the server and the drawers as possible, we decided to have one USB hub per drawer, be it a large drawer or a small drawer. In a small drawer, up to 8 boards can be present, meaning the hub needs at least 8 USB ports. In a large drawer, up to 4 serial connections can be needed so smaller and more common USB hubs can do the work. Since the serial connection may draw some current on the USB port, we wanted all of our USB hubs to be powered with a dedicated power supply.
All USB hubs are then connected to a main USB hub which in turn is connected to our server.
Power supply control
Our server needs to control each board’s power to be able to automatically power on or off a board. It will power on the board when it needs to test a new kernel on it and power it off at the end of the test or when the kernel has frozen or could not boot at all.
In terms of power supplies, we initially investigated using Ethernet-controlled multi-sockets (also called Switched PDU), such as this device. Unfortunately, these devices are quite expensive, and also often don’t provide the most appropriate connector to plug the cheap 5V/12V power adapters used by most boards.
So, instead, and following a suggestion from Kevin Hilman (one of KernelCI’s founder and maintainer), we decided to use regular ATX power supplies. They have the advantage of being inexpensive, and providing enough power for multiple boards and all their peripherals, potentially including hard drives or other power-hungry peripherals. ATX power supplies also have a pin, called PS_ON#, which when tied to the ground, powers up the ATX power supply. This easily allows to turn an ATX power supply on or off.
In conjunction with the ATX power supplies, we have a selected Ethernet-controlled relay board, the Devantech ETH008, which contains 8 relays that can be remote controlled over the network.
This gives us the following architecture:
For the drawers with large boards powered by ATX directly, we have one ATX power supply per board. The PS_ON pin from the ATX power supply is cut and rewired to the Ethernet controlled relay. Thanks to the relay, we control if PS_ON is tied to the ground or not. If it’s tied to the ground, then the board boots, when it’s untied from the ground, the board is powered off.
For the drawers with small boards, we have a single ATX power supply per drawer. The 12V and 5V rails from the ATX power supply are then dispatched through the 8-relay board, then connected to the appropriate boards, through DC barrel or mini-USB/micro-USB cables, depending on the board. The PS_ON is always tied to the ground, so those ATX power supplies are constantly on.
In addition, we have added a bit of over-voltage protection, by adding transient-voltage-suppression diodes for each voltage output in each drawer. These diodes will absorb all the voltage when it exceeds the maximum authorized value and explode, and are connected in parallel in the circuit to protect.
Network connectivity
As part of the continuous integration process, most of our boards will have to fetch the Linux kernel to test (and potentially other related files) over the network through TFTP. So we need all boards to be connected to the server running the continuous integration software.
Since a single 52 port switch is both fairly expensive, and not very convenient in terms of wiring in our situation, we instead opted for adding 8-port Gigabit switches to each drawer, all of them being connected via a central 16-port Gigabit switch located at the back of the home made cabinet. This central switch not only connects the per-drawer switches, but also the server running the continuous integration software, and the wider Internet.
In-drawer architecture: large boards
A drawer designed for large boards, powered by an ATX power supply contains the following components:
Up to four boards
Four ATX power-supplies, with their PS_ON# connected to an 8-port relay controller. Only 4 of the 8 ports are used on the relay.
One 8-port Ethernet-controlled relay board.
One 4-port USB hub, connecting to the serial ports of the four boards.
One 8-port Ethernet switch, with 4 ports used to connect to the boards, one port used to connect to the relay board, and one port used for the upstream link.
One power strip to power the different components.
In drawer architecture: small boards
A drawer designed for small boards contains the following components:
Up to eight boards
One ATX power-supply, with its 5V and 12V rails going through the 8-port relay controller. All ports in the relay are used when 8 boards are present.
One 8-port Ethernet-controlled relay board.
One 10-port USB hub, connecting to the serial ports of the eight boards.
Two 8-port Ethernet switches, connecting the 8 boards, the relay board and an upstream link.
One power strip to power the different components.
Server
At the back of the home made cabinet, a mini PC runs the continuous integration software, that we will discuss in a future blog post. This mini PC is connected to:
A main 16-port Gigabit switch, itself connected to all the Gigabit switches in the different drawers
A main USB hub, itself connected to all the USB hubs in the different drawers
As expected, this allows the server to control the power of the different boards, access their serial port, and provide network connectivity.
Detailed component list
If you’re interested by the specific components we’ve used for our lab, here is the complete list, with the relevant links:
4-pins Molex to SATA connector, used as an extension cord for splitting outputs from ATX power supply instead of cutting the ATX power supply’s wires for small drawers,
20-pins Molex extension cord, used as an extension cord for splitting outputs from ATX power supply instead of cutting the ATX power supply’s wires for big drawers,
18 AWG cable, to split outputs of the ATX power supplies,
And also: Ethernet cables, wood panels, drawer slides, transient voltage suppression diodes, serial cables and power strips.
Conclusion
Hopefully, sharing these details about the hardware architecture of our board farm will help others to create a similar automated testing infrastructure. We are of course welcoming feedback on this hardware architecture!
Stay tuned for our next blog post about the software architecture of our board farm.
Boris Brezillon improved the Rockchip PWM driver to avoid glitches basing that work on his previous improvement to the PWM subsystem already merged in the kernel. He also fixed a few issues and shortcomings in the pwm regulator driver. This is finishing his work on the Rockchip based Chromebook platforms where a PWM is used for a regulator.
While working on the driver for the sii902x HDMI transceiver, Boris Brezillon did a cleanup of many DRM drivers. Those drivers were open coding the encoder selection. This is now done in the core DRM subsystem.
On the support of Atmel platforms
Alexandre Belloni cleaned up the existing board device trees, removing unused clock definitions and starting to remove warnings when compiling with the Device Tree Compiler (dtc).
On the support of Allwinner platforms
Maxime Ripard contributed a brand new infrastructure, named sunxi-ng, to manage the clocks of the Allwinner platforms, fixing shortcomings of the Device Tree representation used by the existing implementation. He moved the support of the Allwinner H3 clocks to this new infrastructure.
Maxime also developed a driver for the Allwinner A10 Digital Audio controller, bringing audio support to this platform.
Boris Brezillon improved the Allwinner NAND controller driver to support DMA assisted operations, which brings a very nice speed-up to throughput on platforms using NAND flashes as the storage, which is the case of Nextthing’s C.H.I.P.
Quentin Schulz added support for the Allwinner R16 EVB (Parrot) board.
On the support of Marvell platforms
Grégory Clément added multiple clock definitions for the Armada 37xx series of SoCs.
He also corrected a few issues with the I/O coherency on some Marvell SoCs
Romain Perier worked on the Marvell CESA cryptography driver, bringing significant performance improvements, especially for dmcrypt usage. This driver is used on numerous Marvell platforms: Orion, Kirkwood, Armada 370, XP, 375 and 38x.
Thomas Petazzoni submitted a driver for the Aardvark PCI host controller present in the Armada 3700, enabling PCI support for this platform.
Thomas also added a driver for the new XOR engine found in the Armada 7K and Armada 8K families
Here are in details, the different contributions we made to this release:
LWN.net has published yesterday an article containing statistics for the 4.7 development cycle. This article is available for LWN.net subscribers only during the coming week, and will then be available for everyone, free of charge.
It turns out that Boris Brezillon, Bootlin engineer, is the second most active contributor to the 4.7 kernel in number of commits! The top three contributors in number of commits are: H Hartley Sweeten (208 commits), Boris Brezillon (132 commits) and Al Viro (127 commits).
In addition to being present in the most active developers by number of commits, Boris Brezillon is also in the #11 most active contributor in terms of changed lines. As we discussed in our previous blog post, most contributions from Boris were targeted at the PWM subsystem on one side (atomic update support) and the NAND subsystem on the other side.
Another Bootlin engineer shows up in the per-developer statistics: Maxime Ripard is the #17 most active contributor by lines changed. Indeed, Maxime contributed a brand new DRM/KMS driver for the Allwinner display controller.
As a company, Bootlin is ranked for the 4.7 kernel as the #12 most active company by number of commits, and #10 by number of changed lines. We are glad to continue being such a contributor to the Linux kernel development, as we have been for the last four years. If you want your hardware to be supported in the official Linux kernel, contact us!