Software architecture of Bootlin’s lab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working.

We previously introduced the lab and its integration in KernelCI, and presented its hardware infrastructure. Now is time to explain how it actually works on the software side.

Continuous integration in Linux kernel

Because of Linux’s well-known ability to run on numerous platforms and the obvious impossibility for developers to test changes on all these platforms, continuous integration has a big role to play in Linux kernel development and maintenance.

More generally, continuous integration is made up of three different steps:

  • building the software which in our case is the Linux kernel,
  • testing the software,
  • reporting the tests results;
KernelCI complete process
KernelCI complete process

KernelCI checks hourly if one of the Git repositories it tracks have been updated. If it’s the case then it builds, from the last commit, the kernel for ARM, ARM64 and x86 platforms in many configurations. Then it stores all these builds in a publicly available storage.

Once the kernel images have been built, KernelCI itself is not in charge of testing it on hardware. Instead, it delegates this work to various labs, maintained by individuals or organizations. In the following section, we will discuss the software architecture needed to create such a lab, and receive testing requests from KernelCI.

Core software component: LAVA

At this moment, LAVA is the only supported software by KernelCI but note that KernelCI offers an API, so if LAVA does not meet your needs, go ahead and make your own!

What is LAVA?

LAVA is a self-hosted software, organized in a server-dispatcher model, for controlling boards, to automate boot, bootloader and user-space testing. The server receives jobs specifying what to test, how and on which boards to run those tests, and transmits those jobs to the dispatcher linked to the specified board. The dispatcher applies all modifications on the kernel image needed to make it boot on the said board and then fully interacts with it through the serial.

Since LAVA has to fully and autonomously control boards, it needs to:

  • interact with the board through serial connection,
  • control the power supply to reset the board in case of a frozen kernel,
  • know the commands needed to boot the kernel from the bootloader,
  • serve files (kernel, DTB, rootfs) to the board.

The first three requirements are fulfilled by LAVA thanks to per-board configuration files. The latter is done by the LAVA dispatcher in charge of the board, which downloads files specified in the job and copies them to a directory accessible by the board through TFTP.

LAVA organizes the lab in devices and device types. All identical devices are from the same device type and share the same device type configuration file. It contains the set of bootloader instructions to boot the kernel (e.g.: how and where to load files) and the bootloader configuration (e.g.: can it boot zImages or only uImages). A device configuration file stores the commands run by a dispatcher to interact with the device: how to connect to serial, how to power it on and off. LAVA interacts with devices via external tools: it has support for conmux or telnet to communicate via serial and power commands can be executed by custom scripts (pdudaemon for example).

Control power supply

Some labs use expensive Switched PDUs to control the power supply of each board but, as discussed in our previous blog post we went for several Devantech ETH008 Ethernet-controlled relay boards instead.

Linaro, the organization behind LAVA, has also developed a software for controlling power supplies of each board, called pdudaemon. We added support for most Devantech relay boards to pdudaemon.

Connect to serial

As advised in LAVA’s installation guide, we went with telnet and ser2net to connect the serial port of our boards. Ser2net basically opens a Linux device and allows to interact with it through a TCP socket on a defined port. A LAVA dispatcher will then launch a telnet client to connect to a board’s serial port. Because of the well-known fact that Linux devices name might change between reboots, we had to use udev rules in order to guarantee the serial we connect to is the one we want to connect to.

Actual testing

Now that LAVA knows how to handle devices, it has to run jobs on those devices. LAVA jobs contain which images to boot (kernel, DTB, rootfs), what kind of tests to run when in user space and where to find them. A job is strongly linked to a device type since it contains the kernel and DTB specifically built for this device type.

Those jobs are submitted to the different labs by the KernelCI project. To do so, KernelCI uses a tool called lava-ci. Amongst other things, this tool contains a big table of the supported platforms, associating the Device Tree name with the corresponding hardware platform name. This way, when a new kernel gets built by KernelCI, and produces a number of Device Tree Blobs (.dtb files), lava-ci knows what are the corresponding hardware platforms to run the kernel on. It submits the jobs to all the labs, which will then only run the tests for which they have the necessary hardware platform. We have contributed a number of patches to lava-ci, adding support for the new platforms we had in our lab.

LAVA overall architecture

Reporting test results

After KernelCI has built the kernel, sent jobs to contributing labs and LAVA has run the jobs, KernelCI will then get the tests results from the labs, aggregate them on its website and notify maintainers of errors via a mailing list.

Challenges encountered

As in any project, we stumbled on some difficulties. The biggest problems we had to take care of were board-specific problems.

Some boards like the Marvell RD-370 need a rising edge on a pin to boot, meaning we cannot avoid pressing the reset button between each boot. To work out this problem, we had to customize the hardware (swap resistors) to bypass this limitation.

Some other boards lose their serial connection. Some lose it when resetting their power but recover it after a few seconds, problem we found acceptable to solve by infinitely reconnecting to the serial. However, we still have a problem with a few boards which randomly close their serial connection without any reason. After that, we are able to connect to the serial connection again but it does not send any character. The only way to get it to work again is to physically re-plug the cable used by the serial connection. Unfortunately, we did not find yet a way to solve this bug.

The Linux kernel of our server refused to bind more than 13 USB devices when it was time to create a second drawer of boards. After some research, we found out the culprit was the xHCI driver. In modern computers, it is possible to disable xHCI support in the BIOS but this option was not present in our server’s BIOS. The solution was to rebuild and install a kernel for the server without the xHCI driver compiled. From that day, the number of USB devices is limited to 127 as in the USB specification.

Conclusion

We have now 35 boards in our lab, with some being the only ones represented in KernelCI. We encourage anyone, hobbyists or companies, to contribute to the effort of bringing continuous integration of the Linux kernel by building your own lab and adding as many boards as you can.

Interested in becoming a lab? Follow the guide!

Buildroot 2016.11 released, Bootlin contributions

Buildroot LogoThe 2016.11 release of Buildroot has been published on November, 30th. The release announcement, by Buildroot maintainer Peter Korsgaard, gives numerous details about the new features and updates brought by this release. This new release provides support for using multiple BR2_EXTERNAL directories, gives some important updates to the toolchain support, adds default configurations for 9 new hardware platforms, and 38 new packages were added.

On a total of 1423 commits made for this release, Bootlin contributed a total of 253 commits:

$ git shortlog -sn --author=free-electrons 2016.08..2016.11
   142  Gustavo Zacarias
   104  Thomas Petazzoni
     7  Romain Perier

Here are the most important contributions we did:

  • Romain Perier contributed a package for the AMD Catalyst proprietary driver. Such drivers are usually not trivial to integrate, so having a ready-to-use package in Buildroot will really make it easier for Buildroot users who use hardware with an AMD/ATI graphics controller. This package provides both the X.org driver and the OpenGL implementation. This work was sponsored by one of Bootlin customer.
  • Gustavo Zacarias mainly contributed a large set of patches that do a small update to numerous packages, to make sure the proper environment variables are passed. This is a preparation change to bring top-level parallel build in Buildroot. This work was also sponsored by another Bootlin customer.
  • Thomas Petazzoni did contributions in various areas:
    • Added a DEVELOPERS file to the tree, to reference which developers are interested by which architectures and packages. Not only it allows the developers to be Cc’ed when patches are sent on the mailing list (like the get_maintainers script does), but it also used by Buildroot autobuilder infrastructure: if a package fails to build, the corresponding developer is notified by e-mail.
    • Misc updates to the toolchain support: switch to gcc 5.x by default, addition of gcc patches needed to fix various issues, etc.
    • Numerous fixes for build issues detected by Buildroot autobuilders

In addition to contributing 104 commits, Thomas Petazzoni also merged 1095 patches from other developers during this cycle, in order to help Buildroot maintainer Peter Korsgaard.

Finally, Bootlin also sponsored the Buildroot project, by funding the meeting location for the previous Buildroot Developers meeting, which took place in October in Berlin, after the Embedded Linux Conference. See the Buildroot sponsors page, and also the report from this meeting. The next Buildroot meeting will take place after the FOSDEM conference in Brussels.

Bootlin at Linux.conf.au, January 2017

Linux.conf.au, which takes place every year in January in Australia or New Zealand, is a major event of the Linux community. Bootlin already participated to this event three years ago, and will participate again to this year’s edition, which will take place from January 16 to January 20 2017 in Hobart, Tasmania.

Linux Conf Australia 2017

This time, Bootlin CTO Thomas Petazzoni will give a talk titled A tour of the ARM architecture and its Linux support, in which he will share with LCA attendees what is the ARM architecture, how its Linux support is working, what the numerous variants of ARM processors and boards mean, what is the Device Tree, the ARM specific bootloaders, and more.

Linux.conf.au also features a number of other kernel related talks, such as the Kernel Report from Jonathan Corbet, Linux Kernel memory ordering: help arrives at last from Paul E. McKenney. The list of conferences is very impressive, and the event also features a number of miniconfs, including one on the Linux kernel.

If some of our readers located in Australia, New Zealand or neighboring countries plan on attending the conference, do not hesitate to drop us a mail so that we can meet during the event!

Hardware infrastructure of Bootlin’slab

As stated in a previous blog post, we officially launched our lab on 2016, April 25th and it is contributing to KernelCI since then. In a series of blog post, we’d like to present in details how our lab is working, starting with this first blog post that details the hardware infrastructure of our lab.

Introduction

In a lab built for continuous integration, everything has to be fully automated from the serial connections to power supplies and network connections.

To gather as much information as we can get to establish the specifications of the lab, our engineers filled a spreadsheet with all boards they wanted to have in the lab and their specificities in terms of connectors used the serial port communication and power supply. We reached around 50 boards to put into our lab. Among those boards, we could distinguish two different types:

  • boards which are powered by an ATX power supply,
  • boards which are powered by different power adapters, providing either 5V or 12V.

Another design criteria was that we wanted to easily allow our engineers to take a board out of the lab or to add one. The easier the process is, the better the lab is.

Home made cabinet

Bootlin' 8 drawers labTo meet the size constraints of Bootlin office, we had to make the lab fit in a 100cm wide, 75cm deep and 200cm high space. In order to achieve this, we decided to build the lab as a large home made cabinet, with a number of drawers to easily access, change or replace the boards hosted in the lab. As some of our boards provide PCIe connectors, we needed to provide enough height for each drawer, and after doing a few measurements, decided that a 25cm height for our drawers would be fine. With a total height of 200cm, this gives a maximum of 8 drawers.

In addition, it turns out that most of our boards powered by ATX power supplies are rather large in size, while the ones powered by regular power adapters are usually much smaller. In order to simplify the overall design, we decided that all large boards would be grouped together on a given set of drawers, and all small boards would be grouped together on another set of drawers: i.e we would not mix large and small boards in the same drawer. With the 100cm x 75cm size limitation, this meant a drawer for small boards could host up to 8 boards, while a drawer for large boards could host up to 4 boards. From the spreadsheet containing all the boards supposed to be in the lab, we eventually decided there would be 3 large drawers for up to 12 large boards and 5 small drawers for up to 40 small or medium-sized boards.

Furthermore, since the lab will host a server and a lot of boards and power supplies, potentially producing a lot of heat, we have to keep the lab as open as it can be while making sure it is strong enough to hold the drawers. We ended up building our own cabinet, made of wood bought from the local hardware store.

We also want the server to be part of the lab. We already have a small piece of wood to strengthen the lab between the fourth and sixth drawers we could use to fix the server. We decided to give a mini-PC (NUC-like) a try, because, after all, it’s only communicating with the serial of each board and serving files to them. Thus, everything related to the server is fixed and wired behind the lab.

Make the lab autonomous

What continuous integration for the Linux kernel typically needs are control of:

  1. the power for each board
  2. serial port connection
  3. a way to send files to test, typically the kernel image and associated files

In Bootlin lab, these different tasks are handled by a dedicated server, itself hosted in the lab.

Serial port control

Serial connections are mostly handled via USB on the server side but there are many different connectors on the target side (in our lab, we have 6 different connectors: DE9, microUSB, miniUSB, 2.54″ male pins, 2.54″ female pins and USB-B). Therefore, our server has to have a physical connection with each of the 50 boards present in the lab. The need for USB hubs is then obvious.

Since we want as few cables connecting the server and the drawers as possible, we decided to have one USB hub per drawer, be it a large drawer or a small drawer. In a small drawer, up to 8 boards can be present, meaning the hub needs at least 8 USB ports. In a large drawer, up to 4 serial connections can be needed so smaller and more common USB hubs can do the work. Since the serial connection may draw some current on the USB port, we wanted all of our USB hubs to be powered with a dedicated power supply.

All USB hubs are then connected to a main USB hub which in turn is connected to our server.

Power supply control

Our server needs to control each board’s power to be able to automatically power on or off a board. It will power on the board when it needs to test a new kernel on it and power it off at the end of the test or when the kernel has frozen or could not boot at all.

In terms of power supplies, we initially investigated using Ethernet-controlled multi-sockets (also called Switched PDU), such as this device. Unfortunately, these devices are quite expensive, and also often don’t provide the most appropriate connector to plug the cheap 5V/12V power adapters used by most boards.

So, instead, and following a suggestion from Kevin Hilman (one of KernelCI’s founder and maintainer), we decided to use regular ATX power supplies. They have the advantage of being inexpensive, and providing enough power for multiple boards and all their peripherals, potentially including hard drives or other power-hungry peripherals. ATX power supplies also have a pin, called PS_ON#, which when tied to the ground, powers up the ATX power supply. This easily allows to turn an ATX power supply on or off.

In conjunction with the ATX power supplies, we have a selected Ethernet-controlled relay board, the Devantech ETH008, which contains 8 relays that can be remote controlled over the network.

This gives us the following architecture:

  • For the drawers with large boards powered by ATX directly, we have one ATX power supply per board. The PS_ON pin from the ATX power supply is cut and rewired to the Ethernet controlled relay. Thanks to the relay, we control if PS_ON is tied to the ground or not. If it’s tied to the ground, then the board boots, when it’s untied from the ground, the board is powered off.
  • For the drawers with small boards, we have a single ATX power supply per drawer. The 12V and 5V rails from the ATX power supply are then dispatched through the 8-relay board, then connected to the appropriate boards, through DC barrel or mini-USB/micro-USB cables, depending on the board. The PS_ON is always tied to the ground, so those ATX power supplies are constantly on.

In addition, we have added a bit of over-voltage protection, by adding transient-voltage-suppression diodes for each voltage output in each drawer. These diodes will absorb all the voltage when it exceeds the maximum authorized value and explode, and are connected in parallel in the circuit to protect.

Network connectivity

As part of the continuous integration process, most of our boards will have to fetch the Linux kernel to test (and potentially other related files) over the network through TFTP. So we need all boards to be connected to the server running the continuous integration software.

Since a single 52 port switch is both fairly expensive, and not very convenient in terms of wiring in our situation, we instead opted for adding 8-port Gigabit switches to each drawer, all of them being connected via a central 16-port Gigabit switch located at the back of the home made cabinet. This central switch not only connects the per-drawer switches, but also the server running the continuous integration software, and the wider Internet.

In-drawer architecture: large boards

A drawer designed for large boards, powered by an ATX power supply contains the following components:

  • Up to four boards
  • Four ATX power-supplies, with their PS_ON# connected to an 8-port relay controller. Only 4 of the 8 ports are used on the relay.
  • One 8-port Ethernet-controlled relay board.
  • One 4-port USB hub, connecting to the serial ports of the four boards.
  • One 8-port Ethernet switch, with 4 ports used to connect to the boards, one port used to connect to the relay board, and one port used for the upstream link.
  • One power strip to power the different components.
Large drawer example scheme
Large drawer example scheme
Large drawer in the lab
Large drawer in the lab

In drawer architecture: small boards

A drawer designed for small boards contains the following components:

  • Up to eight boards
  • One ATX power-supply, with its 5V and 12V rails going through the 8-port relay controller. All ports in the relay are used when 8 boards are present.
  • One 8-port Ethernet-controlled relay board.
  • One 10-port USB hub, connecting to the serial ports of the eight boards.
  • Two 8-port Ethernet switches, connecting the 8 boards, the relay board and an upstream link.
  • One power strip to power the different components.
Small drawer example scheme
Small drawer example scheme
Small drawer in the lab
Small drawer in the lab

Server

At the back of the home made cabinet, a mini PC runs the continuous integration software, that we will discuss in a future blog post. This mini PC is connected to:

  • A main 16-port Gigabit switch, itself connected to all the Gigabit switches in the different drawers
  • A main USB hub, itself connected to all the USB hubs in the different drawers

As expected, this allows the server to control the power of the different boards, access their serial port, and provide network connectivity.

Detailed component list

If you’re interested by the specific components we’ve used for our lab, here is the complete list, with the relevant links:

Conclusion

Hopefully, sharing these details about the hardware architecture of our board farm will help others to create a similar automated testing infrastructure. We are of course welcoming feedback on this hardware architecture!

Stay tuned for our next blog post about the software architecture of our board farm.

Slides and videos from the Embedded Linux Conference Europe 2016

Last month, the entire Bootlin engineering team attended the Embedded Linux Conference Europe in Berlin. The slides and videos of the talks have been posted, including the ones from the seven talks given by Bootlin engineers:

  • Alexandre Belloni presented on ASoC: Supporting Audio on an Embedded Board, slides and video.
  • Boris Brezillon presented on Modernizing the NAND framework, the big picture, slides and video.
  • Boris Brezillon, together with Richard Weinberger from sigma star, presented on Running UBI/UBIFS on MLC NAND, slides and video.
  • Grégory Clement presented on Your newer ARM64 SoC Linux check list, slides and video.
  • Thomas Petazzoni presented on Anatomy of cross-compilation toolchains, slides and video.
  • Maxime Ripard presented on Supporting the camera interface on the C.H.I.P, slides and video.
  • Quentin Schulz and Antoine Ténart presented on Building a board farm: continuous integration and remote control, slides and video.

Support for Device Tree overlays in U-Boot and libfdt

C.H.I.PWe have been working for almost two years now on the C.H.I.P platform from Nextthing Co.. One of the characteristics of this platform is that it provides an expansion headers, which allows to connect expansion boards also called DIPs in the CHIP community.

In a manner similar to what is done for the BeagleBone capes, it quickly became clear that we should be using Device Tree overlays to describe the hardware available on those expansion boards. Thanks to the feedback from the Beagleboard community (especially David Anders, Pantelis Antoniou and Matt Porter), we designed a very nice mechanism for run-time detection of the DIPs connected to the platform, based on an EEPROM available in each DIP and connected through the 1-wire bus. This EEPROM allows the system running on the CHIP to detect which DIPs are connected to the system at boot time. Our engineer Antoine Ténart worked on a prototype Linux driver to detect the connected DIPs and load the associated Device Tree overlay. Antoine’s work was even presented at the Embedded Linux Conference, in April 2016: one can see the slides and video of Antoine’s talk.

However, it turned out that this Linux driver had a few limitations. Because the driver relies on Device Tree overlays stored as files in the root filesystem, such overlays can only be loaded fairly late in the boot process. This wasn’t working very well with storage devices or for DRM that doesn’t allow hotplug of some components. Therefore, this solution wasn’t working well for the display-related DIPs provided for the CHIP: the VGA and HDMI DIP.

The answer to that was to apply those Device Tree overlays earlier, in the bootloader, so that Linux wouldn’t have to deal with them. Since we’re using U-Boot on the CHIP, we made a first implementation that we submitted back in April. The review process took its place, it was eventually merged and appeared in U-Boot 2016.09.

List of relevant commits in U-Boot:

However, the U-Boot community also requested that the changes should also be merged in the upstream libfdt, which is hosted as part of dtc, the device tree compiler.

Following this suggestion, Bootlin engineer Maxime Ripard has been working on merging those changes in the upstream libfdt. He sent a number of iterations, which received very good feedback from dtc maintainer David Gibson. And it finally came to a conclusion early October, when David merged the seventh iteration of those patches in the dtc repository. It should therefore hopefully be part of the next dtc/libfdt release.

List of relevant commits in the Device Tree compiler:

Since the libfdt is used by a number of other projects (like Barebox, or even Linux itself), all of them will gain the ability to apply device tree overlays when they will upgrade their version. People from the BeagleBone and the Raspberry Pi communities have already expressed interest in using this work, so hopefully, this will turn into something that will be available on all the major ARM platforms.

A Kickstarter for a low cost Marvell ARM64 board

At the beginning of October a Kickstarter campaign was launched to fund the development of a low-cost board based on one of the latest Marvell ARM 64-bit SoC: the Armada 3700. While being under $50, the board would allow using most of the Armada 3700 features:

  • Gigabit Ethernet
  • SATA
  • USB 3.0
  • miniPCIe

ESPRESSObin interfaces

The Kickstarter campaign was started by Globalscale Technologies, who has already produced numerous Marvell boards in the past: the Armada 370 based Mirabox, the Kirkwood based SheevaPlug, DreamPlug and more.

We pushed the initial support of this SoC to the mainline Linux kernel 6 months ago, and it landed in Linux 4.6. There are still a number of hardware features that are not yet supported in the mainline kernel, but we are actively working on it. As an example, support for the PCIe controller was merged in Linux 4.8, released last Sunday. According to the Kickstarter page the first boards would be delivered in January 2017 and by this time we hope to have managed to push more support for this SoC to the mainline Linux kernel.

We have been working on the mainline support of the Marvell SoC for 4 years and we are glad to see at last the first board under $50 using this SoC. We hope it will help expanding the open source community around this SoC family and will bring more contributions to the Marvell EBU SoCs.

Linux 4.8 released, Bootlin contributions

Adelie PenguinLinux 4.8 has been released on Sunday by Linus Torvalds, with numerous new features and improvements that have been described in details on LWN: part 1, part 2 and part 3. KernelNewbies also has an updated page on the 4.8 release. We contributed a total of 153 patches to this release. LWN also published some statistics about this development cycle.

Our most significant contributions:

  • Boris Brezillon improved the Rockchip PWM driver to avoid glitches basing that work on his previous improvement to the PWM subsystem already merged in the kernel. He also fixed a few issues and shortcomings in the pwm regulator driver. This is finishing his work on the Rockchip based Chromebook platforms where a PWM is used for a regulator.
  • While working on the driver for the sii902x HDMI transceiver, Boris Brezillon did a cleanup of many DRM drivers. Those drivers were open coding the encoder selection. This is now done in the core DRM subsystem.
  • On the support of Atmel platforms
    • Alexandre Belloni cleaned up the existing board device trees, removing unused clock definitions and starting to remove warnings when compiling with the Device Tree Compiler (dtc).
  • On the support of Allwinner platforms
    • Maxime Ripard contributed a brand new infrastructure, named sunxi-ng, to manage the clocks of the Allwinner platforms, fixing shortcomings of the Device Tree representation used by the existing implementation. He moved the support of the Allwinner H3 clocks to this new infrastructure.
    • Maxime also developed a driver for the Allwinner A10 Digital Audio controller, bringing audio support to this platform.
    • Boris Brezillon improved the Allwinner NAND controller driver to support DMA assisted operations, which brings a very nice speed-up to throughput on platforms using NAND flashes as the storage, which is the case of Nextthing’s C.H.I.P.
    • Quentin Schulz added support for the Allwinner R16 EVB (Parrot) board.
  • On the support of Marvell platforms
    • Grégory Clément added multiple clock definitions for the Armada 37xx series of SoCs.
    • He also corrected a few issues with the I/O coherency on some Marvell SoCs
    • Romain Perier worked on the Marvell CESA cryptography driver, bringing significant performance improvements, especially for dmcrypt usage. This driver is used on numerous Marvell platforms: Orion, Kirkwood, Armada 370, XP, 375 and 38x.
    • Thomas Petazzoni submitted a driver for the Aardvark PCI host controller present in the Armada 3700, enabling PCI support for this platform.
    • Thomas also added a driver for the new XOR engine found in the Armada 7K and Armada 8K families

Here are in details, the different contributions we made to this release:

Bootlin at the X.org Developer Conference 2016

The X.org Foundation hosts every year around september the X.org Developer Conference, which, unlike its name states, is not limited to X.org developers, but gathers all the Linux graphics stack developers, including X.org, Mesa, wayland, and other graphics stacks like ChromeOS, Android or Tizen.

This year’s edition was held last week in the University of Haaga-Helia, in Helsinki. At Bootlin, we’ve had more and more developments on the graphic stack recently through the work we do on Atmel and NextThing Co’s C.H.I.P., so it made sense to attend.

XDC 2016 conference

There’s been a lot of very interesting talks during those three days, as can be seen in the conference schedule, but we especially liked a few of those:

DRM HWComposer – SlidesVideo

The opening talk was made by two Google engineers from the ChromeOS team, Sean Paul and Zach Reizner. They talked about the work they did on the drm_hwcomposer they wrote for the Pixel C, on Android.

The hwcomposer is one of the HAL in Android that interfaces between Surface Flinger, the display manager, and the underlying display driver. It aims at providing hardware composition features, so that Android can leverage the capacities of the display engine to perform compositions (through planes and sprites), without having to use the CPU or the GPU to do this work.

The drm_hwcomposer started out as yet another hwcomposer library implementation for the tegra-drm driver in Linux. While they implemented it, it turned into some generic enough implementation that should be useful for all the DRM drivers out there, and they even introduced some particularly nice features, to split the final screen content into several planes based on the actual displayed content rather than on windows like it’s usually done.

Their work also helped to point out a few flaws in the hwcomposer API, that will eventually be fixed in a new revision of that API.

ARC++ SlidesVideo

The next talk was once again from a ChromeOS engineer, David Reveman, who came to show his work on ARC++, the component in ChromeOS that allows to run Android applications. He was obviously mostly talking about the display side.

In order to achieve that, he had to implement an hwcomposer that would just act as a proxy between SurfaceFlinger and Wayland that is used on the ChromeOS side. The GL rendering is still direct though, and each Android application will talk directly to the GPU, as usual. Only the composition will be forwarded to the ChromeOS side.

In order to minimize that composition process, whenever possible, ARC++ tries to back each application with an overlay so that the composition would happen directly in hardware.

This also led to some interesting challenges, especially since some of the assumptions of both systems are in contradiction. For example, any application can be resized in ChromeOS, while it’s not really a thing in Android where all the applications run full screen.

HDR Displays in Linux – SlidesVideo

The next talk we found interesting was Andy Ritger from nVidia explaining how the HDR displays were supposed to be handled in Linux.

He first started by explaining what HDR is exactly. While the HDR is just about having a wider range of luminance than on a regular display, you often also get a wider gamut with HDR capable displays. This means that on those screens you can display a wider range of colors, and with a better range and precision in their intensity. And
while the applications have been able to generate HDR content for more than 10 years, the rest of the display stack wasn’t really ready, meaning that you had convert the HDR colors to colors that your monitor was able to display, using a technique called tone mapping.

He then explained than the standard, non-HDR colorspace, sRGB, is not a linear colorspace. This means than by doubling the encoded luminance of a color, you will not get a color twice brighter on your display. This was meant this way because the human eye is much more sensitive to the various shades of colors when they are dark than when they are bright. Which essentially means that the darker the color is, the more precision you want to get.

However, the luminance “resolution” on the HDR display is so good that you actually don’t need that anymore, and you can have a linear colorspace, which is in our case SCRGB.

But drawing blindly in all your applications in SCRGB is obviously not a good solution either. You have to make sure that your screen supports it (which is exposed through its EDIDs), but also that you actually tell your screeen to switch to it (through the infoframes). And that requires some support in the kernel drivers.

The Anatomy of a Vulkan Driver – SlidesVideo

This talk by Jason Ekstrand was some kind of a war story of the bring up Intel did of a Vulkan implementation on their GPU.

He first started by saying that it was actually a not so long project, especially when you consider that they wrote it from scratch, since it took roughly 3 full-time engineers 8 months to come up with a fully compliant and open source stack.

He then explained why Vulkan was needed. While OpenGL did amazingly well to cope with the hardware evolutions, it was still designed over 20 years ago, This proved to have some core characteristics that are not really relevant any more, and are holding the application developers back. For example, he mentioned that at its core, OpenGL is based on a singleton-based state machine, that obviously doesn’t scale well anymore on our SMP systems. He also mentioned that it was too abstracted, and people just wanted a lower level API, or that you might want to render things off screen without X or any context.

This was fixed in Vulkan by effectively removing the state machine, which allows it to scale, push things like the error checking or the synchronization directly to the applications, making the implementation much simpler and less layered which also simplifies the development and debugging.

He then went on to discuss how we could share the code that was still shared between the two implementations, like implementing OpenGL on top of Vulkan (which was discarded), having some kind of lighter intermediate language in Mesa to replace Gallium or just sharing through a library the common bits and making both the OpenGL and Vulkan libraries use that.

Motivating preemptive GPU scheduling for real-time systems – SlidesVideo

The last talk that we want to mention is the talk on preemptive scheduling by Roy Spliet, from the University of Cambridge.

More and more industries, and especially the automotive industry, offload some computations to the GPU for example to implement computer vision. This is then used in a car to implement the autonomous driving to make the car recognize signs or stay in its lane. And obviously, this kind of computations are supposed to be handled in a real time
system, since you probably don’t want your shiny user interface for the heating to make your car crash in the car before it because its rendering was taking too long.

He first started to explain what real time means, and what the usual metrics are, which should to no surprise to people used to “CPU based” real time systems: latency, deadline, execution time, and so on.

He then showed a bunch of benchmarks he used to test his preemptive scheduler, in a workload that was basically running OpenArena while running some computations, on various nouveau based platforms (both desktop-grade GPUs, and embedded SoCs).

This led to some expected conclusions, like the fact that a preemptive scheduler is indeed adding some overhead, but is on average worth it, while some have been quite interesting. He was for example observing some worst case latencies that were quite rare (0.3%), but were actually interferences from the display engine filling up its empty FIFOs, and creating some contention on the memory bus.

Conclusion

Overall, this has been a great experience. The organisation was flawless, and the one-track-only format allows you to meet easily both the speakers and attendees. The content was also highly technical, as you might expect, which made us learn a lot and led us to think about some interesting developments we could do on our various projects in the future, such as NextThing Co’s CHIP.

Yocto project and OpenEmbedded training updated to Krogoth

yocto

Continuing our efforts to keep our training materials up-to-date we just refreshed our Yocto project and OpenEmbedded training course to the latest Yocto project release, Krogoth (2.1.1). In addition to adapting our training labs to the Krogoth release, we improved our training materials to cover more aspects and new features.

The most important changes are:

  • New chapter about devtool, the new utility from the Yocto project to improve the developers’ workflow to integrate a package into the build system or to make patches to existing packages.
  • Improve the distro layers slides to add configuration samples and give advice on how to use these layers.
  • Add a part about quilt to easily patch already supported packages.
  • Explain in depth how file inclusions are handled by BitBake.
  • Improve the description about tasks by adding slides on how to write them in Python.

The updated training materials are available on our training page: agenda (PDF), slides (PDF) and labs (PDF).

Join our Yocto specialist Alexandre Belloni for the first public session of this improved training in Lyon (France) on October 19-21, 2016. We are also available to deliver this training worldwide at your site, contact us!