Device Tree 101 webinar slides and videos

As we announced back in January, we have offered in partnership with ST on February 9 a free webinar titled Device Tree 101, which gives a detailed introduction to the Device Tree, an important mechanism used in the embedded Linux ecosystem to describe hardware platforms. We were happy to see the interest around this topic and webinar.

Bootlin has always shared freely and openly all its technical contents, including our training materials, and this webinar is no exception. We are therefore sharing the slides and video recording of both sessions of this webinar:

Thanks to everyone who participated and thanks to ST for the support in organizing this event! Do not hesitate to share and/or like our video, and to suggest us other topics that would be useful to cover in future webinars!

Free “Device Tree 101” webinar, on February 9, 2021

In partnership with ST, we are organizing on February 9, 2021, a free webinar entitled “Device Tree 101”.

The Device Tree has been adopted for the ARM 32-bit Linux kernel support almost a decade ago, and since then, its usage has expanded to many other CPU architectures in Linux, as well as bootloaders such as U-Boot or Barebox. Even though Device Tree is no longer a new mechanism, developers coming into the embedded Linux world often struggle to understand what Device Trees are, what is their syntax, how they interact with the Linux kernel device drivers, what Device Tree bindings are, and more. This webinar will offer a deep dive into the Device Tree, to jump start new developers in using this description language that is now ubiquitous in the vast majority of embedded Linux projects. This webinar will be illustrated with numerous examples applicable to the STM32MP1 MPU platforms, which make extensive usage of the Device Tree.

This webinar will take place on February 9, 2021, and is proposed at two different times during the day: at 10 AM CET (UTC+1) and 5 PM CET (UTC+1). The duration of the webinar is 1 hours and 30 minutes. Registration is free at https://www.eventbrite.com/e/135964923747. The webinar itself will be hosted as a Youtube Live stream, which will allow participants to ask questions in the chat during the webinar.

Device Tree 101

The trainer for this webinar is Thomas Petazzoni, Bootlin’s CTO. Thomas is the author of the popular « Device Tree for Dummies » talk given in 2014 and which helped numerous embedded Linux developers get started with the Device Tree. Thomas has contributed over 900 patches to the official Linux kernel, mainly around ARM hardware platform support. He is also the co-maintainer of the Buildroot open-source project.

Bootlin toolchains integration in Buildroot

Since 2017, Bootlin is freely providing ready-to-use pre-built cross-compilation toolchains at https://toolchains.bootlin.com/. We are now providing over 150 toolchains, for a wide range of CPU architectures, covering the glibc, uClibc-ng and musl C libraries, with up-to-date gcc, binutils, gdb and C library support.

We recently contributed an improvement to Buildroot that allows those toolchains to very easily be used in Buildroot configurations: the Bootlin toolchains are now all known by Buildroot as existing external toolchains, next to toolchains from other vendors such as ARM, Synopsys and others.

If you are building a Buildroot system for a CPU architecture variant that has a matching toolchain available from bootlin.toolchains.com, then Bootlin toolchains will naturally show up in the Toolchain sub-menu, when the selected Toolchain type is External toolchain. For example, if the selected CPU architecture is ARM little endian Cortex-A9, with VFP you will see:

Bootlin toolchain selection

Once Bootlin toolchains is selected, a new sub-option Bootlin toolchain variant appears, which allows to choose between the different toolchains applicable to the selected CPU architecture:

Bootlin toolchain choice

This hopefully should make Bootlin toolchains easier to use for Buildroot users.

Internally, this support for Bootlin toolchains in Buildroot is generated and updated using the support/scripts/gen-bootlin-toolchains script. In addition to making the toolchains available to the user, it allows generates some Buildroot test cases for each toolchain, so that each of those configuration is tested by Buildroot continuous integration, see support/testing/tests/toolchain/test_external_bootlin.py.

Linux 5.10, Bootlin contributions

Linux 5.10 was released a few weeks ago, and while 5.11-rc2 is already out, it’s still time to look at what Bootlin contributed to the 5.10 kernel. As usual, for a broad overview of the major changes in 5.10, we recommend reading the LWN articles: 5.10 merge window part 1, the rest of the 5.10 merge window, or the 5.10 KernelNewbies page.

Overall, Bootlin contributed 78 patches to this kernel release, in the following areas:

  • Alexandre Belloni did a number of improvements in the support of Microchip ARM platforms: device tree updates, code cleanups, etc.
  • Alexandre Belloni added a new rv3032 RTC driver and did some improvements to the r9701 RTC driver.
  • Miquèl Raynal implemented a significant rework of how ECC engines are handled in the MTD subsystems, so that ECC engines can be used not just for parallel NANDs but also for SPI NANDs. See also the talk that Miquèl gave at the Embedded Linux Conference Europe on this topic: slides and video.
  • Miquèl Raynal contributed a few improvements to the tlv320aic32x4 audio codec driver.
  • Paul Kocialkowski made some small improvements, one in the OV5640 camera sensor driver, and one in the Rockchip DRM driver.
  • Thomas Petazzoni implemented a performance improvement in the max310x driver, used for SPI-connected UART controllers.

In addition to these code contributions, we also contribute by having several of our engineers be maintainers of a few subsystems of the Linux kernel. As part of this:

  • Miquèl Raynal reviewed and merged 47 patches touching the MTD subsystem he co-maintains with other kernel developers.
  • Alexandre Belloni reviewed and merged 42 patches touching either the Microchip ARM or MIPS platforms, or the RTC subsystem.
  • Grégory Clement reviewed and merged 2 patches touching the Marvell ARM/ARM64 platform support.

Here is the complete list of our commits to 5.10.

Bootlin welcomes Thomas Perrot in its team

Welcome on board!Since December 1st, 2020, we’re happy to have in our team an additional engineer, Thomas Perrot, who joined our office in Toulouse, France.

Thomas brings 6+ years of experience working on embedded Linux systems, during which he worked at Intel on Android platforms, and then at Sigfox on the base stations for Sigfox’s radio network. Thomas has experience working with Linux on x86-64, ARM and ARM64 platforms, with a wide range of skills: bootloader development, Linux kernel and driver development, Yocto integration, OTA updates. Thomas was also deeply involved in the strong security aspects of Sigfox base stations, with secure boot and measured boot, TPM, integrity measurement, etc. At Sigfox, Thomas was involved in all steps of the product life-cycle, from the design phase all the way to the in-field deployment, update and maintenance. Last but not least, Thomas is a Linux technologist and a free software enthusiast, who hacks some open source hardware projects on his free time. See also Thomas page on our website and his LinkedIn profile.

Thomas Perrot is joining our growing team of engineers in Toulouse, which already included Paul Kocialkowski, Miquèl Raynal, Köry Maincent, Maxime Chevallier and Thomas Petazzoni.

On-going Bootlin contributions to the Video4Linux subsystem: camera, camera sensors, video encoding

Over the past years, we have been more and more involved in projects that have significant multimedia requirements. As part of this trend, 2020 has lead us to work on a number of contributions to the Video4Linux subsystem of the Linux kernel, with new drivers for camera interfaces, camera sensors, video decoders, and even HW-accelerated video encoding. In this blog post, we propose to summarize our contributions and their status on the following topics:

  • Rockchip PX30, RK1808, RK3128 and RK3288 camera interface driver
  • Allwinner A31, V3s/V3/S3 and A83T MIPI CSI-2 support for the camera interface driver
  • Omnivision OV8865 camera sensor driver
  • Omnivision OV5648 camera sensor driver
  • TW9900 PAL/NTSC video decoder driver
  • Rockchip HW-accelerated H264 video encoding

Rockchip camera interface

Rockchip camera interfaceThe Rockchip ARM processors are known to have very good support in the upstream Linux kernel. However, one area where the support was lacking is in the support of the camera interface used by those SoCs. And it turns out that Bootlin engineer Maxime Chevallier has worked precisely on this topic throughout 2020: the development and upstreaming of the rkvip driver, a Video4Linux driver for the Rockchip camera interface. While the work was done and tested on a Rockchip PX30 platform, the same camera interface is used on RK1808, RK3128 and RK3288.

Several iterations of the driver have been posted on the linux-media mailing list, with the latest iteration, version 5, posted on December 29, 2020:

Maxime Chevallier (3):
  media: dt-bindings: media: Document Rockchip VIP bindings
  media: rockchip: Introduce driver for Rockhip's camera interface
  arm64: dts: rockchip: Add the camera interface description of the PX30

 .../bindings/media/rockchip-vip.yaml          |  101 ++
 arch/arm64/boot/dts/rockchip/px30.dtsi        |   12 +
 drivers/media/platform/Kconfig                |   15 +
 drivers/media/platform/Makefile               |    1 +
 drivers/media/platform/rockchip/vip/Makefile  |    3 +
 drivers/media/platform/rockchip/vip/capture.c | 1146 +++++++++++++++++
 drivers/media/platform/rockchip/vip/dev.c     |  331 +++++
 drivers/media/platform/rockchip/vip/dev.h     |  203 +++
 drivers/media/platform/rockchip/vip/regs.h    |  260 ++++
 9 files changed, 2072 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/rockchip-vip.yaml
 create mode 100644 drivers/media/platform/rockchip/vip/Makefile
 create mode 100644 drivers/media/platform/rockchip/vip/capture.c
 create mode 100644 drivers/media/platform/rockchip/vip/dev.c
 create mode 100644 drivers/media/platform/rockchip/vip/dev.h
 create mode 100644 drivers/media/platform/rockchip/vip/regs.h

We’re hoping to get this driver merged soon, as we have now addressed the feedback that was received through the 5 iterations the patch series as gone through. It should be noted that for now it only supports the parallel BT656 interface as this is what we needed for our current project, we are definitely able to extend it to support MIPI CSI2 as well if you’re interested!

It should be noted that as a result of this work, Maxime Chevallier also prepared and delivered a talk From a video sensor to your display which was given at the Embedded Linux Conference Europe 2020. See the slides and video.

Allwinner MIPI CSI2 camera interface

Allwinner MIPI CSI2As part of an internship in 2020 and then a customer project, Bootlin intern Kévin L’Hôpital and Bootlin engineer Paul Kocialkowski worked on extending the Allwinnera camera interface support with support for MIPI CSI2 cameras. In fact, this addition was done to two Allwinner camera interface drivers: the sun6i driver which is used on Allwinner A31 and V3s/V3/S3, and the sun8i-a83t, which is used on the Allwinner A83T.

Through a fairly long 15 patches patch series, support for MIPI CSI2 is added to both camera interface controllers. We have tested both with Omnivision sensors, which are described below.

The series is currently in its third iteration, which was posted by Paul Kocialkowski on December 11, 2020 on the linux-media mailing list:


Paul Kocialkowski (15):
  docs: phy: Add a part about PHY mode and submode
  phy: Distinguish between Rx and Tx for MIPI D-PHY with submodes
  phy: allwinner: phy-sun6i-mipi-dphy: Support D-PHY Rx mode for MIPI
    CSI-2
  media: sun6i-csi: Use common V4L2 format info for storage bpp
  media: sun6i-csi: Only configure the interface data width for parallel
  dt-bindings: media: sun6i-a31-csi: Add MIPI CSI-2 input port
  media: sun6i-csi: Add support for MIPI CSI-2 bridge input
  dt-bindings: media: Add A31 MIPI CSI-2 bindings documentation
  media: sunxi: Add support for the A31 MIPI CSI-2 controller
  ARM: dts: sun8i: v3s: Add nodes for MIPI CSI-2 support
  MAINTAINERS: Add entry for the Allwinner A31 MIPI CSI-2 bridge
  dt-bindings: media: Add A83T MIPI CSI-2 bindings documentation
  media: sunxi: Add support for the A83T MIPI CSI-2 controller
  ARM: dts: sun8i: a83t: Add MIPI CSI-2 controller node
  MAINTAINERS: Add entry for the Allwinner A83T MIPI CSI-2 bridge

 .../media/allwinner,sun6i-a31-csi.yaml        |  88 ++-
 .../media/allwinner,sun6i-a31-mipi-csi2.yaml  | 149 ++++
 .../media/allwinner,sun8i-a83t-mipi-csi2.yaml | 147 ++++
 Documentation/driver-api/phy/phy.rst          |  18 +
 MAINTAINERS                                   |  16 +
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts  |   2 +-
 arch/arm/boot/dts/sun8i-a83t.dtsi             |  26 +
 arch/arm/boot/dts/sun8i-v3s.dtsi              |  67 ++
 drivers/media/platform/sunxi/Kconfig          |   2 +
 drivers/media/platform/sunxi/Makefile         |   2 +
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      | 165 +++--
 .../platform/sunxi/sun6i-csi/sun6i_csi.h      |  58 +-
 .../platform/sunxi/sun6i-csi/sun6i_video.c    |  53 +-
 .../platform/sunxi/sun6i-csi/sun6i_video.h    |   7 +-
 .../platform/sunxi/sun6i-mipi-csi2/Kconfig    |  12 +
 .../platform/sunxi/sun6i-mipi-csi2/Makefile   |   4 +
 .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c   | 590 ++++++++++++++++
 .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h   | 117 ++++
 .../sunxi/sun8i-a83t-mipi-csi2/Kconfig        |  11 +
 .../sunxi/sun8i-a83t-mipi-csi2/Makefile       |   4 +
 .../sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.c    |  92 +++
 .../sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.h    |  39 ++
 .../sun8i_a83t_mipi_csi2.c                    | 657 ++++++++++++++++++
 .../sun8i_a83t_mipi_csi2.h                    | 197 ++++++
 drivers/phy/allwinner/phy-sun6i-mipi-dphy.c   | 164 ++++-
 drivers/staging/media/rkisp1/rkisp1-isp.c     |   3 +-
 include/linux/phy/phy-mipi-dphy.h             |  13 +
 27 files changed, 2581 insertions(+), 122 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml
 create mode 100644 Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/Kconfig
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/Makefile
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.c
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.h
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_mipi_csi2.c
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_mipi_csi2.h

Here as well, the patch series has gone through a number of iterations, with significant reshaping to take into account the comments and feedback of other kernel developers and maintainers, so we hope to be near the point where it can be merged.

Omnivision OV8865 camera sensor driver

OV8865 block diagramAs part of his internship at Bootlin in 2020, Kévin L’Hôpital implemented a driver for the OV8865 camera sensor, connected over MIPI CSI2 to an Allwinner A83T platform. This OV8865 was then taken by Bootlin engineer Paul Kocialkowski, who did additional rework and polishing.

We are currently at the 4th iteration of this driver, which has been posted on December 11, 2020, and it has now been accepted and submitted to the V4L maintainer in a pull request.


Kévin L'hôpital (1):
  ARM: dts: sun8i: a83t: bananapi-m3: Enable MIPI CSI-2 with OV8865

Paul Kocialkowski (2):
  dt-bindings: media: i2c: Add OV8865 bindings documentation
  media: i2c: Add support for the OV8865 image sensor

 .../bindings/media/i2c/ovti,ov8865.yaml       |  124 +
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts  |  102 +
 drivers/media/i2c/Kconfig                     |   13 +
 drivers/media/i2c/Makefile                    |    1 +
 drivers/media/i2c/ov8865.c                    | 2981 +++++++++++++++++
 5 files changed, 3221 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml
 create mode 100644 drivers/media/i2c/ov8865.c

Omnivision OV5648 camera sensor driver

OV5648 block diagramIn addition to the work done by Bootlin intern Kévin L’Hôpital on OV8865 with Allwinner A83T, Paul Kocialkowski worked on OV5648 with Allwinner V3s, also connected over MIPI CSI2. This work results in a driver for the OV5648 camera sensor, which Paul has submitted to the linux-media mailing list.

This driver is now in is 5th iteration, posted on December 11, 2020, and it has now been accepted and submitted to the V4L maintainer in a pull request.


Paul Kocialkowski (2):
  dt-bindings: media: i2c: Add OV5648 bindings documentation
  media: i2c: Add support for the OV5648 image sensor

 .../bindings/media/i2c/ovti,ov5648.yaml       |  115 +
 drivers/media/i2c/Kconfig                     |   13 +
 drivers/media/i2c/Makefile                    |    1 +
 drivers/media/i2c/ov5648.c                    | 2638 +++++++++++++++++
 4 files changed, 2767 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/ovti,ov5648.yaml
 create mode 100644 drivers/media/i2c/ov5648.c

TW9900 PAL/NTSC video decoder driver

TW9900In addition to working on the Rockchip camera interface driver, Maxime Chevallier has also worked on a driver for the TW9900 PAL/NTSC video decoder. This chip from Renesas, takes as input an analog PAL or NTSC signal, digitizes it and outputs it on a parallel BT656 interface, which in our case was connected to a Rockchip PX30 platform.

Maxime posted the third iteration of the patch series adding this driver on December 22, 2020 on the linux-media mailing list.

Maxime Chevallier (3):
  dt-bindings: vendor-prefixes: Add techwell vendor prefix
  media: dt-bindings: media: i2c: Add bindings for TW9900
  media: i2c: Introduce a driver for the Techwell TW9900 decoder

 .../devicetree/bindings/media/i2c/tw9900.yaml |  60 ++
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 MAINTAINERS                                   |   6 +
 drivers/media/i2c/Kconfig                     |  11 +
 drivers/media/i2c/Makefile                    |   1 +
 drivers/media/i2c/tw9900.c                    | 617 ++++++++++++++++++
 6 files changed, 697 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/tw9900.yaml
 create mode 100644 drivers/media/i2c/tw9900.c

Rockchip HW-accelerated H264 video encoding

In 2018 and thanks to success of the crowd-funding campaign we ran back then, Bootlin engineer Paul Kocialkowski pioneered support for stateless video decoders in the Linux kernel, with a first driver supporting MPEG2, H264 and H265 HW-accelerated video decoding on Allwinner platforms.

Rockchip video encoderIn 2020, Paul was tasked to work on HW-accelerated H264 video encoding for Rockchip platforms, which also use a stateless video encoder. Of course, Paul took the same approach of going towards an upstream-acceptable solution rather than relying on out-of-tree and vendor-specific solutions provided by Rockchip.

Paul has been able to implement a working solution for one of our customers, and while the result is not yet in a shape where it can be submitted upstream, Paul has presented its result at the Embedded Linux Conference Europe 2020: the slides and video. The kernel code is available at https://github.com/bootlin/linux/tree/hantro/h264-encoding while the user-space code is available at https://github.com/bootlin/v4l2-hantro-h264-encoder.

As explained in Paul’s talk, this is not fully ready for upstream, as lots of discussions are needed on the user-space APIs, especially around the topic of rate control.

If you are interested in having this work fully available in the upstream Linux kernel, please contact us. We are looking for additional funding and support to push this completely upstream.

Conclusion

As can be seen from the numerous topics covered in this blog post, Bootlin has significant experience with the Video4Linux subsystem, and is able to both implement support for new hardware, extend the Video4Linux subsystem if needed, and contribute drivers and changes to the official Linux kernel.

Large Page Support for NAS systems on 32 bit ARM

The need for large page support on 32 bit ARM

Storage space has become more and more affordable to a point that it is now possible to have multiple hard drives of dozens of terabytes in a single consumer-grade device. With a few 10 TiB hard drives and thanks to RAID technology, storage capacities that exceed 16 or 32 TiB can easily be reached and at a relatively low cost.

However, a number of consumer NAS systems used in the field today are still based on 32 bit ARM processors. The problem is that, with Linux on a 32 bit system, it’s only possible to address up to 16 TiB of storage space. This is still true even with the ext4 filesystem, even though it uses 64 bit pointers.

We were lucky to have a customer contracting us to update older Large Page Support patches to a recent version of the Linux kernel. This set of patches are one way of overcoming this 16 TiB limitation for ARM 32-bit systems. Since updating this patch series was a non trivial task, we are happy to share the results of our efforts with the community, both through this blog post and through a patch series we posted to the Linux ARM kernel mailing list: ARM: Add support for large kernel page (from 8K to 64K).

How Large Page Support works

The 16 TiB limitation comes from the use of page->index which is a pgoff_t offset type corresponding to unsigned long. This limits us to a 32-bit page offsets, so with 4 KiB physical pages, we end up with a maximum of 16 TiB. A way to address this limitation is to use larger physical pages. We can reach 32 TiB with 8 KiB pages, 64 TiB with 16 KiB pages and up to 256 TiB with 64 KiB pages.

Before going further, the ARM32 Page Tables article from Linus Walleij is a good reference to understand how the Linux kernel deals with ARM32 page tables. In our case, we are only going to cover the non LPAE case. As explained there, the way the Linux kernel sees the page tables actually doesn’t match reality. First, the kernel deals with 4 levels of page tables while on hardware there are only 2 levels. In addition, while the ARM32 hardware stores only 256 PTEs in Page Tables, taking up only 1 KB, Linux optimizes things by storing in each 4 KB page two sets of 256 PTEs, and two sets of shadow PTEs that are used to store additional metadata needed by Linux about each page (such as the dirty and accessed/young bits). So, there is already some magic between what is presented to the Linux virtual memory management subsystem, and what is really programmed into the hardware page tables. To support large pages, the idea is to go further in this direction by emulating larger physical pages.

Our series (and especially patch 5: ARM: Add large kernel page support) proposes to pretend to have larger hardware pages. The ARM 32-bit architecture only supports 4 KiB or 64 KiB page sizes, but we would like to support intermediate values of 8 KiB, 16 KiB and 32 KiB as well. So what we do to support 8 KiB pages is that we tell Linux the hardware has 8 KiB pages, but in fact we simply use two consecutive 4 KiB pages at the hardware level that we manipulate and configure simultaneously. To support 16 KiB pages, we use 4 consecutive 4 KiB pages, for 32 KiB pages, we use 8 consecutive pages, etc. So really, we “emulate” having larger page sizes by grouping 2, 4 or 8 pages together. Adding this feature only required a few changes in the code, mainly dealing with ranges of pages every time we were dealing with a single page. Actually, most of the code in the series is about making it possible to modify the hard coded value of the hardware page size and fixing the assumptions associated to such a fixed value.

In addition to this emulated mechanism that we provide for 8 KiB, 16 KiB, 32 KiB and 64 KiB pages, we also added support for using real hardware 64 KiB pages as part of this patch series.

Overall the number of changes is very limited (271 lines added, 13 lines removed), and allows to use much larger storage devices. Here is the diffstat of the full patch series:

 arch/arm/include/asm/elf.h                  |  2 +-
 arch/arm/include/asm/fixmap.h               |  3 +-
 arch/arm/include/asm/page.h                 | 12 ++++
 arch/arm/include/asm/pgtable-2level-hwdef.h |  8 +++
 arch/arm/include/asm/pgtable-2level.h       |  6 +-
 arch/arm/include/asm/pgtable.h              |  4 ++
 arch/arm/include/asm/shmparam.h             |  4 ++
 arch/arm/include/asm/tlbflush.h             | 21 +++++-
 arch/arm/kernel/entry-common.S              | 13 ++++
 arch/arm/kernel/traps.c                     | 10 +++
 arch/arm/mm/Kconfig                         | 72 +++++++++++++++++++++
 arch/arm/mm/fault.c                         | 19 ++++++
 arch/arm/mm/mmu.c                           | 22 ++++++-
 arch/arm/mm/pgd.c                           |  2 +
 arch/arm/mm/proc-v7-2level.S                | 72 ++++++++++++++++++++-
 arch/arm/mm/tlb-v7.S                        | 14 +++-
 16 files changed, 271 insertions(+), 13 deletions(-)

This patch series is running in production now on some NAS devices from a very popular NAS brand.

Limitations and alternatives

The submission of our patch series is recent but this feature has actually been running for years on many NAS systems in the field. Our new series is based on the original patchset, with the purpose of submitting it to the mainline kernel community. However, there is little chance that it will ever be merged into the mainline kernel.

The main drawback of this approach are large pages themselves: as each file in the page cache uses at least one page, the memory wasted increases as the size of the pages increases. For this reason, Linus Torvalds was against similar series proposed in the past.

To show how much memory is wasted, Arnd Bergmann ran some numbers to measure the page cache overhead for a typical set of files (Linux 5.7 kernel sources) for 5 different page sizes:

Page size (KiB) 4 8 16 32 64
page cache usage (MiB) 1,023.26 1,209.54 1,628.39 2,557.31 4,550.88
factor over 4K pages 1.00x 1.18x 1.59x 2.50x 4.45x

We can see that while a factor of 1.18 is acceptable for 8 KiB pages, a 4.45 multiplier looks excessive with 64 KiB pages.

Actually, to make it possible to address large volumes on 32 bit ARM, another solution was pointed out during the review of our series. Instead of using larger pages which have an impact on the entire system, an alternative is to modify the way the filesystem addresses the memory by using 64 bits pgoff_t offsets. This has already been implemented in vendor kernels running in some NAS systems, but this has never been submitted to mainline developers.

Videos and slides from Bootlin talks at Embedded Linux Conference Europe 2020

The Embedded Linux Conference Europe took place online last week. While we definitely missed the experience of an in-person event, we strongly participated to this conference with no less than 7 talks on various topics showing Bootlin expertise in different fields: Linux kernel development in networking, multimedia and storage, but also build systems and tooling. We’re happy to be publishing now the slides and videos of our talks.

From the camera sensor to the user: the journey of a video frame, Maxime Chevallier

Download the slides: PDF, source.

OpenEmbedded and Yocto Project best practices, Alexandre Belloni

Download the slides: PDF, source.

Supporting hardware-accelerated video encoding with mainline Linux, Paul Kocialkowski

Download the slides: PDF, source.

Building embedded Debian/Ubuntu systems with ELBE, Köry Maincent

Download the slides: PDF, source.

Understand ECC support for NAND flash devices in Linux, Miquèl Raynal

Download the slides: PDF, source.

Using Visual Studio Code for Embedded Linux Development, Michael Opdenacker

Download the slides: PDF, source.

Precision Time Protocol (PTP) and packet timestamping in Linux, Antoine Ténart

Download the slides: PDF, source.

Enabling new hardware on Raspberry Pi with Device Tree Overlays

We recently had the chance to work on a customer project that involved the RaspberryPi Compute Module 3, with custom peripherals attached: a Microchip WILC1000 WiFi chip connected on SDIO, and a SGTL5000 audio codec connected over I2S/I2C. We take this opportunity to share some insights on how to introduce new hardware support on RaspberryPi platforms, by taking advantage of the Raspberry Pi specific Device Tree overlay mechanism.

Building and loading an upstream RPi Linux

Our customer is using the RaspberryPi OS Lite, which is a Debian-based system, coming with its pre-compiled Linux kernel image. However, for this project, we obviously need to customize the Linux kernel configuration, so we had to build our own kernel.

While Bootlin normally has a preference for using the upstream Linux kernel, for this project, we decided to keep using the Linux kernel fork provided by Raspberry Pi on Github, which is used by Raspberry Pi OS Lite. So we start our work by grabbing the Linux kernel source code from a the rpi-5.4.y provided by the Raspberry Pi Linux kernel repository:

$ git clone https://github.com/raspberrypi/linux -b rpi-5.4.y

After installing an appropriate ARM 32-bit cross-compilation toolchain, we’re ready to configure and build the kernel:

$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- bcm2709_defconfig
$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- -j4 zImage

Thanks to the USB mass storage mode of the Raspberry Pi it is quite straightforward to update the kernel as it allows to mount the board eMMC partitions on the host system, and update whichever files we want. The tool that actually sets the RaspberryPi in this mode is called rpiboot.

After executing it, we deploy the zImage into the RaspberryPi /media/${USER}/boot/ partition and update the file /media/${USER}/boot/config.txt to set the firmware directive kernel=zImage.

That’s all it takes to replace the kernel image provided by the RaspberryPi OS Lite system by our own kernel, which means we are now ready to integrate our new features.

Device Tree Overlays

A specificity of the Raspberry Pi is that the boot flow starts from the GPU core and not the ARM core like it does on most embedded processors. The GPU load a first bootloader from a ROM that will load a second bootloader (bootcode.bin) from eMMC/SD card that is in charge of executing a firmware (start.elf). This GPU firmware finally reads and parses a file stored in the boot partition (config.txt), which is used to set various boot parameters such as the image to boot on.

This config.txt file also allows to indicate which Device Tree file should be used as the hardware description, as well as Device Tree Overlays that should be applied on top of the Device Tree files. Device Tree Overlays are a bit like patches for the Device Tree: they allow to extend the base Device Tree with new properties and nodes. They are typically used to describe the hardware attached to the RaspberryPi through expansion boards.

The Raspberry Pi kernel tree contains a number of such Device Tree Overlays in the arch/arm/boot/dts/overlays folder. Each of those overlays, stored in .dts file gets compiled into a .dtbo files. Those .dtbo can be loaded and applied to the main Device Tree by adding the following statement to the config.txt file:

dtoverlay=overlay-name,overlay-arguments

We will fully take advantage of this mechanism to introduce two new Device Tree Overlays that will be parsed and dynamically merged into the main Device Tree.

WILC1000 WiFi chip Device Tree overlay

The ATWILC1000 we want to support is a Microchip WiFi module. As of the Linux 5.4 kernel we are using for this project, the driver for this WiFi chip was still in the staging area of the Linux kernel, in drivers/staging/wilc1000, but it more recently graduated out of staging. We obviously started by enabling this driver in the kernel configuration, through its CONFIG_WILC1000 and CONFIG_WILC1000_SDIO options.

To describe the WILC1000 module in terms of Device Tree, we need to create a arch/arm/boot/dts/overlays/wilc1000-overlays.dts file, with the following contents:

/dts-v1/;
/plugin/;

/ {
    compatible = "brcm,bcm2835";

    fragment@0 {
        target = <&mmc>;
        wifi_ovl: __overlay__ {
            pinctrl-0 = <&sdio_ovl_pins &wilc_sdio_pins>;
            pinctrl-names = "default";
            non-removable;
            bus-width = <4>;
            status = "okay";
            #address-cells = <1>;
            #size-cells = <0>;

            wilc_sdio: wilc_sdio@0 {
                compatible = "microchip,wilc1000", "microchip,wilc3000";
                irq-gpios = <&gpio 1 0>;
                status = "okay";
                reg = <0>;
                bus-width = <4>;
            };
        };
    };

    fragment@1 {
        target = <&gpio>;
        __overlay__ {
            sdio_ovl_pins: sdio_ovl_pins {
                brcm,pins = <22 23 24 25 26 27>;
                brcm,function = <7>; /* ALT3 = SD1 */
                brcm,pull = <0 2 2 2 2 2>;
            };

            wilc_sdio_pins: wilc_sdio_pins {
                brcm,pins = <1>; /* CHIP-IRQ */
                brcm,function = <0>;
                brcm,pull = <2>;
            };
        };
    };
};

To be compiled, this overlay needs to be referenced in arch/arm/boot/dts/overlays/Makefile.

This overlay contains two fragments: fragment@0 and fragment@1. Each fragment will extend a Device Tree node of the main Device Tree, which is specified by the target property of each fragment: fragment@0 will extend the node referenced by the &mmc phandle, and fragment@1 will extend the node referenced by the &gpio phandle. The mmc and gpio labels are defined in the main Device Tree describing the system-on-chip, arch/arm/boot/dts/bcm283x.dtsi. Practically speaking, the first fragment enables the MMC controller, its pin-muxing, and describes the WiFi chip as a sub-node of the MMC controller. The second fragment describes the pin-muxing configurations used for the MMC controller.

Above, each fragment target, &mmc and &gpio is matching an existing device node in the underlying tree arch/arm/boot/dts/bcm283x.dtsi. Note that our hardware is designed such as the reset pin of the wilc is automatically de-assert when powering the board so we don’t define it here.

Now that we have our overlay we enable it through the firmware config.txt :

dtoverlay=sdio,poll_once=off
dtoverlay=wilc1000
gpio=38=op,dh

In addition to our own wilc1000 overlay, we are also loading the sdio overlay, with the poll_once=off argument to make sure we are polling our wilc device even after the boot is complete when the chip enable gpio is asserted through the firmware directive gpio=38=op,dh.

After copying the wilc1000.dtbo on our board we can verify that both overlays are getting merged in the main device-tree using the vcdbg command:

  pi@raspberrypi:~$ sudo vcdbg log msg
   002351.555: brfs: File read: /mfs/sd/config.txt
   002354.107: brfs: File read: 4322 bytes
   ...
   005603.860: dtdebug: Opened overlay file 'overlays/wilc1000.dtbo'
   005605.500: brfs: File read: /mfs/sd/overlays/wilc1000.dtbo
   005619.506: Loaded overlay 'sdio'
   005624.063: dtdebug: fragment 3 disabled
   005624.160: dtdebug: fragment 4 disabled
   005624.256: dtdebug: fragment 5 disabled
   005633.283: dtdebug: merge_fragment(/soc/mmcnr@7e300000,/fragment@0/__overlay__)
   005633.308: dtdebug:   +prop(status)
   005633.757: dtdebug: merge_fragment() end
   005642.521: dtdebug: merge_fragment(/soc/mmc@7e300000,/fragment@1/__overlay__)
   005642.549: dtdebug:   +prop(pinctrl-0)
   005643.003: dtdebug:   +prop(pinctrl-names)
   005643.443: dtdebug:   +prop(non-removable)
   005644.022: dtdebug:   +prop(bus-width)
   005644.477: dtdebug:   +prop(status)
   005644.935: dtdebug: merge_fragment() end
   005646.614: dtdebug: merge_fragment(/soc/gpio@7e200000,/fragment@2/__overlay__)
   005650.499: dtdebug: merge_fragment(/soc/gpio@7e200000/sdio_ovl_pins,/fragment@22
   /__overlay__/sdio_ovl_pins)
   ...

Of course there are lots of other useful tools that we used to debug, such as raspi-gpio and dtoverlay:

 pi@raspberrypi:~$ sudo raspi-gpio get
   BANK0 (GPIO 0 to 27):
   GPIO 0: level=1 fsel=0 func=INPUT
   GPIO 1: level=1 fsel=0 func=INPUT
   GPIO 2: level=1 fsel=0 func=INPUT
   ...
   GPIO 22: level=0 fsel=7 alt=3 func=SD1_CLK
   GPIO 23: level=1 fsel=7 alt=3 func=SD1_CMD
   GPIO 24: level=1 fsel=7 alt=3 func=SD1_DAT0
   GPIO 25: level=1 fsel=7 alt=3 func=SD1_DAT1
   GPIO 26: level=1 fsel=7 alt=3 func=SD1_DAT2
   GPIO 27: level=1 fsel=7 alt=3 func=SD1_DAT3

Here we can actually see that the correct pinmuxing configuration is set for the SDIO pins.

Testing WiFi on RaspberryPi OS

During our test we found that the wilc1000 seems to only support one mode at time which means only STA or soft-AP. The first mode is quite straightforward to test using the raspi-config tool, we just had have to supply our SSID name and password.

To know which modes are supported and if they are supported simultaneously use iw list:

   pi@raspberrypi:~$ iw list
   Wiphy phy0
           max # scan SSIDs: 10
           max scan IEs length: 1000 bytes
           max # sched scan SSIDs: 0
           max # match sets: 0
           max # scan plans: 1
           max scan plan interval: -1
           max scan plan iterations: 0
           Retry short limit: 7
           Retry long limit: 4
           Coverage class: 0 (up to 0m)
           Supported Ciphers:
                   * WEP40 (00-0f-ac:1)
                   * WEP104 (00-0f-ac:5)
                   * TKIP (00-0f-ac:2)
                   * CCMP-128 (00-0f-ac:4)
                   * CMAC (00-0f-ac:6)
           Available Antennas: TX 0x3 RX 0x3
           Supported interface modes:
                    * managed
                    * AP
                    * monitor
                    * P2P-client
                    * P2P-GO
            ...
        software interface modes (can always be added):
        interface combinations are not supported
        Device supports scan flush.
        Supported extended features:
 

To use the wilc1000 in soft-AP mode, one need to install additional packages in the RaspberryPi OS, such as hostapd and dnsmasq.

SGTL5000 audio codec Device Tree Overlay

After integrating support for the WILC1000 WiFi chip, we can now look at how we got the SGTL5000 audio codec to work. We wrote an overlay arch/arm/boot/dts/overlays/sgtl5000-overlay.dts with the following contents, and also edited arch/arm/boot/dts/overlays/Makefile to ensure it is built. Our sgtl5000-overlay.dts contains:

/dts-v1/;
/plugin/;

/{
    compatible = "brcm,bcm2835";

    fragment@0 {
        target-path = "/";
        __overlay__ {
            sgtl5000_mclk: sgtl5000_mclk {
                compatible = "fixed-clock";
                #clock-cells = <0>;
                clock-frequency = <12288000>;
                clock-output-names = "sgtl5000-mclk";
            };
        };
    };

    fragment@1 {
        target = <&soc>;
        __overlay__ {
            reg_1v8: reg_1v8@0 {
                compatible = "regulator-fixed";
                regulator-name = "1V8";
                regulator-min-microvolt = <1800000>;
                regulator-max-microvolt = <1800000>;
                regulator-always-on;
            };
        };
    };

    fragment@2 {
        target = <&i2c1>;
        __overlay__ {
            status = "okay";

            sgtl5000_codec: sgtl5000@0a {
                #sound-dai-cells = <0>;
                compatible = "fsl,sgtl5000";
                reg = <0x0a>;
                clocks = <&sgtl5000_mclk>;
                micbias-resistor-k-ohms = <2>;
                micbias-voltage-m-volts = <3000>;
                VDDA-supply = <&vdd_3v3_reg>;
                VDDIO-supply = <&vdd_3v3_reg>;
                VDDD-supply = <&reg_1v8>;
                status = "okay";
            };
        };
    };

    fragment@3 {
        target = <&i2s>;
        __overlay__ {
            status = "okay";
        };
    };

    fragment@4 {
        target = <&sound>;
        sound_overlay: __overlay__ {
           compatible = "simple-audio-card";
           simple-audio-card,format = "i2s";
           simple-audio-card,name = "sgtl5000";
           simple-audio-card,bitclock-master = <&dailink0_master>;
           simple-audio-card,frame-master = <&dailink0_master>;
           simple-audio-card,widgets =
               "Microphone", "Microphone Jack",
               "Speaker", "External Speaker";
           simple-audio-card,routing =
               "MIC_IN", "Mic Bias",
               "External Speaker", "LINE_OUT";
           status = "okay";
           simple-audio-card,cpu {
               sound-dai = <&i2s>;
           };
           dailink0_master: simple-audio-card,codec {
               sound-dai = <&sgtl5000_codec>;
           };
        };
    };
};

This is a much more complicated overlay, with a total of 5 fragments, which we will describe in detail:

  • fragment@0: this fragment adds a new Device Tree node that describes a clock that runs at a fixed frequency. Indeed, the sys_mclk clock of our codec is provided by an external oscillator running at 12.288 Mhz: this sgtl5000_mclk describes this external oscillator.
  • fragment@1: this fragment defines an external power supply used as the VDDD-supply of the sgtl5000 codec. Indeed, this codec is simply power by an always-on power supply at 1.8V. Don’t forget to enable both kernel options CONFIG_REGULATOR=y and CONFIG_REGULATOR_FIXED_VOLTAGE=y otherwise the driver will just silently fail to probe at boot.
  • fragment@2: this fragment describes the audio codec itself. The audio codec is described as a child of the I2C controller it is connected to: indeed the SGTL5000 uses I2C as its control interface.
  • fragment@3: this fragment simply enables the I2S interface. The CONFIG_SND_BCM2835_SOC_I2S=y kernel option must be enabled to have the corresponding driver.
  • fragment@4: this fragment describes the actual sound card. It uses the generic simple-audio-card driver, describes the two sides of the audio link: the CPU interface in the &i2s and the codec interface in the &sgtl_codec node, and describes the audio widgets and routes. See the simple-audio-card Device Tree binding

With this overlay in place, we need to enable it in the config.txt file, as well as the I2C1 overlay with a correct pin-muxing configuration:

dtparam=i2s=on
dtoverlay=sgtl5000
dtoverlay=i2c1,pins_2_3
# AUDIO_SD
gpio=40=op,dh

Once the system has booted, a new audio card is visible, and all the classic ALSA user-space utilities can be used: amixer to control the volume, aplay and arecord to play/record audio.

Conclusion

In this blog post, we have shown how Device Tree Overlays can easily be used to extend the description of the hardware, and enable the use of additional hardware devices on a Raspberry Pi system. Do not hesitate to contact us for support on Raspberry Pi platforms!

Linux 5.9 released: Bootlin contributions

Linux 5.9 was released last Sunday. See our usual resources for a good coverage of the highlights of this new release: KernelNewbies page, LWN.net article on the first part of the merge window, LWN.net article on the second part of the merge window.

On our side, we contributed a total of 69 commits to Linux 5.9, which unusually low and makes Bootlin the 31st contributing company by number of commits according to Linux Kernel Patch Statistic. The highlights of our contributions are:

  • Miquèl Raynal reworked part of the rawnand subsystem to allow drivers for non-ONFI compliant NANDs to select a more efficient data interface.
  • On the support of Atmel/Microchip platforms
    • Alexandre Belloni added proper support for the sama5d2 in the TCB (timer counter block) clocksource driver. While doing so, he upstreamd most of the remaining PREEMPT-RT patches for this driver.
    • Kamel Bouhara submitted a new driver for the TCB, this time in the counter subsystem, allowing to count external events, see our post on the topic
  • Antoine Ténart worked on PTP/IEEE 1588 support for the Microchip/Microsemi Ethernet PHYs. This required a change in the core to support a quad PHY properly. The series also included previous work from Quentin Schulz.
  • Paul Kocialkowski added support for the Rockchip PX30 SoC in the V4L2 M2M RGA driver.
  • Maxime Chretien, one of our trainee this year sent a fix for qconf.

Also, several Bootlin engineers are maintainers of various areas of the Linux kernel:

  • Miquèl Raynal, as the NAND maintainer and MTD co-maintainer, reviewed and merged 57 patches from other contributors
  • Alexandre Belloni, as the RTC maintainer and Microchip platform support co-maintainer, reviewed and merged 54 patches from other contributors
  • Grégory Clement, as the Marvell EBU platform support co-maintainer, reviewed and merged 13 patches from other contributors

Here is the complete list of our contributions: