Bootlin welcomes Thomas Perrot in its team

Since December 1st, 2020, we’re happy to have in our team an additional engineer, Thomas Perrot, who joined our office in Toulouse, France.

Thomas brings 6+ years of experience working on embedded Linux systems, during which he worked at Intel on Android platforms, and then at Sigfox on the base stations for Sigfox’s radio network. Thomas has experience working with Linux on x86-64, ARM and ARM64 platforms, with a wide range of skills: bootloader development, Linux kernel and driver development, Yocto integration, OTA updates. Thomas was also deeply involved in the strong security aspects of Sigfox base stations, with secure boot and measured boot, TPM, integrity measurement, etc. At Sigfox, Thomas was involved in all steps of the product life-cycle, from the design phase all the way to the in-field deployment, update and maintenance. Last but not least, Thomas is a Linux technologist and a free software enthusiast, who hacks some open source hardware projects on his free time. See also Thomas page on our website and his LinkedIn profile.

Thomas Perrot is joining our growing team of engineers in Toulouse, which already included Paul Kocialkowski, Miquèl Raynal, Köry Maincent, Maxime Chevallier and Thomas Petazzoni.

On-going Bootlin contributions to the Video4Linux subsystem: camera, camera sensors, video encoding

Over the past years, we have been more and more involved in projects that have significant multimedia requirements. As part of this trend, 2020 has lead us to work on a number of contributions to the Video4Linux subsystem of the Linux kernel, with new drivers for camera interfaces, camera sensors, video decoders, and even HW-accelerated video encoding. In this blog post, we propose to summarize our contributions and their status on the following topics:

Rockchip PX30, RK1808, RK3128 and RK3288 camera interface driver
Allwinner A31, V3s/V3/S3 and A83T MIPI CSI-2 support for the camera interface driver
Omnivision OV8865 camera sensor driver
Omnivision OV5648 camera sensor driver
TW9900 PAL/NTSC video decoder driver
Rockchip HW-accelerated H264 video encoding

Rockchip camera interface

The Rockchip ARM processors are known to have very good support in the upstream Linux kernel. However, one area where the support was lacking is in the support of the camera interface used by those SoCs. And it turns out that Bootlin engineer Maxime Chevallier has worked precisely on this topic throughout 2020: the development and upstreaming of the rkvip driver, a Video4Linux driver for the Rockchip camera interface. While the work was done and tested on a Rockchip PX30 platform, the same camera interface is used on RK1808, RK3128 and RK3288.

Several iterations of the driver have been posted on the linux-media mailing list, with the latest iteration, version 5, posted on December 29, 2020:

Maxime Chevallier (3):
  media: dt-bindings: media: Document Rockchip VIP bindings
  media: rockchip: Introduce driver for Rockhip's camera interface
  arm64: dts: rockchip: Add the camera interface description of the PX30

 .../bindings/media/rockchip-vip.yaml          |  101 ++
 arch/arm64/boot/dts/rockchip/px30.dtsi        |   12 +
 drivers/media/platform/Kconfig                |   15 +
 drivers/media/platform/Makefile               |    1 +
 drivers/media/platform/rockchip/vip/Makefile  |    3 +
 drivers/media/platform/rockchip/vip/capture.c | 1146 +++++++++++++++++
 drivers/media/platform/rockchip/vip/dev.c     |  331 +++++
 drivers/media/platform/rockchip/vip/dev.h     |  203 +++
 drivers/media/platform/rockchip/vip/regs.h    |  260 ++++
 9 files changed, 2072 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/rockchip-vip.yaml
 create mode 100644 drivers/media/platform/rockchip/vip/Makefile
 create mode 100644 drivers/media/platform/rockchip/vip/capture.c
 create mode 100644 drivers/media/platform/rockchip/vip/dev.c
 create mode 100644 drivers/media/platform/rockchip/vip/dev.h
 create mode 100644 drivers/media/platform/rockchip/vip/regs.h

We’re hoping to get this driver merged soon, as we have now addressed the feedback that was received through the 5 iterations the patch series as gone through. It should be noted that for now it only supports the parallel BT656 interface as this is what we needed for our current project, we are definitely able to extend it to support MIPI CSI2 as well if you’re interested!

It should be noted that as a result of this work, Maxime Chevallier also prepared and delivered a talk From a video sensor to your display which was given at the Embedded Linux Conference Europe 2020. See the slides and video.

Allwinner MIPI CSI2 camera interface

As part of an internship in 2020 and then a customer project, Bootlin intern Kévin L’Hôpital and Bootlin engineer Paul Kocialkowski worked on extending the Allwinnera camera interface support with support for MIPI CSI2 cameras. In fact, this addition was done to two Allwinner camera interface drivers: the sun6i driver which is used on Allwinner A31 and V3s/V3/S3, and the sun8i-a83t, which is used on the Allwinner A83T.

Through a fairly long 15 patches patch series, support for MIPI CSI2 is added to both camera interface controllers. We have tested both with Omnivision sensors, which are described below.

The series is currently in its third iteration, which was posted by Paul Kocialkowski on December 11, 2020 on the linux-media mailing list:


Paul Kocialkowski (15):
  docs: phy: Add a part about PHY mode and submode
  phy: Distinguish between Rx and Tx for MIPI D-PHY with submodes
  phy: allwinner: phy-sun6i-mipi-dphy: Support D-PHY Rx mode for MIPI
    CSI-2
  media: sun6i-csi: Use common V4L2 format info for storage bpp
  media: sun6i-csi: Only configure the interface data width for parallel
  dt-bindings: media: sun6i-a31-csi: Add MIPI CSI-2 input port
  media: sun6i-csi: Add support for MIPI CSI-2 bridge input
  dt-bindings: media: Add A31 MIPI CSI-2 bindings documentation
  media: sunxi: Add support for the A31 MIPI CSI-2 controller
  ARM: dts: sun8i: v3s: Add nodes for MIPI CSI-2 support
  MAINTAINERS: Add entry for the Allwinner A31 MIPI CSI-2 bridge
  dt-bindings: media: Add A83T MIPI CSI-2 bindings documentation
  media: sunxi: Add support for the A83T MIPI CSI-2 controller
  ARM: dts: sun8i: a83t: Add MIPI CSI-2 controller node
  MAINTAINERS: Add entry for the Allwinner A83T MIPI CSI-2 bridge

 .../media/allwinner,sun6i-a31-csi.yaml        |  88 ++-
 .../media/allwinner,sun6i-a31-mipi-csi2.yaml  | 149 ++++
 .../media/allwinner,sun8i-a83t-mipi-csi2.yaml | 147 ++++
 Documentation/driver-api/phy/phy.rst          |  18 +
 MAINTAINERS                                   |  16 +
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts  |   2 +-
 arch/arm/boot/dts/sun8i-a83t.dtsi             |  26 +
 arch/arm/boot/dts/sun8i-v3s.dtsi              |  67 ++
 drivers/media/platform/sunxi/Kconfig          |   2 +
 drivers/media/platform/sunxi/Makefile         |   2 +
 .../platform/sunxi/sun6i-csi/sun6i_csi.c      | 165 +++--
 .../platform/sunxi/sun6i-csi/sun6i_csi.h      |  58 +-
 .../platform/sunxi/sun6i-csi/sun6i_video.c    |  53 +-
 .../platform/sunxi/sun6i-csi/sun6i_video.h    |   7 +-
 .../platform/sunxi/sun6i-mipi-csi2/Kconfig    |  12 +
 .../platform/sunxi/sun6i-mipi-csi2/Makefile   |   4 +
 .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c   | 590 ++++++++++++++++
 .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h   | 117 ++++
 .../sunxi/sun8i-a83t-mipi-csi2/Kconfig        |  11 +
 .../sunxi/sun8i-a83t-mipi-csi2/Makefile       |   4 +
 .../sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.c    |  92 +++
 .../sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.h    |  39 ++
 .../sun8i_a83t_mipi_csi2.c                    | 657 ++++++++++++++++++
 .../sun8i_a83t_mipi_csi2.h                    | 197 ++++++
 drivers/phy/allwinner/phy-sun6i-mipi-dphy.c   | 164 ++++-
 drivers/staging/media/rkisp1/rkisp1-isp.c     |   3 +-
 include/linux/phy/phy-mipi-dphy.h             |  13 +
 27 files changed, 2581 insertions(+), 122 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/media/allwinner,sun6i-a31-mipi-csi2.yaml
 create mode 100644 Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c
 create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/Kconfig
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/Makefile
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.c
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_dphy.h
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_mipi_csi2.c
 create mode 100644 drivers/media/platform/sunxi/sun8i-a83t-mipi-csi2/sun8i_a83t_mipi_csi2.h

Here as well, the patch series has gone through a number of iterations, with significant reshaping to take into account the comments and feedback of other kernel developers and maintainers, so we hope to be near the point where it can be merged.

Omnivision OV8865 camera sensor driver

As part of his internship at Bootlin in 2020, Kévin L’Hôpital implemented a driver for the OV8865 camera sensor, connected over MIPI CSI2 to an Allwinner A83T platform. This OV8865 was then taken by Bootlin engineer Paul Kocialkowski, who did additional rework and polishing.

We are currently at the 4th iteration of this driver, which has been posted on December 11, 2020, and it has now been accepted and submitted to the V4L maintainer in a pull request.


Kévin L'hôpital (1):
  ARM: dts: sun8i: a83t: bananapi-m3: Enable MIPI CSI-2 with OV8865

Paul Kocialkowski (2):
  dt-bindings: media: i2c: Add OV8865 bindings documentation
  media: i2c: Add support for the OV8865 image sensor

 .../bindings/media/i2c/ovti,ov8865.yaml       |  124 +
 arch/arm/boot/dts/sun8i-a83t-bananapi-m3.dts  |  102 +
 drivers/media/i2c/Kconfig                     |   13 +
 drivers/media/i2c/Makefile                    |    1 +
 drivers/media/i2c/ov8865.c                    | 2981 +++++++++++++++++
 5 files changed, 3221 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml
 create mode 100644 drivers/media/i2c/ov8865.c

Omnivision OV5648 camera sensor driver

In addition to the work done by Bootlin intern Kévin L’Hôpital on OV8865 with Allwinner A83T, Paul Kocialkowski worked on OV5648 with Allwinner V3s, also connected over MIPI CSI2. This work results in a driver for the OV5648 camera sensor, which Paul has submitted to the linux-media mailing list.

This driver is now in is 5th iteration, posted on December 11, 2020, and it has now been accepted and submitted to the V4L maintainer in a pull request.


Paul Kocialkowski (2):
  dt-bindings: media: i2c: Add OV5648 bindings documentation
  media: i2c: Add support for the OV5648 image sensor

 .../bindings/media/i2c/ovti,ov5648.yaml       |  115 +
 drivers/media/i2c/Kconfig                     |   13 +
 drivers/media/i2c/Makefile                    |    1 +
 drivers/media/i2c/ov5648.c                    | 2638 +++++++++++++++++
 4 files changed, 2767 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/ovti,ov5648.yaml
 create mode 100644 drivers/media/i2c/ov5648.c

TW9900 PAL/NTSC video decoder driver

In addition to working on the Rockchip camera interface driver, Maxime Chevallier has also worked on a driver for the TW9900 PAL/NTSC video decoder. This chip from Renesas, takes as input an analog PAL or NTSC signal, digitizes it and outputs it on a parallel BT656 interface, which in our case was connected to a Rockchip PX30 platform.

Maxime posted the third iteration of the patch series adding this driver on December 22, 2020 on the linux-media mailing list.

Maxime Chevallier (3):
  dt-bindings: vendor-prefixes: Add techwell vendor prefix
  media: dt-bindings: media: i2c: Add bindings for TW9900
  media: i2c: Introduce a driver for the Techwell TW9900 decoder

 .../devicetree/bindings/media/i2c/tw9900.yaml |  60 ++
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 MAINTAINERS                                   |   6 +
 drivers/media/i2c/Kconfig                     |  11 +
 drivers/media/i2c/Makefile                    |   1 +
 drivers/media/i2c/tw9900.c                    | 617 ++++++++++++++++++
 6 files changed, 697 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/i2c/tw9900.yaml
 create mode 100644 drivers/media/i2c/tw9900.c

Rockchip HW-accelerated H264 video encoding

In 2018 and thanks to success of the crowd-funding campaign we ran back then, Bootlin engineer Paul Kocialkowski pioneered support for stateless video decoders in the Linux kernel, with a first driver supporting MPEG2, H264 and H265 HW-accelerated video decoding on Allwinner platforms.

In 2020, Paul was tasked to work on HW-accelerated H264 video encoding for Rockchip platforms, which also use a stateless video encoder. Of course, Paul took the same approach of going towards an upstream-acceptable solution rather than relying on out-of-tree and vendor-specific solutions provided by Rockchip.

Paul has been able to implement a working solution for one of our customers, and while the result is not yet in a shape where it can be submitted upstream, Paul has presented its result at the Embedded Linux Conference Europe 2020: the slides and video. The kernel code is available at https://github.com/bootlin/linux/tree/hantro/h264-encoding while the user-space code is available at https://github.com/bootlin/v4l2-hantro-h264-encoder.

As explained in Paul’s talk, this is not fully ready for upstream, as lots of discussions are needed on the user-space APIs, especially around the topic of rate control.

If you are interested in having this work fully available in the upstream Linux kernel, please contact us. We are looking for additional funding and support to push this completely upstream.

Conclusion

As can be seen from the numerous topics covered in this blog post, Bootlin has significant experience with the Video4Linux subsystem, and is able to both implement support for new hardware, extend the Video4Linux subsystem if needed, and contribute drivers and changes to the official Linux kernel.

Large Page Support for NAS systems on 32 bit ARM

The need for large page support on 32 bit ARM

Storage space has become more and more affordable to a point that it is now possible to have multiple hard drives of dozens of terabytes in a single consumer-grade device. With a few 10 TiB hard drives and thanks to RAID technology, storage capacities that exceed 16 or 32 TiB can easily be reached and at a relatively low cost.

However, a number of consumer NAS systems used in the field today are still based on 32 bit ARM processors. The problem is that, with Linux on a 32 bit system, it’s only possible to address up to 16 TiB of storage space. This is still true even with the ext4 filesystem, even though it uses 64 bit pointers.

We were lucky to have a customer contracting us to update older Large Page Support patches to a recent version of the Linux kernel. This set of patches are one way of overcoming this 16 TiB limitation for ARM 32-bit systems. Since updating this patch series was a non trivial task, we are happy to share the results of our efforts with the community, both through this blog post and through a patch series we posted to the Linux ARM kernel mailing list: ARM: Add support for large kernel page (from 8K to 64K).

How Large Page Support works

The 16 TiB limitation comes from the use of page->index which is a pgoff_t offset type corresponding to unsigned long. This limits us to a 32-bit page offsets, so with 4 KiB physical pages, we end up with a maximum of 16 TiB. A way to address this limitation is to use larger physical pages. We can reach 32 TiB with 8 KiB pages, 64 TiB with 16 KiB pages and up to 256 TiB with 64 KiB pages.

Before going further, the ARM32 Page Tables article from Linus Walleij is a good reference to understand how the Linux kernel deals with ARM32 page tables. In our case, we are only going to cover the non LPAE case. As explained there, the way the Linux kernel sees the page tables actually doesn’t match reality. First, the kernel deals with 4 levels of page tables while on hardware there are only 2 levels. In addition, while the ARM32 hardware stores only 256 PTEs in Page Tables, taking up only 1 KB, Linux optimizes things by storing in each 4 KB page two sets of 256 PTEs, and two sets of shadow PTEs that are used to store additional metadata needed by Linux about each page (such as the dirty and accessed/young bits). So, there is already some magic between what is presented to the Linux virtual memory management subsystem, and what is really programmed into the hardware page tables. To support large pages, the idea is to go further in this direction by emulating larger physical pages.

Our series (and especially patch 5: ARM: Add large kernel page support) proposes to pretend to have larger hardware pages. The ARM 32-bit architecture only supports 4 KiB or 64 KiB page sizes, but we would like to support intermediate values of 8 KiB, 16 KiB and 32 KiB as well. So what we do to support 8 KiB pages is that we tell Linux the hardware has 8 KiB pages, but in fact we simply use two consecutive 4 KiB pages at the hardware level that we manipulate and configure simultaneously. To support 16 KiB pages, we use 4 consecutive 4 KiB pages, for 32 KiB pages, we use 8 consecutive pages, etc. So really, we “emulate” having larger page sizes by grouping 2, 4 or 8 pages together. Adding this feature only required a few changes in the code, mainly dealing with ranges of pages every time we were dealing with a single page. Actually, most of the code in the series is about making it possible to modify the hard coded value of the hardware page size and fixing the assumptions associated to such a fixed value.

In addition to this emulated mechanism that we provide for 8 KiB, 16 KiB, 32 KiB and 64 KiB pages, we also added support for using real hardware 64 KiB pages as part of this patch series.

Overall the number of changes is very limited (271 lines added, 13 lines removed), and allows to use much larger storage devices. Here is the diffstat of the full patch series:

 arch/arm/include/asm/elf.h                  |  2 +-
 arch/arm/include/asm/fixmap.h               |  3 +-
 arch/arm/include/asm/page.h                 | 12 ++++
 arch/arm/include/asm/pgtable-2level-hwdef.h |  8 +++
 arch/arm/include/asm/pgtable-2level.h       |  6 +-
 arch/arm/include/asm/pgtable.h              |  4 ++
 arch/arm/include/asm/shmparam.h             |  4 ++
 arch/arm/include/asm/tlbflush.h             | 21 +++++-
 arch/arm/kernel/entry-common.S              | 13 ++++
 arch/arm/kernel/traps.c                     | 10 +++
 arch/arm/mm/Kconfig                         | 72 +++++++++++++++++++++
 arch/arm/mm/fault.c                         | 19 ++++++
 arch/arm/mm/mmu.c                           | 22 ++++++-
 arch/arm/mm/pgd.c                           |  2 +
 arch/arm/mm/proc-v7-2level.S                | 72 ++++++++++++++++++++-
 arch/arm/mm/tlb-v7.S                        | 14 +++-
 16 files changed, 271 insertions(+), 13 deletions(-)

This patch series is running in production now on some NAS devices from a very popular NAS brand.

Limitations and alternatives

The submission of our patch series is recent but this feature has actually been running for years on many NAS systems in the field. Our new series is based on the original patchset, with the purpose of submitting it to the mainline kernel community. However, there is little chance that it will ever be merged into the mainline kernel.

The main drawback of this approach are large pages themselves: as each file in the page cache uses at least one page, the memory wasted increases as the size of the pages increases. For this reason, Linus Torvalds was against similar series proposed in the past.

To show how much memory is wasted, Arnd Bergmann ran some numbers to measure the page cache overhead for a typical set of files (Linux 5.7 kernel sources) for 5 different page sizes:

Page size (KiB)	4	8	16	32	64
page cache usage (MiB)	1,023.26	1,209.54	1,628.39	2,557.31	4,550.88
factor over 4K pages	1.00x	1.18x	1.59x	2.50x	4.45x

We can see that while a factor of 1.18 is acceptable for 8 KiB pages, a 4.45 multiplier looks excessive with 64 KiB pages.

Actually, to make it possible to address large volumes on 32 bit ARM, another solution was pointed out during the review of our series. Instead of using larger pages which have an impact on the entire system, an alternative is to modify the way the filesystem addresses the memory by using 64 bits pgoff_t offsets. This has already been implemented in vendor kernels running in some NAS systems, but this has never been submitted to mainline developers.

Videos and slides from Bootlin talks at Embedded Linux Conference Europe 2020

The Embedded Linux Conference Europe took place online last week. While we definitely missed the experience of an in-person event, we strongly participated to this conference with no less than 7 talks on various topics showing Bootlin expertise in different fields: Linux kernel development in networking, multimedia and storage, but also build systems and tooling. We’re happy to be publishing now the slides and videos of our talks.

From the camera sensor to the user: the journey of a video frame, Maxime Chevallier