Enabling new hardware on Raspberry Pi with Device Tree Overlays

We recently had the chance to work on a customer project that involved the RaspberryPi Compute Module 3, with custom peripherals attached: a Microchip WILC1000 WiFi chip connected on SDIO, and a SGTL5000 audio codec connected over I2S/I2C. We take this opportunity to share some insights on how to introduce new hardware support on RaspberryPi platforms, by taking advantage of the Raspberry Pi specific Device Tree overlay mechanism.

Building and loading an upstream RPi Linux

Our customer is using the RaspberryPi OS Lite, which is a Debian-based system, coming with its pre-compiled Linux kernel image. However, for this project, we obviously need to customize the Linux kernel configuration, so we had to build our own kernel.

While Bootlin normally has a preference for using the upstream Linux kernel, for this project, we decided to keep using the Linux kernel fork provided by Raspberry Pi on Github, which is used by Raspberry Pi OS Lite. So we start our work by grabbing the Linux kernel source code from a the rpi-5.4.y provided by the Raspberry Pi Linux kernel repository:

$ git clone https://github.com/raspberrypi/linux -b rpi-5.4.y

After installing an appropriate ARM 32-bit cross-compilation toolchain, we’re ready to configure and build the kernel:

$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- bcm2709_defconfig
$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- -j4 zImage

Thanks to the USB mass storage mode of the Raspberry Pi it is quite straightforward to update the kernel as it allows to mount the board eMMC partitions on the host system, and update whichever files we want. The tool that actually sets the RaspberryPi in this mode is called rpiboot.

After executing it, we deploy the zImage into the RaspberryPi /media/${USER}/boot/ partition and update the file /media/${USER}/boot/config.txt to set the firmware directive kernel=zImage.

That’s all it takes to replace the kernel image provided by the RaspberryPi OS Lite system by our own kernel, which means we are now ready to integrate our new features.

Device Tree Overlays

A specificity of the Raspberry Pi is that the boot flow starts from the GPU core and not the ARM core like it does on most embedded processors. The GPU load a first bootloader from a ROM that will load a second bootloader (bootcode.bin) from eMMC/SD card that is in charge of executing a firmware (start.elf). This GPU firmware finally reads and parses a file stored in the boot partition (config.txt), which is used to set various boot parameters such as the image to boot on.

This config.txt file also allows to indicate which Device Tree file should be used as the hardware description, as well as Device Tree Overlays that should be applied on top of the Device Tree files. Device Tree Overlays are a bit like patches for the Device Tree: they allow to extend the base Device Tree with new properties and nodes. They are typically used to describe the hardware attached to the RaspberryPi through expansion boards.

The Raspberry Pi kernel tree contains a number of such Device Tree Overlays in the arch/arm/boot/dts/overlays folder. Each of those overlays, stored in .dts file gets compiled into a .dtbo files. Those .dtbo can be loaded and applied to the main Device Tree by adding the following statement to the config.txt file:

dtoverlay=overlay-name,overlay-arguments

We will fully take advantage of this mechanism to introduce two new Device Tree Overlays that will be parsed and dynamically merged into the main Device Tree.

WILC1000 WiFi chip Device Tree overlay

The ATWILC1000 we want to support is a Microchip WiFi module. As of the Linux 5.4 kernel we are using for this project, the driver for this WiFi chip was still in the staging area of the Linux kernel, in drivers/staging/wilc1000, but it more recently graduated out of staging. We obviously started by enabling this driver in the kernel configuration, through its CONFIG_WILC1000 and CONFIG_WILC1000_SDIO options.

To describe the WILC1000 module in terms of Device Tree, we need to create a arch/arm/boot/dts/overlays/wilc1000-overlays.dts file, with the following contents:

/dts-v1/;
/plugin/;

/ {
    compatible = "brcm,bcm2835";

    fragment@0 {
        target = <&mmc>;
        wifi_ovl: __overlay__ {
            pinctrl-0 = <&sdio_ovl_pins &wilc_sdio_pins>;
            pinctrl-names = "default";
            non-removable;
            bus-width = <4>;
            status = "okay";
            #address-cells = <1>;
            #size-cells = <0>;

            wilc_sdio: wilc_sdio@0 {
                compatible = "microchip,wilc1000", "microchip,wilc3000";
                irq-gpios = <&gpio 1 0>;
                status = "okay";
                reg = <0>;
                bus-width = <4>;
            };
        };
    };

    fragment@1 {
        target = <&gpio>;
        __overlay__ {
            sdio_ovl_pins: sdio_ovl_pins {
                brcm,pins = <22 23 24 25 26 27>;
                brcm,function = <7>; /* ALT3 = SD1 */
                brcm,pull = <0 2 2 2 2 2>;
            };

            wilc_sdio_pins: wilc_sdio_pins {
                brcm,pins = <1>; /* CHIP-IRQ */
                brcm,function = <0>;
                brcm,pull = <2>;
            };
        };
    };
};

To be compiled, this overlay needs to be referenced in arch/arm/boot/dts/overlays/Makefile.

This overlay contains two fragments: fragment@0 and fragment@1. Each fragment will extend a Device Tree node of the main Device Tree, which is specified by the target property of each fragment: fragment@0 will extend the node referenced by the &mmc phandle, and fragment@1 will extend the node referenced by the &gpio phandle. The mmc and gpio labels are defined in the main Device Tree describing the system-on-chip, arch/arm/boot/dts/bcm283x.dtsi. Practically speaking, the first fragment enables the MMC controller, its pin-muxing, and describes the WiFi chip as a sub-node of the MMC controller. The second fragment describes the pin-muxing configurations used for the MMC controller.

Above, each fragment target, &mmc and &gpio is matching an existing device node in the underlying tree arch/arm/boot/dts/bcm283x.dtsi. Note that our hardware is designed such as the reset pin of the wilc is automatically de-assert when powering the board so we don’t define it here.

Now that we have our overlay we enable it through the firmware config.txt :

dtoverlay=sdio,poll_once=off
dtoverlay=wilc1000
gpio=38=op,dh

In addition to our own wilc1000 overlay, we are also loading the sdio overlay, with the poll_once=off argument to make sure we are polling our wilc device even after the boot is complete when the chip enable gpio is asserted through the firmware directive gpio=38=op,dh.

After copying the wilc1000.dtbo on our board we can verify that both overlays are getting merged in the main device-tree using the vcdbg command:

  pi@raspberrypi:~$ sudo vcdbg log msg
   002351.555: brfs: File read: /mfs/sd/config.txt
   002354.107: brfs: File read: 4322 bytes
   ...
   005603.860: dtdebug: Opened overlay file 'overlays/wilc1000.dtbo'
   005605.500: brfs: File read: /mfs/sd/overlays/wilc1000.dtbo
   005619.506: Loaded overlay 'sdio'
   005624.063: dtdebug: fragment 3 disabled
   005624.160: dtdebug: fragment 4 disabled
   005624.256: dtdebug: fragment 5 disabled
   005633.283: dtdebug: merge_fragment(/soc/mmcnr@7e300000,/fragment@0/__overlay__)
   005633.308: dtdebug:   +prop(status)
   005633.757: dtdebug: merge_fragment() end
   005642.521: dtdebug: merge_fragment(/soc/mmc@7e300000,/fragment@1/__overlay__)
   005642.549: dtdebug:   +prop(pinctrl-0)
   005643.003: dtdebug:   +prop(pinctrl-names)
   005643.443: dtdebug:   +prop(non-removable)
   005644.022: dtdebug:   +prop(bus-width)
   005644.477: dtdebug:   +prop(status)
   005644.935: dtdebug: merge_fragment() end
   005646.614: dtdebug: merge_fragment(/soc/gpio@7e200000,/fragment@2/__overlay__)
   005650.499: dtdebug: merge_fragment(/soc/gpio@7e200000/sdio_ovl_pins,/fragment@22
   /__overlay__/sdio_ovl_pins)
   ...

Of course there are lots of other useful tools that we used to debug, such as raspi-gpio and dtoverlay:

 pi@raspberrypi:~$ sudo raspi-gpio get
   BANK0 (GPIO 0 to 27):
   GPIO 0: level=1 fsel=0 func=INPUT
   GPIO 1: level=1 fsel=0 func=INPUT
   GPIO 2: level=1 fsel=0 func=INPUT
   ...
   GPIO 22: level=0 fsel=7 alt=3 func=SD1_CLK
   GPIO 23: level=1 fsel=7 alt=3 func=SD1_CMD
   GPIO 24: level=1 fsel=7 alt=3 func=SD1_DAT0
   GPIO 25: level=1 fsel=7 alt=3 func=SD1_DAT1
   GPIO 26: level=1 fsel=7 alt=3 func=SD1_DAT2
   GPIO 27: level=1 fsel=7 alt=3 func=SD1_DAT3

Here we can actually see that the correct pinmuxing configuration is set for the SDIO pins.

Testing WiFi on RaspberryPi OS

During our test we found that the wilc1000 seems to only support one mode at time which means only STA or soft-AP. The first mode is quite straightforward to test using the raspi-config tool, we just had have to supply our SSID name and password.

To know which modes are supported and if they are supported simultaneously use iw list:

   pi@raspberrypi:~$ iw list
   Wiphy phy0
           max # scan SSIDs: 10
           max scan IEs length: 1000 bytes
           max # sched scan SSIDs: 0
           max # match sets: 0
           max # scan plans: 1
           max scan plan interval: -1
           max scan plan iterations: 0
           Retry short limit: 7
           Retry long limit: 4
           Coverage class: 0 (up to 0m)
           Supported Ciphers:
                   * WEP40 (00-0f-ac:1)
                   * WEP104 (00-0f-ac:5)
                   * TKIP (00-0f-ac:2)
                   * CCMP-128 (00-0f-ac:4)
                   * CMAC (00-0f-ac:6)
           Available Antennas: TX 0x3 RX 0x3
           Supported interface modes:
                    * managed
                    * AP
                    * monitor
                    * P2P-client
                    * P2P-GO
            ...
        software interface modes (can always be added):
        interface combinations are not supported
        Device supports scan flush.
        Supported extended features:
 

To use the wilc1000 in soft-AP mode, one need to install additional packages in the RaspberryPi OS, such as hostapd and dnsmasq.

SGTL5000 audio codec Device Tree Overlay

After integrating support for the WILC1000 WiFi chip, we can now look at how we got the SGTL5000 audio codec to work. We wrote an overlay arch/arm/boot/dts/overlays/sgtl5000-overlay.dts with the following contents, and also edited arch/arm/boot/dts/overlays/Makefile to ensure it is built. Our sgtl5000-overlay.dts contains:

/dts-v1/;
/plugin/;

/{
    compatible = "brcm,bcm2835";

    fragment@0 {
        target-path = "/";
        __overlay__ {
            sgtl5000_mclk: sgtl5000_mclk {
                compatible = "fixed-clock";
                #clock-cells = <0>;
                clock-frequency = <12288000>;
                clock-output-names = "sgtl5000-mclk";
            };
        };
    };

    fragment@1 {
        target = <&soc>;
        __overlay__ {
            reg_1v8: reg_1v8@0 {
                compatible = "regulator-fixed";
                regulator-name = "1V8";
                regulator-min-microvolt = <1800000>;
                regulator-max-microvolt = <1800000>;
                regulator-always-on;
            };
        };
    };

    fragment@2 {
        target = <&i2c1>;
        __overlay__ {
            status = "okay";

            sgtl5000_codec: sgtl5000@0a {
                #sound-dai-cells = <0>;
                compatible = "fsl,sgtl5000";
                reg = <0x0a>;
                clocks = <&sgtl5000_mclk>;
                micbias-resistor-k-ohms = <2>;
                micbias-voltage-m-volts = <3000>;
                VDDA-supply = <&vdd_3v3_reg>;
                VDDIO-supply = <&vdd_3v3_reg>;
                VDDD-supply = <&reg_1v8>;
                status = "okay";
            };
        };
    };

    fragment@3 {
        target = <&i2s>;
        __overlay__ {
            status = "okay";
        };
    };

    fragment@4 {
        target = <&sound>;
        sound_overlay: __overlay__ {
           compatible = "simple-audio-card";
           simple-audio-card,format = "i2s";
           simple-audio-card,name = "sgtl5000";
           simple-audio-card,bitclock-master = <&dailink0_master>;
           simple-audio-card,frame-master = <&dailink0_master>;
           simple-audio-card,widgets =
               "Microphone", "Microphone Jack",
               "Speaker", "External Speaker";
           simple-audio-card,routing =
               "MIC_IN", "Mic Bias",
               "External Speaker", "LINE_OUT";
           status = "okay";
           simple-audio-card,cpu {
               sound-dai = <&i2s>;
           };
           dailink0_master: simple-audio-card,codec {
               sound-dai = <&sgtl5000_codec>;
           };
        };
    };
};

This is a much more complicated overlay, with a total of 5 fragments, which we will describe in detail:

  • fragment@0: this fragment adds a new Device Tree node that describes a clock that runs at a fixed frequency. Indeed, the sys_mclk clock of our codec is provided by an external oscillator running at 12.288 Mhz: this sgtl5000_mclk describes this external oscillator.
  • fragment@1: this fragment defines an external power supply used as the VDDD-supply of the sgtl5000 codec. Indeed, this codec is simply power by an always-on power supply at 1.8V. Don’t forget to enable both kernel options CONFIG_REGULATOR=y and CONFIG_REGULATOR_FIXED_VOLTAGE=y otherwise the driver will just silently fail to probe at boot.
  • fragment@2: this fragment describes the audio codec itself. The audio codec is described as a child of the I2C controller it is connected to: indeed the SGTL5000 uses I2C as its control interface.
  • fragment@3: this fragment simply enables the I2S interface. The CONFIG_SND_BCM2835_SOC_I2S=y kernel option must be enabled to have the corresponding driver.
  • fragment@4: this fragment describes the actual sound card. It uses the generic simple-audio-card driver, describes the two sides of the audio link: the CPU interface in the &i2s and the codec interface in the &sgtl_codec node, and describes the audio widgets and routes. See the simple-audio-card Device Tree binding

With this overlay in place, we need to enable it in the config.txt file, as well as the I2C1 overlay with a correct pin-muxing configuration:

dtparam=i2s=on
dtoverlay=sgtl5000
dtoverlay=i2c1,pins_2_3
# AUDIO_SD
gpio=40=op,dh

Once the system has booted, a new audio card is visible, and all the classic ALSA user-space utilities can be used: amixer to control the volume, aplay and arecord to play/record audio.

Conclusion

In this blog post, we have shown how Device Tree Overlays can easily be used to extend the description of the hardware, and enable the use of additional hardware devices on a Raspberry Pi system. Do not hesitate to contact us for support on Raspberry Pi platforms!

More Improvements to Raspberry Pi Display Testing

Raspberry Pi Display Support and IGT

We have been working with Raspberry Pi for quite some time, especially on areas related to the display side. Our work is part of a larger ongoing effort to move away from using the VC4 firmware for display operations and use native Linux drivers instead, which interact with the hardware directly. This transition is a long process, which requires bringing the native drivers to a point where they are efficient and reliable enough to cover most use cases of Raspberry Pi users.

Continuous Integration (CI) plays an important role in that process, since it allows detecting regressions early in the development cycle. This is why we have been tasked with improving testing in IGT GPU Tools, the test suite for the DRM subsystem of the kernel (which handles display). We already presented the work we conducted for testing various pixel formats with IGT on the Raspberry Pi’s VC4 last year. Since then, we have continued the work on IGT and brought it even further.

Improving YUV and Adding Tiled Pixel Formats Support

We continued the work on pixel formats by generalizing support for YUV buffers and reworking the format conversion helpers to support most of the common YUV formats instead of a reduced number of them. This lead to numerous commits that were merged in IGT:

In the meantime, we have also added support for testing specific tiling modes for display buffers. Tiling modes indicate that the pixel data is laid out in a different fashion than the usual line-after-line linear raster order. It provides more efficient data access to the hardware and yields better performance. They are used by the GPU (T tiling) or the VPU (SAND tiling). This required introducing a few changes to IGT as well as adding helpers for converting to the tiling modes, which was done in the following commits:

DRM Planes Support

The display engine hardware used on the Raspberry Pi allows displaying multiple framebuffers on-screen, in addition to the primary one (where the user interface lives). This feature is especially useful to display video streams directly, without having to perform the composition step with the CPU or GPU. The display engine offers features such as colorspace conversion (for converting YUV to RGB) and scaling, which are usually quite intensive tasks. In the Linux kernel’s DRM subsystem, this ability of the display engine hardware is exposed through DRM planes.

Displaying multiple DRM planes

We worked on adding support for testing DRM planes with the Chamelium board, with a fuzzing test that selects randomized attributes for the planes. Our work lead to the introduction of a new test in IGT:

Dealing with Imperfect Outputs

With the Chamelium, there are two major ways of finding out whether the captured display is correct or not:

  • Comparing the captured frame’s CRC with a CRC calculated from the reference frame;
  • Comparing the pixels in the captured and reference frames.

While the first method is the fastest one (because the captured frame’s CRC is calculated by the Chamelium board directly), it can only work if the framebuffer and the reference are guaranteed to be pixel-perfect. Since HDMI is a digital interface, this is generally the case. But as soon as scaling or colorspace conversion is involved, the algorithms used by the hardware do not result in the exact same pixels as performing the operation on the reference with the CPU.

Because of this issue, we had to come up with a specific checking method that excludes areas where there are such differences. Since our display pattern resembles a colorful checkerboard with solid-filled areas, most of the differences are only noticeable at the edges of each color block. As a result, we introduced a checking method that excludes the checkerboard edges from the comparison.

Detecting the edges (in blue) of a multi-plane pattern

This method turned out to provide good results and very few incorrect results after some tweaking. It was contributed to IGT with commit:

Underrun Detection

We also worked on implementing display pipeline underrun detection in the kernel’s VC4 DRM driver. Underruns occur when too much pixel data is provided (e.g. because of too many DRM planes enabled) and the hardware can’t keep up. In addition, a bandwidth filter was also added to reject configurations that would likely lead to an underrun. This lead to a few commits that were already merged upstream:

We prepared tests in IGT to ensure that the underruns are correctly reported, that the bandwidth protection does its job and that both are consistent. This test was submitted for review with patch:

Bootlin and Raspberry Pi Linux kernel upstreaming

Raspberry Pi logoFor a few months, Bootlin has been helping the Raspberry Pi Foundation upstream to the Linux kernel a number of display related features for the Raspberry Pi platform.

The main goal behind this upstreaming process is to get rid of the closed-source firmware that is used on non-upstream kernels every time you need to enable/access a specific hardware feature, and replace it by something that is both open-source and compliant with upstream Linux standards.

Eric Anholt has been working hard to upstream display related features. His biggest contribution has certainly been the open-source driver for the VC4 GPU, but he also worked on the display controller side, and we were contracted to help him with this task.

Our first objective was to add support for SDTV (composite) output, which appeared to be much easier than we imagined. As some of you might already know, the display controller of the Raspberry Pi already has a driver in the DRM subsystem. Our job was to add support for the SDTV encoder (also called VEC, for Video EnCoder). The driver has been submitted just before the 4.10 merge window and surprisingly made it into 4.10 (see also the patches). Eric Anholt explained on his blog:

The Raspberry Pi Foundation recently started contracting with Bootlin to give me some support on the display side of the stack. Last week I got to review and release their first big piece of work: Boris Brezillon’s code for SDTV support. I had suggested that we use this as the first project because it should have been small and self contained. It ended up that we had some clock bugs Boris had to fix, and a bug in my core VC4 CRTC code, but he got a working patch series together shockingly quickly. He did one respin for a couple more fixes once I had tested it, and it’s now out on the list waiting for devicetree maintainer review. If nothing goes wrong, we should have composite out support in 4.11 (we’re probably a week late for 4.10).

Our second objective was to help Eric with HDMI audio support. The code has been submitted on the mailing list 2 weeks ago and will hopefully be queued for 4.12. This time on, we didn’t write much code, since Eric already did the bulk of the work. What we did though is debugging the implementation to make it work. Eric also explained on his blog:

Probably the biggest news of the last two weeks is that Boris’s native HDMI audio driver is now on the mailing list for review. I’m hoping that we can get this merged for 4.12 (4.10 is about to be released, so we’re too late for 4.11). We’ve tested stereo audio so far, no compresesd audio (though I think it should Just Work), and >2 channel audio should be relatively small amounts of work from here. The next step on HDMI audio is to write the alsalib configuration snippets necessary to hide the weird details of HDMI audio (stereo IEC958 frames required) so that sound playback works normally for all existing userspace, which Boris should have a bit of time to work on still.

On our side, it has been a great experience to work on such topics with Eric, and you should expect more contributions from Bootlin for the Raspberry Pi platform in the next months, so stay tuned!