Linux 6.5 released, Bootlin contributions

Linux 6.5 was released yesterday, with as usual over 10,000 commits from a large number of contributors. We recommend reading LWN.net articles on the merge window (part 1, part 2), but also the CNX Software page that focuses on embedded-related improvements.

Bootlin contributed 76 commits to this kernel release, putting us as the #26 contributing company. This time around, our main contributions have been:

  • The large stack of patches from Luca Ceresoli on the NVidia Tegra camera interface driver finally landed: they add support for the Tegra20 parallel camera interface to the existing driver, which required a lot of changes to the driver that was so far only support Tegra210 CSI. This work allows one of our customers, who was stuck on an old vendor NVidia kernel to an upstream Linux kernel.
  • Hervé Codina contributed a driver for the Renesas X9250 potentiometer, in the IIO subsystem. This will be followed in Linux 6.6 by a glue driver that allows to expose an IIO device as an auxiliary device in the ALSA subsystem, allowing this potentiometer to be used in audio applications
  • Alexis Lothoré contributed support for the Marvell MV88E6361 Ethernet switch into the existing mv88e6xxx DSA driver
  • Maxime Chevallier contributed a new regmap-based MDIO driver, which required some changes in the regmap code. This allows the Altera TSE driver to use the existing Lynx PCS driver, and drop the custom Altera TSE PCS driver. Finally, the stmmac Ethernet driver is modified to be able to use the Lynx PCS driver as well. Quite an adventure to finally get proper PCS support with stmmac
  • Miquèl Raynal contributed improvments in the 802.15.4 stack, especially related to scanning support.
  • Miquèl Raynal contributed fixes to the sja1000 CAN driver (to avoid overrun stalls on Renesas processors), to the SPI subsystem (to avoid false timeouts for long transfers), to the DMA engine driver for Xilinx XDMA IP, and a few more.
  • Miquèl Raynal also continued his effort of improving the Device Tree bindings for MTD NAND controllers
  • Luca Ceresoli added sound card support to the MSC SM2-MB-EP1 carrier board, which runs a i.MX8MP SoM, and he also fixed the timings for one of the panels supported by the simple-panel driver

Here are the details of all our changes that went into Linux 6.5:

Linux 6.4 released, Bootlin contributions inside

Linux 6.4 was released on June 25, just before the start of the Embedded Open Source Summit in Prague. As usual, lots of changes in Linux 6.4, and we recommend reading LWN coverage of the merge window (part 1, part 2). Sadly, the usual KernelNewbies page hasn’t received a lot of attention, contributions are probably welcome to revive this useful resource.

With 59 commits from Bootlin engineers, Bootlin is ranked as the #28 contributing company by number of commits for this 6.4 release, according to contribution statistics. Our main contributions have been:

  • Alexis Lothoré and Clément Léger contributed a few fixes to the Renesas RZ/N1 A5PSW Ethernet switch driver
  • Hervé Codina contributed a number of new drivers needed to support complex audio setups on some relatively old Freescale PowerPC 32-bit platforms: a driver for the Time Slot Assigner (TSA), a driver for the QUICC Multichannel Controller (QMC), and an ALSA driver that provides audio support over QMC. We have more contributions coming in this area, most notably to support HDLC network traffic over QMC.
  • Kamel Bouhara added support for the TI TAS5733 audio codec in the existing tas571x driver
  • Luca Ceresoli improved the fsl-ldb driver, used on NXP i.MX8MP and i.MX93 for the built-in DPI-to-LVDS encoder. Luca’s improvement allows to use LVDS channel 1 only, while the driver initially supported using either LVDS channel 0, or LVDS channel 0 and 1 combined.
  • Maxime Chevallier contributed an improvement to the regmap code, which allows upshifting register addresses before performing operations
  • Maxime Chevallier also contributed some small fixes to the phylink code related to previous work on QUSGMII support
  • Miquèl Raynal contributed the support for Real-While-Write in the MTD SPI-NOR subsystem. This allows to perform read operations while erase/program operations are on-going, which helps to reduce read latencies. This of course only works on SPI NOR chips that support this feature.
  • Miquèl Raynal contributed several improvements to the NVMEM subsystem. First, a brand new NVMEM driver capable of parsing the ONIE TLV information, as defined by the ONIE spec used on network equipment. Second, he contributed changes that allow NVMEM layout drivers to be compiled as kernel modules rather than being built-in

And the full details of our contributions:

Boot time: choose your kernel loading address carefully

When the compressed and uncompressed kernel images overlap

At least on ARM32, there seems to be many working addresses where the compressed kernel can be loaded in RAM. For example, one can load the compressed kernel at offset 0x1000000 (16 MB) from the start of RAM, and the Device Tree Blog (DTB) at offset 0x2000000 (32 MB). Whatever this loading address, the kernel is then decompressed at offset 0x8000 from the start of RAM, as explained this the famous How the ARM32 Linux kernel decompresses article from Linus Walleij.

There is a potential issue with the loading address of the compressed kernel, as explained in the article too. If the compressed kernel is loaded too close to the beginning of RAM, where the kernel must be decompressed, there will be an overlap between the two. The decompressed kernel will overwrite the compressed one, potentially breaking the decompression process.

Overlapping compressed and decompressed kernel

As you see in the above diagram, when this happens, the bootstrap code in the compressed kernel will first copy the compressed image to a location that’s far enough to guarantee that the decompressed kernel won’t overlap it. However, this extra step in the boot process has a cost.

Measuring boot time impact

In the context of updating our materials for our upcoming Embedded Linux Boot Time Optimization course in June, we measured this additional time on the STM32MP157A-DK1 Discovery Kit from STMicroelectronics, with a dual-core ARM Cortex-A7 CPU running at 650 MHz.

Initially, in our Embedded Linux System Development course, we were booting the DK1 board as follows:

ext4load mmc 0:4 0xc0000000 zImage; ext4load mmc 0:4 0xc4000000 dtb; bootz 0xc0000000 - 0xc4000000

0xc0000000 is exactly the beginning of RAM! We are therefore in the overlap situation.

We used grabserial from Tim Bird to measure the time between Starting kernel in U-Boot and when the compressed kernel starts executing (Booting Linux on physical CPU 0x0):

...
[4.451996 0.000124] Starting kernel ...
[0.001838 0.001838] 
[2.439980 2.438142] [    0.000000] Booting Linux on physical CPU 0x0
...

On a series of 5 identical tests, we obtained an average time of 2,440 ms, with a standard deviation of 0.4 ms.

Then, we measured the optimum case, in which the compressed kernel is loaded far enough from the beginning of RAM so that no overlap is possible:

No overlap between compressed and decompressed kernel

Here we chose to load the kernel at 0xc2000000:

ext4load mmc 0:4 0xc2000000 zImage; ext4load mmc 0:4 0xc4000000 dtb; bootz 0xc2000000 - 0xc4000000

On a series of 5 identical tests, we obtained an average time of 2,333 ms, with a standard deviation of 0.7 ms.

The new average is 107 ms smaller, which you are likely to consider as a worthy reduction, if you have experience with boot time reduction projects.

What to remember

In your embedded projects, if you are using a compressed kernel, make sure it is loaded far enough from the beginning of RAM, leaving enough space for the decompressed kernel to fit in between. Otherwise, your system will still be able to boot, but depending on the speed of your CPU and storage, it will be slower, from a few tens to a few hundreds of milliseconds.

We checked the How to optimize the boot time page on the STM32 wiki, and it recommends optimum loading addresses: 0xc2000000 for the kernel and 0xc4000000 for the device tree. This way, the upper limit for the decompressed kernel is 32 MB, which is more than enough.

If you are directly using an uncompressed kernel, which is more rare, you should also make sure that it is loaded at an optimum location, so that there is no need to move it before starting it.

Linux 6.2 released, Bootlin contributions inside

Linux 6.2 was released a few days ago, and as usual we point our readers to the LWN coverage of the merge window (part 1 and part 2), or the traditional KernelNewbies page or alternatively the embedded focused CNX Software coverage.

At Bootlin, we contributed a total of 122 patches to this release, making Bootlin the 21st contributing company by number of commits according to statistics. Also Bootlin engineer Paul Kocialkowski appears in the top developers by changed lines in the Linux 6.2 statistics.

Continue reading “Linux 6.2 released, Bootlin contributions inside”

Test a Linux kernel USB Device Controller driver with testusb

At Bootlin, we recently developed from scratch a new Linux driver for the USB Device Controller found in the Renesas RZ/N1 processor. This driver is already accepted upstream, is currently visible in linux-next and should hopefully be part of the upcoming Linux 6.3 release.

As part of developing this driver, we of course had to… test it! To test a USB Device Controller driver, the obvious idea that comes to mind is to use the available USB gadget drivers in the Linux kernel, to expose a USB mass-storage device, a USB network device, etc. However, these existing USB gadget drivers are not necessarily the best option for this kind of testing: they perform some more or less complex transfers and it can be difficult to find the root cause of an error using these gadget drivers.

Fortunately, a tool exists precisely to perform testing of USB transfers: this tool is called testusb, and it can be found directly in the Linux kernel source code in tools/usb/testusb.c. The tool is quite old and not very well known, but it proved to be very useful for our testing, so in this blog post we are sharing some details on how to use it.

Continue reading “Test a Linux kernel USB Device Controller driver with testusb”

Linux 6.1 released, Bootlin contributions

Linux 6.1 has been released yesterday, a week later than expected. Head over to LWN (part 1, part 2) or KernelNewbies for an overview of the major features merged in this release.

For this release, Bootlin contributed a total of 38 patches, with the following highlights:

  • Maxime Chevallier added initial support for the QUSGMII PHY mode, together with supporting code in the lan966x MAC driver and lan966x PHY driver.
  • Maxime Chevallier added a new PCS driver for the Altera PSE
  • Maxime Chevallier converted the Altera TSE MAC driver to phylink
  • Paul Kocialkowski contributed many improvements to the Allwinner sun6i camera interface driver, which are preparation commits to introduce support for interacting with the Allwinner ISP

Continue reading “Linux 6.1 released, Bootlin contributions”

Device Tree phandle: the C code point of view

Introduction

In this blog post, we’ll discuss the phandle properties used in Device Tree. These properties are used to describe a relationship between components described in the Device Tree. Many blog posts describe this property from the Device Tree source point of view (you can for example have a look at https://elinux.org/Device_Tree_Mysteries#Phandle for details related to Device Tree source). In this blog post, we want to take a different approach, and discuss how to handle this type of property from the Linux kernel C code point of view.

Continue reading “Device Tree phandle: the C code point of view”

Linux 6.0 released, Bootlin contributions

Linux 6.0 has been released two weeks ago, and Linux 6.1-rc1 is already out of the door, but we didn’t get the chance to look at the contributions made by Bootlin to the Linux 6.0 release. Before we do that, let’s provide our usual must-read articles on Linux 6.0: the Linux 6.0 merge window part 1 and Linux 6.0 merge window part 2 LWN.net articles and the KernelNewbies.org article.

On Bootlin side, our significant contributions to this release have been:

  • Clément Léger contributed a new driver for the Ethernet switch found in the Renesas RZ/N1 processor, as well as a PCS driver for the MII converter of the same processor. Obviously, this came with the related Device Tree bindings and Device Tree changes, but also with a few small changes in the DSA subsystem.
  • Hervé Codina enabled support for the PCIe controller found in the same Renesas RZ/N1 processor, which in fact does not allow to use PCIe devices, but USB devices: this PCIe controller is only used to connect to an internal USB controller in the chip, which therefore allows to use USB devices.
  • Köry Maincent extended the existing mpc4922 DAC IIO driver to also support the mpc4921 variant, which has only one output channel instead of two.
  • Luca Ceresoli contributed several improvements to the I2C subsystem documentation.
  • Paul Kocialkowski contributed a new DRM driver for the logiCVC-ML display controller IP
  • Paul Kocialkowski contributed two new V4L drivers for the MIPI CSI-2 camera interfaces available in the Allwinner A31 family of processors (sun6i) and the Allwinner A83T family of processors (sun8i).

Here is the full details of our contributions, commit by commit:

A journey in the RTC subsystem

As part of a team effort to improve the upstream Linux kernel support for the Renesas RZ/N1 ARM processor, we had to write from scratch a new RTC driver for this SoC. The RTC subsystem API is rather straightforward but, as most kernel subsystems, the documentation about it is rather sparse. So what are the steps to write a basic RTC driver? Here are some pointers.

The registration

The core expects drivers to allocate, initialize and then register a struct rtc_device with the device managed helpers: devm_rtc_allocate_device() and devm_rtc_register_device(). Between these two function calls, one will be required to provide at least a set of struct rtc_class_ops which contains the various callbacks used to access the device from the core, as well as setting a few information about the device.

The kind of information expected is the support for various features (rtcdev->features bitmap) as well as the maximum continuous time range supported by your RTC. If you do not know the actual date after which your device stops being reliable, you can use the rtc-range test tool from rtc-tools, available at https://git.kernel.org/pub/scm/linux/kernel/git/abelloni/rtc-tools.git (also available as a Buildroot package). It will check the consistency of your driver against a number of common known-to-be-failing situations.

Time handling

The most basic operations to provide are ->read_time() and ->set_time(). Both functions should play with a struct rtc_time which describes time and date with members for the year, month, day of the month, hours (in 24-hour mode), minutes and seconds. The week day member is ignored by userspace and is not expected to be set properly, unless it is actively used by the RTC, for example to set alarms. There are then three popular ways of storing time in the RTC world:

  1. either using the binary values of each of these fields
  2. or using a Binary Coded Decimal (BCD) version of these fields
  3. or, finally, by storing a timestamp in seconds since the epoch

In BCD, each decimal digit is encoded using four bits, eg. the number 12 could either be coded by 0x0C in hexadecimal, or 0x12 in BCD, which is easier to read with a human eye.

The three representations are absolutely equivalent and you are free to convert the time from one system to another when needed:

  • #1 <-> #2 conversions are done with bcd2bin() and bin2bcd() (from linux/bcd.h)
  • #1 <-> #3 conversions are done with rtc_time64_to_tm() and rtc_tm_to_time64() (from linux/rtc.h)

While debugging, it is likely that you will end up dumping these time structures. Note that struct rtc_time is aligned on struct tm, this means that the year field is the number of years since 1900 and the month field is the number of months since January, in the range 0 to 11. Anyway, dumping these fields manually is a loss of time, it is advised instead to use the dedicated RTC printk specifiers which will handle the conversion for you: %ptR for a struct rtc_time, %ptT for a time64_t.

Of course, when reading the actual time from multiple registers on the device and filling those fields, be aware that you should handle possible wrapping situations. Either the device has an internal latching mechanism for that (eg. the front-end of the registers that you must read are all frozen upon a specific action) or you need to verify this manually by, for instance, monitoring the seconds register and try another read if it changed between the beginning and the end of the retrieval.

If your device continuous time range ended before 2000 you may want to shift the default hardware range further by providing the start-year device tree property. The core will then shift the Epoch further for you.

Finally, once done, you can verify your implementation by playing with the rtc test tool (also from rtc-tools).

Supporting alarms

One common RTC feature is the ability to trigger alarms at specific times. Of course it’s even better if your RTC can wake-up the system.

If the device or the way it is integrated doesn’t support alarms, this should be advertised at registration time by clearing the relevant bit (RTC_FEATURE_ALARM, RTC_FEATURE_UPDATE_INTERRUPT). In the other situations, it is relevant to indicate whether the RTC has a second, 2-seconds or minute resolution by setting the appropriate flag (RTC_FEATURE_ALARM_RES_2S, RTC_FEATURE_ALARM_RES_MINUTE). Mind when testing that querying an alarm time below this resolution will return a -ETIME error.

When implementing the ->read_alarm(), ->set_alarm() and ->alarm_irq_enable() hooks, be aware that the update and periodic alarms are now implemented in the core, using HR timers rather than with the RTC so you should focus on the regular alarm. The read/set hooks naturally allow to read and change the alarm settings. A struct rtc_wkalrm *alrm is passed as parameter, alrm->time is the struct rtc_time and alrm->enabled the state of the alarm (which must be set in ->set_alarm()). The third hook is an asynchronous way to enable/disable the alarm IRQ.

The interrupt handler for the alarm is required to call rtc_update_irq() to signal the core that an alarm happened, providing the RTC device, the number of alarms reported (usually one), and the RTC_IRQF flag OR’ed with the relevant alarm flag (likely, RTC_AF for the main alarm).

Oscillator offset compensation

RTC counters rely on very precise clock sources to deliver accurate times. To handle the situation where the source is not matching the expected precision, which is the case with most cheap oscillators on the market, some RTCs have a mechanism allowing to compensate for the frequency variation by incrementing or skipping the RTC counters at a regular interval in order to get closer to the reality.

The RTC subsystem offers a set of callbacks, ->read_offset() and a ->set_offset(), where a signed offset is passed in ppb (parts per billion).

As an example, if an oscillator is below its targeted frequency of 32768 Hz and is measured to run at 32767.7 Hz, we need to offset the counter by 1 - (32767.7/32768) = 9155 ppb. If the RTC is capable of offsetting the main counter once every 20s it means that every 20s, this counter (which gets decremented at the frequency of the oscillator to produce the “seconds”) will start at a different value than 32768. Adding 1 to this counter every 20s would basically mean earning 1 / (32768 * 20) = 1526 ppb. Our target being 9155 ppb, we must offset the counter by 9155 / 1526 = 6 every 20s to get a compensated rate of 32767.7 + (6 / 20) = 32768 Hz.

Upstreaming status of the RZ/N1 RTC driver

The RZ/N1 RTC driver has all the features listed above and made its way into the v5.18 Linux kernel release. Hopefully this little reference sheet will encourage others to finalize and send new RTC drivers upstream!

Bootlin at Linux Plumbers conference 2022

Next week, almost the entire Bootlin team will be at the Embedded Linux Conference Europe in Dublin, see our previous blog post on this topic. We will give four talks at this event, on a variety of Linux kernel and embedded Linux topics.

During the same week, also in Dublin albeit in a different location, will take place the Linux Plumbers conference. Bootlin engineer Miquèl Raynal will give a talk at Linux Plumbers, as part of the IoTs a 4-Letter Word micro-conference. Miquèl’s talk will discuss Linux IEEE 802.15.4 MLME improvements, as Miquèl has been working for several months on bringing improvements to the 802.15.4 stack in the Linux kernel.