Systemd, read-only rootfs and overlay file system over /etc

Systemd is a popular init system, used to bootstrap user space and manage user processes. It now replaces several Linux utilities with its own components like log management, networking, time management, etc. There is even a bootloader component now. Systemd is obviously ubiquitous nowadays for desktop/server Linux distributions, and is also commonly used on embedded devices to benefit from features such as parallel startup of services, monitoring of services, and more.

In a recent project that uses Buildroot as its build system, we have used systemd with the storage consisting of a read-only root filesystem (SquashFS) and an overlay file system (OverlayFS) mounted on /etc. While doing this, we faced two issues with the use of OverlayFS on /etc:

  • /etc/machine-id file management. This file is created during the first boot by systemd, and if the root filesystem is read-only, it will bind mount it to /run and wait to have read-write access to create it (see more details). In that case, the machine-id file is re-generated at each boot (because /run is a tmpfs, which means that the machine identification changes at each boot, which is not necessarily desirable. On the other hand, we don’t want to machine-id file to be part of the SquashFS filesystem because the SquashFS filesystem is identical on all devices, while the /etc/machine-id file is unique per device. So ideally, we would like this machine-id file to be stored in our OverlayFS, generated during the first boot. The issue is that reading the machine-id file is done very early by systemd, before we get the chance to mount the OverlayFS.
  • We wanted to be able to add or modify systemd services using the OverlayFS. Systemd parses the service files at early init and executes them according to their order and dependencies. The service mounting the filesystems from /etc/fstab and any other services is started after such parsing, which is too late. We could think of running daemon-reload from a custom service once mounting was complete, but this is not really a stable solution, as
    Lennart Poettering commanted on in a short e-mail thread about this issue.

The solution suggested by Lennart, and elsewhere on the wider Internet is to mount the OverlayFS from an initramfs, which allows to have it setup before systemd even starts. As we use Buildroot and using an initramfs adds complexity by requiring a separate configuration to manage multiple images. This was overkill in our case, just for setting up the overlay. The solution we eventually chose was to create an init_overlay.sh script which is started as init before systemd, by adding init=/sbin/init_overlay.sh to the kernel command line:

#!/bin/sh
mount -t proc -o nosuid,nodev,noexec none /proc
mount -t sysfs -o nosuid,nodev,noexec none /sys
mount /dev/mmcblk0p2 /mnt/data
mount -t overlay overlay -o lowerdir=/etc,upperdir=/mnt/data/etc,workdir=/mnt/data/.etc-work /etc
exec /sbin/init

Hopefully, this will be useful to others. Of course, we’re also curious to hear if others faced the same issue, and discover how they solved this. Let us know in the comments.

Bootlin contributes SquashFS support to U-Boot

SquashFS is a very popular read-only compressed root filesystem, widely used in embedded systems. It has been supported in the Linux kernel for many years, but so far the U-Boot bootloader did not have support for SquashFS, so it was not possible to load a kernel image or a Device Tree Blob from a SquashFS filesystem in U-Boot.

Between February 2020 and August 2020, João Marcos Costa from the ENSICAEN engineering school, has worked at Bootlin as an intern. João’s internship goal was specifically to implement and contribute to U-Boot the support for the SquashFS filesystem. We are happy to announce that João’s effort has now completed, as the support for SquashFS is now in upstream U-Boot. It can be found in fs/squashfs/ in the U-Boot source code.

More specifically, João’s contributions have been:

In addition to those contributions already merged, João has also submitted for inclusion the support for LZO and ZSTD decompression support.

Practically speaking, this SquashFS support works very much like the support for other filesystems. At build time, you need to enable the CONFIG_FS_SQUASHFS option for the SquashFS driver itself, and CONFIG_CMD_SQUASHFS for the SquashFS U-Boot commands. Once enabled, in U-Boot, you get:

=> sqfsls     
sqfsls - List files in directory. Default: root (/).
 
Usage:
sqfsls  [] [directory]
    - list files from 'dev' on 'interface' in 'directory'
 
=> sqfsload 
sqfsload - load binary file from a SquashFS filesystem
 
Usage:
sqfsload  [ [ [ [bytes [pos]]]]]
    - Load binary file 'filename' from 'dev' on 'interface'
      to address 'addr' from SquashFS filesystem.
      'pos' gives the file position to start loading from.
      If 'pos' is omitted, 0 is used. 'pos' requires 'bytes'.
      'bytes' gives the size to load. If 'bytes' is 0 or omitted,
      the load stops on end of file.
      If either 'pos' or 'bytes' are not aligned to
      ARCH_DMA_MINALIGN then a misaligned buffer warning will
      be printed and performance will suffer for the load.

sqfsls is obviously used to list files, here the list of files from a typical Linux root filesystem:

=> sqfsls mmc 0:1
            bin/
            boot/
            dev/
            etc/
            lib/
    <SYM>   lib32
    <SYM>   linuxrc
            media/
            mnt/
            opt/
            proc/
            root/
            run/
            sbin/
            sys/
            tmp/
            usr/
            var/
 
2 file(s), 16 dir(s)

And then you can use sqfsload to load files, which we illustrate here by loading a Linux kernel image and Device Tree blob, and booting this kernel:

=> sqfsload mmc 0:1 $kernel_addr_r /boot/zImage
6160384 bytes read in 433 ms (13.6 MiB/s)
=> sqfsload mmc 0:1 0x81000000 /boot/am335x-boneblack.dtb
40817 bytes read in 11 ms (3.5 MiB/s)
=> setenv bootargs console=ttyO0,115200n8
=> bootz $kernel_addr_r - 0x81000000
## Flattened Device Tree blob at 81000000
   Booting using the fdt blob at 0x81000000
   Loading Device Tree to 8fff3000, end 8fffff70 ... OK
 
Starting kernel ...
 
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.19.79 (joaomcosta@joaomcosta-Latitude-E7470) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05)) #1 SMP Fri May 29 18:26:39 CEST 2020
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d

Of course, the SquashFS driver is still fresh, and there is a chance that more extensive and widespread testing will uncover a few bugs or limitations, which we’re sure the broader U-Boot community will help address. Overall, we’re really happy to have contributed this new functionality to U-Boot, it will be useful for our projects, and we hope it will be useful to many others in the embedded Linux community!