How to boot an uncompressed Linux kernel on ARM

This is a quick post to share my experience booting uncompressed Linux kernel images, during the benchmarks of kernel compression options, and no compression at all was one of these options.

It is sometimes useful to boot a kernel image with no compression. Though the kernel image is bigger, and takes more time to copy from storage to RAM, the kernel image no longer has to be decompressed to RAM. This is useful for systems with a very slow CPU, or very little RAM to store both the compressed and uncompressed images during the boot phase. The typical case is booting CPUs emulated by FPGA, during processor development, before the final silicon is out. For example, I saw a Cortex A15 chip boot at 11 MHz during Linaro Connect Q2.11 in Budapest. At this clock frequency, booting a kernel image with no compression saves several minutes of boot time, reducing development and test time. Note that with such hardware emulators, copying the kernel image to RAM is cheap, as it is done by the emulator from a file given by the user, before starting to emulate the system.

Building a kernel image with no compression on ARM is easy, but only once you know where the uncompressed image is and what to do! For people who have never done that before, I’m sharing quick instructions here.

To generate your uncompressed kernel image, all you have to do is run the usual make command. The file that you need is arch/arm/boot/Image.

Depending on the bootloader that you use, this could be sufficient. However, if you use U-boot, you still need to put this image in a uImage container, to let U-boot know about details such as how big the image is, what its entry point is, whether it is compressed or not… The problem is you can’t run make uImage any more to produce this container. That’s because Linux on ARM has no configuration option to keep the kernel uncompressed, and the uImage file would contain a compressed kernel.

Therefore, you have to create the uImage by invoking the mkimage command manually. To do this without having to guess the right mkimage parameters, I recommend to run make V=1 uImage once:

$ make V=1 uImage
...
  Kernel: arch/arm/boot/zImage is ready
  /bin/bash /home/mike/linux/scripts/mkuboot.sh -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/zImage arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 13:54:00 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    3351272 Bytes = 3272.73 kB = 3.20 MB
Load Address: 80008000
Entry Point:  80008000
  Image arch/arm/boot/uImage is ready

Don’t be surprised if the above message says that the kernel is uncompressed (corresponding to -C none). If we told U-boot that the image is already compressed, it would take care of uncompressing it to RAM before starting the kernel image.

Now, you know what mkimage command you need to run. Just invoke this command on the Image file instead of zImage (you can directly replace mkuboot.sh by mkimage):

$ mkimage -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/Image arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 14:02:27 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    6958068 Bytes = 6794.99 kB = 6.64 MB
Load Address: 80008000
Entry Point:  80008000

Now, you can use your uImage file as usual.

mkenvimage: a tool to generate a U-Boot environment binary image

Many embedded devices these days use the U-Boot bootloader. This bootloader stores its configuration into an area of the flash called the environment that can be manipulated from within U-Boot using the printenv, setenv and saveenv commands, or from Linux using the fw_printenv and fw_setenv userspace utilities provided with the U-Boot source code.

This environment is typically stored in a specific flash location, defined in the board configuration header in U-Boot. The environment is basically stored as a sequence of null-terminated strings, with a little header containing a checksum at the beginning.

While this environment can easily be manipulated from U-Boot or from Linux using the above mentioned commands, it is sometimes desirable to be able to generate a binary image of an environment that can be directly flashed next to the bootloader, kernel and root filesystem into the device’s flash memory. For example, on AT91 devices, the SAM-BA utility provided by Atmel is capable of completely reflashing an AT91 based system connected through the serial port of the USB device port. Or, in factory, initial flashing of devices typically takes place either through specific CPU monitors, or through a JTAG interface. For all of these cases, having a binary environment image is desirable.

David Wagner, who has been an intern with us at Bootlin from April to September 2011, has written a utility called mkenvimage which just does this: generate a valid binary environment image from a text file describing the key=value pairs of the environment. This utility has been merged into the U-Boot Git repository (see the commit) and will therefore be part of the next U-Boot release.

With mkenvimage you can write a text file uboot-env.txt describing the environment, like:

bootargs=console=ttyS0,115200
bootcmd=tftp 22000000 uImage; bootm
[...]

Then use mkenvimage as follows:

./tools/mkenvimage -s 0x4200 -o uboot-env.bin uboot-env.txt

The -s option allows to specify the size of the image to create. It must match the size of the flash area reserved for the U-Boot environment. Another option worth having in mind is -r, which must be used when there are two copies of the environment stored in the flash thanks to the CONFIG_ENV_ADDR_REDUND and CONFIG_ENV_SIZE_REDUND. Unfortunately, U-Boot has chosen to have a different environment layout in those two cases, so you must tell mkenvimage whether you’re using a redundant environment or a single environment.

This utility has proven to be really useful, as it allows to automatically reflash a device with an environment know to work. It also allows to very easily generate a different environment image per-device, for example to contain the device MAC address and/or the device serial number.

Barebox 2011.03 released, with contributions from Bootlin

BareboxBarebox is a bootloader started about two years ago for embedded systems of various architectures. It plays the same role as U-Boot, which is the best known project in this area, but has several advantages over U-Boot. First, it has a much better configuration and compilation system, based on the one used by the Linux kernel: instead of the rusty include/configs/myboard.h configuration headers in U-Boot, Barebox provides a nice menuconfig/xconfig/defconfig based configuration system, that everyone is familiar with. Second, Barebox has a source code organization very similar to the one of the Linux kernel and has replicated the device/driver model of the kernel. This allows to have a nice separation between device drivers and their instantiation, and a source code that looks familiar to anyone that already does kernel development.

Of course, as Barebox is newer than U-Boot, the number of architectures and platforms is more limited, but it is growing rapidly. It already supports ARM, PPC, Blackfin, x86 and a testing sandbox architecture. On ARM, the supported platforms are AT91, EP93xx, iMX, Nomadik, OMAP, S3C24xx and Versatile. On PPC, a single mpc5xxx platform is supported. Patches to add support for the NIOS architecture have also been posted recently (NIOS is a soft-core architecture from Altera).

As a young but fast-growing project, Barebox has chosen a quick development cycle: new releases are made each month, and Barebox 2011.03 has been released a few days ago. It has many ARM and generic improvements, but is also the first release with contributions from Bootlin :

Gregory CLEMENT (3):
      BMP: Add support for 32bpp video frame buffer
      ARM STM/i.MX: Add possibility to choose the bit per pixel for STM video driver
      fb i.MX23/28: Add the reset control of LCD

My colleague Gregory Clement has contributed several improvements to framebuffer support on the i.MX platform. Those improvements were made in the context of a customer project, for which Barebox was used as a way of showing immediately after the device start-up a nice logo on the screen, while the system continues to boot in the background. Initially, the user had to wait 20+ seconds to see a logo on the screen showing that the system was booting. With our Barebox based solution, a logo is now visible on the screen less than 2 seconds after the power on button is pushed.

Buildroot 2010.08 released!

Buildroot logoOn the last day of August, just in time, the 2010.08 version of Buildroot has been released. For the record, Buildroot is an easy-to-use embedded Linux build system: it can build your toolchain, your root filesystem with all its components (Busybox, libraries, applications, etc.), your kernel and your bootloaders, or any combination of these components.

Amongst the interesting changes in this version :

  • Complete rewrite of the bootloader build code. It contained a lot of legacy, unused and unclear stuff, it is now much easier to use and extend. We’ve removed support for Yaboot and added support for the new Barebox bootloader, and all the code to support AT91Bootstrap, AT91DataFlashBoot, U-Boot, Grub and Grub 2 has been rewritten.
  • Complete rewrite of the Linux kernel build code. It was also complicated to use, with an horribly complicated kernel version selection mechanism, the new code is much easier to configure and use.
  • The configuration file .config is now located in the out-of-tree directory when the O= option is used. So typically, for an out-of-tree build (which are very convenient when using the same Buildroot source tree for different projects/tests), you could do : mkdir ~/myoutput ; make O=~/myoutput menuconfig ; make O=~/myoutput
  • Support for building NPTL toolchains with uClibc, using the latest uClibc snapshots.
  • Support for the gconfig Gtk-based configurator, in addition to the already available menuconfig and xconfig
  • A particular effort has been put on fixing many of the bugs in our Bugzilla, improving robustness thanks to automated random builds, and converting even more packages to the generic and autotools infrastructure
  • Various things have also been deprecated: support for the CRIS, IA64, Sparc64 and Alpha architectures, support for Gtk over DirectFB (which is at the moment not supported upstream), Java support (no maintainer has volunteered to maintain this in Buildroot)
  • Many components have been bumped to newer versions
  • The shared configuration cache, which allowed to speed up the configuration of different packages, has been disabled by default, since it was causing a lot of problems with certain package configurations

I’ve again contributed to a significant portion of this release, being the author of the bootloader build code cleanup, the Linux kernel build code rewrite, leading an effort to reduce the number of outstanding bugs in our Bugzilla and many other little things. The contributors for this release are shown below :

   175  Peter Korsgaard
   168  Thomas Petazzoni
    38  Gustavo Zacarias
    18  cmchao
     8  Luca Ceresoli
     7  Paul Jones
     6  Lionel Landwerlin
     6  Malte Starostik
     5  Yann E. MORIN
     3  Julien Boibessot
     3  Khem Raj
     2  Dmytro Milinevskyy
     2  Francois Perrad
     2  Nick Leverton
     2  Peter Huewe
     2  Stanislav Bogatyrev
     1  Baruch Siach
     1  Bjørn Forsman
     1  Daniel Hobi
     1  Darcy Watkins
     1  Darius Augulis
     1  H Hartley Sweeten
     1  Karl Krach
     1  Kelvin Cheung
     1  Ossy
     1  Sagaert Johan
     1  Simon Pasch
     1  Slava Zanko
     1  Thiago A. Correa
     1  Will Wagner
     1  Yegor Yefremov

For the next release, there are already a few things in the pipeline :

  • Cleanup of all the board support code in Buildroot, in order to cleanly add support for more boards like BeagleBoard, Qemu boards, Calao boards, etc. We’ll use the new minimal defconfig mechanism used by the kernel. I’ve already started working on this
  • Cleanup of the package download process, to support Git and SVN download. The code has already been written by Maxime Petazzoni, reviewed on the list, so I expect it to be included fairly soon
  • Rewrite of libtool handling code, to remove some of our ugly libtool hacks. The code is currently being worked on by Lionel Landwerlin
  • Support for compiling toolchain using Crosstool-NG as a backend. The code is currently being finalized by Yann E. Morin, the author of Crosstool-NG
  • Further work on package uninstallation, clean partial rebuild. Some work has been started by Lionel Landwerlin, but it needs some discussion
  • Continue the conversion of packages to the generic and autotools infrastructures
  • I have also a ton of other things on my TODO-list : rework gdb/gdbserver support with external toolchains, rework the configuration of IPv6/RPC/locale/etc. with external toolchains, set up a Wiki-based Buildroot website with tutorials and better documentation, clean up the toolchain build process, reduce the number of “enhancement” bugs waiting in our Bugzilla, etc.

As Peter Korsgaard, Buildroot maintainer, said in the 2010.08 announcement: The next release is going to be 2010.11. Expect the first release candidate in late October and the final release at the end of November..

It is worth noting that we will be having a Buildroot Developer Day, on Friday 29th October, right after Embedded Linux Conference Europe. At least Peter Korsgaard, Lionel Landwerlin, Yann E. Morin and myself should be there.

Faster boot: starting Linux directly from AT91bootstrap

Reducing start-up time looks like one of the most discussed topics nowadays, for both embedded and desktop systems. Typically, the boot process consists of three steps: AT91SAM9263 CPU

  • First-stage bootloader
  • Second-stage bootloader
  • Linux kernel

The first-stage bootloader is often a tiny piece of code whose sole purpose is to bring the hardware in a state where it is able to execute more elaborate programs. On our testing board (CALAO TNY-A9260), it’s a piece of code the CPU stores in internal SRAM and its size is limited to 4Kib, which is a very small amount of space indeed. The second-stage bootloader often provides more advanced features, like downloading the kernel from the network, looking at the contents of the memory, and so on. On our board, this second-stage bootloader is the famous U-Boot.

One way of achieving a faster boot is to simply bypass the second-stage bootloader, and directly boot Linux from the first-stage bootloader. This first-stage bootloader here is AT91bootstrap, which is an open-source bootloader developed by Atmel for their AT91 ARM-based SoCs. While this approach is somewhat static, it’s suitable for production use when the needs are simple (like simply loading a kernel from NAND flash and booting it), and allows to effectively reduce the boot time by not loading U-Boot at all. On our testing board, that saves about 2s.

As we have the source, it’s rather easy to modify AT91bootstrap to suit our needs. To make things easier, we’ll boot using an existing U-Boot uImage. The only requirement is that it should be an uncompressed uImage, like the one automatically generated by make uImage when building the kernel (there’s not much point using such compressed uImage files on ARM anyway, as it is possible to build self-extractible compressed kernels on this platform).

Looking at the (shortened) main.c, the code that actually boots the kernel looks like this:

int main(void)
{
/* ================== 1st step: Hardware Initialization ================= */
/* Performs the hardware initialization */
hw_init();

/* Load from Nandflash in RAM */
load_nandflash(IMG_ADDRESS, IMG_SIZE, JUMP_ADDR);

/* Jump to the Image Address */
return JUMP_ADDR;
}

In the original source code, load_nandflash actually loads the second-stage bootloader, and then jumps directly to JUMP_ADDR (this value can be found in U-Boot as TEXT_BASE, in the board-specific file config.mk. This is the base address from which the program will be executed). Now, if we want to load the kernel directly instead of a second-level bootloader, we need to know a handful of values:

  • the kernel image address (we will reuse IMG_ADDRESS here, but one could
    imagine reading the actual image address from a fixed location in NAND)
  • the kernel size
  • the kernel load address
  • the kernel entry point

The last three values can be extracted from the uImage header. We will not hard-code the kernel size as it was previously the case (using IMG_SIZE), as this would lead to set a maximum size for the image and would force us to copy more data than necessary. All those values are stored as 32 bits bigendian in the header. Looking at the struct image_header declaration from image.h in the uboot-mkimage sources, we can see that the header structure is like this:

typedef struct image_header {
uint32_t    ih_magic;    /* Image Header Magic Number    */
uint32_t    ih_hcrc;    /* Image Header CRC Checksum    */
uint32_t    ih_time;    /* Image Creation Timestamp    */
uint32_t    ih_size;    /* Image Data Size        */
uint32_t    ih_load;    /* Data     Load  Address        */
uint32_t    ih_ep;        /* Entry Point Address        */
uint32_t    ih_dcrc;    /* Image Data CRC Checksum    */
uint8_t        ih_os;        /* Operating System        */
uint8_t        ih_arch;    /* CPU architecture        */
uint8_t        ih_type;    /* Image Type            */
uint8_t        ih_comp;    /* Compression Type        */
uint8_t        ih_name[IH_NMLEN];    /* Image Name        */
} image_header_t;

It’s quite easy to determine where the values we’re looking for actually are in the uImage header.

  • ih_size is the fourth member, hence we can find it at offset 12
  • ih_load and ih_ep are right after ih_size, and therefore can be found at offset 16 and 20.

A first call to load_nandflash is necessary to get those values. As the data we need are contained within the first 32 bytes, that’s all we need to load at first. However, some space is required in memory to actually store the data. The first-stage bootloader is running in internal SRAM, so we can pick any location we want in SDRAM. For the sake of simplicity, we’ll choose PHYS_SDRAM_BASEhere, which we define to the base address of the on-board SDRAM in the CPU address space. Then, a second call will be necessary to load the entire kernel image at the right load address.

Then all we need to do is:

#define be32_to_cpu(a) ((a)[0] << 24 | (a)[1] << 16 | (a)[2] << 8 | (a)[3])
#define PHYS_SDRAM_BASE 0x20000000

int main(void)
{
unsigned char *tmp;
unsigned long jump_addr;
unsigned long load_addr;
unsigned long size;

hw_init();

load_nandflash(IMG_ADDRESS, 0x20, PHYS_SDRAM_BASE);

/* Setup tmp so that we can read the kernel size */
tmp = PHYS_SDRAM_BASE + 12;
size = be32_to_cpu(tmp);

/* Now, load address */
tmp += 4;
load_addr = be32_to_cpu(tmp);

/* And finally, entry point */
tmp += 4;
jump_addr = be32_to_cpu(tmp);

/* Load the actual kernel */
load_nandflash(IMG_ADDRESS, size, load_addr - 0x40);

return jump_addr;
}

Note that the second call to load_nandflash could in theory be replaced by:

load_nandflash(IMG_ADDRESS + 0x40, size + 0x40, load_addr);

However, this will not work. What happens is that load_nandflash starts reading at an address aligned on a page boundary, so even when passing IMG_ADDRESS+0x40 as a first argument, reading will start at IMG_ADDRESS, leading to a failure (writes have to aligned on a page boundary, so it is safe to assume that IMG_ADDRESS is actually correctly aligned).

The above piece of code will silently fail if anything goes wrong, and does no checking at all – indeed, the binary size is very limited and we can’t afford to put more code than what is strictly necessary to boot the kernel.