Slides and videos of Bootlin talks at Live Embedded Event #2

The second edition of Live Embedded Event took place on June 3rd, exactly 6 months after the first edition. Even though there were a few issues with the online platform, it was once again great to learn new things about embedded, and share some of the work we’ve been doing at Bootlin on various topics. For the next edition, we plan to switch to a different online platform, hopefully providing a better experience.

But in the mean time, all videos of the event have been posted on the Youtube Channel of the event. The talks from Bootlin have been posted on Bootlin’s Youtube Channel.

Indeed, in addition to being part of the organization committee, Bootlin prepared and delivered 5 talks as part of Live Embedded Event, covering different topics we have worked on in the recent months for our customers.

Understanding U-Boot Falcon Mode and adding support for new boards, Michael Opdenacker

Slides [PDF]

Introduction to RAUC, Kamel Bouhara

Slides [PDF]

Security vulnerability tracking tools in Buildroot, Thomas Petazzoni

Slides [PDF]

Secure boot in embedded Linux systems, Thomas Perrot

Slides [PDF]

Device Tree overlays and U-boot extension board management, Köry Maincent

Slides [PDF]

New training course: embedded Linux boot time optimization

For many embedded products, the issue of how much time it takes from power-on to the application being fully usable by the end-user is an important challenge. Bootlin has been providing its expertise and experience in this area to its customers for many years through numerous boot time optimization projects, and we have shared this knowledge through a number of talks at several conferences over the past years.

We are now happy to announce that we have a new training course Embedded Linux boot time optimization, open for public registration. This training course was already given to selected Bootlin customers and is now available for everyone.

Embedded Linux boot time optimization

The training course will be lead by Michael Opdenacker, Bootlin’s founder, and author of several publications on the topic of boot time optimization. The course is organized over 4 sessions of 4 hours, with a significant fraction of time spent on practical demonstrations showing on a real-life example the techniques to measure and reduce the boot time of an embedded Linux system.

As usual with Bootlin, the training materials are fully available: Agenda, Slides and Practical lab instructions.

Boot time optimization slide

Our first course open for public registration will take place from April 6th to April 9th, 2021, from 14:00 to 18:00 UTC+2 (Paris time) on each day. The session cost is 519 EUR if you take advantage of the early bird price available until March 9th. Otherwise, the regular rate is 619 EUR. You can register now for this course on Eventbrite.

Also, if you’re interested in organizing a dedicated session for your company, do not hesitate to contact us.

New training materials: boot time reduction workshop

We are happy to release new training materials that we have developed in 2013 with funding from Atmel Corporation.

The materials correspond to a 1-day embedded Linux boot time reduction workshop. In addition to boot time reduction theory, consolidating some of our experience from our embedded Linux boot time reduction projects, the workshop allows participants to practice with the most common techniques. This is done on SAMA5D3x Evaluation Kits from Atmel.

The system to optimize is a video demo from Atmel. We reduce the time to start a GStreamer based video player. During the practical labs, you will practice with techniques to:

  • Measure the various steps of the boot process
  • Analyze time spent starting system services, using bootchartd
  • Simplify your init scripts
  • Trace application startup with strace
  • Find kernel functions taking the most time during the boot process
  • Reduce kernel size and boot time
  • Replace U-Boot by the Barebox bootloader, and save a lot of time
    thanks to the activation of the data cache.

Creative commonsAs usual, our training materials are available under the terms of the Creative Commons Attribution-ShareAlike 3.0 license. This essentially means that you are free to download, distribute and even modify them, provided you mention us as the original authors and that you share these documents under the same conditions.

Special thanks to Atmel for allowing us to share these new materials under this license!

Here are the documents at last:

The first public session of this workshop will be announced in the next weeks.
Don’t hesitate to contact us if you are interested in organizing a session on your site.

Starting Linux directly from AT91bootstrap3

Here is an update for our previous article on booting linux directly from AT91bootstrap. On newer ATMEL platforms, you will have to use AT91bootstrap 3. It now has a convenient way to be configured to boot directly to Linux.

You can check it out from github:

git clone git://github.com/linux4sam/at91bootstrap.git

That version of AT91bootstrap is using the same configuration mechanism as the Linux kernel. You will find default configurations, named in the form:
<board_name><storage>_<boot_strategy>_defconfig

  • board_name can be: at91sam9260ek, at91sam9261ek, at91sam9263ek, at91sam9g10ek, at91sam9g20ek, at91sam9m10g45ek, at91sam9n12ek, at91sam9rlek, at91sam9x5ek, at91sam9xeek or at91sama5d3xek
  • storage can be:
    • df for DataFlash
    • nf for NAND flash
    • sd for SD card
  • our main interest will be in boot_strategy which can be:
    • uboot: start u-boot or any other bootloader
    • linux: boot Linux directly, passing a kernel command line
    • linux_dt: boot Linux directly, using a Device Tree
    • android: boot Linux directly, in an Android configuration

Let’s take for example the latest evaluation boards from ATMEL, the SAMA5D3x-EK. If you are booting from NAND flash:

make at91sama5d3xeknf_linux_dt_defconfig
make

You’ll end up with a file named at91sama5d3xek-nandflashboot-linux-dt-3.5.4.bin in the binaries/ folder. This is your first stage bootloader. It has the same storage layout as used in the u-boot strategy so you can flash it and it will work.

As a last note, I’ll had that less is not always faster. On our benchmarks, booting the SAMA5D31-EK using AT91bootstrap, then Barebox was faster than just using AT91bootstrap. The main reason is that barebox is actually enabling the caches and decompresses the kernel(see below, the kernel is also enaling the caches before decompressing itself) before booting.

Linux on ARM: xz kernel decompression benchmarks

I recently managed to find time to clean up and submit my patches for xz kernel compression support on ARM, which I started working on back in November, during my flight to Linaro Connect. However, it was too late as Russell King, the ARM Linux maintainer, alreadyaccepted a similar patch, about 3 weeks before my submission. The lesson I learned was that checking a git tree is not always sufficient. I should have checked the mailing list archives too.

The good news is that xz kernel compression support should be available in Linux 3.4 in a few months from now. xz is a compression format based on the LZMA2 compression algorithm. It can be considered as the successor of lzma, and achieves even better compression ratios!

Before submitting my patches, I ran a few benchmarks on my own implementation. As the decompressing code is the same, the results should be the same as if I had used the patches that are going upstream.

Benchmark methodology

For both boards I tested, I used the same pre 3.3 Linux kernel from Linus Torvalds’ mainline git tree. I also used the U-boot bootloader in both cases.

I used the very useful grabserial script from Tim Bird. This utility reads messages coming out of the serial line, and adds timestamps to each line it receives. This allow to measure time from the earliest power on stages, and doesn’t slow down the target system by adding instrumentation to it.

Our benchmarks just measure the time for the bootloader to copy the kernel to RAM, and then the time taken by the kernel to uncompress itself.

  • Loading time is measured between “reading uImage” and “OK” (right before “Starting kernel”) in the bootloader messages.
  • Compression time measured between “Uncompressing Linux” and “done”:
    ~/bin/grabserial -v -d /dev/ttyUSB0 -e 15 -t -m "Uncompressing Linux" -i "done," > booting-lzo.log

Benchmarks on OMAP4 Panda

The Panda board has a fast dual Cortex A9 CPU (OMAP 4430) running at 1 GHz. The standard way to boot this board is from an MMC/SD card. Unfortunately, the MMC/SD interface of the board is rather slow.

In this case, we have a fast CPU, but with rather slow storage. Therefore, the time taken to copy the kernel from storage to RAM is expected to have a significant impact on boot time.

This case typically represents todays multimedia and mobile devices such as phones, media players and tablets.

Compression Size Loading time Uncompressing time Total time
gzip 3355768 2.213376 0.501500 2.714876
lzma 2488144 1.647410 1.399552 3.046962
xz 2366192 1.566978 1.299516 2.866494
lzo 3697840 2.471497 0.160596 2.632093
None 6965644 4.626749 0 4.626749

Results on Calao Systems USB-A9263 (AT91)

The USB-A9263 board from Calao Systems has a cheaper and much slower AT91SAM9263 CPU running at 200 MHz.

Here we are booting from NAND flash, which is the fastest way to boot a kernel on this board. Note that we are using the nboot command from U-boot, which guarantees that we just copy the number of bytes specified in the uImage header.

In this case, we have a slow CPU with slow storage. Therefore, we expect both the kernel size and the decompression algorithm to have a major impact on boot time.

This case is a typical example of industrial systems (AT91SAM9263 is still very popular in such applications, as we can see from customer requests), booting from NAND storage operating with a 200 to 400 MHz CPU.

Compression Size Loading time Uncompressing time Total time
gzip 2386936 5.843289 0.935495 6.778784
lzma 1794344 4.465542 6.513644 10.979186
xz 1725360 4.308605 4.816191 9.124796
lzo 2608624 6.351539 0.447336 6.798875
None 4647908 11.080560 0 11.080560

Lessons learned

Here’s what we learned from these benchmarks:

  • lzo is still the best solution for minimum boot time. Remember, lzo kernel compression was merged by Bootlin.
  • xz is always better than lzma, both in terms of image size. Therefore, there’s no reason to stick to lzma compression if you used it.
  • Because of their heavy CPU usage, lzma and xz remain pretty bad in terms of boot time, on most types of storage devices. On systems with a fast CPU, and very slow storage though, xz should be the best solution
  • On systems with a fast CPU, like the Panda board, boot time with xz is actually pretty close to lzo, and therefore can be a very interesting compromise between kernel size and boot time.
  • Using a kernel image without compression is rarely a worthy solution, except in systems with a very slow CPU. This is the case of CPUs emulated on an FPGA (typically during chip development, before silicon is available). In this particular case, copying to memory is directly done by the emulator, and we just need CPU cycles to start the kernel.