ELCE 2009 videos

Videos from the Embedded Linux Conference Europe, Grenoble, October 2009

ELCE 2009Just a few weeks before the next edition of the Embedded Linux Conference in San Francisco, here are the videos from the previous edition in Europe a few months ago.

These videos were shot by Satoru Ueda and Tim Bird (Sony), Ruud Derwig (NXP) and by Thomas Petazzoni and Michael Opdenacker (Bootlin). As usual, they are released under the terms of the Creative Commons Attribution – ShareAlike Licence version 3.0.

Ruud DerwigIf you have never been to an Embedded Linux Conference yet, these videos should show you how useful this conference is for embedded Linux system developers. This is the place where you can discover new development tools and technologies that will change your working life, benefit from the experience from your peers, get the opportunity to talk to the fantastic people who implement the Free and Open Source software that makes your system run, and win cool penguin goodies. So, don’t miss next next edition in San Francisco.

ELC 2010 program announced

Japantown, San FranciscoThe program of talks and BOFs of the 2010 edition of the Embedded Linux Conference has been published a few days ago, an opportunity to look at the most important and interesting conference for embedded Linux developers. For the record, ELC 2010 will take place from April, 12th to April, 14th in San Francisco, CA, USA, in the same place as the 2009 edition.

A nice set of talks

  • A set of real-time related talks: Real-Time Linux Failure, by Frank Rowand (works for Sony, well known for his preempt-rt related talks at various ELC conferences), Effective Use of RT-Preempt, by Kevin Dankwardt, Using Interrupt Threads to Prioritize Interrupts, by Mike Anderson (also well known for his very interactive talks, he will also be giving his traditional Using JTAG to debug Linux device drivers tutorial), Measuring Responsiveness of Linux Kernel on Embedded System, by YungJoon Jung and DongHyouk Lim.
  • A talk by Grant Likely about Flattened Device Tree ARM support update, an effort to convert the ARM architecture to the same organization used in PowerPC, with a device tree file describing the hardware details instead of platform_device definitions in plain C. An important change for anyone doing ARM kernel development.
  • Several power-management related talks: Runtime Power Management: Overview and Platform Implementation, by Kevin Hillman (who works for Deep Root Systems and has done a huge amount of work in the OMAP power management area). Runtime Power Management is probably the most important change done recently to the power management infrastructure of the Linux kernel, so this talk is certainly worth a look, all the more as Kevin is a very good speaker. On power manegement, there will also be other talks : DVFS for the Embedded Linux, by Yong Bon Koo and Youngbin Seo, Wake-ups effect on idle power for Intel’s Moorestown MID and smartphone platform, by German Monroy (Intel), Workload based aggressive Power Management on the Intel Moorestown MID and future Intel MID/Smartphone Platforms, by Sujith Thomas (from Intel).
  • Japan Town, San FranciscoThe usual tracing-related talks, with Using the LTTng tracer for system-wide performance analysis and debugging by Mathieu Desnoyers and Ftrace – embedded edition, by Steven Rostedt. A talk on debugging Linux toolchain overview with advanced debugging and tracing features, by Dominique Toupin.
  • Talks about platforms: a keynote by Greg Kroah Hartmann on Android: a case study of an embedded Linux project (during which Greg will probably explain why the Android kernel modifications are not mainlined), Experiences in Android Porting, Lessons learned, tips and tricks, by Mark Gross and Understanding and Developing Applications for the Maemo Platform, by Leandro Melo de Sales, even though the recent merge of Maemo and Moblin to create MeeGo is likely to change some technical aspects of application development for this platform.
  • The question of multi-core now also seems to be present in embedded conferences: Strategies for Migrating Uniprocessor Code to Multi-Core, by Mike Anderson, Embedded Multi-core with Adeos, Dan Malek, Lock-free algorithm for Multi-core architecture, Hiromasa Kanda. Multi-core Scheduling optimizations for soft real-time multi-threaded applications – A cooperation aware approach, Lucas Martins De Marchi.
  • Some security talks, with Mike Anderson (again !) talking about Creating a Secure Router Using SELinux and Jake Edge about Understanding threat models for embedded devices
  • Some more-or-less multimedia-oriented talks: Supporting SoC video subsystems in video4linux, by Hans Verkuil, An Introduction to the Qt Development Framework, by Jeremy Katz, GeeXboX Enna: embedded Media Center, by Benjamin Zores, Case Study – Embedded Linux in a digital television STB, by Melanie Rhianna Lewis
  • In the other talks, I’ve noted the Small Business Owners BOF by Grant Likely, Evaluation of Data Reliability on Linux File Systems by Yoshitake Kobayashi, Porting the Linux Kernel to x86 MID platforms, by Jacob Pan, Linux without a bootloader? by Greg Ungerer, Kexec – Ready for Embedded Linux by Magnus Damn, Custom hardware modeling for FPGAs and Embedded Linux Platforms with QEMU, by John Williams, Edgar Iglesias.

Both Michael Opdenacker and I will be there at ELC. We hope to meet you during this conference!

Linux 2.6.33 features for embedded systems

Interesting features for embedded Linux system developers

Penguin workerLinux 2.6.33 was out on Feb. 24, 2010, and to incite you to try this new kernel in your embedded Linux products, here are features you could be interested in.

The first news is the availability of the LZO algorithm for kernel and initramfs compression. Linux 2.6.30 already introduced LZMA and BZIP2 compression options, which could significantly reduce the size of the kernel and initramfs images, but at the cost of much increased decompression time. LZO compression is a nice alternative. Though its compression rate is not as good as that of ZLIB (10 to 15% larger files), decompression time is much faster than with other algorithms. See our benchmarks. We reduced boot time by 200 ms on our at91 arm system, and the savings could even increase with bigger kernels.

This feature was implemented by my colleague Albin Tonnerre. It is currently available on x86 and arm (commit, commit, commit, commit), and according to Russell King, the arm maintainer, it should become the default compression option on this platform. This compressor can also be used on mips, thanks to Wu Zhangjin (commit).

For systems lacking RAM resources, a new useful feature is Compcache, which allows to swap application memory to a compressed cache in RAM. In practise, this technique increases the amount of RAM that applications can use. This could allow your embedded system or your netbook to run applications or environments it couldn’t execute before. This technique can also be a worthy alternative to on-disk swap in servers or desktops which do need a swap partition, as access performance is much improved. See this LWN.net article for details.

This new kernel also carries lots of improvements on embedded platforms, especially on the popular TI OMAP platform. In particular, we noticed early support to the IGEPv2 board, a very attractive platform based on the TI OMAP 3530 processor, much better than the Beagle Board for a very similar price. We have started to use it in customer projects, and we hope to contribute to its full support in the mainline kernel.

Another interesting feature of Linux 2.6.33 is the improvements in the capabilities of the perf tool. In particular, perf probe allows to insert Kprobes probes through the command line. Instead of SystemTap, which relied on kernel modules, perf probe now relies on a sysfs interface to pass probes to the kernel. This means that you no longer need a compiler and kernel headers to produce your probes. This made it difficult to port SystemTap to embedded platforms. The arm architecture doesn’t have performance counters in the mainline kernel yet (other architectures do), but patches are available. This carries the promise to be able to use probe tools like SystemTap at last on embedded architectures, all the more if SystemTap gets ported to this new infrastructure.

Other noticeable improvements in this release are the ability to mount ext3 and ext2 filesystems with just an ext4 driver, a lightweight RCU implementation, as well as the ability to change the default blinking cursor that is shown at boot time.

Unfortunately, each kernel release doesn’t only carry good news. Android patches got dropped from this release, because of a lack of interest from Google to maintain them. These are sad news and a threat for Android users who may end up without the ability to use newer kernel features and releases. Let’s hope that Google will once more realize the value of converging with the mainline Linux community. I hope that key contributors that this company employs (Andrew Morton in particular) will help to solve this issue.

As usual, this was just a selection. You will probably find many other interesting features on the Linux Changes page for Linux 2.6.33.

Buildroot 2010.02 released, contributions from Bootlin!

Buildroot logoBuildroot is a embedded Linux system build system. It automates the process of downloading, configuring, compiling and installating all the components of an embedded Linux system, from Busybox to more complicated software stacks using Gtk, Qt, X.org, Gstreamer, etc. Buildroot is easy to use and extend, making it a nice choice for small to medium-sized embedded Linux systems.

As promised by the fixed-release schedule, a 2010.02 release has been published on Friday, with numerous improvements over the previous version 2009.11, many of which are part of the general cleanup process that the project is doing since the beginning of 2009. These improvements are detailed in the project CHANGES file.

Thomas Petazzoni, from Bootlin, implemented several of these improvements :

  • Creation of a package infrastructure for non-autotools packages. Buildroot had for a long time an infrastructure to factorize the code needed to build packages based on the autotools build systems. But all other packages were using hand-made Makefiles, which were hard to write and generated a lot of code duplication. Therefore, we have introduced an infrastructure that makes adding new packages much easier, and which allows us to cleanup the existing codebase significantly by factorizing a lot of common code. The autotools infrastructure has also been reworked on top of the generic infrastructure to avoid code duplication as well. At the same time, we have significantly improved the documentation on how to add new packages. This infrastructure is a building block that will allow us to easily add more features to all packages in Buildroot (such as package generation).
  • Removal of the external toolchain source mechanism, which was merged with the normal toolchain building procedure. This special casing was implemented to allow the compilation of AVR32 toolchains, but such an additional complexity wasn’t needed. Now, Buildroot continues to build AVR32 toolchains as it used to do, but the code is much cleaner. Another illustration of our large cleanup effort.
  • Many, many, many fixes to different packages, many of them to ensure that we do not depend on development packages being installed on the host. This is very important to ensure that our build procedure is as independent as possible from the development machine configuration.

From the list of contributors, ordered by the number of patches, Thomas Petazzoni of Bootlin has been the first Buildroot contributor for this last release :

$ git shortlog -e -n -s 2009.11..2010.02  (removed e-mail addresses)
139 Thomas Petazzoni
124 Peter Korsgaard
26  Lionel Landwerlin
23  Gustavo Zacarias
7   Julien Boibessot
4   Nigel Kukard
4   Sven Neumann
2   Anders Darander
2   Chris Packham
2   Daniel Mack
2   H Hartley Sweeten
2   Richard van Paasen
2   Will Wagner
2   William Wagner
2   Yann E. MORIN
1   Cameron Hutchison
1   Clark Rawlins
1   Francisco Gonzalez
1   Francisco Gonzalez Morell
1   Hans-Christian Egtvedt
1   Lionel Landwerlin
1   Ormund Williams
1   Rob Alley
1   Sagaert Johan
1   grante

For the next release, we will work on additional cleanup of Buildroot and particularly the target/ directory, which contains the code to build the Linux kernel, different bootloaders, and to generate the final root filesystem image in various formats. Improving support for external toolchains is also on our TODO list : supporting multilib toolchains such as the CodeSoucery toolchain, and fixing a long-standing issue with libtool.

Don’t hesitate to try Buildroot, and to report your successes and failures on the mailing-list, in our bug tracker, or on our IRC channel, #uclibc on Freenode.

Embedded Linux practical labs with the Beagle Board

Note: the materials for training with the Beagle Board are no longer available, and would be significantly out of date anyway. We advise you to check our Embedded Linux System Development and Linux Kernel and Driver Development training courses for up-to-date instructions that work on cheaper boards, which are still available on the market today. And if you still have an old Beagle board, it will be an interesting exercise to adapt our current labs to run them on such hardware.

We were asked to customize our embedded Linux training session with specific labs on OMAP 3530 hardware. After a successful delivery on the customer site, using Beagle boards, here are our training materials, released as usual under the terms of the Creative Commons Attribution-ShareAlike 3.0 license:

If you are the happy owner of such a board (both attractive and cheap), or are interested in getting one, you can get valuable embedded Linux experience by reading our lecture materials and by taking our practical labs.

Here’s what you would practise with if you decide to take our labs:

  • Build a cross-compiling toolchain with crosstool-NG
  • Compile U-boot and the X-loader and install it on MMC and flash storage.
  • Manipulate Linux kernel sources and apply source patches
  • Configure, compile and boot a Linux kernel for an emulated PC target
  • Configure, cross-compile and boot a Linux kernel on your Beagle Board
  • Build a tiny filesystem from scratch, based on BusyBox, and with a web server interface. Practice with NFS booting.
  • Put your filesystem on MMC storage, replacing NFS. Practice with SquashFS.
  • Put your filesystem on internal NAND flash storage. Practice with JFFS2 too.
  • Manually cross-compile libraries (zlib, libpng, libjpeg, FreeType and DirectFB) and a DirectFB examples, getting familiar with the tricks required to cross-compile components provided by the community.
  • Build the same kind of graphical system automatically with Buildroot.
  • Compile your own application against existing libraries. Debug a target application with strace, ltrace and gdbserver running on the target.
  • Do experiments with the rt-preempt patches. Measure scheduling latency improvements.
  • Implement hotplugging with mdev, BusyBox’s lightweight alternative to udev.

Note that the labs were tested with Rev. C boards, but are also supposed to work fine with Rev. B ones. You may also be able to reuse some of our instructions with other boards with a TI OMAP3 processor.

Of course, if you like the materials, you can also ask your company to order such a training session from us. We will be delighted to come to your place and spend time with you and your colleagues.

LZO kernel compression

As Michael stated in his review of the interesting features in Linux 2.6.30, new compression options have been recently added to the kernel. We therefore decided to have a look at those compression methods, from a compression ratio and decompression speed point of view.

This comparison will be based on “self-extractible kernels”, that is, kernel images containing bootstrap code allowing them to extract a compressed image. As underlined in the previous article, this approach is not used on all architectures. Blackfin, notably, chose a different path and compresses the whole kernel image, without including bootstrap code. While this has the clear advantage of making compression much simpler with respect to kernel code, it forces decompression out to the bootloader code.

Each of those methods has its advantages. Indeed, the Blackfin approach relies on the bootloader to provide the necessary functions, so that may be a problem to do things like bypassing u-boot to reduce the boot time. On the other hand, implementing it only once in the bootloader (as architecture-independent code) makes it unnecessary to write the low-level bootstrap code for each architecture in the kernel, which is surely interesting on virtually all architectures, the notable exceptions being x86 and ARM.

Gzip (also known as Zlib or inflate) has been the traditional (and, as a matter of fact, only) method used to compress kernels. Consequently, we’ll use it as the reference in the following tests. Our test environment is as follows:

ARM9 AT91SAM9263 CPU, 200MHz, using the mainline arch/arm/configs/usb-a9263.config.

This comparison includes figures for LZO, a new kernel image compression method that I have contributed to the Linux sources, and which hopefully will make its way into the mainline kernel. LZO support in the kernel is only new for kernel decompression, as it is already used by JFFS2 and UBIFS. LZO is a stream-oriented algorithm, and although its compression ratio is lower than that of gzip, decompression is lightning-fast, as we will soon find out.

So, here are the figures, average on 20 boots with each compression method:

Uncompressed 3.24Mo 200%
LZO 1.76Mo 0.552s 70% 109%
Gzip 1.62Mo 0.775s 100% 100%
LZMA 1.22Mo 5.011s 646% 75%

Bzip2 has not been tested here: the low-level bootstrap file, head.S, only allocates 64Kb for use by malloc() on ARM. Some quick tests showed that the kernel would not extract with less than 3.5Mib of malloc() space. That would require to modify head.S so that malloc can use more memory, which we will not do here. However, given that enough memory is usable on the system, one could well use bzip2. All the other algorithms performed the extraction using the standard 64Kib malloc space.

From the results, we can clearly see that LZMA is nearly unsuitable for our system, and should be considered only if the space constraints for storing the kernel are so tight that we can’t afford to use more space that was is strictly necessary.

LZO looks like a good candidate when it comes to speeding up the boot process, at the expense of some (almost neglectable) extra space. Gzip is close to LZO when it comes to size, although extraction is not as fast. That means that unless you’re hitting corner cases, like only having enough space for a Gzip compressed image but not for one made with LZO, choosing the latter is probably a safe bet.

Besides, the LZO-compressed kernel size is about 54% the size of the uncompressed kernel. As the kernel load time varies linearly with its size, load time for an uncompressed kernel doubles. While 0.55s are won because there’s no need to run a decompression algorithm, you spend twice as much time loading the kernel. This time is not negligible at all compared to the decompression time. Indeed, loading the uncompressed image takes roughly 0.8s. That means that at the cost of slowing down the boot process by 0.15s (compared to an uncompressed kernel), one gets a kernel image which is roughly twice as small. Rather nice, isn’t it?

Crosstool-NG 1.5.0 released, with new features!

Crosstool-NG, the successor of Crosstool, is a tool that automates the complicated process of building a cross-compilation toolchain. It provides a nice menuconfig interface to fine tune the configuration of the toolchain, before creating the toolchain automatically by retrieving, extracting, patching, configuring, compiling and installing the different components in the right order, with the right arguments and configuration.

Yann E. Morin, the maintainer of Crosstool-NG has just announced the release of Crosstool-NG 1.5.0, with the following new features:

  • Support for gcc 4.4
  • Experimental support for canadian-cross toolchains. Canadian-cross toolchains are toolchains that are built on machine A, to run on machine B and generate code for machine C, while usual cross-compilation toolchain are built on machine A, to run on machine A and generate code for machine B
  • Experimental support for AVR32, with support for operation as MMU-less, thanks to the newlib C library

In addition to these important features, bugs have been fixed and improvements have been made. Yann also switched the development of Crosstool-NG from Subversion to Mercurial, in order to ease community participation in the improvement of Crosstool-NG.

By the way, if you are interested in Crosstool-NG, don’t miss Yann E. Morin’s talk at the Embedded Linux Conference Europe 2009. See the program for details.

CALAO boards supported in mainline U-Boot

I’m happy to announce that a couple days ago, support for the CALAO SBC35-A9G20, TNY-A9260 and TNY-A9G20 boards made its way into the U-Boot git repository. Sadly, it’s not possible to boot from an MMC/SD card with the SBC35 yet, but it’s something I’m currently working on.

Support for all these cards will be available in the next U-Boot release, due in November.

Faster boot: starting Linux directly from AT91bootstrap

Reducing start-up time looks like one of the most discussed topics nowadays, for both embedded and desktop systems. Typically, the boot process consists of three steps: AT91SAM9263 CPU

  • First-stage bootloader
  • Second-stage bootloader
  • Linux kernel

The first-stage bootloader is often a tiny piece of code whose sole purpose is to bring the hardware in a state where it is able to execute more elaborate programs. On our testing board (CALAO TNY-A9260), it’s a piece of code the CPU stores in internal SRAM and its size is limited to 4Kib, which is a very small amount of space indeed. The second-stage bootloader often provides more advanced features, like downloading the kernel from the network, looking at the contents of the memory, and so on. On our board, this second-stage bootloader is the famous U-Boot.

One way of achieving a faster boot is to simply bypass the second-stage bootloader, and directly boot Linux from the first-stage bootloader. This first-stage bootloader here is AT91bootstrap, which is an open-source bootloader developed by Atmel for their AT91 ARM-based SoCs. While this approach is somewhat static, it’s suitable for production use when the needs are simple (like simply loading a kernel from NAND flash and booting it), and allows to effectively reduce the boot time by not loading U-Boot at all. On our testing board, that saves about 2s.

As we have the source, it’s rather easy to modify AT91bootstrap to suit our needs. To make things easier, we’ll boot using an existing U-Boot uImage. The only requirement is that it should be an uncompressed uImage, like the one automatically generated by make uImage when building the kernel (there’s not much point using such compressed uImage files on ARM anyway, as it is possible to build self-extractible compressed kernels on this platform).

Looking at the (shortened) main.c, the code that actually boots the kernel looks like this:

int main(void)
{
/* ================== 1st step: Hardware Initialization ================= */
/* Performs the hardware initialization */
hw_init();

/* Load from Nandflash in RAM */
load_nandflash(IMG_ADDRESS, IMG_SIZE, JUMP_ADDR);

/* Jump to the Image Address */
return JUMP_ADDR;
}

In the original source code, load_nandflash actually loads the second-stage bootloader, and then jumps directly to JUMP_ADDR (this value can be found in U-Boot as TEXT_BASE, in the board-specific file config.mk. This is the base address from which the program will be executed). Now, if we want to load the kernel directly instead of a second-level bootloader, we need to know a handful of values:

  • the kernel image address (we will reuse IMG_ADDRESS here, but one could
    imagine reading the actual image address from a fixed location in NAND)
  • the kernel size
  • the kernel load address
  • the kernel entry point

The last three values can be extracted from the uImage header. We will not hard-code the kernel size as it was previously the case (using IMG_SIZE), as this would lead to set a maximum size for the image and would force us to copy more data than necessary. All those values are stored as 32 bits bigendian in the header. Looking at the struct image_header declaration from image.h in the uboot-mkimage sources, we can see that the header structure is like this:

typedef struct image_header {
uint32_t    ih_magic;    /* Image Header Magic Number    */
uint32_t    ih_hcrc;    /* Image Header CRC Checksum    */
uint32_t    ih_time;    /* Image Creation Timestamp    */
uint32_t    ih_size;    /* Image Data Size        */
uint32_t    ih_load;    /* Data     Load  Address        */
uint32_t    ih_ep;        /* Entry Point Address        */
uint32_t    ih_dcrc;    /* Image Data CRC Checksum    */
uint8_t        ih_os;        /* Operating System        */
uint8_t        ih_arch;    /* CPU architecture        */
uint8_t        ih_type;    /* Image Type            */
uint8_t        ih_comp;    /* Compression Type        */
uint8_t        ih_name[IH_NMLEN];    /* Image Name        */
} image_header_t;

It’s quite easy to determine where the values we’re looking for actually are in the uImage header.

  • ih_size is the fourth member, hence we can find it at offset 12
  • ih_load and ih_ep are right after ih_size, and therefore can be found at offset 16 and 20.

A first call to load_nandflash is necessary to get those values. As the data we need are contained within the first 32 bytes, that’s all we need to load at first. However, some space is required in memory to actually store the data. The first-stage bootloader is running in internal SRAM, so we can pick any location we want in SDRAM. For the sake of simplicity, we’ll choose PHYS_SDRAM_BASEhere, which we define to the base address of the on-board SDRAM in the CPU address space. Then, a second call will be necessary to load the entire kernel image at the right load address.

Then all we need to do is:

#define be32_to_cpu(a) ((a)[0] << 24 | (a)[1] << 16 | (a)[2] << 8 | (a)[3])
#define PHYS_SDRAM_BASE 0x20000000

int main(void)
{
unsigned char *tmp;
unsigned long jump_addr;
unsigned long load_addr;
unsigned long size;

hw_init();

load_nandflash(IMG_ADDRESS, 0x20, PHYS_SDRAM_BASE);

/* Setup tmp so that we can read the kernel size */
tmp = PHYS_SDRAM_BASE + 12;
size = be32_to_cpu(tmp);

/* Now, load address */
tmp += 4;
load_addr = be32_to_cpu(tmp);

/* And finally, entry point */
tmp += 4;
jump_addr = be32_to_cpu(tmp);

/* Load the actual kernel */
load_nandflash(IMG_ADDRESS, size, load_addr - 0x40);

return jump_addr;
}

Note that the second call to load_nandflash could in theory be replaced by:

load_nandflash(IMG_ADDRESS + 0x40, size + 0x40, load_addr);

However, this will not work. What happens is that load_nandflash starts reading at an address aligned on a page boundary, so even when passing IMG_ADDRESS+0x40 as a first argument, reading will start at IMG_ADDRESS, leading to a failure (writes have to aligned on a page boundary, so it is safe to assume that IMG_ADDRESS is actually correctly aligned).

The above piece of code will silently fail if anything goes wrong, and does no checking at all – indeed, the binary size is very limited and we can’t afford to put more code than what is strictly necessary to boot the kernel.

First Buildroot Developer Day after ELCE, on October 17th in Grenoble, France

Buildroot logoThe first Buildroot Developer Day will take place on Saturday, October 17th in Grenoble, France, just the day after Embedded Linux Conference Europe. This Developer Day aims at allowing Buildroot developers to meet and exchange ideas on the project and its future.

As the number of places is limited, interested candidates are invited to send an e-mail to Peter Korsgaard (jacmet at uclibc dot org) and Thomas Petazzoni (thomas dot petazzoni at free-electrons dot com).

Peter Korsgaard and I are organizing this Developer Day thanks to the sponsorship of Calao Systems (offering the location for the meeting) and Bootlin (offering free lunch to the participants).

See also the announcement of the mailing list and on Buildroot website.