How to boot an uncompressed Linux kernel on ARM

This is a quick post to share my experience booting uncompressed Linux kernel images, during the benchmarks of kernel compression options, and no compression at all was one of these options.

It is sometimes useful to boot a kernel image with no compression. Though the kernel image is bigger, and takes more time to copy from storage to RAM, the kernel image no longer has to be decompressed to RAM. This is useful for systems with a very slow CPU, or very little RAM to store both the compressed and uncompressed images during the boot phase. The typical case is booting CPUs emulated by FPGA, during processor development, before the final silicon is out. For example, I saw a Cortex A15 chip boot at 11 MHz during Linaro Connect Q2.11 in Budapest. At this clock frequency, booting a kernel image with no compression saves several minutes of boot time, reducing development and test time. Note that with such hardware emulators, copying the kernel image to RAM is cheap, as it is done by the emulator from a file given by the user, before starting to emulate the system.

Building a kernel image with no compression on ARM is easy, but only once you know where the uncompressed image is and what to do! For people who have never done that before, I’m sharing quick instructions here.

To generate your uncompressed kernel image, all you have to do is run the usual make command. The file that you need is arch/arm/boot/Image.

Depending on the bootloader that you use, this could be sufficient. However, if you use U-boot, you still need to put this image in a uImage container, to let U-boot know about details such as how big the image is, what its entry point is, whether it is compressed or not… The problem is you can’t run make uImage any more to produce this container. That’s because Linux on ARM has no configuration option to keep the kernel uncompressed, and the uImage file would contain a compressed kernel.

Therefore, you have to create the uImage by invoking the mkimage command manually. To do this without having to guess the right mkimage parameters, I recommend to run make V=1 uImage once:

$ make V=1 uImage
...
  Kernel: arch/arm/boot/zImage is ready
  /bin/bash /home/mike/linux/scripts/mkuboot.sh -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/zImage arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 13:54:00 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    3351272 Bytes = 3272.73 kB = 3.20 MB
Load Address: 80008000
Entry Point:  80008000
  Image arch/arm/boot/uImage is ready

Don’t be surprised if the above message says that the kernel is uncompressed (corresponding to -C none). If we told U-boot that the image is already compressed, it would take care of uncompressing it to RAM before starting the kernel image.

Now, you know what mkimage command you need to run. Just invoke this command on the Image file instead of zImage (you can directly replace mkuboot.sh by mkimage):

$ mkimage -A arm -O linux -T kernel -C none -a 0x80008000 -e 0x80008000 -n 'Linux-3.3.0-rc6-00164-g4f262ac' -d arch/arm/boot/Image arch/arm/boot/uImage
Image Name:   Linux-3.3.0-rc6-00164-g4f262ac
Created:      Thu Mar  8 14:02:27 2012
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    6958068 Bytes = 6794.99 kB = 6.64 MB
Load Address: 80008000
Entry Point:  80008000

Now, you can use your uImage file as usual.

Linux on ARM: xz kernel decompression benchmarks

I recently managed to find time to clean up and submit my patches for xz kernel compression support on ARM, which I started working on back in November, during my flight to Linaro Connect. However, it was too late as Russell King, the ARM Linux maintainer, alreadyaccepted a similar patch, about 3 weeks before my submission. The lesson I learned was that checking a git tree is not always sufficient. I should have checked the mailing list archives too.

The good news is that xz kernel compression support should be available in Linux 3.4 in a few months from now. xz is a compression format based on the LZMA2 compression algorithm. It can be considered as the successor of lzma, and achieves even better compression ratios!

Before submitting my patches, I ran a few benchmarks on my own implementation. As the decompressing code is the same, the results should be the same as if I had used the patches that are going upstream.

Benchmark methodology

For both boards I tested, I used the same pre 3.3 Linux kernel from Linus Torvalds’ mainline git tree. I also used the U-boot bootloader in both cases.

I used the very useful grabserial script from Tim Bird. This utility reads messages coming out of the serial line, and adds timestamps to each line it receives. This allow to measure time from the earliest power on stages, and doesn’t slow down the target system by adding instrumentation to it.

Our benchmarks just measure the time for the bootloader to copy the kernel to RAM, and then the time taken by the kernel to uncompress itself.

  • Loading time is measured between “reading uImage” and “OK” (right before “Starting kernel”) in the bootloader messages.
  • Compression time measured between “Uncompressing Linux” and “done”:
    ~/bin/grabserial -v -d /dev/ttyUSB0 -e 15 -t -m "Uncompressing Linux" -i "done," > booting-lzo.log

Benchmarks on OMAP4 Panda

The Panda board has a fast dual Cortex A9 CPU (OMAP 4430) running at 1 GHz. The standard way to boot this board is from an MMC/SD card. Unfortunately, the MMC/SD interface of the board is rather slow.

In this case, we have a fast CPU, but with rather slow storage. Therefore, the time taken to copy the kernel from storage to RAM is expected to have a significant impact on boot time.

This case typically represents todays multimedia and mobile devices such as phones, media players and tablets.

Compression Size Loading time Uncompressing time Total time
gzip 3355768 2.213376 0.501500 2.714876
lzma 2488144 1.647410 1.399552 3.046962
xz 2366192 1.566978 1.299516 2.866494
lzo 3697840 2.471497 0.160596 2.632093
None 6965644 4.626749 0 4.626749

Results on Calao Systems USB-A9263 (AT91)

The USB-A9263 board from Calao Systems has a cheaper and much slower AT91SAM9263 CPU running at 200 MHz.

Here we are booting from NAND flash, which is the fastest way to boot a kernel on this board. Note that we are using the nboot command from U-boot, which guarantees that we just copy the number of bytes specified in the uImage header.

In this case, we have a slow CPU with slow storage. Therefore, we expect both the kernel size and the decompression algorithm to have a major impact on boot time.

This case is a typical example of industrial systems (AT91SAM9263 is still very popular in such applications, as we can see from customer requests), booting from NAND storage operating with a 200 to 400 MHz CPU.

Compression Size Loading time Uncompressing time Total time
gzip 2386936 5.843289 0.935495 6.778784
lzma 1794344 4.465542 6.513644 10.979186
xz 1725360 4.308605 4.816191 9.124796
lzo 2608624 6.351539 0.447336 6.798875
None 4647908 11.080560 0 11.080560

Lessons learned

Here’s what we learned from these benchmarks:

  • lzo is still the best solution for minimum boot time. Remember, lzo kernel compression was merged by Bootlin.
  • xz is always better than lzma, both in terms of image size. Therefore, there’s no reason to stick to lzma compression if you used it.
  • Because of their heavy CPU usage, lzma and xz remain pretty bad in terms of boot time, on most types of storage devices. On systems with a fast CPU, and very slow storage though, xz should be the best solution
  • On systems with a fast CPU, like the Panda board, boot time with xz is actually pretty close to lzo, and therefore can be a very interesting compromise between kernel size and boot time.
  • Using a kernel image without compression is rarely a worthy solution, except in systems with a very slow CPU. This is the case of CPUs emulated on an FPGA (typically during chip development, before silicon is available). In this particular case, copying to memory is directly done by the emulator, and we just need CPU cycles to start the kernel.

Embedded Linux Conference day 3

Finally, the last day of the 2012 edition of the Embedded Linux Conference has arrived. Including the Android Builders Summit, it was a very busy week with five full days of presentations, a very intensive learning session, but also highly motivating and refreshing. Here is, with a little bit of delay, the report of this last day.

Thanks to the kind help of Benjamin Zores (from Alcatel/Lucent, the GeeXboX and OpenBricks projects) who kindly accepted to record the Userland Tools and Techniques For Linux Board Bring-Up and Systems Integration, both Grégory and myself could attend the talk from Greg Ungerer titled M68K: Life in the Old Architecture. Greg started with a very nice and clear explanation of the history of the 68k architecture from a hardware perspective, and detailed its evolution into the Coldfire architecture. The history is quite complicated: the first 68k processors had no MMU, and then MMU was added starting at the 68030 family. However, when Freescale started with Coldfire, which uses a subset of the 68k instruction set, they removed the MMU, until Coldfire V4e, on which an MMU is available. Originally, the Linux port in arch/m68k only supported the classic 68k with MMU, and support for non-MMU Coldfires was added in uClinux. Later, support for non-MMU Coldfires was added into the mainline kernel in arch/m68knommu, with unfortunately a lot of duplication from arch/m68k. The two directories have been merged again some time ago: the merge had already been done in a mechanic fashion (merging identical files, renaming different files that had similar names), and a huge cleanup effort has taken place since then. The cleanup effort is not completely done yet, but it’s getting close, according to Greg Ungerer. At the end of the session, there has been a question on how m68k/coldfire developers typically generate their userspace, and Greg said he uses something similar to Buildroot, which in fact is uClinux-dist. I jumped in, and said that we would definitely like to have Coldfire support, especially since the activity on uClinux-dist isn’t very strong. I also asked what were the remaining differences between the uClinux kernel and the mainline kernel, and according to Greg, there is almost no difference now except maybe support for a few boards. Greg only uses the mainline Linux kernel now for his m68k and Coldfire developments.

The next conference I attended was the talk from Gary Bisson (Adeneo Embedded) titled Useful USB Gadgets on Linux. I rescued the speaker by lending my laptop because his laptop had no VGA output. Fortunately, the speaker was French, so he could adapt quickly to our bizarre azerty keyboard layout. Gary gave quite a bit of context on what USB is, and explained the USB terminology such as interfaces, end-points, configurations, etc. He then quickly described the Linux USB Gadget stack and gadgetfs for the implementation of USB gadget drivers in userspace. He then presented the existing USB gadget drivers in the kernel, mainly the zero gadget driver (for testing purposes), the mass storage gadget driver, the serial gadget driver and the Ethernet gadget driver. At the end of the presentation, he made a demonstration on a BeagleBoard-XM with the gadget multi driver, which allows to expose multiple gadget interfaces at the same time. So he showed that he could expose the Ethernet interface, the Mass Storage interface and the Serial interface, and demonstrated their usage from the host machine. Overall the talk was good, but I was personally expecting a more in-depth look at USB Gadget driver development, and not only usage: I have already been using gadget drivers for some time now, and I was more interested in having details on developing custom gadget drivers rather than simply on using the existing ones.

Bootlin engineering team. From left to right: Grégory Clément, Maxime Ripard and Thomas Petazzoni
Bootlin engineering team (missing: Michael Opdenacker). From left to right: Grégory Clément, Maxime Ripard and Thomas Petazzoni

After a quick break, Grégory and I attended the Getting the First Open Source GSM Stack in Linux talk by Marcin Mielczarczyk from Tieto. It was an absolutely excellent talk. Marcin described the work he and one of his colleague did to reverse engineer a cheap Chinese phone and port U-Boot and Linux on it. Marcin started by giving details about the landscape of those cheap Chinese phones, and it was quite funny: there are brands like Nokla, Sany Eracsson or SciPhone that create phones that are similar in shape and design to phones from the original brands, but with completely different hardware, and usually completely different software. Marcin said that the great thing about those phones is that they are really cheap (which is nice when you need to do some hardware modifications on them for reverse engineering purposes), can easily be bought from auction sites like eBay, and usually do not use any sort of encryption or signature mechanism to prevent the execution of a different operating system or bootloader. The motivation of Marcin in getting Linux to run on such a phone was to ultimately be able to run the complete OsmocomBB software GSM stack inside the phone. OsmocomBB is a free software implementation of a GSM communication stack, lead by Harald Welte. For the moment, the OsmocomBB project uses phones based on the Calypso based-band processor, and only use the phone for the layer 1 (physical layer) of the communication, while the above layers (layer 2 and 3) are implemented in a PC that communicates with the phone over a serial port. Marcin would like to integrate everything inside the phone itself, in order to make the free software GSM stack completely autonomous and fully usable directly on the phone. Marcin decided to pick the SciPhoneDreamG2, a phone that uses the Mediatek 6235 processor, which has the great advantage of being an ARM9 processor, allowing to run a full-blown Linux, and having a datasheet available on the Web. The original operating system of the phone is Nucleus, on top of which the Chinese brand has added an interface that completely mimics Android but is not Android at all. Marcin described the work he did to understand where the UART port and JTAG port was connected (for this work, he mentioned the usage of the JTAG finder project, a software one can run on a micro-controller and that automatically finds which pins are the JTAG pins of a processor). Once he had access to a serial console and the JTAG, he dumped the memory, and started understanding how the boot process was working, and how the existing boot loader was initializing the DRAM. This work was completely done by disassembling the code, which required quite some effort, according to Marcin. Once this was done, he said that porting U-Boot only required creating a basic UART driver and a timer driver, and porting a basic Linux only required a similar UART driver and timer driver, but also an interrupt driver. Marcin and his colleague then went one in developing the other drivers, such as SD, USB, GPIOs and more, and they detailed some of the issues they faced and the time required for these different tasks. In the end, the project is not yet finished, since OsmocomBB does not run on the phone yet, but this is the next goal for Marcin and his colleague. In the end, it was a very interesting goal, detailing in an informative and amusing way an absolutely excellent reverse-engineering effort conducted by Marcin. I would strongly recommend watching the video of this talk.

Pin Control Subsystem Overview Linus Walleij
Pin Control Subsystem Overview Linus Walleij

The last afternoon of ELC started with a talk from Linus Walleij from Linaro, Pin Control Subsystem Overview. Linus Walleij started by describing with lots of details how I/O pins are implemented from a hardware perspective. He first described a basic I/O pin, on which the software can just control the level. On top of this, he explained the hardware logic used to generate interrupts and wake-up events from I/O pins. And finally, he added that those I/O pins are nowadays commonly multiplexed since the SoC do not have enough pins to expose all their possible features, so a given pin can be used either for one function (say, one pin of a I2C bus) or another function (say, one pin of a parallel LCD interface) or as a general purpose I/O. Since this multiplexing is controlled by software, the code for the various ARM sub-architectures in the Linux kernel have each implemented their own little framework and API to solve that problem, and it’s up to each board file to set their I/O multiplexing settings. Unfortunately, since each ARM sub-architecture has its own implementation, there is no coherent API, and there is code duplication. Linus Walleij’s pin mux subsystem intends to solve that. It has already been merged in mainline, in the drivers/pinctrl directory, and a few ARM sub-architectures have started using it, with more to come in the near future, said Linus. Basically, the pinmux subsystem allows to describe which pins are available on the SoC, how they are grouped together in functions, and how drivers can select which function should be activated at an I/O multiplexing level. Of course, the pinmux subsystem detects conflicting usage of I/O, for example if two different drivers want to use the same pin with a different function. Linus also clarified how drivers for I/O pins block should be implemented in the kernel now. If what you have is a simple GPIO expander, then the driver for it should lie in drivers/gpio and it should use the gpio_chip structure. If this simple GPIO expander is also capable of generating interrupts, then the driver should still be in drivers/gpio, but in addition to the gpio_chip structure, it should also register an irq_chip structure. And finally, if instead this I/O pin controller supports multiplexing, then the driver for it should be implemented in drivers/pinctrl, and it should register into the GPIO subsystem (through the gpio_chip structure), into the IRQ subsystem (through the irq_chip structure) and into the pinmux subsystem (through the pinctrl_desc and other related structures). All in all, Linus’s presentation was a great talk, but I wished he would have put more details on the actual API and data structures: his description of the data structures through UML diagrams were a bit hard to follow.

For the last session of the day, I initially planned to attend Pintu Kummar’s talk on Controlling Linux Memory Fragmentation and Higher Order Allocation Failure: Analysis, Observations and Results, but this session was unfortunately canceled. Therefore, I joined my colleague Maxime Ripard and attended Lucas de Marchi talk about Managing Kernel Modules With kmod. Basically, about a year ago, Lennart Poettering, developer of the systemd new userspace init implementation for Linux, listed a set of topics that he wanted to see improved in Linux to make the initialization sequence perform better. Amongst them was the development of a userspace library to manage kernel modules (query information, insert and remove modules). The problem is that until now, the only way to load and remove modules was to call the modprobe, insmod or rmmod programs, which for each module load operation, required a costly sequence of fork/exec. Since udev tries to load up to 200-300 modules at startup (sometimes just to discover that the module is already loaded), this takes a significant amount of time. So Lucas de Marchi, who works at ProFUSION, decided to step up, and did the implementation of kmod. kmod is composed of a C library which implements the core logic of the module information query, module loading and module removal operation, supporting all the fine details that modprobe was supporting (such as dependency handling, module aliases and the configuration files in /etc/modprobe.d/ with options for modules, blacklisted modules). kmod also contains replacement programs for the insmod, lsmod, rmmod and modprobe programs, directly inside a single kmod binary, with symlinks pointing to it for the various commands. kmod is now a full replacement for the old module-init-tools, which has been marked as obsolete by his former maintainer, Jon Masters (who has joined the kmod project). Desktop distributions have started to pick up kmod (Arch Linux, Fedora, and Debian in experimental), as well as embedded Linux build systems. Lucas mentioned that Buildroot had the latest version of kmod, while OpenEmbedded had a slighly older version, and that he didn’t know about other build systems. In the end, this kmod project does not bring a lot of new features or innovations, but is a well-appreciated initiative to make module management better in Linux. What’s very impressive in the time frame in which the project was done: in about a year, the project got started, the development was done, and it is now a full replacement of the old solution, which has been marked deprecated. Great job!

Managing Kernel Modules With kmod, Lucas De Marchi
Managing Kernel Modules With kmod, Lucas De Marchi

Finally, as every ELC, the conference was closed with a game involving all the attendees, and allowing to win nice prizes such as development boards, USB scopes, audio/video portable players (PMPs), and more. The game started with a set of geek questions (such as “Will the Linux kernel in version 3.3 have more or less than 15 millions lines of code ?”, or “Is the distance from the Earth to the Moon smaller or higher than 150.000 miles ?”), and then a rock/paper/scissors game, and finally a raffle. This closing game is always a nice way of ending ELC.

This year’s edition of the Android Builders Summit and the Embedded Linux Conference have been great, with lots of interesting technical talks, and lots of side discussions with various developers. Many thanks to the conference organizers and speakers!

Embedded Linux Conference Europe 2012
Embedded Linux Conference Europe 2012

We hope that those five blog posts reporting some details about those conferences have been interesting to those who didn’t have the chance to attend, and we are definitely looking forward the next edition of the Embedded Linux Conference Europe, which will take place in Barcelona from November 5th to November 7th. Note that the call for papers has already been published. It’s time to think about what you’re doing in the embedded Linux world, and to propose a corresponding talk!

Embedded Linux Conference day 2

Day 2 of the Embedded Linux Conference started with a keynote titled The Internet of Things, given by Mike Anderson. With such a title, one could have feared some kind of very fuzzy-marketing-style kind of keynote, but with Mike Anderson as speaker, it clearly couldn’t be the case. Mike is well-known at ELC and ELCE for all its highly technical presentation on kernel debugging, JTAG, OpenOCD and more. This keynote was not really related to embedded Linux directly, but about all the potential applications that modern technologies such as RFID, nano-robots, wireless communications have. As Mike pointed out, there are lots of potential opportunities to optimize energy usage, make our lives easier, but there are also lots of dangers (surveillance, manipulation of information, reduction of private life, etc.).

The Internet of Things, Mike Anderson
The Internet of Things, Mike Anderson

Right after Mike’s keynote, it was the time for me to give the presentation Buildroot: A Nice, Simple, and Efficient Embedded Linux Build System. As a presenter, I am obviously not objective, but I think the presentation went well. I filled the entire time slot, leaving the time for about five questions at the end. Around 60-70 people were in the room, quite a good number considering the fact that there was a talk from the excellent Steven Rostedt in another room at the same time. I will put the slides of this presentation on line very soon, which was a general presentation of Buildroot, trying to emphasize all the cleanups and quality improvements we have done since the last three years, and also trying to highlight the fact that Buildroot is really easy to understand, it is not a magic black box contrary to some other embedded Linux build systems. That’s the reason why I gave some details about how our package infrastructure works internally, to show that it is really simple. There were several questions about why we do not support binary packages, and of course I replied that it was a design decision in order to remain simple. At the end of the presentation, a guy from Mentor Graphics came to tell me that saying no was an excellent thing and that too many projects fail to say no to new features, and therefore they get more and more complicated.

At the same time as my Buildroot’s talk, Steven Rostedt from RedHat was presenting Automated Testing with ktest.pl (Embedded Edition) and Grégory attended this conference. Grégory reports: “As indicated in the title it is the “embedded” version of a former conference. I don’t know if Steven is really new in the embedded field or if he just pretends to, but the result is that for a newcomer in embedded Linux, this talk is really well detailed. He shows how to setup the board step by step, showing the problems you usually have. But the real topic is the ktest.pl script and how to use it. After two hours of presentation I was totally convinced by the usefulness of this script. It will help a lot to automate the tasks we usually do by hand such as git bisect, check that the stack of patches we have don’t break anything, check that we don’t have any regression at runtime or just at build. All these tasks can be done with ktest.pl and in a very simple way!”

Automated Testing with ktest.pl (Embedded Edition), Steven Rostedt
Automated Testing with ktest.pl (Embedded Edition), Steven Rostedt

Then, I went to Tim Bird’s talk about Embedded-Appropriate Crash Handling in Linux. The initial problem that Tim wanted to solve is how to get and store information about applications that have crashed on devices in the-field. The major issue is that to debug and understand the crash you theoretically need to keep a lot of information, but in practice you cannot do this due to space constraints. Typically, a way of doing post-mortem analysis of a crashed application is to use the core file that the kernel generates after the crash, and use it with gdb. Unfortunately, a core file is typically very large. Tim looked at the crash report mechanism of Android, and discovered that it was directly registering a handler for the SIGSEGV signal (and other related signals indicating an application crash) into the dynamic library loader in Bionic. This signal handler communicates with a daemon called debuggerd over a socket, and this daemon then uses ptrace to get details about the state of the application at the moment of the crash (register values, stack contents, etc.). Tim didn’t want to require modifications at the application level or at the dynamic library loader, so instead he used the core pattern mechanism provided by the Linux kernel: by writing to some file in /proc, you can tell the kernel to start a userspace program when an application crashes, and the kernel dumps the core file contents as the standard input of this new process. Based on debuggerd, Tim implemented such a program that also uses ptrace and /proc to get details about the crashed application. Tim also discussed the various ways of getting a backtrace: using the frame pointer (but this is often not available, as many people use the -fomit-frame-pointer compiler option), using the unwind tables, using a best-guess method (you just go through the stack, and everything that looks like a valid function address is assumed to be part of the call stack, so this method shows some false positive) or using some kind of ARM emulation (but I don’t recall the name of this solution at the moment). All in all, Tim’s talk was great, a good report of its experiment and good technical information about this topic.

Everybody at Bootlin wanted to attend to the “ARM Subarchitecture Status” presentation given by Arnd Bergmann, but we couldn’t since we were responsible for recording videos of all talks. This time, it’s Grégory who had the privilege of attending what looked like the most interesting talk of the slot. In fact as we follow the ARM Linux community in a close way through the mailing lists or the LWN.net website, nothing was really new for Grégory in Arnd’s presentation. Nevertheless it was good to take the time to have a status. The interesting part for Grégory was to see how Arnd works with all the git trees coming from SoC vendors or from community and how he merges them together and merges the conflicts. It is more manual than we imagined and honestly is certainly a very hard job to do.

ARM Subarchitecture status, Arnd Bergmann
ARM Subarchitecture status, Arnd Bergmann

Later in the day, I went to David Anders talk about Board Bringup: LCD and Display Interfaces and it was really a great talk. David explained very well the hardware signals between the LCD controller that you have in your SoC and the LCD panel you’re using, and how those signals affect the timing configuration that you have to set in your kernel code. He clearly explained things like pixel clock, vertical and horizontal sync, but also more complex things like the front porch and the back porch. He then went on to describe LVDS, which in fact is a serial protocol that uses two wires per-color in a differential mode to transmit the picture contents, and also talked about EDID, which is basically an I2C bus that can be used to read from the display device what display modes are available and what their timings are. He also described some of the test methods he used, from a logic analyzer up to a program called fb-test. David’s talk was really great because it provided the kind of hardware details that a low-level software engineer needs to understand, and David explained them in a way that can be understood by a software engineer. Following the talk, I met David and asked some more questions and he was very nice to answer them, in a very clear way. David slides are available at http://elinux.org/Elc-lcd, and you can also check out other things that David is working on at TinCanTools, such as the very nice Flyswatter JTAG debugger for ARM.

At the end of day, Grégory attended the Real-Time discussion session, Maxime attended the Yocto Project discussion session and I attended the Common Clock Framework discussion session. This last discussion session was about work done to consolidate the multiple implementations of the clock APIs that exist in the kernel: at the moment, each ARM sub-architecture re-implements its own clock framework and the goal is to have a common clock framework in drivers/clk/ that can be shared by all ARM sub-architectures but also potentially by other architectures as well. The discussion lead by Mike Turquette from Texas Instruments/Linaro showed that a great deal of work has already been done, but a lot of questions remained opened. Each ARM sub-architecture has different constraints, and finding the right solution that solves the constraints of everybody isn’t easy.

And finally, there was the usual Technical Showcase, with demonstrations of the Pandaboard, but also the newer BeagleBone platform which looks really exciting. David Anders was demonstrating his LCD bring-up setup, another person was demonstrating an open-source GSM access point based on USRP, etc. Lots of interesting things to see, lots of nice people to discuss with.

Embedded Linux Conference day 1

The first day of the Embedded Linux Conference started on Wednesday here at Redwood City, California.

The day started with the usual Kernel Report from Jonathan Corbet. It was, as usual with Corbet’s talk, a very interesting summary of what happened in the kernel through the last year, with highlights of the major new features per release, thoughts about issues like the kernel.org security problem and subsequent outage, etc.

The Kernel Report, Jonathan Corbet
The Kernel Report, Jonathan Corbet

After this talk, Grégory went to the Saving the Power Consumption of the Unused Memory talk, given by Loïc Pallardy, who works for ST Ericson in France. The purpose of the talk was to detail the kernel modifications they made to support the fact of powering down portions of the memory that are unused. In fact, DDR memories these days are capable of powering off some their areas, which allows to save power. Of course, when an area of the memory is powered off, its contents are lost, so the kernel needs to ensure that nothing valuable remained on this area of memory. Their kernel modifications allow to describe how the memory is organised (which address ranges are available and can be powered down independently) and introduce some kernel memory allocator changes to reference count those banks of memory. Of course, the next problem is that physical memory is usually highly fragmented, so they detailed how they re-used some of the existing kernel mechanisms to group unmovable pages on one side and movable pages on the other side and that allow to defragment the movable pages. This topic has been worked on since quite a long time in the kernel, as can be found in this LWN article from 2006.

Saving the Power Consumption of the Unused Memory, Loïc Pallardy
Saving the Power Consumption of the Unused Memory, Loïc Pallardy

On my side, I attended the What Android and Embedded Linux Can Learn From Each Other talk. The speaker detailed many of the Android kernel additions and how they could, theoretically, be re-used in non-Android embedded Linux systems. Things like re-using the Binder inter-process communication mechanism, or simple things like the RAM-based Logger mechanism. Unfortunately, none of the speaker’s suggestions were backed by any sort of real experimentation, so those suggestions were mostly speculations. For example, he suggested the possibility of re-using the Android graphics stack on a non-Android system, but most likely this is a very difficult task to achieve and not necessarily worth the effort. At the end of the talk, the speaker suggested that the embedded Linux community and the Android community should talk more to each other, but looking at how Google is driving Android development, it is difficult to see this happening in the near future.

Then, the talk from Hisao Munakata about Close Encounters of the Upstream Resource was an interesting and good summary of the tensions that exist within embedded companies between the product teams (who have deadlines and need the product to work, and don’t want to worry about upstreaming things) and the community teams (who are in relation with the community and try to upstream modifications). He had really nice slides to show the multiple issues that a company faces when it produces major modifications to open-source components such as the Linux kernel, without any effort to upstream them. But he also said that things are improving, and that with Android using fairly recent kernel versions, the embedded Linux system makers are now much closer to mainline versions, which helps in getting changes merged in the official Linux kernel. He advocated that embedded Linux developers should be proficient with git, because it allows to easily track the modifications, find out whether bugs have been fixed in later versions of the Linux kernel, etc. He also quickly presented LTSI, an initiative that offers long-term support around the Linux kernel. He presented it as the way of solving the fragmentation between the vendor BSPs kernel versions, the Android versions, and all other kernel versions that are floating around. However, how those versions will get merged into the official Linux kernel was not really clear.

In the afternoon, Grégory went to the talk Comparing Power Saving Techniques For Multicore ARM Platforms, presented by Vincent Guittot was an other talk presented by a French guy from ST Ericsson. As the one Grégory saw in the morning about power management of memories, this one was also very instructive, well documented and the speaker seemed to really know his topic. He worked the right way on Linux: only very minimal changes inside the kernel, tried to reuse the existing components, provided a git tree available and proposed some improvements on the mailing lists: good job!

Grégory also attended the traditional talk from Tim Bird entitled Status of Embedded Linux. Very pleasant talk (as usual with Tim Bird). It was a very good overview of the state of embedded Linux. If you want to start working on embedded Linux this talk is a must see. Moreover Tim mentioned the valuable work done by Bootlin by recording and sharing the conferences for many years!

The Status of Embedded Linux, Tim Bird
The Status of Embedded Linux, Tim Bird

Later in the day, I attended the talk Passing Time With SPI Framebuffer Driver given by Matt Porter, who now works for Texas Instruments. His talk was feedback from real-life experience developing a driver for a SPI framebuffer controller. Initially, the problem was that a customer had started developing a driver, but that driver violated all the Linux development rules: no usage of the GPIO APIs, no usage of the SPI infrastructure, no usage of the device model, everything was done through a basic character driver directly manipulating the hardware registers. This is something that we also see quite sometimes at Bootlin in the kernel code of some customers: this happens when the code has been written by developers who have only started reading the Linux Devices Driver book, but didn’t go far enough in the Linux code to understand the device model and the principle of code re-usability. So clearly, Matt’s experience resonated with our own experience. So, Matt went on to describe how the driver worked, modifications needed at the board configuration level, the driver itself, its integration in the device model. He also clearly detailed how a SPI framebuffer can work. On a normal framebuffer integrated into the SoC, the framebuffer memory is directly mapped into the application address space so that the application can directly draw pixels on the screen. However, when the framebuffer controller is over SPI, it is clearly not possible to map the framebuffer memory into the application address space. But fortunately, the kernel has a dedicated mechanism for such case: FB deferred I/O. What gets mapped into the application address space is normal kernel memory, but the kernel detects thanks to page faults when a portion of this memory has been changed, and calls the framebuffer driver so that the driver has an opportunity to push these changes over SPI to the framebuffer controller. Of course, this mechanism run at a configurable frequency. The device that was used by Matt Porter was a 1.8 screen available from Ada Fruit, this might also been a good device to use in our future kernel courses, to let participants exercise with driver development.

At the end of the day, I attended the Experiences With Device Tree Support Development For ARM-Based SOC’s by Thomas P. Abraham, from Samsung Electronics, but also from Linaro. It was clearly an excellent presentation about the device tree and how it works. It showed, with lots of code examples, how to compile the device tree source into a device tree blob, how to configure and use U-Boot to get this device tree blob loaded and passed to the kernel, how the board files in the kernel are changed to use the device tree, how device drivers are modified, how the platform data mechanism is changed with the device tree, and more. Definitely a must-see for anyone doing ARM development these days.

My colleague Maxime went to the talk from Paul McKenney about Making RCU Safe For Battery-Powered Devices. Maxime reported that it was an excellent introduction to RCU: Paul introduced very progressively the various issues, so it was possible even for an RCU-newbie to follow that talk. Definitely a presentation I will watch thanks to the video recording!

In the evening, there was the traditional social event of the conference. It took place at the Hiller Aviation Museum, they have lots of strange aircrafts or helicopters, such as a piece of the supersonic Boeing prototype plane, or other bizarre flying devices such as this flying platform.

First day at the Android Builders Summit

Yesterday was the first day at the Android Builders Summit 2012, here in Redwood City, near San Francisco, California. My colleagues Grégory Clément and Maxime Ripard as well as myself are fortunate to attend this conference, and the contents of the first day were really interesting.

Amongst others:

  • A talk from Karim Yaghmour, well-known for having worked on the original version of the Linux Trace Toolkit, on the Adeos patch, as well as for its activity around Android. He delivered a 30 minutes talk about Leveraging Linux’s history in Android, which covered the differences in architecture between a standard embedded Linux system and Android, as well as how to nicely integrate BusyBox or tools like the Linux Trace Toolkit into Android. The presentation was really impressive: in just 30 minutes, Karim covered a huge number of slides, and made several live demonstrations. It is also worth noting that Karim, following the direction that Bootlin has drawn 7 years ago, has decided to release his Android training materials under a Creative Commons BY-SA license.
  • A panel with multiple kernel developers and people involved in Android on how to integrate the specific Android kernel patches into the mainline kernel. Not many new things learned here: the issue with the Android patches is that they add a lot of new userspace-to-kernel APIs, and such code is much much harder to get in mainline than conventional driver or platform code, since such APIs need to be maintained forever. Interestingly, Zach Pfeffer from Linaro pointed out the fact that the major problems with Android integration these days are not due to the kernel patches, but rather to the horrible binary blobs and related drivers that are needed for 3D acceleration ARM SoCs (Systems On a Chip).
    Panel « Android and the Linux Kernel Mainline: Where Are We? »
    Panel « Android and the Linux Kernel Mainline: Where Are We? »
  • A talk from Marko Gargenta on how to customize Android. He explained how to expose a specific Linux kernel driver functionality to Android applications, through a native C library, the JNI mechanism and an Android service, with complete details in terms of source code and build system integration. This presentation, just like last year’s presentation from Marko, was absolutely excellent. A lot of content, very dynamic presentation, a lot of things learned.
  • A talk on how ADB (Android Debugger) works. The contents were really good as well here, with lots of details about the ADB architecture, some tips and tricks about its usage, and more. Unfortunately, the speaker was really not familiar with English, and most of its presentation was spent reading the slides. This is a bit unfortunate because the technical contents was really, really excellent. The slides are available at http://www.slideshare.net/tetsu.koba/adbandroid-debug-bridge-how-it-works.
  • Using Android in safety-critical medical devices. This talk was not about technical issues, but rather about the reason for using Android in medical devices (get those devices connected together and collect some data to learn more about medical practices, their efficiency and cost) and also the legal requirements to get such devices validated by the Food and Drugs Administration in the US. A lot of useful arguments on how to convince managers that Android and Linux in general are usable in safety-critical medical devices.
  • A talk about Over-The-Air updates in Android, which I didn’t attend, but my colleague Maxime Ripard and other attendees gave an excellent feedback about it. It detailed an advanced system for safely upgrading an Android system, using binary diffs and other techniques.
    Customizing Android by Marko Gargenta, Marakana
    Customizing Android by Marko Gargenta, Marakana
  • The talk about Integrating Projects Using Their Own Build System Into the Android Build System had a really promising title and abstract, but unfortunately, the contents were disappointing. The speaker took 25 minutes just to explain how to build BusyBox (outside of any Android context) and then another 20 minutes to explain how to integrate it in the Android build system, on unreadable slides.
  • The talk about Android Device Porting Walkthrough was really great. Benjamin Zores exhausted its time slot with a 1h15 talk instead of the 50 minutes slot allocated, but fortunately, it was the last talk of the day in this session. During this talk, Benjamin gave a huge amount of information and many details about various issues encountered in the process of adapting Android for an Alcatel business VoIP phone (the ones you see in business desks). Issues like filesystem layout, input subsystem configuration, touchscreen configuration, graphics and much much more were covered. Be sure to check out Benjamin slides at http://www.slideshare.net/gxben/android-device-porting-walkthrough.
  • Finally, the day ended with a lightning talk session moderated by Karim Yaghmour. Lightning talks are really nice, because in less than 5 minutes, you quickly hear about a project or an idea. When the speaker is not good or the topic uninteresting, you know that after 5 minutes, you’ll hear someone else speaking about a different topic. The lightning talk on the integration of GStreamer in Android was really interesting, as was the lightning talk from Karim about its CyborgStack initiative, which creates an upstream Android source to integrate all the Android modifications that will never be mainlined by Google. See Karim slides at http://www.cyborgstack.org/sites/default/files/cyborgstack-120213.pdf for details.

And now, it’s time for breakfast, before the conferences of the second day of this Android Builders Summit.