Update on flash filesystems

Reviewing new possibilities for flash filesystems – My slides at ELCE 2008

With the release of Linux 2.6.27, including the new UBIFS filesystem for MTD storage, embedded Linux system developers now have multiple choices for their flash storage devices. As far as it is concerned, JFFS2 has also been improved and now has support for LZO compression, which makes uncompressing faster. So, how to choose between JFFS2, YAFFS2, and UBIFS?

To help our customers and the community make the right decision, I measured how these filesystems compare in terms of mount time, access time, read and write speed, as well as CPU usage in several corner cases and with different flash chip sizes.

I showed the results during the Embedded Linux Conference Europe event. Besides sharing lessons learned from these experiments, my presentation also introduced each filesystem and its implementation. I also gave advice for flash based block storage (such as Compact Flash and Solid State disks), to reduce the number of writes and avoid damaging flash blocks.

As usual, Bootlin slides are available under the Creative Commons BY-SA license: flash-filesystems.pdf (PDF), flash-filesystems.odp (Open Document Format).

The main finding is that UBIFS outperforms both JFFS2 and YAFFS2 in almost all corner cases. As shown by the benchmarks, it has consistently good mount time, and read/write performance. If your products are using a recent kernel, and are still based on JFFS2, you should definitely try UBIFS and get significant performance benefits, in particular for boot time, as mounting a JFFS2 root filesystem can take several seconds!

The advent of UBIFS also questions the relevance of YAFFS2. YAFFS2 used to be a good alternative to JFFS2, but unlike UBIFS, it doesn’t support compression. Then, why choose YAFFS2, when a apparently superior alternative is available?

The only case in which JFFS2 can still make sense if when you have very small partitions, sizing just a few megabytes. In this case, the overhead from UBI, the erase-block management layer below UBIFS, is no longer negligible. You will be able to pack much less data than with JFFS2. In this case, you can still improve JFFS2’s performance by using some of its new features (more details in the presentation).

SquashFS is also another great alternative, as shown by my benchmarks. It’s true it is a block filesystem, but since it is read-only, and there is no problem to use it on a write-once mtdblock device. You should really consider it for the read-only parts in your system, though it is advisable to use it on top of UBI, to make its blocks participate to wear-leveling and bad block management. Again, you will find more details in my presentation.

The presentation also mentions LogFS, which is also a promising filesystem for flash storage. Unfortunately, LogFS is not available yet for recent kernels. Stay tuned and I will benchmark it as soon this situation changes.

Embedded Linux From Scratch

An update to this presentation was given in 2019

This presentation shows how easy it can be to build an embedded system from the ground up, rather than trimming an existing general purpose GNU/Linux distribution. It is mainly targeted at beginners in embedded systems, but it also gives useful tricks that more experienced people may not know about.

This document was used in our training sessions. It is available under the Creative Commons BY-SA license (see details and other documents).

It is available under several formats:

Back to our technical presentations

Embedded Linux optimizations

This presentation is a collection of ideas and resources for optimizing the Linux kernel and applications for speed, size, RAM, power and cost. Most of them are gathered and supported by the CE Linux Forum projects. Interested embedded system developers are invited to contribute benchmarks, testing, code and more ideas to these projects.

This document is used in our training sessions. It is available under the Creative Commons BY-SA license (see details and other documents).

It is available under several formats:

Thanks to people who helped, sent corrections or suggestions: Tim Bird, Robert P.J. Day

Ogg/Theora video mini howto

How to make your own Ogg/Theora videos

Here is how we created the free conference videos we are sharing with you.

Our goal is to show you that it is very easy and pretty cheap to create Ogg/Theora videos using only Free Software tools. It would be great if more people shared what they experience, in particular when they attend interesting presentations!


Creative commons

Copyright 2006-2008, Bootlin.
This mini-howto is released under the terms of the Creative Commons Attribution-ShareAlike 2.5 license.


A mini-DV camcorder.

Such a device costs approximately 500 US dollars / euros. Mini-dv tapes cost about 5 US dollars / euros.

Note that other devices may be used, such as DVD or harddisk camcorders.

Harddisk camcorders are not a very good solution, because video is stored with a high compression rate (MPEG-4 format). You will not get the best results if you encode from MPEG-4 to Theora, because you will be using low bitrate input compressed with another codec.

A DVD camcorder is fine (MPEG-2 compression), because the input quality would be much better. The best is still DV input, which has very little compression, and allows to get the best of the Theora codec.

Camcorder accessories

A tripod is a must-have. Without one, your image will not be very stable (even with image stabilization), and above all, you will be exhausted after one hour.

An external microphone is nice to have, but not mandatory at all. You still
get pretty good quality audio with the built-in one. So, if you are satisfied
by the audio that you get, you do not have to buy such a microphone. However,
the best solution for top quality audio is to connect you audio input to the
room audio system (if any, and if the speaker is using a microphone).

Computer connectivity

You need a GNU/Linux computer with FireWire input (aka IEEE 1394 or iLink).
If you have a notebook with a PCMCIA adaptor, you best option is to get a
FireWire PCMCIA card which doesn’t need any special driver. This should mean
that it is compliant with the 1394 OHCI standard, which is fully supported by
Linux. Note that recent distributions (at least Fedora Core) automatically
load the right drivers when such a card is plugged in.

It may be possible to use USB connectivity too to get the video from the
camcorder. We just do not know how yet. Any resources are welcome!


The DV files are huge (roughly 15 GB per hour). As intermediate processing
steps are used in our flow, intermediate files of similar size will be
created. Hence, you will need at least 30 GB of free space to process 1 video.
Anyway, it’s much better to have 100 GB or more to store and process several
videos in a row. For notebook owners, external hard drives (typically
high-speed USB 2.0) are your friends.

An external microphone is nice to have, but not mandatory at all. You still get pretty good quality audio with the built-in one. So, if you are satisfied by the audio that you get, you do not have to buy such a microphone. However, the best solution for top quality audio is to connect you audio input to the room audio system (if any, and if the speaker is using a microphone).

Shooting the video

Before or right after filming, make sure that you ask the speaker(s) for permission to publish the video! Make sure you mention the license that you are going to use.

Video capture

Connect the camera to the computer with the FireWire cable.

If you are using a PCMCIA FireWire adaptor, all the modules should have been loaded automatically at module load time.

If you have a legacy FireWire input, you may have to do a few things by hand (logged as root):

modprobe dv1394
chmod a+rw /dev/dv1394/0

You will now use dvgrab
to get the video through the FireWire link and save it to a file. This tool is shipped by most distributions.

dvgrab --size 0 --format raw <output-file-prefix>

Note: --size 0 means that the output file is not split into many smaller ones, when they exceed a given size.

Now that you’re done, let’s assume that you created a video.dv file.

Video trimming

When you read reused tapes, it’s hard to avoid video frames from the previous recordings at the begining or at the end. Before compressing, you first have to trim out the unwanted frames.

This is pretty easy to do with the kino tool, available in all recent distros.

Make sure you export the trimmed video in DV format, to avoid losing quality.

We will soon post kino usage screenshots on this page, to get you started faster with kino.

Quick Ogg/Theora generation

That’s very easy to do. Get the latest version of the ffmpeg2theora package.

ffmpeg2theora -o video.ogv video.dv

You’re done!

You can use the -v and -a parameters to control video and audio quality. The defaults (5 and 2) should be fine for average quality requirements. With -v 7, we already get very good video quality, but the output file size is roughly double. As far as audio quality is concerned, keep in mind the source quality. Unless your audio input is high quality (audio in directly connected to the conference room sound system), there is no need for high bitrate audio compression (-a setting greater than 4).


If the output video quality is poor, it could be because your video needs deinterlacing. In particular, this happens when you record your video in long play mode. Interlaced video is very easy to identify: you just need to find a sequence with motion (camcorder or character motion). Pause the video and interlaced lines will show up.

So, if you source video is interlaced, use the --deinterlace parameter of ffmpeg2theora:

ffmpeg2theora --deinterlace -o video.ogv video.dv

Denoising the video

Look carefully at the generated Ogg/Theora video. Do you see MPEG-like squares moving on surfaces which shouldn’t change at all (walls, sky, board, etc.)?

If this happens, this means that your original video contained noise. This is very frequent with digital camcorders, in particular in low light conditions (when you amplify a weak signal, noise gets more significant). Such noise, though it is not obvious on the source video, can get amplified in the compression process.

Hence, it’s best to remove noise before compressing, so that pixes in still surfaces do not change at all in the source video. Follow the below instructions and compare the output Ogg/Theora video size. You will find that the output file is smaller that what you got by just running ffmpeg2theora on the raw DV video.

Fortunately ffmpeg2theora now supports denoising filters: we contracted Jan Gerber, its developer, to add such filters to his tool. First make sure you have at least version 0.20 (otherwise, download the latest version).

The implementation is based on ffmpeg / mplayer‘s postproc library. Available filter settings are detailed by ffmpeg2theora --pp help, or can be found by looking for tmpnoise in mplayer’s manual page. Filter settings are not easy to choose, however. For your convenience, here are the settings we chose after multiple experiments: --pp de,tn:256:512:1024. At least with our videos, they produce good quality output without significant side effects.

Ogg/Theora video with metatags

It’s possible and useful to add metainformation (title, author, location, license) to the ogv video files.

This can be done thanks to ffmpeg2theora parameters:

ffmpeg2theora -a 3 -v 7 --pp de,tn:256:512:1024 \
--artist "Michael Opdenacker" --title "Fosdem 2006" \
--date "February 2006" --location "ULB, Brussels, Belgium" \
--organization "Bootlin (https://bootlin.com)" \
--copyright "Copyright 2006, Michael Opdenacker" \
--license "Creative Commons Attribution-ShareAlike 2.5" \
-o video.ogv video.dv

If you need to mass encode several videos in a script, it is now possible to add the metatags by hand after encoding. This can be done with the TagTheora tool.

Going further

Run ffmpeg2theora --help for details about more possibilities like live encoding and streaming.


  • To Diego Rondini, for letting us know about TagTheora

RAID + Xen on Ubuntu Edgy

Using Xen on Ubuntu 6.10 (Edgy), using RAID storage.


Creative commons

Copyright 2006, Bootlin.
This mini-howto is released under the terms of the Creative Commons Attribution-ShareAlike 2.5 license.


Thanks to:

  • Sébastien Chaumat, for making me feel like using Xen.
  • Eric-Olivier Lamey, for sending feedback and making useful suggestions,


In this document, we share our experience using Xen on Ubuntu 6.10 (Edgy), using RAID storage.

Ubuntu 6.10 was used because it was the first Ubuntu version with Xen support. In earlier Ubuntu versions (in particular 6.06 LTS), you have to install Xen from sources, and do manual C library tweaks (for TLS support issues). The advantage of packages is that you can easily know about and deploy security updates!

Another reason for using version 6.10 is that it uses the Linux kernel version supported by the latest Xen version (3.0.3 when we installed it). With Ubuntu 6.06, we would have needed to upgrade the kernel version, or to use an earlier Xen version.

Kernel configuration files are provided for the Via C7 based Dedibox servers available in France. Of course, these instructions should be useful for anyone trying to use Xen, whatever the server hardware, and even if RAID storage is not used.

We wanted to share our experience because we spent a significant amount of time looking for correct kernel configuration settings, bootloader settings (in particular for RAID), as well as Xen network and tuning settings. We hope that this document will save some of your time, in particular if you have a Dedibox server!

Note that this HOWTO may not given enough details for unexperienced system administrators, who are unlikely to fiddle with Xen and RAID anyway.

Ubuntu 6.10 installation

For Dedibox users, we chose the below partition settings in the Dedibox installation interface:

  • 1st partition: /boot, 256 MB, RAID1, ext3
  • 2nd partition: /, 4096 MB, RAID1, ext3
  • 3rd partition: /xen, 146225 MB, RAID1, ext3
  • 4th partition: Linux swap, 2048 MB (2 separate partitions on sda and sdb)

Note that when using Xen, the swap space will only be used by Domain0, which is not supposed to run any services, except a ssh server. The 2048 MB maximum size is definitely much more than needed. 512 MB may be more than enough. We kept 2048 MB in case we decide to stop using Xen and run all our services on a single, real server.

By default, the Dedibox or Edgy install only took the first partition into account. However, Linux fully supports several swap partitions (max 2GB per partition). If these partitions are on different disks as in our case, this is even better for performance, as Linux can access those 2 partitions in parallel.

To make the second swap partition work, we added it to /etc/fstab file, ran mkswap /dev/sdb4 and then rebooted to check that everything was correctly set up.

For Dedibox users, note that the server doesn’t seem to boot if you do not choose a separate boot partition.

Adding packages

Uncomment all universe lines in /etc/apt/sources.list.

Install the packages we are going to need in the next sections:

apt-get update
apt-get install xen-hypervisor-3.0-i386 xen-source-2.6.17 xen-tools xen-utils-3.0 libc6-xen
apt-get install build-essential libncurses5-dev ccache

Kernel compiling

Of course, you may choose to use a generic Linux kernel provided by Ubuntu (such as xen-image-xen0-2.6.17-6-server-xen0). Follow the below instructions if you want to tune your kernel according to your exact hardware.

cd /usr/src
tar jxf xen-source-2.6.17.tar.bz2
cd xen-source

To speed up recompiling, you can add ccache support in the kernel makefile. Change the lines defining CC and CROSS_COMPILE:

HOSTCC          = ccache gcc
CC              = ccache $(CROSS_COMPILE)gcc

Dedibox users can use our custom configuration file. We derived it from the Ubuntu Xen kernel and used the settings used in the official Dedibox kernel configurations. You may still check that you have all the features you need, as we removed the features we do not use at the moment (such as NFS, ReiserFS, FAT…).

Compile your kernel:

make install
make modules_install

Bootloader configuration

The Grub bootloader configuration file needed to be updated to be able to load the Xen hypervisor kernel. Here’s what we added to our /boot/grub/menu.lst file before the ## ## End Default Options ## line:

title XEN/2.6.17-bootlin
root (hd0,0)
kernel /xen-3.0-i386.gz
module /vmlinuz- root=/dev/md1 md=1,/dev/sda2,/dev/sdb2 ro quiet splash

You can see that Grub loads files from the first raw partition (not using RAID), while Linux directly boots from the RAID device. In this case, files paths are taken from the /boot partition. You will have to adjust file patches in case /boot is part of the / partition.

Note that there is no clear documentation on the minimum of memory needed for dom0, the privileged Xen domain, from which you are going to control the standard Xen domains. We found that many sites use 128 MB, but other ones seem to be working fine with 64 MB. Just try by yourself if physical RAM is scarse!

Testing dom0

You are now ready to test dom0!

Just reboot and hope that your new kernel boots well. In case you administrate a remote server and this doesn’t work, debugging is tricky (believe us!), because you have no access to the system console. If this happens to you, we advise you to start from a working configuration, like ours or the default Ubuntu kernel, and apply your changes little by little.

Once you access a working shell, you can run top and check that you are no longer running on your regular server. In particular, you will only see the amount of RAM that you attributed to dom0 in the Grub configuration file. You can also run uname -r to check that you are running your new kernel, and xm info to get more information about the Xen hypervizor running on your machine.

Configuring regular Xen domains

To configure networking between dom0 and regular domains, we decided to use regular routing and NAT. Xen uses bridging by default, but we are less familiar with this method.

Set this in the /etc/xen/xend-config.sxp file, by making sure that bridge settings are commented out, and by uncommenting the below 2 lines:

(network-script network-nat)
(vif-script     vif-nat)

We also commented out domain migration settings, as we are not using them (yet).

Creating a new domU domain

Create a 1 GB (for example) sparse file for dom1 system files and format it:

dd if=/dev/zero of=/xen/dom1.img bs=1024k seek=1024 count=0
mkfs.ext3 -F dom1.img

Sparse files are particular files containing holes filled with zeros. No space is used on real storage until the empty blocks are written to. In a few words, they just use the size of their contents.

Create a swap partition image file:

dd if=/dev/zero of=dom1-swap.img bs=128M count=1
mkswap dom1-swap.img

We also created a special 8 MB filesystem for data files for dom1 (not belonging to the operating system):

dd if=/dev/zero of=/xen/dom1-data.img bs=1024k seek=8192 count=0
mkfs.ext3 -F dom1.img

Populate the root filesystem for dom1:

mkdir /mnt/dom1
mount -o loop dom1.img /mnt/dom1
debootstrap edgy /mnt/dom1

Copy the kernel modules too:

rsync -a /lib/modules/ /mnt/dom1/lib/modules/

Copy the /etc/apt/sources.list and /etc/resolv.conf files too.

Configuring domU

Declare the new domain in a /etc/xen/dom1.cfg file:

kernel = "/boot/vmlinuz-"
memory = 96
name = "dom1"
vcpus = 1
disk = [ 'file:/xen/dom1.img,ioemu:hda1,w','file:/xen/dom1-data.img,ioemu:hda2,w','file:/xen/dom1-swap.img,ioemu:hda3,w' ]
root = "/dev/hda1 ro"

vif = [ 'ip=' ]
dhcp = "off"
hostname = "dom1.bootlin.com"
ip = ""
netmask = ""
gateway = ""

Of course, replace dom1 by a meaningful name!

Here, you can see that we assign 10.0.0.x to domx domains.

You can also see that we are using the same kernel as the one we use for dom0. This is not required at all, and we will soon propose a slightly lighter domU kernel without things which are not needed in unprivileged domains (no RAID, no netfilter…).

Booting and configuring domU

Start the new virtual machine and access a console:

xm create /etc/xen/dom1.cfg
xm console dom1

We gave you both instructions, as the second one is useful to access the console of an already running domain. However, you can create a domain and access its console in a single command, using the -c option:

xm create -c /etc/xen/dom1.cfg

You are connected to your new virtual system, with minimum Ubuntu server packages. There are still a few things to adjust though:

Set the root password, otherwise there is no password!


Fill up the /etc/fstab file as follows:

# /etc/fstab: static file system information.
/dev/hda1       /       ext3    defaults,errors=remount-ro   0  1
/dev/hda2       /data   ext3    defaults        0       2
/dev/hda3       none    swap    sw      0       0

Fill up the /etc/network/interfaces file as follows:

# Used by ifup(8) and ifdown(8). See the interfaces(5) manpage or
# /usr/share/doc/ifupdown/examples for more information.

# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static

Edit the /etc/hosts file as follows:       localhost localhost.localdomain       dom1

Fill up the /etc/hostname file as follows:


Exit the dom1 console by typing exit followed by Ctrl ], and reboot dom1:

xm reboot dom1

Back to the dom1 console, check networking with dom0 and with the outside world:

ping bootlin.com

Install extra packages:

apt-get install libc6-xen deborphan psutils wget rsync openssh-client

Remove packages we will not need in the Ubuntu distribution:

apt-get remove --purge wireless-tools wpasupplicant pcmciautils libusb-0.1-4 alsa-base alsa-utils dhcp3-common dmidecode linux-sound-base x11-common eject libconsole aptitude groff-base

Keep your current configuration as a reference starting point for other domU domains you will create:

cp dom1.img domu.img.ref

Now we are going to create NAT (Network Address Translation) rules to forward incoming Internet packets to the right server on your virtual local network. Write your own rules in /etc/network/if-up.d/iptables in dom0 (make sure you make this file executable!). Here’s an example for an http server and a BitTorrent seed server.


### Port Forwarding ###
iptables -A PREROUTING -t nat -p tcp -i eth0 --dport 80 -j DNAT --to
iptables -A PREROUTING -t nat -p tcp -i eth0 --dport 6881:6889 -j DNAT --to

Now, make sure your domain is started when your server is started, by adding a link in /etc/xen/auto/:

cd /etc/xen/auto/
ln -s ../dom1.cfg .

Ubuntu Edgy fixes

tty fixes

If you are using only xm console to connect to dom1, and do not plan to use ssh, you will notice that there are 5 getty processes running all the time waiting for a terminal connection on /dev/tty2 to /dev/tty6, while you just use /dev/tty0.

This does not only waste some CPU cycles and a few MB of RAM, but these getty processes keep failing and get respawned by the upstart init process. This causes a log of writes to the /var/log/daemon.log file, which could eventually fill up your root filesystem.

Here’s a quick fix for this:

rm /etc/event.d/tty2
rm /etc/event.d/tty3
rm /etc/event.d/tty4
rm /etc/event.d/tty5
rm /etc/event.d/tty6

Note that you may need to run this again each time the initscripts package is updated.

Configuring CPU sharing

With Xen, it’s not only possible to set the amount of physical RAM used by each domain. You can also set kind of CPU priorities for more critical domains needing fast response times.

Of course, you could also set the vcpus setting to values greater than one, but this has several drawbacks:

  • You would need an SMP enabled kernel.
  • This method also allows discrete settings, and you would need quite big numbers of vcpus to have a 45% / 65% cpu share, for example.
  • No easy way to add more CPU power to a domain without rebooting it (you may try Linux CPU hotplugging capa bilities, though).

Fortunately, the "xm sched-credit" command lets you change the weight and cap settings of each domain. See the Credit based CPU scheduler page for details about this command.

What’s good is that you can change those settings on the fly, according to actual server loads, without having to restart the corresponding domain. Unfortunately, Xen 3.0.3 doesn’t let you configure initial weight and cap settings at domain creation time through the domain configuration file (such a feature should be available in Xen 3.0.4, as a patch has been committed in the development version). Until such a feature is available, here’s an example implementation:

Create a /etc/init.d/xen-sched-credits file with execute permissions:

# Sets initial weight and cap for xen domains
# In xen 3.0.4, this should be possible
# to set these values in the domain configuration files

# dom1 domain
xm sched-credit -d dom1 -w 64
xm sched-credit -d dom1 -c 25

# dom2 domain
xm sched-credit -d dom2 -w 256
xm sched-credit -d dom2 -c 50

Add this file to init runlevel 2 in dom0:

cd /etc/rc2.d/
ln -s ../init.d/xen-sched-credits S99xen-sched-credits

Useful tips

Making changes in a domain

What’s good with Xen as opposed to a real server is that you don’t need the domain to be running to make changes, even to upgrade packages!

mkdir /mnt/dom1
mount -o loop /xen/dom1.img /mnt/dom1
chroot /mnt/dom1

Further tasks

Congratulations! You are now ready to create more domains, and fine tune Xen according to your exact requirements.

Have fun! We hope that this HOWTO saved some of your time, anticipated some of your questions, and hopefully made you feel like sharing what you learn on your turn too.

Useful links

On-line resources that we used to configure Xen:

LXR websites

Important update: we are proposing a more modern engine to index big projects like the Linux kernel. Check out Elixir.

LXR (Linux Cross Reference) is a great source indexing tool, particularly useful for big projects like the Linux kernel.

Here are Internet sites that make it easy to explore the Linux sources thanks to the LXR engine:

Please let us know if you know about other LXR sites!

You can also browse Linus Torvalds’ GIT tree.


Checks LibreOffice / OpenOffice.org documents for bad Links

cOOol is a simple Python script that looks for broken hyperlinks in LibreOffice / OpenOffice.org documents.

  • cOOol only supports documents in the OpenDocument format.
  • cOOol is fast: it doesn’t start LibreOffice / OpenOffice.org and runs link checks in parallel threads.
  • cOOol supports most kinds of hyperlinks, including links within the documents.
  • cOOol is easy to use. Just download the script and run it!
  • cOOol is free. It is released under the terms of the GNU General Public License.
cOOol logo

Here is why an automatic link checker for your documents is useful:

  • External references can be a very valuable part of your documents. Broken links reduce their usefulness as well as the impression they make. They also give the feeling that your documents are outdated and older than they are.
  • Web sites evolve frequently. Having an automated way of detecting obsolete links is essential to keeping your documents up to date.
  • You may be much more familiar with your target websites than your readers. They may not be able to find a new location by themselves. You’d better be aware of the change and do this for them!
  • When you rename a page (for example), LibreOffice and OpenOffice.org don’t update all the references to it.


Usage: coool [options] [OpenOffice.org document files]

Checks OpenOffice.org documents for broken Links

  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         display progress information
  -d, --debug           display debug information
  -t MAX_THREADS, --max-threads=MAX_THREADS
                        set the maximum number of parallel threads to create
  -r MAX_REQUESTS_PER_HOST, --max-requests-per-host=MAX_REQUESTS_PER_HOST
                        set the maximum number of parallel requests per host
  -x EXCLUDE_HOSTS, --exclude-hosts=EXCLUDE_HOSTS
                        ignore urls which host name belongs to the given list

When a broken link is found, open the document in OpenOffice.org and use the search facility to look for the link text.

Configuration file

Rather than configuring cOOol from the command line, it is possible to
define the same settings in a ~/.cooolrc file.


# Configuration file for cOOol

verbose = True
exclude_hosts = "lxr.bootlin.com www.example.com"
max_threads = 200

You can see that configuration file settings have the same name as long
options, except that dash (-) characters are replaced by
underscores (_).

Usage through a proxy

cOOol can be used through a proxy. The Python classes it uses rely on standard Unix environment variables for proxy definition, as in the below bash example:

export http_proxy="proxy.server.com:8080"
export ftp_proxy="proxy.server.com:8080"


You first need to install the configparse module.


cOOol can be found in our odf-tools git tree.


cOOol screenshot


cOOol parses the xml components of each document file, looking for hyperlinks.

It would have been cleaner and safer to use the OpenOffice.org API to explore the documents. However, there are also benefits in a standalone Python implementation:

  • No need to start OpenOffice.org and load documents in memory. This saves a lot of time and RAM!
  • No need to have an OpenOffice.org install. Nice if you need to implement a validation server using cOOol.
  • Last but not least, no need to understand OpenOffice.org’s API and the internal structure of documents! By the way, that’s what makes exchange formats like XML attractive. However, we would be delighted if somebody could come up with a simpler and safer implementation based on the API, that could be run within OpenOffice.org user interface!

Testing cOOol

We are using 2 documents to make sure that cOOol finds all the kinds of broken links it is supposed to support:

Limitations and possible improvements

  • cOOol doesn’t check for e-mail links. It could at least check that the corresponding domain is valid.
  • cOOol doesn’t give you page numbers for broken links. You have to open the document and use the search facilities to locate each link.
  • cOOol still crashes on some documents with Unicode strings (for example with Chinese text).
  • cOOol has trouble with link text containing quotes, as in what’s new. The text it outputs is truncated.


Compacts directories by replacing duplicate files by symbolic links

clink is a simple Python script that replaces duplicate files in Unix filesystems by symbolic links.

  • clink saves space. It works particularly well with automatically generated directory structures, such as compiling toolchains.
  • clink uses relative links, making it possible to move processed directory structures
  • clink is fast. It reads each file only once and its runtime is mainly the time taken to read files.
  • clink is light. It consumes very little RAM. No problem to run it on huge filesystems!
  • clink is easy to use. Just download the script and run it!
  • clink is free. It is released under the terms of the GNU General Public License.
clink logo


usage: clink [options] [files or directories]

Compacts folders by replacing identical files by symbolic links

  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -d, --dry-run  just reports identical files, doesn't make any change.


clink screenshot


Here is the OpenPGP key used to generate the signatures.

How it works

clink reads all the files one by one, and computes their SHA (20 bytes) and MD5 (16 bytes) checksums. The trick to easily find identical files is a dictionary of files lists indexed by their SHA checksum.

All the files with the same SHA checksum are not immediately considered as identical. Their MD5 checksums and sizes are also compared then. There is an extremely low probability that files meeting all these 3 criteria at once are different. You are much more likely to face file corruption because of a hardware failure on your computer!

Hard links to the same contents are treated as regular files. Keeping one instance and replacing the others by symbolic links is harmless. Files implemented by symbolic links also have the advantage of not having their contents duplicated in tar archives.

Limitations and possible improvements

  • File permissions: clink just keeps one copy of duplicate files. The permissions of this file may be less strict than those of other duplicates. If permissions matter, enforce them by yourself after running clink.
  • Directory structure: even when entire directories are identical, clink just creates links between files. This is not fully optimal in this case, but it keeps clink simple.

Similar tools or alternatives

  • dupmerge2: replaces identical files by hardlinks.
  • finddup: finds identical files.

Demo: tiny qemu arm system with a DirectFB interface

A tiny embedded Linux system running on the qemu arm emulator, with a DirectFB interface, everything in 2.1 MB (including the kernel)!


This demo embedded Linux system has the following features:

  • Very easy to run demo, just 1 file to download and 1 command line to type!
  • Runs on qemu (easy to get for most GNU/Linux distributions), emulating an ARM Versatile PB board.
  • Available through a single file (kernel and root filesystem), sizing only 2.1 MB!
  • DirectFB graphical user interface.
  • Demonstrates the capabilities of qemu, the Linux kernel, BusyBox, DirectFB, and
    shows the benefits of system size and boot time reduction techniques as advertised and supported by the CE Linux Forum.
  • License: GNU GPL for root filesystem scripts. Each software component has its own license.

How to run the demo

  • Make sure the qemu emulator is installed on your GNU/Linux distribution. The demo works with qemu 0.8.2 and beyond, but it may also work with earlier versions.
  • Download the vmlinuz-qemu-arm-2.6.20
  • Run the below command:
    qemu-system-arm -M versatilepb -m 16 -kernel vmlinuz-qemu-arm-2.6.20 -append "clocksource=pit quiet rw"
  • When you reach the console prompt, you can try regular Unix commands but also the graphical demo:

FAQ / Troubleshooting

  • Q: I get Could not initialize SDL - exiting when I try to run qemu.

    That’s a qemu issue (qemu used the SDL library). Check that you can start graphical applications from your terminal (try xeyes or xterm for example). You may also need to check that you have name servers listed in /etc/resolv.conf. Anyway, you will find solutions for this issue on the Internet.


console screenshot df_andi program screenshot
df_dok program screenshot df_dok2 program screenshot
df_neo program screenshot df_input program screenshot

How to rebuild this demo

All the files needed to rebuild this demo are available here:

  • You can rebuild or upgrade the (Vanilla) kernel by using the given kernel configuration file.
  • The configuration file expects to find an initramfs source directory in ../rootfs, which
    you can create by extracting the contents of the rootfs.tar.7z archive.
  • Of course, you can make changes to this root filesystem!

Tools and optimization techniques used in this demo

Software and development tools

  • The demo was built using Scratchbox, a fantastic development tool that makes cross-compiling transparent!
  • The demo includes BusyBox 1.4.1, an toolbox implementing most UNIX commands in a few hundreds of KB. In our case, BusyBox includes the most common commands (like a vi implementation), and only sizes 192 KB!
  • The root filesystem is shipped within the Linux kernel image, using the initramfs technique, which makes the kernel simpler and saves a dramatic amount of RAM (compared to an init ramdisk).
  • The demo is interfaced by DirectFB example programs (version 0.9.25, with DirectFB 1.0.0-rc4), which demonstrate the amazing capabilities of this library, created to meet the needs of embedded systems.

Size optimization techniques

The below optimization techniques were used to reduce the filesystem size from 74 MB to 3.3 MB (before compression in the Linux kernel image):

  • Removing development files: C headers and manual pages copied when installing tools and libraries, .a library files, gdbserver, strace, /usr/lib/libfakeroot, /usr/local/lib/pkgconfig
  • Files not used by the demo programs: libstdc++, and any library or resource file.
  • Stripping and even super stripping (see sstrip) executables and libraries.
  • Reducing the kernel size using CONFIG_EMBEDDED switches, mainly from the
    Linux Tiny project.

Techniques to reduce boot time

We used the below techniques to reduce boot time:

  • Disabled console output (quiet boot option, printk support was disabled anyway), which saves time scrolling the framebuffer console.
  • Use the Preset Loops per Jiffy technique to disable delay loop calculation, by feeding the kernel with a value measured in an earlier boot (lpj setting, which you may update according to the speed of your own workstation).

All these optimization techniques and other ones we haven’t tried yet are described either on the elinux.org Wiki or in our embedded Linux optimizations presentation.

Future work

We plan to implement a generic tool which would apply some of these techniques in an automatic way, to shrink an existing GNU/Linux or embedded Linux root filesystem without any loss in functionality. More in the next weeks or months!

OLS 2008 videos

30 videos from the Linux Symposium in Ottawa

We are pleased to release 29 videos that we took at the Linux Symposium in Ottawa, Canada, in July 2008:

  • Keynote: The Kernel: 10 Years in Review, by Matthew Wilcox (Intel)
    video (57 minutes, 175M)
  • Talk: Tux on the Air: State of Linux Wireless Networking, by John W. Linville (Red Hat)
    paper, video (52 minutes, 168M)
  • Talk: Suspend to RAM in Linux: State of the Union, by Len Brown and Rafael Wysocki (Intel)
    paper, video (52 minutes, 163M)
  • Talk: Real Time vs Real Fast: How To Choose?, by Paul E. McKenney (IBM)
    paper, video (45 minutes, 166M)
  • Tutorial: ftrace: latency tracer, by Steven Rostedt (Red Hat) video (98 minutes, 772M)
  • BOF: Embedded Linux, by Tim R. Bird (Sony)
    video (42 minutes, 200M)
  • BOF: Embedded Microcontroller Linux, by Michael Durrant (Arcturus Networks)
    video (42 minutes, 243M)
  • Talk: Energy-aware task and interrupt management, by Vaidyanathan Srinivasan (IBM)
    paper, video (52 minutes, 182M)
  • Talk: Application Testing Under Realtime Linux, by Luis Claudio R. Gonçalves (Red Hat)
    paper, slides, video (54 minutes, 297M)
  • Talk: Application Framework for Your Mobile Device, by Shreyas Srinivasan (Geodesic Information Systems)
    paper, video (25 minutes, 146M)
  • Keynote: The Making of OpenMoko Neo, by Werner Almesberger (OpenMoko)
    video (94 minutes, 463M)
  • BOF: U-Boot by Wolfgang Denk (Denx)
    video (54 minutes, 362M)
  • BOF: Linux Compiler, by Rob Landley (Impact Linux)
    video (100 minutes, 765M)
  • Tutorial: Practical Guide to Using Git, by James Bottomley (Hansen Partnership)
    video (61 minutes, 357M)
  • Talk: Advanced XIP File System, by Jared Hulbert (Numonyx)
    paper, video (49 minutes, 160M)
  • Talk: SELinux for Consumer Electronic Devices, by Yuichi Nakamura (Hitachi)
    paper, video (31 minutes, 113M)
  • Talk: Around the Linux File System World in 45 Minutes, by Steve French (IBM)
    paper, slides, video (49 minutes, 298M)
  • BOF: Linux The Easy Way with LTIB, by Stuart Hughes (Freescale)
    slides, video (25 minutes, 144M)
  • Keynote: The Joy of Synchronicity: Coordinating the Releases of Upstream and Distributions, by Mark Shuttleworth (Canonical)
    slides, video (76 minutes, 458M)
  • Talk: Smack in Embedded Computing, by Casey Schauffer
    paper, video (59 minutes, 211M)
  • Talk: Bazillions of Pages: The Future of Memory Management, by Christoph H. Lameter (SGI)
    paper, video (49 minutes, 258M)
  • Tutorial: Writing application fault handlers, by Gilad Ben-Yossef (Codefidence)
    video (49 minutes, 275M)
  • Talk: Linux, Open Source and System Bringup Tools, by Tim Hockin (Google)
    paper, video (51 minutes, 229M)
  • Talk: DCCP Reached Mobiles, by Leandro Melo Sales (Federal University of Campina Grande)
    paper, video (42 minutes, 193M)
  • Talk: Building a robust Linux kernel, by Subrata Modak (IBM)
    paper, slides, video (51 minutes, 249M)
  • CELF BOF presentation: Best of recent CELF Conferences, by Tim Bird (Sony)
    slides, video (10 minutes, 88M)
  • CELF BOF presentation: Developing Embedded Linux with Target Control, by Tim Bird (Sony)
    slides, video (17 minutes, 145M)
  • CELF BOF presentation: Embedded Building Tools – An Audience Survey, by Michael Opdenacker (Bootlin)
    slides, video (17 minutes, 127M)
  • CELF BOF presentation: GCC Tips and Tricks Highlights, by Gene Sally
    video (14 minutes, 62M)

See also all the papers, and a report from the CELF BOF.

We could only shoot the presentations we attended. You can see that our main interests are embedded systems and the Linux kernel wink smiley.