From April 15th
to 17th 2008 took place the fourth edition of the Embedded Linux
Conference organised every year by the CE Linux Forum in the Silicon
Valley. This year, for the first time, the conference was organized
inside the Computer History
Museum, which happened to be a very nice venue for such a
conference. The museum also has various exhibits about computer
history such as Visible
Storage, an exhibit featuring many samples from the museum
collection, ranging from the first computers to the first Google
cluster, going through Cray supercomputers.
The conference's program was very promising: three keynotes from famous speakers (Henry Kigman, Andrew Morton and Tim Bird) and fifty sessions, either talks, tutorials of bird-of-a-feather sessions, covering a wide range of subjects of interest for any embedded Linux developer : power management, debugging techniques, system size reduction, flash filesystems, embedded distributions, realtime, graphics and video, security, etc.
This report has been written by Thomas Petazzoni, from Free Electrons. The report only covers the talks he could actually attend : there were three simultaneous tracks at Embedded Linux Conference. Sometimes very interesting talks were happening at the same time, leading to a kind of frustration for the audience, willing to be at several places at the same time. For those people, and for all the persons who could not attend the conferences, Free Electrons also provides video records for 19 talks given during ELC. The links to the video are given below in the report. The following report makes an extensive use of the contents of the slides used by the speakers during their talks.
Link to the video (44 minutes, 139 megabytes) and the slides.
The first day of the conference
was opened by a keynote of Henry Kigman entitled Tux in
Lights. Henry Kigman is famous for being the editor behing the
well-known Linux Devices
website, and he was this year in charge of opening the Embedded Linux
Conference. He started his talk with an introduction about the
importance of such meetings : he emphasized the fact that many
free software developers work together all year long without having
the chance to meet in person. In that respect, conferences such as ELC
are important moment to see each other, he said.
Kigman then continued his presentation with slides containing the result of the latest Linux Devices survey concerning the use of embedded Linux, that Jake Edge already reported on Linux Weekly News in his article ELC: Trends in embedded Linux.
Link to the video (50 minutes, 251 megabytes) and the slides.
In this talk, Frank Rowand presented what has been involved in setting up the real time version of the Linux kernel (linux-rt) on a MIPS platform, using the TX4937 processor. He started by reminding that doesn't mean fast response time, but determinism, and that deadlines could be seconds, milliseconds or microseconds, for example.
Then, he summed up
what could affect the IRQ latency in the Linux kernel : disabled
interrupts, execution of top halves, softirqs, scheduler execution,
and finally context switch. An important aspect of Linux RT is tuning
this IRQ latency to make it 1) deterministic and 2) low. So, code
disabling interrupts in the kernel should be avoided as much as
possible, and Frank's talk focused on finding and fixing issues about
such pieces of code.
The roadmap of his adventure was basically :
He then started to talk about the latency tracer, which has been
recently submitted to mainline inclusion by Ingo Molnar. Currently
only available in the -rt, this tracer has recently been
improved in several areas in 2.6.24-rt2 : cleaned up code,
user/kernel interface based on debugfs instead of /proc,
simultaneous trace of IRQ off and preempt off latencies, and
simultaneous histogram and trace. He however used the previous
version, 2.6.24-rt1 for the experiments reported in his talk.
His first experiments with the tracer lead to the discovery of several issues :
/proc/latency_hist/interrupt_off_latency/CPU0. Using
/proc/latency_trace, he discovered the culprit :
r4k_wait_irqoff(), a MIPS-specific function called when
the CPU is idle. That function was disabling interrupts before going
into idle using the wait MIPS instruction. The quick
fix was to use the nowait
kernel option, to disable the use of CPU idle specific
instructions. Of course, one must be aware of the consequences of
using such an option from a power management perspective. The real
fix would be to stop latency tracing in cpu_idle(), as
is done on x86. Even with that fix, he still had some
large maximum latencies.include/linux/jiffies.h. This fix removed the maximum
reported latencies, and he was now down at 6.7 milliseconds maximum
latency.kernel/latency_trace.c, he had to check for switch
between raw and non-raw clock sources, and at any switch, delete
timestamps in other mode from the current event.He then showed some nice and pretty graphs (visible in the video), showing the improvements made by each fix. Once that the very ugly latencies were fixed, the things to do next is to fix the thing that disables preemption for the longest time and the thing that disables interrupts for the longest time. In his talk, he focused on the second part : irq disabled time.
He presented in
more details the main tool used for that debugging : the latency
tracer. He described the contents of a latency trace output, which
might be kernel-hacker-readable, but not necessarly human-readable at
first sight. He highlighted the fact that the function trace that one
can get with the latency tracer is not a list of all functions
executed, but that trace points are only instered at "interesting"
locations in various subsystems. Thus, one has to interpolate what's
happening between the locations provided by the trace, he said. He
also mentionned the usefulness of the data fields available for each
line of trace : they are not documented in any way, are specific
to each trace point, but end up to be very useful in understanding
what's happening. They contain informations such as time for timer
related functions or PID and priority for scheduling related
functions.
The first problem he found with latencies of 164 microseconds
occured when handling the timer interrupt, in
hrtimer_interrupt(). Several calls to
try_to_wake_up() where made, causing a long time with
interrupt disabled (between handle_int(), the low level
interrupt handling function in MIPS that disables interrupts, and
schedule(), which re-enables interrupts). In fact, the
timer code was waking up the tasks for which timers have expired,
which is an O(n) algorithm that depends on the number of timers in the
system. He has no fix yet, except the workaround of not using to many
timers at the same time.
The second problem he found is the fact that the interrupt top half
handling followed by preempt_schedule_irq() is a long
path executing with interrupts disabled. A possible workaround is to
remove or rate limit non-realtime related interrupts, which in his
case where caused by the network card, due to having the root
filesystem mounted over NFS. What he tried, as a quick and dirty hack,
was to re-enable and immediatly disable again interrupts in
resume_kernel, the return from interrupt function. It is
a bad hack as it allows nested interrupts to occur, which could cause
the stack to overflow. However, he found that it improved the
latencies, and presented results confirming that.
As a moral, he said do not lose sight of the most important
metric -- meeting the real time application deadline -- while trying
to tune the components that cause latency. He mentionned LatencyTOP as a promising tool,
but also mentionned using the experts' knowledge, thanks to the web
and mailing lists. He mentionned a few recent topics of discussion on
the linux-rt-users, to show the type of discussion
occuring on this mailing list.
To conclude the talk, he showed and discussed real-time results made by Alexander Bauer (and presented at the 9th Real Time Linux Workshop) and his own.
In the end, this talk happened to be highly technical, but very interesting for the one who wanted to discover how the latency tracer could be used, and the kind of problems one can face when setting up and using such an instrumentation tool.
Link to the video (49 minutes, 146 megabytes).
Matt Mackall founded the Linux Tiny project in
2003, is the author of SLOB, a more space-efficient alternative
to SLAB, the kernel's memory allocator, and of other
significant improvements towards reducing the code size of the Linux
kernel. He naturally made an update of the size of the kernel, and
announced a new version of his bloat-tracking tool, Bloatwatch.
To start with, Matt Mackall explained why all that attention is paid on size. He said that it of course matters for the embedded people, become memory and storage are expensive relative to the price of an embedded device, and that a smaller kernel means cheaper device, and hence more room for applications. But Matt also said that the rest of the world now cares about code size, because even if memory and storage are cheap, the speed ratio between CPU cache and memory increases, which means that smaller code allows to fit more code in cache lines, allowing performance improvements. Matt Mackall is certainly correct with that statement, but the issue is that the code size reduction is focused on hot paths, not on the overall code size.
According to Mackall, the reason for the kernel grow are many : new features, improved correcness, robustness, generality and diagnostics. He then gave an absolutely impressive report on the amount of changes that occured during the last year. In April 2007, Linux 2.6.21 was the stable version, it had 21.615 files and 8.24 millions lines of code. In April 2008, time of the conference, Linux 2.6.25-rc8 was the latest available version (probably very close to the final 2.6.25), and it has 23.811 files and 9.21 millions lines of code. 37.033 changesets were committed to the kernel, from around 2.400 different contributors, contributing to the change of 18.165 files (almost of all files in the kernel have been touched !), to the addition of 2.24 millions lines and the removal of 1.25 millions lines. Matt concludes : « a lot has happened ».
He then mentionned a few notable changes in 2007, concerning the subjects he cares : SLUB, another alternative to SLAB, is now the default allocator, SLUB and SLOB have seen their efficiency improved, greater attention is paid to cache footprint issues, increase usage of automated testing, pagemap and PSS to monitor userspace (work that has been merged in 2.6.25 and that allows to understand precisely userspace memory consumption), and the revival of the Linux-Tiny project, now maintained by Michael Opdenacker.
Mackall then entered the
core of the subject : kernel code size. With all the
architectures, drivers and configuration options, it's difficult to
measure the kernel code size increase (or decrease), so Matt proposed
a simple metric : measure the size of an allnoconfig
configuration for the x86 architecture. The
allnoconfig kernel Makefile target allows to create a
minimalistic configuration, with no network, no filesystems, no
drivers, only the core kernel features. Matt then showed a graph of
the kernel size in that configuration, from 2.6.13 (released two and
half years ago) and now. And he said, « we can see a pretty
steady and obvious increase », which we can obvisouly see on
the graph. Most of the growth is due to code increase, the data part
of the kernel hasn't increased in the last years.
The graph showed an increase of 28% on the kernel size over the
last two and half years. Over the last year, between 2.6.21 and
2.6.25-rc8, the kernel size of the same allnoconfig has
increased from 1.06 megabytes to 1.21 megabytes, a 14% increase. He
said that he made some experiments on more realistic kernel
configurations, and ignoring variations in configuration options over
the kernels, the kernel size increase was pretty much the same so he
thinks the allnoconfig metric is good enough.
He then gave some nice numbers about the size increase : it currently increases at a rate of 400 bytes per day or 4 bytes per change (one or two instructions). The average function size is around 140 bytes, so he concludes that we would need to take out of the kernel three functions every day to keep the core from growing !
To keep the kernel small, his biggest advise is to review the code before it goes in. Insist to have new functionnalities under configuration options, because, as he said : « I don't need processes namespaces on my phone ». And more generally, he said that the kernel community currently lacks code reviewers. He proposed to continue working on inlining and code duplication elimination : code inlining used to be popular in the kernel community, but it is not longer useful with modern architectures. The biggest issue is that a lot of functions are defined in header files, and are then included in thousands of C files so that they are instantiated in every object file. And then, Matt thinks that there is a need to automate the size measurement to find worst offender in existing code.. which made a perfect transition to the next topic of his talk : Bloatwatch 2.0.
Two years ago, at the same conference, he presented Bloatwatch 1.0. The new version is rewritten from scratch, with many improvements :
One can get Bloatwatch from its Mercurial repository, using
hg clone http://selenic.com/repo/bloatwatch
or grab the tarball, at http://selenic.com/repo/bloatwatch/archive/tip.tar.gz.
Matt then went
one making a demo of Bloatwatch. On one hand, Bloatwatch is a set of
scripts to compile a kernel according to a configuration, and fill a
database with the results. On the other hand, it is a Web application
that allows to navigate through the results, generate nice and fancy
graphs, compare sizes between kernel versions, for the total kernel,
or for any subsystem, object file or even function.
He said that building the whole database for
allnoconfig for several years of stable kernels takes a
few hours on a normal laptop, and doing the same with
defconfig takes about a day. Which means that rebuilding
the database for a given configuration is something anyone can do
pretty easily.
In a few seconds, he demonstrated how to find the specific source
of a bloat. He pointed down the sysctl_check.code file,
that apperead during the last year, and which weights 25 kilobytes of
code. And thanks to the link to the revision control system of the
kernel, he was able to find the description of the original patches in
a few seconds, which gave an insight on the purpose of the change. In
fact, it happened that all that stuff does binary checking on
sysctl arguments, something we probably don't need on
your phone, he said. So it's probably a good candidate for a
configuration option.
Bloatwatch appears to be a great tool for measuring the kernel size increase, and to analyze the source of that increase. Now, some effort should probably be set up to communicate these informations to the kernel developer community in one way or another.
Link to the video (35 minutes, 71 megabytes) and the slides.
Liam Girdwood works for a company
called Wolfson
Microelectronics and discussed the creation of a kernel API for
voltage and current regulators controls. Before going into the kernel
framework itself, he started by providing an introduction to regulator
based systems, assuming that everyone is not necessarly familiar with
such systems, which indeed was true.
The power consumption in semiconductors has two components : the static one and the dynamic one. The static one is smaller that the dynamic one when the device is active, but is the bigger source of power consumption when the device is inactive. The dynamic one corresponds to the activity of the device : signals switching, analog circuits changing state, etc. The power consumption grows linearly with the frequency, and grows with the square of the voltage. See this Wikipédia page on power optimization for more informations. Liam concluded that general introduction by saying that regulators can be used to save both static and dynamic power.
Then, he went on to
present the global picture of a regulator. It is a piece of hardware
that takes an input power (from a battery, line, USB or another
regulator), and that outputs a power (to a device or another
regulator). This piece of hardware is controlled by software, so that
we can control how the output power will be. It is possible to
instruct the regulator to generate a 1.8V output power when the input
source is 5V, or to limit the current to 20mA, for example. The whole
purpose of the regulator framework is to provide a generic software
framework for controlling this kind of devices.
After that, he introduced the abstraction of power domains. A power domain is a set of devices and regulators that get their input power from a regulator, a switch or another power domain, so that power domains can be chained together. Power constraints can also be applied to power domains to protect the hardware.
Then, in order to get into more concrete examples, he started describing the system architecture of one of their Internet Tablets. It has the usual components : CPU, memory, NOR flash, audio codec, touchscreen, LCD controller, USB, Wifi and other peripherals. Then, after showing this block diagram, he presented the same block diagram, with all the regulators. Each device can be controlled by one or several power regulators. The whole purpose of the regulator framework is to control all these regulators, and so he went on with a discussion of the framework itself.
The general goal of the regulator framework is to « provide a standard kernel interface to control voltage and current regulators ». It should allow systems to dynamically control power regulator powerouput in order to save power, with the ultimate goal of prolonging battery life, of course. The kernel framework to control all that is divded in four interfaces :
The consumers are the clients of the regulators, i.e the drivers controlling a device that get its current from a regulator. The consumers are constrained by the power domain in which they are : they cannot request more that the limits that have been set for their power domain. They defined two types of consumer : the static ones (that just want to enable or disable the power source), and the dynamic ones (that want to change voltage or current limit).
The consumer API is very similar to the clock API, he said. Basically, a device driver starts accessing a regulator using :
regulator = regulator_get(dev, "Vcc");
where dev is the device and "Vcc" a
string identifying the particular regulator we would like to
control. It returns a reference to a regulator, that should be at some
point released, using :
regulator_put(regulator);
Then, the API to enable or disable is as simple as :
int regulator_enable(regulator); int regulator_disable(regulator); int regulator_force_disable(regulator);
regulator_enable() will actually enable the regulator
on, and keep track of a reference count, so that the regulator will
actually be disabled only after the corresponding number of calls to
regulator_disable().
regulator_force_disable(), as its name says, allows to
disable a regulator even if the reference count is non-zero. A status
API is also available in the form of the int
regulator_is_enabled(regulator) function.
Then, the voltage API looks like :
int regulator_set_voltage(regulator, int min_uV, int max_uV);
After checking the constraints, the referenced regulator will
provide power with a voltage inside the boundaries requested by the
consumer, between min_uV (minimal voltage in micro-volts)
and max_uV. The regulator will actually choose the
minimum value that it can provide and that is in the range requested
by the consumer. The voltage actually choosen by the regulator can be
fetched using int regulator_get_voltage(regulator).
The current limit API is similar :
int regulator_set_current_limit(regulator, int min_uA, int max_uA); int regulator_get_current_limit(regulator);
The regulators are not 100%
efficient, their efficiency vary depending on load, and they often
provide several modes to increase their efficiency. He gave the
example of a regulator with two modes : normal mode, pretty
inefficient for low current values, but that covers the full range of
current values, and an idle mode, more efficient for low current
values, for that cannot provide more current than a given limit
(smaller than the one in normal mode). So, for example, with a
consumer of 10 mA, the efficiency would be 70% in normal mode,
consuming 13 mA and 90% in idle mode, consuming 11 mA, thus saving 2
mA. So, there is an API to set the optimum mode for a given current
value :
regulator_set_mode(); regulator_get_mode(); regulator_set_optimum_mode();
Regulators can also notify software of events, such as failure or over temperature :
regulator_register_notifier(); regulator_unregister_notifier();
This is all the API one can use in device drivers to handle the relation with regulators.
Then, he switched to the topic of writing a regulator driver. The API is certainly very similar to other kernel APIs. They must first be registered to the framework before consumers can use them :
struct regulator_dev *regulator_register(struct regulator_desc *desc, void *data); void regulator_unregister(struct regulator_dev *rdev);
The events can propagated to consumers, thanks to the notifier call
chain mechanism. Every consumer that registered a callback using
regulator_register_notifier() will be notified if the
following function is called by a regulator driver :
int regulator_notifier_call_chain(struct regulator_dev *rdev, unsigned long event, void *data);
The regulator_desc structure must give a few
information about the regulator (name, type, IRQ, etc.), but most
importantly, must contain a pointer to a regulator_ops
structure. It is pretty much a 1:1 mapping of the consumer
interface :
struct regulator_ops {
/* get/set regulator voltage */
int (*set_voltage)(struct regulator_cdev *, int uV);
int (*get_voltage)(struct regulator_cdev *);
/* get/set regulator current */
int (*set_current)(struct regulator_cdev *, int uA);
int (*get_current)(struct regulator_cdev *);
/* enable/disable regulator */
int (*enable)(struct regulator_cdev *);
int (*disable)(struct regulator_cdev *);
int (*is_enabled)(struct regulator_cdev *);
/* get/set regulator operating mode (defined in regulator.h) */
int (*set_mode)(struct regulator_cdev *, unsigned int mode);
unsigned int (*get_mode)(struct regulator_cdev *);
/* get most efficient regulator operating mode for load */
unsigned int (*get_optimum_mode)(struct regulator_cdev *, int input_uV,
int output_uV, int load_uA);
};
After this short description of the regulator driver interface, he described the machine driver interface. It is basically used to glue the regulator drivers with their consumers for a specific machine configuration. It describes the power domains : « regulator 1 supplies consumers x, y and z », power domain suppliers : « regulator 1 is supplied by default (Line/Battery/USB) » or « regulator 1 is supplied by regulator 2 » and power domain constraints : « regulator 1 output must be between 1.6V and 1.8V ».
To give a concrete example, he said lets take a NAND flash whose power is supplied by the LDO1 regulator. To attach the regulator to the "Vcc" supply pin of the NAND, we use the followind call :
regulator_set_device_supply("LDO1", dev, "Vcc");
It will associate the regulator named LDO1 (as given
in the regulator_desc structure) to the Vcc
input of a given device. Then that device driver is able to use the
regulator_get() to get a reference to its regulator and
then control it.
Then, the machine driver can specify constraints on power domains,
using the regulation_constraints that can be associated
to a given regulator using
regulator_set_platform_constraints().
Finally, the machine driver is also responsible for mapping
regulators to regulators, when one regulator are supplied by other
regulators. It is done using the regulator_set_supply()
function, which takes the name of two regulators as arguments, the
supplier regulator, and the consumer regulator. Of course, it is up to
the machine specific code to glue up everything properly.
Then, he described the sysfs interface, which exports regulator and consumer information to userspace. It is currently read-only, and Liam doesn't seem at the moment a good reason to switch it to read-write. One can access information such as voltage, current limit, state, operating mode and constraints, which could be used to provide more power usage information to PowerTOP, for example.
After this API description, he gave some real world
examples. First, cpufreq, which allows to scale CPU frequency
to meet processing demands. He says that voltage can also be scaled
with frequency : increased with frequency to increase performance
and stability or decreased with frequency to save power. And that can
be done with the regulator_set_voltage() API. In
cpuidle, you can imagine changing the operating mode of the
regulator that supplies current to the CPU in order to switch a more
efficient mode.
He then gave the example of LCD backlights, which usually consume a
lot of power. When it's possible to reduce brightness, then power
reduction is only possible, and that can be achieved using the
regulator_set_current_limit() API, particularly for
backlights using white LEDs, in which brightness can be changed by
changing the current.
In the audio world as well, improvements can be made. Audio
hardware consumes analog power even when there is no audio
activity : power can be saved by switching off the regulators
supplying the audio hardware. We might also think of switching off the
components that are not used, and he gave the example of the FM-turner
when you're listening to MP3's or the speaker amplifier that can be
turned off when headphones are used. And mentionned the relevant
API : regulator_enable() and
regulator_disable(). Same goes for NAND and NOR flashes
that consume more power during I/O than when they are idle, so it is
possible to switch the operating mode of the regulators to take
advantage of more efficient mode for low current values. He pointed
out the fact that flashes have power consumption information in their
datasheets, and that they could be used in the flash driver to
properly call regulator_set_optimum_mode() to set the
best possible mode for the current power consumption.
The status of this work is that the code is working on several machines. It supports several hardware : Freescale MC13783, Wolfson WM8350 and WM8400. They are working with the -mm kernel by providing patches to Andrew Morton, and they already posted the code on the Linux Kernel Mailing List. They also have a webpage for their project, and the code is available in a git repository.
Link to the video (53 minutes, 263 megabytes) and the slides.
This talk of Klaas van Gend, Senior Solutions Architect at Montavista Europe, was subtitled Common pitfalls, tips and tricks. His talk is a presentation of the real-time version of the Linux kernel, clarifications about various misconceptions on real-time, and advises.
He started by
presenting both faces of Klaas : Klass-the-Geek, who started
programming at 13, first encountered Linux in 1993 and is a software
engineer since 1998, and Klaas-the-Sales-Guy, who joined Montavista as
FAE in 2004 and is in charge of the UK, Benelux and Israel
territory.
Originally, Linux is designed to be fair, like the other
Unixes : the CPU has to be shared properly between all processes,
with a fair scheduling. However, in the case of real-time systems, you
don't usually care about fairness. So a lot has to be done to get
real-time capabilities to the Linux kernel, and that work is
happening, and has been happening for a long time in the
-rt version of the Linux kernel, maintained as a separate
patch. His slide also mentionned some progress made on the mainline
kernel : originally, only userspace code was preemptible, then
Robert Love added preemption to the kernel, and Ingo Molnar added
voluntary preemption. The O(1) scheduler, which allows to decide which
task should be run next in a constant time, was also mentionned.
He then went on
with a definition of real-time : « Ok, we have a deadline
and if we don't make the answer within the deadline... Sorry we don't
care anymore ». As an example, he said : « if the
airbag doesn't blow in time or is only half-way blown, too bad :
you're dead ». In contrast, he said, if after a mouse click the
system only reacts after half a second, that's annoying, but it
works. And his words were strengthen with a nice slide showing that
the degree of acceptability of the response time only slowly decreases
for consumer/user interface, but flaws down completely for a classic
real time system.
Main assumption is Real Time Linux : the higest priority task should go first, « always », he said. Which means that everything should be pre-emptable and that nothing should keep higher priority things from executing. He said that lots of things had to be changed in the kernel to get this assumption, and one of the first thing was spinlock.
The original Linux UP spinlock basically disables interrupts : nothing else can interrupt during the critical section, which is not real-time-friendly at all. And the original SMP spinlock basically busy wait for another CPU to release the lock, which is not necessarly performance-friendly. In order to go to real-time, something had to be done with spinlocks : introduce sleeping spinlocks, so that instead of busy-waiting, threads waiting for the lock would go to sleep, and no interrupt would be disabled. Spinlocks are turned into mutexes.
Another
problem is the priority inversion problem, a fairly classical problem
in synchronization and scheduling litterature, which can lead to the
situation where an high priority process cannot run because it is
blocked by a low priority process. We have three processes : A, B
and C. A has the highest priority, B a medium priority and C a low
priority. C holds a lock Q. After some time, task A needs that lock Q,
but it is still held by C, so A cannot run. And because B is runnable
and its priority is higher than the priority of C, then B will run,
and run, and run, and the lock will never be released, or only when B
is done executing its code. The solution to this problem is known as
priority inheritance. In our case, the priority
inheritance mechanism would increase the priority of C to the
priority of A when A needs the lock held by C, so that C can run
instead of B, release the lock, and allow A to get it. Work on
priority inheritance has been done inside the linux-rt
tree, but has finally been merged
into mainline in 2.6.18.
The next problem discussed by Klaas come from the named semaphores mechanism. These are semaphores that appear on the filesystem, so that they can be used by several non-related processes (processes with no parent-child relationship or living in the same address space). The problem with named semaphores is that when a process holding the semaphore dies, the semaphore is not automatically released and any other process trying to get the semaphore will be stuck... until a system reboot. The solution to that is called robust mutexes, which allow to automatically release mutexes when a process dies. It has been merged in 2.6.17, and covered by Linux Weekly News.
Then, Klaas quickly covered the topic of priority
queues. Traditionnaly, the Linux kernel handled mutex queues in a
FIFO-order: the first waiting process gets the mutex when it is
released by another process. However, on real-time system, you want
the mutex to be assigned to the highest priority waiting process. This
is not fair for the other processes, but as explained by the speaker
at the beginning of his talk, real-time and fairness are not
necessarly compatible. The solution to this problem is called
priority queues : processes are ordered by priority in
waiting lists. This is implemented by the rtmutex code in
the Linux kernel (see kernel/rtmutex.c),
which is used by the Futex facility available in userspace (see
futex(7) for more details), and used by the glibc
to implement mutexes. The rtmutex relies on the
plist library in the kernel (see lib/plist.c).
Klaas van Gend then discussed the issues of the standard IRQ
handling mechanism in Linux. In the regular kernel, IRQ and tasklets
are handled in priority over any task in the system, even the
highest-priority ones. This means that the execution of an high
priority task can be delayed for an unbounded time because of any IRQ
coming from the hardware, even interrupts we don't care about. The
solution to this problem, only available in the linux-rt
tree as of today, is called threaded interrupts. The idea is to
move the interrupt handlers to threads, so that they become entities
known by the scheduler. Once known by the schedulers, these entities
can be scheduled (i.e delayed) and we can assign priority to them. To
illustrate the need for such a feature, Klaas gave the example of a
customer who builds a big printer. On this printer, the high-priority
task is to push informations to the engine, otherwise the user will
get white bands on the paper. And this process should not be disturbed
by any other process, such as getting new printing jobs. He wanted to
highlight the fact that threaded interrupts are actually in use
and are useful.
Klaas then concluded : « essentially, those are the basic mechanisms in use to make Linux realtime. Does it help ? Yes it does. ». And he switched to the Results section of his presentation. He started with measurements of the interrupt latency, and compared the results of different preemption modes (none, desktop and RT) on an IXP425 platform using a 2.6.18 kernel. With preempt none, the minimum latency is 4 microseconds, average is 6 and maximum is 9797 microseconds. With preempt desktop, the minimum latency is 5, average is 10 and maximum is 2679. With preempt RT, the minimum is 6, the average 7 and the maximum 349 microseconds. With an higher-end processor (FreeScale 8349 mITX), the results are better: maximum latency of 3968 microseconds with preempt none, 1604 with preempt desktop and 53 microseconds with preempt RT. And he said that with an Intel Core 2 Duo, they managed to lower the maximum latency to 30 microseconds. For more results, he suggested requesting the Real Time whitepaper from Montavista.
After this short
result section, the speaker switched to the final part of his talk,
entitled Common mistakes and myths. The first myth is that
people are confusing fast and determinism. He cited
quotes such as « I need real time because my system needs to
be fast » or « I want to have the best performance
Linux can do ». But he said «NO !», real time does not mean
highest throughput, it means more predictable. He even said that
efficiency and responsiveness are inversely related. For example, the
real-time preemption code adds some overhead (spinlocks are replaced
by mutuxes but mutexes are much more heavy-weight than spinlocks,
priority inheritance increases task switching and worst case execution
time, etc.). He cited benchmarks that measured a decrease of 20% in
the network through of a -RT kernel when compared to a regular
kernel.
He then went one with a list of mistakes :
-rt, all
kernel files need to be recompiled because of the complex internal
changes that are involved by the switch to
-rt. However, the userspace ABI doesn't change, so you
don't have to recompile the glibc or the userspace applications. But
if you use third-party modules, you'll have to recompile
them. Another drawback of third-party binary kernel
modules !pthread_mutex_t mutex; pthread_mutexattr_t mutex_attr; pthread_mutex_attr_init(&mutex_attr); pthread_mutexattr_setprotocol(&mutex_attr, PTHREAD_PRIO_INHERIT); pthread_mutexattr_setrobust_np(&mutex_attr, PTHREAD_MUTEX_ROBUST_NP); pthread_mutex_init(&mutex, &mutex_attr);
sched_yield(). sched_yield() will
simply yield the CPU to the highest-priority runnable process :
you !.He then gave some advises on how to design the system. One should not set the highest priority or even realtime priority to all the processes in the system, otherwise you are no longer real-time. And the realtime tasks should be carefully designed to run in a fairly limited time, so that the rest of the system can still execute. If you have a collection of realtime processes, their execution time must of course match with the timing requirements that you have. He suggested to have only one or two high priority tasks in the system, otherwise things start to be very complicated to design.
One myth he wanted to fight first is the myth that real time is hard. He said that it is not as hard as many tend to say and to think it is. The second myth he wanted to fight is the rumor that real-time is only pushed by the embedded community. It is also strongly pushed by the audio community, and can be useful for games as well.
He went back to a
mistake made by a customer. Even after switching to a real-time
enabled 2.6 kernel, that customer still missed some bytes upon
reception on serial ports on his Geode x86-like board. It turned out
that it was caused by calls to the BIOS used for VGA buffer scrolling
and VGA resolution switching. These calls can disable interrupts for
an unbounded amount of time, not controlled by the Linux kernel. He
also mentionned the problem of printk() on serial
port. printk() on a serial line can block waiting for the
transmission buffer of the UART to be empty after transmitting the
bytes to the other end. And this can take an unbounded amount of
time. So he suggested disabling printk() completely when
one has real-time issues.
After this suggestion, he switched to another topic : the
relation between RT and SMP concerns when doing driver development. He
said that both RT and SMP have similar requirements : in RT any
process can be preempted at any time, which is very similar to
multi-processor issues, where the same code can run simultaneously on
different cores. All requirements for SMP-safeness also apply to RT,
and RT and SMP share the same advanced locking, he said. He also
mentionned that the deadlock detection code introduced by the
-rt people already led to the fix of many SMP bugs in
the kernel.
Then he discussed the problem of swapping in the context of a real
time system. What happens if your real time task code or data gets
swapped to disk because of memory pressure in your system ? The
latencies would be horrible. The solution he mentionned to this
problem is the usage of the mlockall() system
call :
mlockall(MCL_CURRENT | MCL_FUTURE);
But it warned that this should already be done on small process, because all memory pages of the process will be locked into memory : code, data and libraries.
And to complete his talk he highlighted the fact that the Linux Real Time kernel comes with no warranty. Even though is has been thorougly tested over the years by the kernel community and companies such as Montavista, the Linux kernel has several millions of lines of code, and nobody can prove that it will work correctly in all situations. One has to verify that it works well for his particular use cases.
To conclude, he said that Linux used to be fair, which is not good for real-time. Montavista has worked on RT behaviour since 1999, but true real time only appeared in Linux in 2004, with interrupt latencies below 50 microseconds on certain platforms. However, the real-time patch is still being merged into mainline kernel, and real time system design has its challenges... just like programming in COBOL, he said. He ended with a famous quote of Linus Torvalds « Controlling a laser with Linux is crazy, but everyone in this room is crazy in his own way. So if you want Linux to control an industrial welding laserr, I have no problem with your using PREEMPT_RT ». And Klaas made the transition to the questions session with a funny Windows Blue Screen of Death.
During the question session, there have been questions about the
interaction between memory allocation and real time, questions about
predictions on the merge of the remaining -rt features to
the mainline kernel (with some insights by Matt Mackall on that
topic), a question about the interaction between real-time and the I/O
scheduling, another topic on which Matt Mackall gave some interesting
insights.
In the end, that presentation wasn't about anything really new, but gave a well-presented overview of the features needed in the Linux kernel to answer the needs of real-time users, and a good summary of the first pitfalls one could fall in while doing real time programming.
Link to the video (57 minutes, 401 megabytes) and the slides.
Mark Gross, who works at the Open Source Technology Center of
Intel, gave a talk about power management quality of service
(PM_QOS), a new kernel infrastructure that has been
merged in 2.6.25 (see the commit
and the interface
documentation).
The
starting problem for Mark Gross work in the current architecture of
power management, where the power mangament policy implementation is
extracted away for the drivers (who know the hardware the best) to a
centralized policy manager, creating a dual point of maintenance of
device power/performance knowledge : some in the driver, some in
the policy manager. In his opinion, it « removes all hope of
good abstractions or stable and useful PM API's ».
That's the reason for which PM QoS was created. The goal is to provide a coordination mechanism between the hardware providing a power managed resource and users with performance needs. It's implemented as a new kernel infrastructure to facilitate the communication of latency and throughput needs among devices, system and users. Automatic power management is then then enabled at the driver with coordinated device throtling given the QoS expectations on that device.
He then presented areas where PM QoS would be useful in the
kernel. First in the cpu-idle infrastructure, to take
DMA latency requirements into account when switching to deeper
C-states. He also cited issues with the ipw2100 driver or sound
drivers when C-state latencies are large.
PM QoS first implements in pm_qos_params.c a
list of parameters, which are currently just :
cpu_dma_latency, network latency and network
throughput. These are exported both to the kernel and to
userspace. PM QoS maintains a list of pm_qos requests
for each parameter, along with an aggregated performance requirement
and maintains a notification tree, for each parameter. Inside the
kernel, it provides an API to register to notifications of performance
requests and target changes. To userspace, it provides an interface
for requesting QoS.
When an element is added or changed inside the list of pm_qos of a given parameter, the corresponding aggregate value is recomputed. If it changed, then all drivers registered for notification on that parameter are notified.
From the userspace point of view, PM QoS appears as a set of character device files, one for each PM QoS parameter. When an application opens one of these files, then a PM QoS request with a default value is registered. The application can later change the value by writing to the device file. Closing the device file will remove the request in the kernel, so that if the application crashes, the cleanup is done automatically by the kernel. Mark then showed a simple Python program to use that user interface :
#!/usr/bin/python
import struct, time
DEV_NODE = "/dev/network_latency"
pmqos_dev = open(DEV_NODE, 'w')
latency = 2000
data = struct.pack('=i', latency)
pmqos_dev.write(data)
pmqos_dev.flush()
while(1):
time.sleep(1.0)
Mark Gross then described the in-kernel API. A driver can poll the current value for a parameter using :
int pm_qos_requirement(int qos);
but of course, most drivers will probably be more interested in the parameter notification mechanism. They can subscribe (and unsubscribe) to a notification chain using :
int pm_qos_add_notifier(int qos, struct notifier_block *notifier); int pm_qos_remove_notifier(...);
To create new PM QoS parameters, one will have to modify the
pm_qos_init() code in
kernel/pm_qos_params.c.
After describing the
consumer side of the API, he described the producer
side of the API, that allows to instruct other device drivers to
respect certain latency or throughput requirements (just like the
userspace API presented previouslsy). This API is a set of three
functions : pm_qos_add_requirement(int qos, char *name, s32
value) to add a requirement to a parameter list,
pm_qos_update_requirement(int qos, char *name, s32 value)
to update it, and pm_qos_remove_requirement(int qos, char
*name) to remove it.
At the end of the presentation, he gave the example of using PM QoS within the iwl4965 wireless adapter driver. This is a work in progress that he is working on with one of the iwl4965 developers. The chipset has six high level power configurations affecting the powering of the antenna, how quickly it sleeps the radio and for how long between AP-beacons, so it looks like a good application of PM QoS network latency parameter, he said.
Currently, the power management of this device is device-specific,
through sysfs. Thanks to PM QoS, the driver could simply
register itself for pm_qos notifications of changes to
network latencies requirements, and switch to the corresponding power
management levels when needed. All other network device drivers could
do the same, so that sane user mode policy managers could be written
without knowing the exact power management details of each and every
network adapter. Mark Gross then described some details of the
implementation of PM QoS inside the iwl4965 driver.
Mark sees a lot of possibilities with such a coherent userspace interface. Network shooter games could set network latency to zero to disable power management. A Web browser could set it to two seconds, a instant-messaging client to 0.5 seconds, an user mode policy manager could adjust it when the laptop goes to battery power or switches back to AC power, etc.
In the end, the talk was fairly short, but very interesting and completely in-topic. Some developer invents a new API to solve a problem, and tries to make it known, to allow other developers to use this API in their drivers or applications, and to get feedback from the community. Something that just happened during the long questions and answers session that followed the talk (discussion on the current API, its usage, etc.)
Link to the (45 minutes, 220 megabytes) and the slides.
In this talk, Matt Porter, who
works for Embedded Alley, wanted to explain how one can leverage Free
and Open Source Software in the development of a new product. Everyone
knows that GNU toolchains exist, that we have the Linux kernel and
standard basic root filesystems. But then, what else is there,
wondered Matt Porter ?
In order to make his talk more concrete, he proposed to discuss a case study, and follow the following steps : define application requirements, break down requirements by software components, identify software components fully or partially available as FOSS and finally integrate and extend the FOSS components with value-added software to meet application requirements.
His case study was the development of a Digial Photo Frame (DPF), on of these small devices that allows to display pictures, play music, are wireless connected and look nice and shiny on the dinning room table. The requirements for such a device are clear and concise, he said, making it a good example for his presentation.
His hardware platform is a ARM SoC (with DSP, PCM audio playback, LCD controller, MMC/SD controller, NAND controller), a 800x600 LCD screen, a couple of nagivation buttons, MMC/SD slot, NAND flash and speakers. The user requirements for the DPF device were
Based on these requirements, he established a list of software components that are needed
He quickly covered the obvious components : U-Boot for the firmware, Linux as the kernel, leveraging the SD/MMC, framebuffer, input and ALSA subsystems of the kernel as I/O drivers, use Busybox as the base userspace framework and use OpenEmbedded as the build system.
For the media
event handling, he used udev,
which receives events from the kernel when the SD card is inserted or
removed, creates device nodes according to a set of rules, and then
sends the event to the HAL
daemon. HAL, which stands for Hardware Abstraction Layer, is a
daemon to handle hardware interaction : it knows how to handle
the hardware, and can send events over >D-Bus to
notify other applications, such as the main DPF
application. D-Bus was used in their product, it is an IPC
framework used to implement a system-wide bus on which applications
can communicate with each other. In their case, the communication
between HAL and their application takes place over D-Bus : the
application subscribes to HAL events for the SD card and is notified
when something happens.
The next subject was JPEG picture handling. For JPEG decoding, they used the libjpeg library, and for resize and rotation, they used jpegtran. Dithering was not supported in libjpeg or jpegtran, and instead of writing their own code, they borrowed some code from the FIM image viewer (FIM stands for Fbi IMproved, which is a framebuffer based image viewer).
To support MP3 playing, they used libmad, which runs on ARM and supports MP3 audio decoding for playback. They also used libid3 to handle the ID3 tags and be able to display them on the screen, and libm3u to handle media playlists.
Then, he covered a more specific and technical subject : using DSP acceleration. Using the DSP available in hardware to accelerate JPEG and MP3 processing looks like an interesting option. First, one needs a DSP bridge, and he mentionned openomap.org as a good starting point for that topic. He also mentionned using libelf to process ELF DSP binaries, which allows for pre-runtime patching of symbols and cross calls from DSP to ARM. He said that the general purpose libraries such as libjpeg, jpegtran, FIM and libmad can be ported to run portions of their code on a DSP.
For the 3D graphic
interface, they decided to use Vincent, an OpenGL ES 1.1
compliant implementation. Nokia ported the code to Linux/X11, and it
has been easily modified to run on top of the Linux framebuffer. It
can also be extended in various ways to support hardware accelerated
cursor, floating/fixed point conversions, use GPU acceleration,
etc.
He said that a complete GUI can be implemented in low-level OpenGL ES. Font rendering can be done using the freetype library, and it enables 3D desktop look for the interface. It also makes 3D photo transitions possible : photos are loaded as textures, and transitions are then managed as polygon animation and camera view management. He also mentionned the fact that higher-level libraries such as Clutter can be used on top of OpenGL ES to provide higher-level interface building tools.
Finally, he described the main DPF application, which integrates all the FOSS components : manages media events, uses the JPEG library to decode and render photos, handles Linux input events and drivers OpenGL ES based GUI, manages user-selected configuration, displays photo slideshow using selected transitions.
To conclude, he said that « good research is the key to maximizing FOSS use ». He however warned that many components will require extensions and/or optimization, but that smart use of FOSS where possible will save time, money and speed product to market.
At the end of the first day, some companies and projects have been invited to demonstrate some of their work in the hall next to the main conference room. Your editor found some of these demonstrations particularly interesting.
One person from Futjitsu was demonstrating Google Android on real hardware. They ported Android from the QEMU environment provided in Google's SDK to real environments : Freescale LMX31 PDF, a development board, and Sophia Systems Sandgate3-P, a device which looks like a mix of a phone and a remote controller.
Engineers from Lineo Solutions were demonstrating their work around memory management, and the management of out-of-memory situations. They explored in-kernel memory mechanisms and userspace notifications mechanisms through a signal. The latter sounded particularly interesting, as it allows to notify applications of memory pressure inside the kernel. The application could then free some memory used for temporary caches for example, in order to help the system to recover for the bad situation.
Richard Woodruff, from Texas Instrument, was demonstrating the power management improvements they made to the Linux kernel in order to decrease the power consumption of their OMAP3 platform. They have been able to get very impressive results.
One Hitachi engineer was demonstrating the use of SELinux in Android. SELinux was used to create to operating modes in Android : the private mode and the business mode. In private mode, only personal applications and data are available. In business mode, only business applications and data are available. And the isolation between these two worlds is enforced by SELinux.
Another Hitachi engineer was demonstrating the use of SystemTap in an embedded system. SystemTap was not designed with the cross-compilation and host/target separation problems in mind. So they improved SystemTap to make it more easily usable in embedded situations : the kernel module generate by SystemTap can be cross-compiled, then loaded to a remote target, and the results can be gathered on the host. These improvements will soon be published.
York Sun, from Freescale Semiconductor was demonstrating a new CPU, with interesting framebuffer capabilities. The framebuffer controller is able to overlay in hardware several layers, which is very useful in things such as navigation systems. York Sun gave more details about Linux support of such a framebuffer controller in the talk entitled Adding framebuffer support for Freescale SoCs.
Link to the video (55 minutes, 240 megabytes) and the slides.
The second day started by a conference given by a famous special guest : Andrew Morton. After an introduction by conference organizer Tim Bird, Andrew started his talk entitled The relation ship between kernel.org development and the use of Linux for embedded applications.
His talk was already the subject of several reports, at least one on LWN, by Jake Edge.
Andrew Morton's talk was not technical at all, it rather discussed how embedded companies could participate more in mainline kernel development, what are their interests to do so, and how this can be mutually beneficial to both the companies and the kernel community.
Link to the video (32 minutes, 140 megabytes). Thanks to Jean Pihet, Montavista for recording the talk.
Making a full and complete report of your editor's talk wouldn't be very interesting, so let's let other persons do that. Just to sum up, the talk discussed the following topics :
Link to the video (30 minutes, 145 megabytes) and the slides.
David Mandala gave a not very
technical talk about UME, Ubuntu Mobile and Embedded. He first
described the type of devices targetted by UME : the devices are
called MID, for Mobile Internet Devices. He described them as
« consumer centric devices », « task oriented
devices », offering a simple and rich experience with an
intuitive UI and an "invisible" Linux OS.
He then described Ubuntu Mobile & Embedded as a completely new product based on Ubuntu core technology. It incorporates open source components from maemo.org, adds new mobile applications developed by Intel and adapts existing open source applications for mobile devices. The challenges for UME are mainly that applications can't fit on small screens and that applications are designed for keyboard and mouse, not fingers and touch screen. The big focus of UME is on these two problems, not on other embedded related issues such as system size, boot time, memory consumption, port to other architectures, etc. This is a point that has been raised by the Rob Landley at the end of the talk, and it seems that at the moment, these topics are not in the radar of the UME project.
David Mandala listed the differences between UME and the standard Ubuntu desktop : GNOME Mobile (Hildon) is used instead of the standard GNOME desktop, applications are optimized to fit in 4.5" to 7" touch LCD, optimizations for power consumption (with a reference to the LPIA acronym, which seems to stand for Low Power on Intel Architecture), built-in drivers for WiFi, WiMax, 3G and Bluetooth. The size of the system will be around 500 megabytes, it targets devices with more than 2 gigabytes of Flash. Not something we can call resource-constrained.
The global architecture of Ubuntu Mobile is similar to a normal Linux desktop : the kernel with its drivers, X11 with Cairo, Pango, OpenGL, a networking layer, basic frameworks like Gtk, HAL, D-Bus, Gstreamer, and then application for PIM, e-mail, web browsing, instant messaging, etc. David Mandala also mentionned the problem of proprietary applications with redistribution restrictions, such as a Flash player and video codecs.
Mandala mentionned the Moblin website, « a place for specific Intel software for MIDs ». The projects focus on things such as an image creator, a power policy manager and a web browser. Ubuntu Mobile integrates applications and solutions from the standard Ubuntu, from Moblin and from GNOME Mobile.
Canonical representative
then discussed the community they are building around Ubuntu
Mobile. It works pretty much like the standard Ubuntu community :
transparent community process, a MID/Mobile track at the Ubuntu
Developer Summit, a code of conduct, open and transparent community
councils and boards, use of launchpad.net, etc. Canonical will
dedicate a three person team to the mobile community, and they will
engage with upstream communities to work with them in improving mobile
solutions.
It is also worth noting that LWN published a short report of this talk.
Link to the video (28 minutes, 159 megabytes) and the slides.
In this talk, Justin
Treon, from Numonyx, explained how he hacked into a phone running
Linux and how he modified it to use «Phase change memory». He first
explained how he managed to get serial and JTAG working, then how he
reduced the amount of SD-RAM from 48 megabytes to 32 megabytes
(because of the use of Phase Change Memory, less RAM is actually
needed to run the same system and set of applications).
Phase change memory, or PCM in short, is a type of non-volatile memory, that combines advantages of the exising types of memory without having their drawbacks. PCM allows execution in place, like NOR flash, it is fast to write, like NAND flash, and doesn't require erase and can be modified on a bit-by-bit basis, like RAM. Using PCM greatly simplifies the software stack (no need for a Flash Translation Layer, for erase, block management, garbage collection), and improves system performances, he said. PCM is backward compatible with Flash, it supports traditionnal erase and write commands. But it also offers new commands like «Bit-Alterable Write One Word» and «Bit-Alterable Buffer Write» with which block erasing is not needed anymore.
Justin Treon then explained
how he hacked the Flash code of the Linux kernel to support PCM. His
modifications are very hacky at the moment (direct hack of
mtdblock), but he wants to improve them in the
future.
Link to the video (47 minutes, 261 megabytes) and the slides.
Grant Likely, who works for Secret Labs Technologies Ltd., is an experienced kernel developer of the PowerPC port&nsbp;: he works on the device tree and gave a talk entitled A symphony of flavours: using the device tree to describe embedded hardware, to which I unfortunately couldn't attend due to the three simultaneous tracks of ELC. However, in this talk, Grant wanted to share his experience with running Linux on FPGA.
At the beginning of his
talk, he first gave some context on running Linux on FPGA. He first
presented the typical architecture of a System-on-Chip (SoC) : in
a single chip, one has a CPU, an interrupt controller, a memory
controller and several peripheral controllers, such as Ethernet MAC,
UART, GPIOs and other external buses. On his diagram, all that stuff
fitted inside a big grey box representing the chip, and was connected
to external boxes (DDR2 RAM, Ethernet PHY, serial transceiver,
etc.)
Then, he showed a "FPGA
system". The big grey box is now completely empty. One has to
implement everything inside it, using specific languages such as VHDL or Verilog. This is nice,
because it's very flexible : the full chip can be completely
programmed in a custom way.
However, people using
FPGA soon discovered that they were often implementing the same
blocks, and that they could benefit from having this block directly in
hardware. It could make these blocks faster with reduced consumption
of programmable gates. Grant presented the architecture of a FPGA
with higher cool factor, the Virtex
4FX FPGA system. It's a FPGA in which one or two PowerPC processor
blocks, two or four Ethernet MACs, and between 0 and 24 RocketIO
serial transceivers are implemented in hardware (see this product
table). These are fixed in hardware, they cannot be changed. But
they are available in the same chip as the normal FPGA, which can be
programmed for custom applications. The rest of Grant's talk focused
on running Linux on the PowerPC, not on the FPGA itself, because as he
said, he is not a FPGA engineer.
Grant then presented the status of Virtex FPGA Linux support. There is basic support in mainline, for serial ports, ML300/403 framebuffer and SystemACE device. The extra drivers are available in Xilinx public git tree : Ethernet devices, DMA, I2C, GPIO, Microblaze support. At the time of the talk, this git tree was uptodate with 2.6.24-rc8, but some rewrite work was needed before mainline merge.
The first lesson learned that
the speaker wanted to share with the audience was summarized by
Don't make developers lives hard. As his slides say, hardware
engineers don't like to compile kernels, and software engineers don't
like to synthesis bitstreams. He explained that when doing development
on a FPGA, the peripherals addresses and configurations can be changed
at any time by the hardware people. At the beginning, the synthesis
process generated a file with addresses definitions, which was then
included by the Linux kernel to know how to compile the drivers
properly. Which means that anytime an hardware engineer wanted to run
Linux on a modified FPGA, he had to recompile the kernel (which he
doesn't like to do). This is where device tree comes into
play. It is basically a file that one can give to the kernel at boot
time (and not compile time), and that lists the configuration for the
various peripherals of the system. This file can for example be
generated by the synthesis process of the hardware engineers, so that
they don't have to mess up with kernel compilation anymore. And
software engineers don't have to mess up with bitstream synthesis
anymore.
The second lesson learned was to get the drivers into mainline, and Grant referred to Andrew Morton's talk in the morning, that just said the same thing. Grant Likely said « You're not doing anything that novel anyway. No: you're really not », trying to fight the usual "intellectual property protection" argument.
The third lesson learned is that with FPGA, «hardware is the new software», so that one should follow software best practices : revision control, automated builds and peer review. Grant even suggest to let the software people have a look at the hardware design, and vice-versa.
The fourth lesson is that when you have a problem, it « really might be a hardware bug ». Grant advices to talk your hardware engineer immediatly when you have problems, because he is able to probe any signal inside the FPGA design.
The fifth lesson is to not spend all the budget on "boring" stuff, such as getting PCI, USB, Ethernet or serial working. Grant Likely cited Matt Mackall who said : « if your vendor isn't pushing stuff to mainline, go beat them up ». You shoud probably be spending your time on interesting stuff, such as developing the custom application logic you want to put inside the FPGA. So Grant suggests to choose careful the platform at the design stage by making sure that Linux support is correct, and that viable device drivers are available. He mentionned the experience of the Cypress c67x00 USB driver that he had to develop. It took three months for something that was absolutely not directly interesting for the project. But the piece of hardware was here, in their design, he had no other choice that developing the driver for it.
The sixth lesson learned is
very classical to software engineering : make things work first
before you try to optimize them (to make it faster, smaller, or more
clever).
The seventh lesson is to prepare for dynamic hardware in the kernel. When working with FPGA, one should expect things to change, at a much faster rate than with SoCs.
The next lesson was entitled « User space sucks ». Grant thinks that it's easy to cross-compile kernels, but that cross-compiling userspace is hard. So he suggests to get the userspace problem solved early.
After that, the question and answer session started. The first question was concerns about the increase in boot time caused by the use of the device tree. Grant said that using the device tree is mandatory, but one could still limit its use to only a few things. But he was a bit skeptical about the fact that the device tree is actually responsible for increasing the boot time. He suggested making some measurements, because he always found the kernel uncompression step to be the biggest time consumer in the booting process of the kernel.
During the discussion, he
used the term «SystemACE», and one person of the audience asked what
SystemACE was, so Grant started an explanation about what it actually
is. The SystemACE is a separate chip, next to the FPGA. On one side,
it has a Compact Flash interface. On the other side it has a 8-bit bus
connexion and a JTAG connexion with the FPGA. When the board is
powered up, the SystemACE reads data files from the Compact Flash and
pushed that data stream to the FPGA to configure it. After boot, the
SystemACE can also be used as an interface for the FPGA to read more
informations from the Compact Flash. The SystemACE mechanism is also
documented on the Xilinx
website. It is not the only solution to configure the FPGA at boot
time, CPLD are also commonly used. The data file used by the SystemACE
is actually a list of JTAG commands, so one can actually use it to
push the bistream to the FPGA, but also to load the kernel to memory
for example (but this is slow because of the limited JTAG speed, Grant
said). Following a question for the audience, Grant suggested to have
a normal Flash connected to the FPGA. The SystemACE is used to load a
simple loader, which will then run on the FPGA and load the kernel
from Flash.
The next question was how to handle, from the Linux perspective, the flow of data to the FPGA and out of the FPGA, considering the fact that this flow of data is usually very high-speed, but that the CPU doesn't need to touch it. Grant then offered an interesting view of a hardware architecture that allows to transfer large amount of memory from a high speed source to the DDR, using MPMC, the Multi Port Memory Controller, which is available on Xilinx chips. Grant then offered very technical and precise recommandations on how to handle that from Linux perspective. I recommend anyone interested by the details to look at the video of the talk.
Link to the video (113 minutes, 694 megabytes) and the slides.
During this two hours
tutorial, Mike Andersen first described the development of a simple
character device driver, and then to the debugging of the Linux kernel
and Linux kernel modules using a JTAG. He described what JTAG are,
what kind of hardware and software you need, how you can use them with
gdb, how you configure them. This tutorial is a very good introduction
to the use of JTAG devices for those who never and only rarely used
that kind of debugging technology.
However, as this talk is a tutorial with lots of live demonstrations, it's probably not worth making a full report of it. Your editor rather suggests the reader to directly look at the video. Mike Anderson speaks very clearly, with a loud voice, making his tutorial very easy to understand, even for non-native english speakers.
Day 2 ended with the usual
social event for such conferences. It started with a nice barbecue in
the garden next to the Computer History Museum building. The
conference attendees were able to prolong their discussions around
tables, with lots of meat, wine and beer. After that barbecue, the
attendees were invited to the Mountain View Laser Quest, on the other
side of the street, to get some fun fighting with laser guns. Laser
Quest employees were a bit puzzled by the nicknames choosen by the
participants : fbflush, sbin init,
dev zero, kill -9 or rm -rf
/. Such social events are always a nice addition to the
conference in that they allow to create more contacts with the other
attendees.
Link to the video (44 minutes, 139 megabytes).
Your editor thought
that sharing the video of such a talk with the community would be very
interesting, and Kevin Hilman, a colleague of Deepak Saxena at
Montavista, kindly accepted to record the talk. Thanks !
During that talk, your editor attended the Adding Framebuffer Support for Freescale SoCs, detailled below.
This talk followed the demonstration made by York Sun of a new Freescale CPU with impressive framebuffer capabilities, the MPC8610. It is an high-performance chip with interesting controllers, but the controller that was discussed during this talk was the LCD controller. The controller is able to do real-time blending of up to three planes, and handle transparency between the planes. Inside each plane, several non-overlaping windows can also be created to render different applications, videos or pictures. The chip is able to display at 1280x1024, 60 Hz with a color depth up to 24 bits.
York Sun described the implementation of the Linux framebuffer
driver for such a chip. He decided to export several framebuffer
devices to userspace, one for each plane /dev/fb0,
/dev/fb1 and /dev/fb2. The former is the
main plane, while the last two are the secondary planes. An
application can render to any of these planes, the hardware will do
the blending in real-time magically. However, there are still some
differences between the primary plane and the secondary
planes. System-wide configuration can only be made on the primary
plane, for example.
This talk was interesting because it was an illustration of new hardware capabilities that creates new challenges for Linux device drivers. The existing frameworks always have to be redesigned, refactorized, to take into account new hardware capabilities.
Link to the video (54 minutes, 160 megabytes) and the slides.
In this talk, Jong-Sun Kim, with a very light voice making the talk difficult to understand, made a report of his work on MIPS back-tracing, which happens to be a complex topic.
Of course, everyone knows
that backtracing is very useful for debugging. However, backtracing
facilities such as gcc's __builtin_return_address() or
glibc's backtrace(3) or backtrace_symbols(3)
are not available on MIPS. Jong-Sun then described a typical
real-world MIPS stack frame in order to explain why back-tracing on
MIPS is a difficult thing : there is no easy way to get the
address of the caller stack frame.
So, the only solution, detailled by Jorg-Sun, is binary code scanning. The speaker presented typical function prologue and epilogue code, and the backtracing procedure that can be used on MIPS (both in english and in C language). The procedure scans the prologue of the function to get informations from the instructions themselves on the stack frame size, and deduce the location of the caller stack frame from that.
Then, Jorg-Sun presented the challenges posed by back-tracing inside a signal handler. The execution context of a signal handler is a bit special, and adaptation of the back-tracing procedure has to be done to handle this case properly.
However, these procedures are not perfect : leaf functions (that do not save registers) or assembly-coded or higly-optimized functions can have non-typical prologue code that could defeat the proposed back-tracing procedures. The speaker then showed some demonstrations of his back-tracing procedures, and said that he is currently working on releasing these functions as an open-source library or inside the MIPS port of the C library.
After this presentation, the questions and answers session started, to which the speaker didn't really participate, probably due to language understanding problems. It happened that in the audience, four different people already made four different implementations of back-tracing procedures for MIPS. Some of them claimed to have better implementations than the one proposed by Jorg-Sun (for some obscure details that your editor couldn't get), but none of these implementations are currently released or available as part of the C library.
After the long discussion on what to do about back-tracing on MIPS, the speaker went back to his presentation with an interesting appendix on a crash report system used in LG Electronic products. The goal of this system is to guarantee that in-time informations about system crashes could not be lost. Their system include a watchdog, which on expiration, will trigger a in-kernel procedure that will store the contents of the circular log buffer used for the console and in-time debug informations into a NVRAM, for later debugging. Jorg-Sun then explained the implementation details of their solution and showed an example of its use.
Link to the video (57 minutes, 160 megabytes).
To start the talk, Takanari by
giving a short presentation of DirectFB. It is a lightweight and
small footprint (< 700 kilobytes on SH4, the architecture of interest
for the presenter). It doesn't have any server/client model like X11
has. DirectFB offers an hardware abstraction layer for hardware
graphics acceleration : anything not supported by hardware will
still be supported by software. DirectFB has multi-process support and
other things, such as a built-in window manager and more.
The first embedded chip supported by the mainline version of
DirectFB was Renesas
SH7722.
The speaker then detailled the architecture used to support
this device in DirectFB. At the lower level, one find the
hardware : video memory and hardware, and video hardware
accelarator. On top of that, they run an unmodified Linux kernel to
which they add a kernel module that handles the video hardware
accelarator. It receives commands from an userspace part, integrated
as a module in DirectFB, called the gfxdriver, which is
specific to a given video hardware. They also use the already existing
devmem system module of DirectFB that allows to access the
video memory using the /dev/mem character device. So all
what they had to implement was a kernel module and an userspace
gfxdriver, all the rest was existing code.
Takanari then presented
important terms in DirectFB terminology. From his slides :
DirectFB is composed of several modules that can be extended in an
object-oriented way : system modules, graphic drivers, graphic
devices, screens and layers. A system module should implement the
functions of the CoreSystemFuncs structure declared in
core_system.h and should be declared with
DFB_CORE_SYSTEM(). A graphic driver should implement the
GraphicsDriverFuncs structure declared in
graphic_driver.h and be declared using
DFB_GRAPHICS_DRIVER(). A graphic device object should
implement the methods of the GraphicsDeviceFuncs
structure (in gfxcard.h) and be registered inside the
driver_init_driver() method of
GraphicsDriverFuncs. Screens objects should implement the
ScreenFuncs methods (in screens.h) and be
registered with dfb_screens_register(). Finally, layers
objectfs should implement DisplayLayerFuncs methods (in
layers.h) and be registered using
dfb_layers_register(). At this time of the talk, things
were a bit confused in your editor's mind, which is only a beginner in
DirectFB internal concepts. Fortunately, during the rest of the talk,
Takanari explained the role of each module and how to actually
implement them.
He first discussed the system module. A system module provides
access to the hardware ressources (framebuffer and hardware
management). As of DirectFB 1.1.0, several system modules are
available : fbdev (the default, which uses the kernel framebuffer interface
through the /dev/fb* devices), osx,
sdl, vnc, x11 and
devmem. The system module to use can be specified in the
directfbrc file using system=devmem for
example.
For embedded systems, the devmem system module, merged
in DirectFB 1.0.1 is particularly interesting. It uses
/dev/mem to access graphics hardware and
framebuffer. According to the speaker, it's a convenient way for those
using memory mapped I/O and uniform memory among CPU and graphics
accelerator, and it seems that most embedded devices fall into this
category. When using devmem, one must specify additional
parameters : video-phys, the physical address of the
beginning of the video memory, vido-length, its length,
mmio-phys, the physical address of the beginning of the
MMIO area used to control the graphics hardware,
mmio-length, its length and accelerator, a
ID used by DirectFB core to select the graphics driver. These values
have to be specified in the directfbrc configuration
file.
Takanari described how DirectFB matches systems and
gfxdrivers. The DirectFB core calls the
driver_probe() method implemented in each
gfxdriver to ask each driver whether they support a particular
piece of detected hardware. If supported by the gfxdriver, the
driver_probe() method should return a non-zero
value. When using devmem, there is a special case :
the value passed to driver_probe() is the value passed
using the accelerator parameter. So one has to make sure
that they match.
Then, Takanari went to the core of the talk, writing the graphics
driver. It consists of several components : the graphics driver
module, the graphics device module, the screen module (optional for
fbdev, but mandatory for devmem) and the
layer module (also optional for fbdev, but mandatory for
devmem). To get your graphics accelerator working, this
is the code you must write, he said. One can use devmem,
so that there is no need to write a kernel framebuffer driver usable
through fbdev.
First, the graphics driver. At the beginning of the file, give a
new to your graphics driver, using
DFB_GRAPHICS_DRIVER(yourname). After that, DirectFB
expect to find six functions. First, driver_probe()
(detailled earlier), driver_get_info() to get
metainformations about the driver, driver_init_driver()
and driver_close_driver() to initialize and close the
driver. driver_init_driver() is responsible for acquiring
all the hardware resources (setting up the mappings, etc.) and then to
register screens and layers. It is also responsible for setting
pointers to acceleration functions in a
GraphicsDeviceFuncs structure. Finally, one must define
driver_init_device() and
driver_close_device() to initialize and close the
device. In driver_init_device() you must for example set
the device capabilities, so that the DirectFB core knows what the
device can do in hardware, and what it will have to do in
software.
Takanari then detailled again the initialization steps of a gfxdriver. From his slides :
driver_probe() in each
gfxdriver on the system with a graphics device identifier to
find the corresponding gfxdriver for the devicedriver_probe() returns non-zero, then DirectFB
calls driver_init_driver(). In this function, the
driver should register graphics device functions, screen and
layers.driver_init_device(),
in which the driver should set the capabilities supported by the
device in a GraphicsDeviceInfo structure.The GraphicsDeviceFuncs structure lists the functions
supported by the driver, and are set during
driver_init_driver() (see src/core/gfxcard.h
for the definition of this structure and many other important
structures). The driver developer do not have to set all the
functions, only the one for which you want a specific
implementation. According to the speaker, the most interesting and
important ones are : reset/sync of graphics accelerator
(EngineReset(), EngineSync()), check/set
state of the graphics accelerator (CheckState(),
SetState()) and blitting/drawing functions
(Blit, StretchBlit(),
FillRectangle(), DrawLine(), etc.). In
total, there are twenty-two functions that can set through this
GraphicsDeviceFuncs structure.
The acceleration process was then
described by Takanari. DirectFB starts by calling the
CheckState() method to ask the driver whether it is
possible to execute a specific operation with a specific state. The
driver can either answer that it supports the operation, or not. If
not, DirectFB will fall back to software rendering. If the hardware
supports the operation, then DirectFB calls SetState(),
which gives the opportunity to the driver to program the hardware for
the execution of a given operation in a given state. Once done,
DirectFB finally calls the appropriate drawing/blitting function, such
as Blit(). As Takanari explained, thanks to this modular
approach to acceleration support, one can start with a very basic
driver with no acceleration and incrementally add support for the
acceleration of the different operations, one by one.
The speaker then mentionned the possibility of queuing draw/blit
commands, if the graphics accelerator supports this. If so, you can
queue the draw/blit operations as much as you can and then kick the
hardware. This is implemented through the EmitCommands
operation, and an example of this is visible in the
sh7722 gfxdriver.
Now that the graphics driver and device modules have been covered,
Takanari went to the discussion of the screen module. A screen
represent an ouput device, such as an LCD. For a fixed size screen,
the minimum function that the driver developer has to define are
InitScreen() and GetScreenSize. The screen
operations must be listed in a ScreenFuncs structure (see
src/core/screens.h for the definition), and registered by
the driver_init_driver() operation using the
dfb_screens_register() function.
Then, layers. They represent independent graphics buffer, and they
are converged by the hardware when they get displayed on the screen
(usually with alpha blending). Layers are required to be able to
change the size, the pixel format, the buffering mode, the
colour-lookup table (CLUT) and to be able to flip buffers. The layer
operations must be implemented and listed in a
DisplayLayerFuncs structure (see
src/core/layers.h for the definition), and registered by
the driver_init_driver() operation using the
dfb_layers_register() function. The important display
layer operations are : LayerDataSize() (returns size
of the layer data to be stored in shared memory),
RegionDataSize() (returns size of region data to be
stored in shared memory), InitLayer() (initialize layer),
TestRegion() (check if given parameters are supported),
SetRegion() (program hardware with given parameters),
RemoveRegion() (remove the region),
FlipRegion() (flip the frame buffer).
To finish the talk, Takanari gave detailed informations about
surface allocation. DirectFB 1.0 used a single one-dimensional linear
Surface allocator. Using the DCAPS_VIDEOONLY flag, one
could request a surface in the form of a contiguous memory block, and
with DCAPS_SYSTEMONLY, the surface is allocated using
malloc(). For embedded graphics accelerator, Takanari
said that you're likely required to use physically contiguous memory,
so that DCAPS_VIDEOONLY is the only solution. The only
way to customize the surface allocation was through the Layer Driver
API, mostly for primary surfaces. Takanari then gave examples of cases
where you would want to customize the surface allocation, and
explained how this can be done in DirectFB 1.0. Then, he introduced a
new concept that has been added in DirectFB 1.1 : surface
pools, which greatly simplify the topic of surface
allocation. Takanari then detailed this new concept, and suggested to
read the documentation of DirectFB Surface Pools.
Link to the video (49 minutes, 141 megabytes) and the slides.
Matthew Locke works for Embedded Alley, so he is a colleague of Matt Porter, who gave a talk on the first day of the conference about leveraging existing free and open source software in embedded projects. Matthew's talk is a very good presentation of OpenEmbedded, with step by step details on how to use.
Matthew started his talk with
a bit of background on OpenEmbedded. It has been started inside the
OpenZaurus project in order to easily be able to build applications
for Zaurus PDAs. After some time, the build tool was rewritten to
seperate the build tool for the metadata describing how to build the
various applications. The build tool, named bitbake is
based on concepts found in Gentoo's portage
tool. OpenEmbedded is now used by many open source projects :
handhelds.org, Linksys routeurs, Motorola phones, MythTV hardware and
more recently OpenMoko.
The speaker then defined OpenEmbedded as a « self contained cross build system for embedded devices ». It contains a collection of recipes that describes how to build thousands of packages including bootloaders, kernels, libraries and applications. It targets more than 60 machines and provide over 40 package/machine configurations, which are basically custom distributions for a given device. OpenEmbedded doesn't contain the source code for the applications, it fetches it from tarballs or SVN thanks to instructions in the metadata. At the end, OpenEmbedded outputs individual packages and filesystem images (jffs2, ext3, etc.).
The build philosophy of OpenEmbedded is to build from scratch. By default, it builds the latest version of all components by downloading their source code and possibly applying patches to them.
OpenEmbedded is implemented using bitbake, whose role
is to parse recipes and configurations, to create a database of how to
tech, configure, build and install each package, to determine the
dependencies between package, to build them in the correct order, in
parallel when possible. It uses the IPK packaging format for
individual packages.
To setup OpenEmbedded, one
should first decide which metadata version to use (the latest one or a
stable snapshot), then install bitbake. The speaker
suggests to setup a pristine OpenEmbedded directory, and to not make
your changes directly inside this set of reference metadata. There a
mechanism called overlay that allows to keep to modifications
separate from the code OpenEmbedded. The speaker also suggest to set
up an internal mirror of the upstream softwares you're using, so that
you are sure of the version you will be using and there won't be any
surprise such as upstream server downtime, etc. The switch to an
internal mirror is made very simple in OpenEmbedded.
bitbake parses all configuration and recipes files
found in the directories listed in the BBPATH environment
variables. So one can setup an overlay directory that will contain
specific configuration files, internal packages metadata and that
might overload any pristine metadata. Then, BBPATH should
include two directories : the one with the pristine OpenEmbedded
metadata, and the directory with the overlay information. The overlay
directory should contain two subdirectories : conf/
with custom and overloaded configuration files and
packages/ with internal and overloaded package
files. Using that you can for example overload the default Busybox
configuration provided in the pristine OpenEmbedded metadata by your
specific Busybox configuration.
The configuration files define how the build environment is setup, package versions, information, global inheritance, target boards, final image configuration, etc. There are four types of configuration files, and Matthew went through all of them one by one to give an idea of what they are useful for.
The first configuration file is the distro configuration file. It defines the toolchain and package versions, the package configuration and high level settings such as whether udev should be used or not, and final image format. Matthew detailled a specific example of such a configuration file, which simply defines a set of variables.
The second configuration file is the machine configuration file, defining board specific versions and features : architecture, compiler options, kernel version and package provider and board specific things. Again, the speaker gave an example in which one could see the architecture specification, the specification of the udev version and the kernel version, and a list of features that should be included (alsa, host-usb, gadget-usb, mtd, wifi, etc.). That list of features will trigger the build and inclusion of various packages into the final image.
The next configuration file was the recipe files, with the
.bb extension. They contain the necessary information to
build a package, in the form of functions do_fetch(),
do_stage(), do_configure(),
do_compile(), do_install(). There are also
four types of bb files. First classes, that define
common steps for a class of package (for example all kernel packages
share the same building procedure). Then packages, that usually
inherits classes, adds or overrides package specific settings and
steps. The next type of bb files are tasks, that
defines the collection of packages to be built. And the last type,
images, allows to create filesystem images out of tasks.
Matthew then went into more details about tasks. They allow to
divide packages into logical groups, to enable developers to work on
building blocks or to separate production and development. Typically,
one could define four tasks : base (with the base
system, kernel, glibc, busybox, etc.), core (with the
core open source middleware components), apps (with the
product applications) and UI (with the interface specific
components). Of course this is completely flexible.
A task file for base could look like :
RDEPENDS = "\
${@base_contains("DEVEL_FEATURES", "alsa", "${ALSA_PKGS}","",d)} \
base-files base-passwd busybox-devel \
kernel kernel-modules \
initscripts sysvinit udev \
${@base_contains("DEVEL_FEATURES", "mtd", "mtd-utils", "", d)} \
${@base_contains("DEVEL_FEATURES", "wifi", "wireless-tools", "", d)} \
dropbear \
"
It gives a list of packages, with some packages being included only if some features have been enabled in the machine configuration file.
The speaker then gave a list of advises on how to use OpenEmbedded
in the development of a commercial product. First, he suggested to
make local copies of open source components as tarballs on a local
server. He also thinks that locking down the versions of the
components in a Bill of Materials configuration file is a good
idea. This configuration file, using the
PREFERRED_VERSION_<pkgname> = <version>
directive, makes sure that the system is built with the correct
packages version. Then, he suggests to create metadata for the
internal components as well, to use parallel build to speed up the
build and to create and distribute prebuilt build environment
(SDK).
Once that the base system can
be built using OpenEmbedded, the custom applications still have to be
developed. The speaker gave some tips on how to do that with
OpenEmbedded. There are basically two ways. The first one is to use
OpenEmbedded directly during application development : create a
bb recipe for the applications, keep the revision control
system updated with the changes you want to test, build using
bitbake <packagename> and integrate into the system
by adding the package in the appropriate package file. Powerful
method, but with drawbacks : your application is completely
recompiled everytime, and you must commit your changes to the SCM to
be actually able to test them. The second method is to export the SDK
from OpenEmbedded : setup OpenEmbedded to export the toolchain
and libraries to an environment that is independent from OpenEmbedded,
then build your applications the usual way from your local sources,
and only integrate them into OpenEmbedded when they are ready.
Finally, Matthew described the result of an OpenEmbedded build. It
creates several directories, the interesting ones being
conf, the build specific configuration files,
deploy, the images (in the images/
subdirectory) and individual packages (in the ipk/
subdirectory), staging, that contains the intermediate
install for libraries and headers, work, which is the
build directory, cross that contains the host tools for
the target and rootfs, which contains the expanded root
filesystem generated by OpenEmbedded.
Matthew concluded his talk by describing OpenEmbedded as a « very powerful metadata system », with many advantages : layered design that eases customization, supports commercial software development easily, many supported packages, very flexible, large community using it and maintaining it. However, he admitted that the learning curve is quite steep, and that finding a version of metadata that "just works" can be a challenge.
The resources pointed by the speaker are the OpenEmbedded official website, the Bitbake manual and the OpenMoko Wiki, which contains a lot of information on how to build a complete software stack with OpenEmbedded.
In the end, your editor found this talk particularly interesting. He was found of Buildroot, another tool with similar capabilities, but discovered that OpenEmbedded had some real advantages over Buildroot, and that it was probably worth spending some time testing and playing with OpenEmbedded.
Link to the video (27 minutes, 190 megabytes). The slides can be found on elinux.org.
In this talk, Guido Madaus and his colleague, two german developers, presented Disko, a framework to develop multimedia applications for digital media devices. The project started under the name MorphineTV, which is an application to use inside set-top boxes. After some time, the developers figured out that a more generic framework, allowing others to create custom applications and plugins, would be interesting, and the result of this effort is the Disko framework. It allows to create GUI applications for multimedia devices using a simple XML language. The applications are then fully skinable and themable.
Link to the video (49 minutes, 112 megabytes) and the slides.
In his conference closing keynote,
Tim Bird starting by giving a list of kernel highlights, things that
happened over the last year and that will probably happen in the
coming months, with of course a focus on embedded related
features. So, for 2.6.24, he highlighted :
2.6.25 which was released just during the conference, so that Tim wasn't even aware of it when starting the talk. But of course, the well-informed audience noted that and told Tim Bird the good news. Tim highlighted the following features :
On the radar, Tim seems two interesting things. First, the latency
trace system, based on the gcc -mcount feature, which
support multiple tracers and is based on the latency tracer available
in the RT tree. And also mem_notify, which allows
processes to avoid the OOM killer by responding to events and
shrinking their memory usage voluntarily (see this LWN article).
He then mentionned the Technology Watch List maintained by the CE Linux Forum. It's a list of technologies that they are interested in and that they are watching, like the Kernel Weather Forecast, but with an embedded focus. This page is available at http://elinux.org/Technology_Watch_List on the elinux.org Wiki. The list of technologies in the list is very impressive, and Tim went quickly over it with a few slides.
First preoccupation, the kernel size. Tim of course mentionned the Linux-Tiny work, which is now maintained by Michael Opdenacker, from Free-Electrons, and the CE Linux Forum is contracting Free Electrons for this work. He mentionned the patches being mainlined by your editor, who also works for Free-Electrons, and suggested to see http://elinux.org/Linux_Tiny_Patch_Details for more informations about the patch. Then Tim mentionned the kpagemap work done by Matt Mackall, founded by CELF and merged in 2.6.25. Matt Mackall also released Bloatwatch 2.0, a tool to show kernel size regression that has been covered in his talk Kernel Size Report during the conference.
Tim gave some details about kpagemap. kpagemap allows
to have details about every allocated page in the system, and allows
introduce new metrics to measure the memory consumption of userspace
applications. The existing metric, RSS, is not convenient
because it counts memory consumed by shared pages for all processes
mapping these pages. The new metrics, PSS, for
Proportional Set Size and USS for Unique Set
Size should give a better idea of memory consumption. For shared
pages, PSS divides the memory consumption by the number
of processes actually mapping these pages. USS simply
doesn't count shared pages. Tim gave some interesting pointers :
an ELC
presentation, a LWN
article and a visualization tool.
The next topic was
filesystems. First SquashFS, the famous compressed read-only file
system with better compression than CramFS. It is actively maintained
(last release in February 2008), but still not mainlined. The main
developer, Philip Lougher, injured his hand, so he cannot work too
much on the project, so some help would be appreciated to get that
filesystem merged in the kernel tree. Second filesystem, AXFS, covered
by a talk by Jared Hulbert during the conference. It is an advance XIP
file system in that it can profile applications and then use only XIP
on some blocks. It allows fine-grained control over how much flash
vs. RAM is used for an application set. A mainline merge was requested
in summer 2007, the main developer said that he would try again
soon. Third filesystem, LogFS, also covered by a talk from Jörn Engel
during ELC. It's a flash filesystem that solve the scalability
problems of JFFS2. It reduces the memory consumption and the mount
time when compared to JFFS2, but still has some outstanding problems
to be solved. Tim mentionned that CELF is paying for work on this
filesystem. Last filesystem, UBIFS, which is built on top of UBI, a
new flash layer merged into the kernel. UBIFS inclusion into mainline
has recently been requested by Nokia. See this
whitepaper and this LWN
article for more informations about UBIFS.
Tim Bird is also very interested by tracing solutions. On the LTTng front, the markers infrastructure has been merged in 2.6.24, and the next thing to merge is the core of LTTng. The markers are a infrastructure for static instruction, so they do not compete with Kprobes which allows dynamic instrumentation. The goal of the kernel markers is to have a very low overhead when not in use, thanks to the use of immediate values. Tim Bird also mentionned SystemTap and the work done by Lineo engineers to adapt it to cross-compiled environment (see our demonstrations report), Kernel Function Trace which is now maintained by Nicholas McGuire, and printk-times architecture support.
Next topic was security, with discussion on Tomoyo Linux and App Armor, both of which are still out of the mainline version, and Smack which is now part of 2.6.25. He also mentionned the work on Embedded SE Linux, and the talks given by Nakamura and Kohei during the conference. SE Linux requires a filesystem with extended attributes support, and usually comes with enormous configurations. People were able to reduce this to only 700 kilobytes, which makes SE Linux usable in some embedded contexts.
On the power management front, PowerTOP was mentionned, but also PM QoS, merged in 2.6.24 and the work of Wolfson Electronics on voltage and current regulator stuff.
In the real-time area, the high resolution timers have been merged
to 2.6.21, but some work is still needed on some architectures. Large
pieces of linux-rt still remain to be merged :
thread interrupts, sleeping spinlocks and latency tracer, amongst
others. As said previously, work to mainline the latency tracer is
currently being done.
Tim said that this list should of course be updated with the progress of the different projects. The fact that the page is on the Embedded Linux Wiki will probably help.
Then, some other topics of
interest to CELF members were briefly presented : bootup time,
system size, licensing, graphics (with an interest in GstOpenMAX),
middleware (discussion about DLNA), mobile phone stack wars (with
Android, LIMO and the nw ARM Ultra-Mobile PC initiative).
At the end of the talk, Tim Bird gave a quick presentation of the CE Linux Forum, which is focused on the advancement of Linux as an open source platform for consumer electronics devices. He has been founded in june 2003 and now has about 50 member companies, with Panasonic, Sony, Hitachi, Toshiba, Sharp, Philips, Samsung, NEC, IBM, etc. Interestingly, more than half of CELF members are in Asia, around a third in the US and around ten percent in Europe. There is an almost equal representation of Consumer Electronics players, Semiconductor players and Software players at the CELF. The CELF does some technical work, through workgroups, contract work, conferences, technical output and special projects. There are many workgroups in CELF, about audio video and graphics, boot technologies, digital television, memory management, power management, real time, security, system size, etc. Tim had some slides about them, but skipped them to make the presentation shorter.
Tim highlighted the contracted work done by Matt Mackall, Matt Locke, Bill Traynor, Michael Opdenacker, Nicholas McGuide and Jörn Engel. They are ready to fund other projects of interest for Linux on embedded devices. CELF also organizes or is present at several conferences : Embedded Linux Conference, Ottawa Linux Symposium, Regional Jamborees, ELC-Europe and Japan trade shows. The next ELC-Europe conference will take place on November 6th and 7th in Ede, The Nederlands.
Before closing, Tim again said
that the elinux.org was open to
contribution, and that it should become a central place of information
for the use of Linux on embedded devices. Finally, Tim said that
« Linux is destined to dominate the embedded market, so let's
have fun doing it ! ».
After the talk, a game was organized, at the end of which one person could win a Nokia Internet tablet. The first stage of the game consisted in solving Tango puzzles. After that stage, three persons were selected for the second stage : Jörn Engel, Matt Mackall and Liam Girdwood. During the second stage, they played the wheel of fortune, during which they had to guess sentences such as Linus Torvalds or Tux The Penguin. In the end, Matt Mackall won that second stage and could go back home with a shiny new Nokia Internet tablet. Cool !
Of course, many other interesting talks took place at Embedded Linux Conference, on a wide variety of topics. Here is a list of the other talks, along with links to their corresponding slides if available on the conference website :
For your editor, it was his first edition of Embedded Linux Conference. The contents of the conference is very good, highly technical and of sufficient variety to allow all the attendees to find valuable informations matching their interests. The conference organization was also absolutely perfect : nice venue, free lunch every day, fun social event, etc. The demonstrations session was also very interesting, and one could wish that more projects will be present next year.
Congratulations to the organization team, thanks to Tim Bird and the CE Linux Forum for setting up such a great conference every year !
As a suggestion for next years, your editor would suggest setting up a proper video recording team, with connection to the room audio system. This would allow to record in high-quality all the talks, so that attendees could see the talks they couldn't see, and it could also benefit to people that couldn't come to the conference for various reasons.