COOol

Checks LibreOffice / OpenOffice.org documents for bad Links

cOOol is a simple Python script that looks for broken hyperlinks in LibreOffice / OpenOffice.org documents.

  • cOOol only supports documents in the OpenDocument format.
  • cOOol is fast: it doesn’t start LibreOffice / OpenOffice.org and runs link checks in parallel threads.
  • cOOol supports most kinds of hyperlinks, including links within the documents.
  • cOOol is easy to use. Just download the script and run it!
  • cOOol is free. It is released under the terms of the GNU General Public License.
cOOol logo

Here is why an automatic link checker for your documents is useful:

  • External references can be a very valuable part of your documents. Broken links reduce their usefulness as well as the impression they make. They also give the feeling that your documents are outdated and older than they are.
  • Web sites evolve frequently. Having an automated way of detecting obsolete links is essential to keeping your documents up to date.
  • You may be much more familiar with your target websites than your readers. They may not be able to find a new location by themselves. You’d better be aware of the change and do this for them!
  • When you rename a page (for example), LibreOffice and OpenOffice.org don’t update all the references to it.

Usage

Usage: coool [options] [OpenOffice.org document files]

Checks OpenOffice.org documents for broken Links

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         display progress information
  -d, --debug           display debug information
  -t MAX_THREADS, --max-threads=MAX_THREADS
                        set the maximum number of parallel threads to create
  -r MAX_REQUESTS_PER_HOST, --max-requests-per-host=MAX_REQUESTS_PER_HOST
                        set the maximum number of parallel requests per host
  -x EXCLUDE_HOSTS, --exclude-hosts=EXCLUDE_HOSTS
                        ignore urls which host name belongs to the given list

When a broken link is found, open the document in OpenOffice.org and use the search facility to look for the link text.

Configuration file

Rather than configuring cOOol from the command line, it is possible to
define the same settings in a ~/.cooolrc file.

Example:

# Configuration file for cOOol

verbose = True
exclude_hosts = "lxr.bootlin.com www.example.com"
max_threads = 200

You can see that configuration file settings have the same name as long
options, except that dash (-) characters are replaced by
underscores (_).

Usage through a proxy

cOOol can be used through a proxy. The Python classes it uses rely on standard Unix environment variables for proxy definition, as in the below bash example:

export http_proxy="proxy.server.com:8080"
export ftp_proxy="proxy.server.com:8080"

Prerequisites

You first need to install the configparse module.

Downloads

cOOol can be found in our training scripts git tree.

Screenshot

cOOol screenshot

Implementation

cOOol parses the xml components of each document file, looking for hyperlinks.

It would have been cleaner and safer to use the OpenOffice.org API to explore the documents. However, there are also benefits in a standalone Python implementation:

  • No need to start OpenOffice.org and load documents in memory. This saves a lot of time and RAM!
  • No need to have an OpenOffice.org install. Nice if you need to implement a validation server using cOOol.
  • Last but not least, no need to understand OpenOffice.org’s API and the internal structure of documents! By the way, that’s what makes exchange formats like XML attractive. However, we would be delighted if somebody could come up with a simpler and safer implementation based on the API, that could be run within OpenOffice.org user interface!

Testing cOOol

We are using 2 documents to make sure that cOOol finds all the kinds of broken links it is supposed to support:

Limitations and possible improvements

  • cOOol doesn’t check for e-mail links. It could at least check that the corresponding domain is valid.
  • cOOol doesn’t give you page numbers for broken links. You have to open the document and use the search facilities to locate each link.
  • cOOol still crashes on some documents with Unicode strings (for example with Chinese text).
  • cOOol has trouble with link text containing quotes, as in what’s new. The text it outputs is truncated.

clink

Compacts directories by replacing duplicate files by symbolic links

clink is a simple Python script that replaces duplicate files in Unix filesystems by symbolic links.

  • clink saves space. It works particularly well with automatically generated directory structures, such as compiling toolchains.
  • clink uses relative links, making it possible to move processed directory structures
  • clink is fast. It reads each file only once and its runtime is mainly the time taken to read files.
  • clink is light. It consumes very little RAM. No problem to run it on huge filesystems!
  • clink is easy to use. Just download the script and run it!
  • clink is free. It is released under the terms of the GNU General Public License.
clink logo

Usage

usage: clink [options] [files or directories]

Compacts folders by replacing identical files by symbolic links

options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -d, --dry-run  just reports identical files, doesn't make any change.

Screenshot

clink screenshot

Downloads

Here is the OpenPGP key used to generate the signatures.

How it works

clink reads all the files one by one, and computes their SHA (20 bytes) and MD5 (16 bytes) checksums. The trick to easily find identical files is a dictionary of files lists indexed by their SHA checksum.

All the files with the same SHA checksum are not immediately considered as identical. Their MD5 checksums and sizes are also compared then. There is an extremely low probability that files meeting all these 3 criteria at once are different. You are much more likely to face file corruption because of a hardware failure on your computer!

Hard links to the same contents are treated as regular files. Keeping one instance and replacing the others by symbolic links is harmless. Files implemented by symbolic links also have the advantage of not having their contents duplicated in tar archives.

Limitations and possible improvements

  • File permissions: clink just keeps one copy of duplicate files. The permissions of this file may be less strict than those of other duplicates. If permissions matter, enforce them by yourself after running clink.
  • Directory structure: even when entire directories are identical, clink just creates links between files. This is not fully optimal in this case, but it keeps clink simple.

Similar tools or alternatives

  • dupmerge2: replaces identical files by hardlinks.
  • finddup: finds identical files.

Demo: tiny qemu arm system with a DirectFB interface

A tiny embedded Linux system running on the qemu arm emulator, with a DirectFB interface, everything in 2.1 MB (including the kernel)!

Overview

This demo embedded Linux system has the following features:

  • Very easy to run demo, just 1 file to download and 1 command line to type!
  • Runs on qemu (easy to get for most GNU/Linux distributions), emulating an ARM Versatile PB board.
  • Available through a single file (kernel and root filesystem), sizing only 2.1 MB!
  • DirectFB graphical user interface.
  • Demonstrates the capabilities of qemu, the Linux kernel, BusyBox, DirectFB, and
    shows the benefits of system size and boot time reduction techniques as advertised and supported by the CE Linux Forum.
  • License: GNU GPL for root filesystem scripts. Each software component has its own license.

How to run the demo

  • Make sure the qemu emulator is installed on your GNU/Linux distribution. The demo works with qemu 0.8.2 and beyond, but it may also work with earlier versions.
  • Download the vmlinuz-qemu-arm-2.6.20
    binary.
  • Run the below command:
    qemu-system-arm -M versatilepb -m 16 -kernel vmlinuz-qemu-arm-2.6.20 -append "clocksource=pit quiet rw"
  • When you reach the console prompt, you can try regular Unix commands but also the graphical demo:
    run_demo
    

FAQ / Troubleshooting

  • Q: I get Could not initialize SDL - exiting when I try to run qemu.

    That’s a qemu issue (qemu used the SDL library). Check that you can start graphical applications from your terminal (try xeyes or xterm for example). You may also need to check that you have name servers listed in /etc/resolv.conf. Anyway, you will find solutions for this issue on the Internet.

Screenshots

console screenshot df_andi program screenshot
df_dok program screenshot df_dok2 program screenshot
df_neo program screenshot df_input program screenshot

How to rebuild this demo

All the files needed to rebuild this demo are available here:

  • You can rebuild or upgrade the (Vanilla) kernel by using the given kernel configuration file.
  • The configuration file expects to find an initramfs source directory in ../rootfs, which
    you can create by extracting the contents of the rootfs.tar.7z archive.
  • Of course, you can make changes to this root filesystem!

Tools and optimization techniques used in this demo

Software and development tools

  • The demo was built using Scratchbox, a fantastic development tool that makes cross-compiling transparent!
  • The demo includes BusyBox 1.4.1, an toolbox implementing most UNIX commands in a few hundreds of KB. In our case, BusyBox includes the most common commands (like a vi implementation), and only sizes 192 KB!
  • The root filesystem is shipped within the Linux kernel image, using the initramfs technique, which makes the kernel simpler and saves a dramatic amount of RAM (compared to an init ramdisk).
  • The demo is interfaced by DirectFB example programs (version 0.9.25, with DirectFB 1.0.0-rc4), which demonstrate the amazing capabilities of this library, created to meet the needs of embedded systems.

Size optimization techniques

The below optimization techniques were used to reduce the filesystem size from 74 MB to 3.3 MB (before compression in the Linux kernel image):

  • Removing development files: C headers and manual pages copied when installing tools and libraries, .a library files, gdbserver, strace, /usr/lib/libfakeroot, /usr/local/lib/pkgconfig
  • Files not used by the demo programs: libstdc++, and any library or resource file.
  • Stripping and even super stripping (see sstrip) executables and libraries.
  • Reducing the kernel size using CONFIG_EMBEDDED switches, mainly from the
    Linux Tiny project.

Techniques to reduce boot time

We used the below techniques to reduce boot time:

  • Disabled console output (quiet boot option, printk support was disabled anyway), which saves time scrolling the framebuffer console.
  • Use the Preset Loops per Jiffy technique to disable delay loop calculation, by feeding the kernel with a value measured in an earlier boot (lpj setting, which you may update according to the speed of your own workstation).

All these optimization techniques and other ones we haven’t tried yet are described either on the elinux.org Wiki or in our embedded Linux optimizations presentation.

Future work

We plan to implement a generic tool which would apply some of these techniques in an automatic way, to shrink an existing GNU/Linux or embedded Linux root filesystem without any loss in functionality. More in the next weeks or months!

OLS 2008 videos

30 videos from the Linux Symposium in Ottawa

We are pleased to release 29 videos that we took at the Linux Symposium in Ottawa, Canada, in July 2008:

  • Keynote: The Kernel: 10 Years in Review, by Matthew Wilcox (Intel)
    video (57 minutes, 175M)
  • Talk: Tux on the Air: State of Linux Wireless Networking, by John W. Linville (Red Hat)
    paper, video (52 minutes, 168M)
  • Talk: Suspend to RAM in Linux: State of the Union, by Len Brown and Rafael Wysocki (Intel)
    paper, video (52 minutes, 163M)
  • Talk: Real Time vs Real Fast: How To Choose?, by Paul E. McKenney (IBM)
    paper, video (45 minutes, 166M)
  • Tutorial: ftrace: latency tracer, by Steven Rostedt (Red Hat) video (98 minutes, 772M)
  • BOF: Embedded Linux, by Tim R. Bird (Sony)
    video (42 minutes, 200M)
  • BOF: Embedded Microcontroller Linux, by Michael Durrant (Arcturus Networks)
    video (42 minutes, 243M)
  • Talk: Energy-aware task and interrupt management, by Vaidyanathan Srinivasan (IBM)
    paper, video (52 minutes, 182M)
  • Talk: Application Testing Under Realtime Linux, by Luis Claudio R. Gonçalves (Red Hat)
    paper, slides, video (54 minutes, 297M)
  • Talk: Application Framework for Your Mobile Device, by Shreyas Srinivasan (Geodesic Information Systems)
    paper, video (25 minutes, 146M)
  • Keynote: The Making of OpenMoko Neo, by Werner Almesberger (OpenMoko)
    video (94 minutes, 463M)
  • BOF: U-Boot by Wolfgang Denk (Denx)
    video (54 minutes, 362M)
  • BOF: Linux Compiler, by Rob Landley (Impact Linux)
    video (100 minutes, 765M)
  • Tutorial: Practical Guide to Using Git, by James Bottomley (Hansen Partnership)
    video (61 minutes, 357M)
  • Talk: Advanced XIP File System, by Jared Hulbert (Numonyx)
    paper, video (49 minutes, 160M)
  • Talk: SELinux for Consumer Electronic Devices, by Yuichi Nakamura (Hitachi)
    paper, video (31 minutes, 113M)
  • Talk: Around the Linux File System World in 45 Minutes, by Steve French (IBM)
    paper, slides, video (49 minutes, 298M)
  • BOF: Linux The Easy Way with LTIB, by Stuart Hughes (Freescale)
    slides, video (25 minutes, 144M)
  • Keynote: The Joy of Synchronicity: Coordinating the Releases of Upstream and Distributions, by Mark Shuttleworth (Canonical)
    slides, video (76 minutes, 458M)
  • Talk: Smack in Embedded Computing, by Casey Schauffer
    paper, video (59 minutes, 211M)
  • Talk: Bazillions of Pages: The Future of Memory Management, by Christoph H. Lameter (SGI)
    paper, video (49 minutes, 258M)
  • Tutorial: Writing application fault handlers, by Gilad Ben-Yossef (Codefidence)
    video (49 minutes, 275M)
  • Talk: Linux, Open Source and System Bringup Tools, by Tim Hockin (Google)
    paper, video (51 minutes, 229M)
  • Talk: DCCP Reached Mobiles, by Leandro Melo Sales (Federal University of Campina Grande)
    paper, video (42 minutes, 193M)
  • Talk: Building a robust Linux kernel, by Subrata Modak (IBM)
    paper, slides, video (51 minutes, 249M)
  • CELF BOF presentation: Best of recent CELF Conferences, by Tim Bird (Sony)
    slides, video (10 minutes, 88M)
  • CELF BOF presentation: Developing Embedded Linux with Target Control, by Tim Bird (Sony)
    slides, video (17 minutes, 145M)
  • CELF BOF presentation: Embedded Building Tools – An Audience Survey, by Michael Opdenacker (Bootlin)
    slides, video (17 minutes, 127M)
  • CELF BOF presentation: GCC Tips and Tricks Highlights, by Gene Sally
    video (14 minutes, 62M)

See also all the papers, and a report from the CELF BOF.

We could only shoot the presentations we attended. You can see that our main interests are embedded systems and the Linux kernel wink smiley.

Conference videos and report

27 free videos from the ELC and FOSDEM 2008 conferences. Extensive technical report from ELC 2008.

After participating to the Embedded Linux Conference (ELC) in Mountain View, and to FOSDEM in Brussels, we are pleased to release the videos that we managed to shoot.

These videos should be useful to anyone interested in the multiple topics covered by these very interesting conferences, either to people who couldn’t join these conferences, or to single core participants who couldn’t attend more than one presentation at once. These videos are also interesting opportunities to see and hear key community members like Andrew Morton, Keith Packard, Henry Kingman, Tim Bird and many others!

While we’ve been releasing free technical videos for a few years now, ELC is the first conference for which we are also offering an extensive report, written by Thomas Petazzoni, one of our kernel and embedded system developers. This report is trying to sum up the most interesting things learned at this conference, at least from the presentations Thomas could attend. This way, you shouldn’t have to view all the videos to identify the most interesting talks.

Creative commons In agreement with the speakers, these videos and the report are released under the terms of the Creative Commons Attribution-ShareAlike 3.0 license.

We hope that sharing this knowledge will attract new contributors and users, and will bring our community one step closer to world domination…

Embedded Linux Conference, Mountain View, Apr. 2008

Don’t miss our detailed report on the below presentations!

  • Keynote: The Relationship Between kernel.org Development and the Use of Linux for Embedded Applications, by Andrew Morton (Google):
    video, slides (55 minutes, 240 MB)
  • UME – Ubuntu Mobile and Embedded, by David Mandala (Canonical):
    video, slides (30 minutes, 145 MB)
  • Appropriate Community Practices: Social and Technical Advice, by Deepak Saxena (MontaVista):
    video (thanks to Kevin Hilman, MontaVista)(44 minutes, 139 MB)
  • Adventures In Real-Time Performance Tuning, by Frank Rowand:
    video,slides (50 minutes, 251 MB)
  • Shifting Sands: Lessons Learned from Linux on an FPGA, by Grant Likely:
    video, slides (44 minutes, 262 MB)
  • Disko – An Application Framework for Digital Media Devices, by Guido Madaus:
    video (27 minutes, 190 MB)
  • Keynote: Tux in Lights, by Henry Kingman (LinuxDevices.com):
    video, slides (44 minutes, 139 MB)
  • Back-tracing in MIPS-based Linux Systems, by Jong-Sung Kim (LG Electronics):
    video, slides
    (54 minutes, 160 MB)
  • Making a Phone Call With Phase Change Memory, by Justin Treon (Numonyx):
    video, slides (28 minutes, 159 MB)
  • Building Blocks for Embedded Power Management, by Kevin Hilman (MontaVista):
    We couldn’t film his presentation, but we already shot a similar presentation he gave at Fosdem 2008: video ((56 minutes, 183 MB)
  • Using Real-Time Linux, by Klaas van Gend (MontaVista):
    video, slides (53 minutes, 263 MB)
  • Every Microamp is Sacred – A Dynamic Voltage and Current Control Interface for the Linux Kernel, by Liam Girdwood (Wolfson Microelectronics):
    video, slides (35 minutes, 71 MB)
  • Power Management Quality of Service and How You Could Use it in Your Embedded Application, by Mark Gross (Intel):
    video, slides (57 minutes, 401 MB)
  • OpenEmbedded for product development, by Matt Locke (Embedded Alley):
    video, slides (49 minutes, 141 MB)
  • Kernel Size Report, and Bloatwatch Update, by Matt Mackall (Selenic Consulting):
    video (49 minutes, 146 MB)
  • Leveraging Free and Open Source Software in a Product Development Environment, by Matt Porter (Embedded Alley):
    video, slides (45 minutes, 220 MB)
  • Using a JTAG for Linux Driver Debugging, by Mike Anderson (PTR Group):
    video, slides (113 minutes, 694 MB)
  • DirectFB Internals – Things You Need to Know to Write Your DirectFB gfxdriver, by Takanari Hayama ():
    video (43 minutes, 200 MB)
  • Linux Tiny – Penguin Weight Watchers, by Thomas Petazzoni (Bootlin):
    video (thanks to Jean Pihet, MontaVista), slides (32 minutes, 140 MB)
  • Keynote: Status of Embedded Linux and CELF Plenary Meeting, by Tim Bird (Sony):
    video, slides (49 minutes, 112 MB)

Slides are collected on http://www.celinux.org/elc08_presentations/.

Fosdem, Brussels, Feb. 2008

  • Modest, email client for embedded systems, by Dirk-Jan Binnema (Nokia):
    video (34 minutes, 121 MB)
  • Design a Linux robot companion with 8 bits microcontrollers, by David Bourgeois:
    video (54 minutes, 211 MB)
  • Linux on the PS3, by Olivier Grisel:
    video (47 minutes, 272 MB)
  • Xen for Secure Isolation on ARM11, by Jean-Pihet (MontaVista):
    video (41 minutes, 207 MB)
  • Building blocks for Embedded Power Management, by Kevin Hilman (MontaVista):
    video (56 minutes, 183 MB)
  • Emdebian Update: Rootfs, GPE and tdebs, by Neil Williams:
    video (47 minutes, 226 MB)
  • pjsip: lightweight portable SIP stack, by Perry Ismangil:
    video (55 minutes, 194 MB)

Additional video

  • Roadmap to recovery – pain and redemption in X driver development, by Keith Packard:
    video (44 minutes, 168 MB)

ELCE 2007 videos

Free videos of CELF’s Embedded Linux Conference Europe / 9th Real-Time Linux Workshop in Linz, Austria, November 2007.

We are happy to release the videos that we took at the CELF Embedded Linux Conference Europe 2007 / 9th Real-Time Linux Workshop which happened in Linz, Austria in November, 2007.

  • Detection & Resolution of Real Time Issues Using TimeDoctor, by François Audeon (NXP):
    video (32 minutes, 359 MB)
  • Fancy and Fast GUIs on Embedded Devices, by Gustavo Sverzut Barbieri (INDT):
    video, slides (46 minutes, 146 MB)
  • arch/ppc, arch/powerpc and Device Trees – A Walk Through a Port, by Hugh Blemings (IBM):
    video (30 minutes, 534 MB)
  • Free Software, Licensing and Business Processes, by Shane Martin Coughlan (FSF Europe):
    video, slides (40 minutes, 138 MB)
  • Introduction to LogFS, by Jörn Engel:
    video, slides (46 minutes, 260 MB)
  • WebKit on Linux and How It Compares to Other Open Source Engines, by Holger Freyther (Trolltech):
    video, slides (49 minutes, 205 MB)
  • Status Overview of Real-Time, by Thomas Gleixner (Linutronix.de):
    video (47 minutes, 236 MB)
  • Kernel Summit Report, by Thomas Gleixner (Linutronix.de):
    video (34 minutes, 520 MB)
  • Writing DirectFB gfxdriver For Your Embedded System, by Takanari Hayama (igel):
    video, slides (31 minutes, 223 MB)
  • Improving JFFS2 RAM Usage and Performance, by Alexey Korolev (Intel):
    video, slides (20 minutes, 141 MB)
  • YAFFS, by Wookey:
    video, slides (45 minutes, 194 MB)
  • Parallelizing Linux boot on CE Devices, by Vitaly Wool (Embedded Alley Solutions):
    video, slides (40 minutes, 185 MB)
  • Linux Suspend-to-Disk Objectives for Consumer Electronic Devices, by Vitaly Wool (Embedded Alley Solutions):
    video, slides (35 minutes, 652 MB)
  • Evaluation of Linux rt-preempt for embedded industrial devices for Automation and Power technologies – A case study, by Morten Mossige, Pradyumna Sampath, Rachana Rao (ABB):
    video, paper (22 minutes, 224 MB)
  • Assessment of the Realtime Preemption Patches (RT-Preempt) and their impact on the general purpose performance of the system, by Arthur Siro (DSLab / OSADL):
    video, paper (31 minutes, 224 MB)
  • Panel: the ideal embedded Linux distribution, by Tim Bird (Sony):
    video (65 minutes, 465 MB)

To speed up the processing of these videos, we contracted Jan Gerber, the developer of ffmpeg2theora, to add denoising support to this tool. Thanks to this contribution, it is now possible for anyone in the community to directly denoise DV camcorder input and generate Ogg/Theora video in just 1 step. Before it was necessary to use mencoder‘s denoising filter, and because mencoder couldn’t process DV input properly, a preprocessing stage with ffmpeg was also required. This new functionality can also improve the quality and compression rate of live Ogg/Theora video broadcasts.

Offering free training to community contributors

Free seats to community contributors in our public embedded Linux training sessions

At Bootlin, we owe a lot to the Free Software community, and we’re doing our best to give back as much as we can.

For each of our public embedded Linux training sessions, we decided to offer a free seat to a deserving contributor to the community, with a commercial value of 1750 € (including materials, lunch and laptop rental).

Linux USB drivers

Learning how to write USB device drivers for Linux

Bootlin is proud to release a new set of training slides from its embedded Linux training materials. These new ones cover writing USB device drivers for Linux.

Like everything we create, these new materials are released to the user and developer community under a free license. They can be freely downloaded, copied, distributed or even modified according to the terms of the Creative Commons Attribution-ShareAlike 2.5 license.

2007 – Year of the Penguin

Best wishes for 2007!

Bootlin is happy to send its best wishes to the entire Free Software and Open Source user and developer community. Whether you believe or not in Finnish Astrology, let 2007 be the Year of the Penguin! For our customers, we also wish that their competitors continue to use proprietary operating systems! Wink emoticon

Do not hesitate to reuse our New Year’s card for your own needs. It’s Free as in Free Speech!

2007 wish card, front
2007 wish card, inside 1

2007 wish card, inside 2

  • License: right to copy and modify if the copyright notice is kept. Graphic elements (astrological symbols) can be copied and modified with no restriction (Public Domain).
  • Downloads: source (Scalable Vector Graphics, created with Inkscape) and generated files can be found here.

Embedded Linux and Ecology

Embedded Linux contributions to the Linux Ecology HOWTO.

Bootlin has contributed major updates to the Linux Ecology HOWTO, a Linux Documentation Project document that gathers ideas and techniques for using Linux in an environmentally friendly way.

In particular, Bootlin took advantage of its experience with embedded Linux system development to add new techniques which can reduce power consumption or make it possible to extend the lifetime of old systems with limited resources.

Bootlin also contributed an overview presentation on this HOWTO. The latest HOWTO version with our updates (waiting for the next official release) can also be found on the same page.