A custom PipeWire node

As described in previous articles (Introduction to PipeWire, Hands-on installation of PipeWire), the PipeWire daemon is responsible for running the graph execution. Nodes inside this graph can be implemented by any process that has access to the PipeWire socket that is used for IPC. PipeWire provides a shared object library that abstracts the communication with the main daemon and the communication with the modules that are required by the client.

In this blog post, our goal will be to implement an audio source node that plays audio coming from a file, in a loop. This will be an excuse to see a lot of code, showing what the library API looks like and how it should be used. To introduce some dynamism to a rather static setup, we’ll rely on an input from a Wii Nunchuck, connected using a custom Linux driver and relying on the input event userspace API.

A node, in the PipeWire sense, is a graph object that can consume and/or produce buffers. The main characteristic of a node is to have ports; those are graph objects described by a direction (input or output), an ID and some properties.

A basic PipeWire source node

To create a node, we will have to:

  1. Create an instance of one of the event loop implementations (pw_data_loop, pw_thread_loop or pw_main_loop for the moment);
  2. Create a pw_context using pw_context_new;
  3. Connect the context to the PipeWire daemon using pw_context_connect which returns a proxy to the core object (whose ID is zero);
  4. Create the node object (spa_node), initialise its properties, register its methods (set_param, process, enum_params, etc.; see struct spa_node_methods for the full list) and export it to the core so that the object becomes part of the graph (using pw_core_export);
  5. For each port, emit a port_info event on the node object, that will be picked up by the node implementation which will create and export the global port object.

Steps 4 and 5 are rather complex and prone to errors and, as such, PipeWire provides two abstractions to facilitate node creation:

  • A filter is the small abstraction over steps 4 and 5 from above. The filter constructor takes a proxy to the core object, a name and properties. Input or output ports can then be added using pw_filter_add_port, which returns a pointer to user memory allocated for the port. It has a few events which can be listened upon; the only required one is process which is responsible for consuming data from input ports and producing data for output ports. The audio data format for both is raw floats. In the process callback, the user pointer given as the return value of pw_filter_add_port is used as an identifier for the port to retrieve the port’s data buffer.
  • A stream is also an abstraction over steps 4 and 5 from above, but more focused. The main issue it targets is format negotiation; it defines a list of supported formats and there is a format negotiation that happens before it being connected to another node. The adapter module is used to support input format conversion, channel-mixing, resampling and output format conversion.

What follows is therefore the full implementation of a PipeWire client that creates a source node using pw_stream. The code is heavely commented, so that it documents itself.

#include <errno.h>
#include <math.h>
#include <signal.h>
#include <stdio.h>
#include <sys/time.h>

#include <pipewire/pipewire.h>
#include <sndfile.h>
#include <spa/param/audio/format-utils.h>

/* A common pattern for PipeWire is to provide a user data void
   pointer that can be used to pass data around, so that we have a
   reference to our memory structures when in callbacks. The norm
   is therefore to store all the state required by the client in a
   struct declared in main, and passed to PipeWire as a pointer.
   struct data is just that. */
struct data {
    /* Keep some references to PipeWire objects. */
    struct pw_main_loop *loop;
    struct pw_core *core;
    struct pw_stream *stream;

    /* libsndfile stuff used to read samples from the input audio
       file. */
    SNDFILE *file;
    SF_INFO fileinfo;
};

static void on_process(void *userdata);
static void do_quit(void *userdata, int signal_number);
static const struct pw_stream_events stream_events;

int main(int argc, char **argv)
{
    struct data data = { 0, };

    /* A single argument: path to the audio file. */
    if (argc != 2) {
        fprintf(stderr,
            "expected an argument: the file to open\n");
        return 1;
    }

    /* We initialise libsndfile, the library we'll use to convert
       the audio file's content into interlaced float samples. */
    memset(&data.fileinfo, 0, sizeof(data.fileinfo));
    data.file = sf_open(argv[1], SFM_READ, &data.fileinfo);
    if (data.file == NULL) {
        fprintf(stderr, "file opening error: %s\n",
            sf_strerror(NULL));
        return 1;
    }

    /* We initialise libpipewire. This mainly reads some
       environment variables and initialises logging. */
    pw_init(NULL, NULL);

    /* Create the event loop. */
    data.loop = pw_main_loop_new(NULL);

    /* Create the context. This is the main interface we'll use to
       interact with PipeWire. It parses the appropriate
       configuration file, loads PipeWire modules declared in the
       config and registers event sources to the event loop. */
    struct pw_context *context = pw_context_new(
        pw_main_loop_get_loop(data.loop),
        pw_properties_new(
            /* Explicity ask for the realtime configuration. */
            PW_KEY_CONFIG_NAME, "client-rt.conf",
            NULL),
        0);
    if (context == NULL) {
        perror("pw_context_new() failed");
        return 1;
    }

    /* Connect the context, which returns us a proxy to the core
       object. */
    data.core = pw_context_connect(context, NULL, 0);
    if (data.core == NULL) {
        perror("pw_context_connect() failed");
        return 1;
    }

    /* Add signal listeners to cleanly close the event loop and
       process when requested. */
    pw_loop_add_signal(pw_main_loop_get_loop(data.loop), SIGINT,
        do_quit, &data);
    pw_loop_add_signal(pw_main_loop_get_loop(data.loop), SIGTERM,
        do_quit, &data);

    /* Initialise a string that will be used as a property to the
       stream. We request a specific sample rate, the one found in
       the opened file. Note that the sample rate will not be
       enforced: see PW_KEY_NODE_FORCE_RATE for that. */
    char rate_str[64];
    snprintf(rate_str, sizeof(rate_str), "1/%u",
        data.fileinfo.samplerate);

    /* Create the pw_stream. This does not add it to the graph. */
    data.stream = pw_stream_new(
        data.core, /* Core proxy. */
        argv[1],   /* Media name associated with the stream, which
                      is different to the node name. */
        pw_properties_new(
            /* Those describe the node type and are required to
               allow the session manager to auto-connect us to a
               sink node. */
            PW_KEY_MEDIA_TYPE, "Audio",
            PW_KEY_MEDIA_CATEGORY, "Playback",
            PW_KEY_MEDIA_ROLE, "Music",

            /* Our node name. */
            PW_KEY_NODE_NAME, "Audio source",

            PW_KEY_NODE_RATE, rate_str,
            NULL));

    /* Register event callbacks. stream_events is a struct with
       function pointers to the callbacks. The most important one
       is `process`, which is called to generate samples. We'll
       see its implementation later on. */
    struct spa_hook event_listener;
    pw_stream_add_listener(data.stream, &event_listener,
        &stream_events, &data);

    /* This is the stream's mechanism to define the list of
       supported formats. A format is specified by the samples
       format(32-bit floats, unsigned 8-bit integers, etc.), the
       sample rate, the channel number and their positions. Here,
       we define a single format that matches what we read from
       from the file for the sample rate and the channel number,
       and we use a float format for samples, regardless of what
       the file contains. */
    const struct spa_pod *params[1];
    uint8_t buffer[1024];
    struct spa_pod_builder b = SPA_POD_BUILDER_INIT(buffer,
        sizeof(buffer));
    params[0] = spa_format_audio_raw_build(&b,
            SPA_PARAM_EnumFormat,
            &SPA_AUDIO_INFO_RAW_INIT(
                .format = SPA_AUDIO_FORMAT_F32,
                .channels = data.fileinfo.channels,
                .rate = data.fileinfo.samplerate ));

    /* This starts by calling pw_context_connect if it wasn't
       called, then it creates the node object, exports it and
       creates its ports. This makes the node appear in the graph,
       and it can then be detected by the session manager that is
       responsible for establishing the links from this node's
       output ports to input ports elsewhere in the graph (if it
       can).

       The third parameter indicates a target node identifier.

       The fourth parameter is a list of flags:
        - we ask the session manager to auto-connect us to a sink;
        - we want to automatically memory-map the memfd buffers;
        - we want to run the process event callback in the realtime
          thread rather than in the main thread.
       */
    pw_stream_connect(data.stream,
              PW_DIRECTION_OUTPUT,
              PW_ID_ANY,
              PW_STREAM_FLAG_AUTOCONNECT |
              PW_STREAM_FLAG_MAP_BUFFERS |
              PW_STREAM_FLAG_RT_PROCESS,
              params, 1);

    /* We start the event loop. Underlying to this is an epoll call
       that listens on an eventfd. In this example, the process
       gets woken up regularly to evaluate the process event
       handler. */
    pw_main_loop_run(data.loop);

    /* pw_main_loop_run returns when the event loop has been asked
       to quit, using pw_main_loop_quit. */
    pw_stream_destroy(data.stream);
    spa_hook_remove(&event_listener);
    pw_context_destroy(context);
    pw_main_loop_destroy(data.loop);
    pw_deinit();
    sf_close(data.file);

    return 0;
}

/* do_quit gets called on SIGINT and SIGTERM, upon which we ask the
   event loop to quit. */
static void do_quit(void *userdata, int signal_number)
{
    struct data *data = userdata;
    pw_main_loop_quit(data->loop);
}

/* This is a structure containing function pointers to event
   handlers. It is a common pattern in PipeWire: when something
   allows event listeners, a function _add_listener is available
   that takes a structure of function pointers, one for each
   event. Those APIs are versioned using the first field which is
   an integer version number, associated with a constant declared
   in the header file.

   Not all event listeners need to be implemented; the only
   required one for a stream or filter is `process`. */
static const struct pw_stream_events stream_events = {
    PW_VERSION_STREAM_EVENTS,
    .process = on_process,
};

/* on_process is responsible for generating the audio samples when
   the stream should be outputting audio. It might not get called,
   if the ports of the stream are not connected using links to
   input ports.

   The general process is the following:
     - pw_stream_dequeue_buffer() to retrieve a buffer from the
       buffer queue;
     - fill the buffer with data and set its properties
       (offset, stride and size);
     - pw_stream_queue_buffer() to hand the buffer back to
       PipeWire.

   We'll use the following calling convention: a frame is composed
   of multiple samples, one per channel. */
static void on_process(void *userdata)
{
    /* Retrieve our global data structure. */
    struct data *data = userdata;

    /* Dequeue the buffer which we will fill up with data. */
    struct pw_buffer *b;
    if ((b = pw_stream_dequeue_buffer(data->stream)) == NULL) {
        pw_log_warn("out of buffers: %m");
        return;
    }

    /* Retrieve buf, a pointer to the actual memory address at
       which we'll put our samples. */
    float *buf = b->buffer->datas[0].data;
    if (buf == NULL)
        return;

    /* stride is the size of one frame. */
    uint32_t stride = sizeof(float) * data->fileinfo.channels;
    /* n_frames is the number of frames we will output. We decide
       to output the maximum we can fit in the buffer we were
       given, or the requested amount if one was given. */
    uint32_t n_frames = b->buffer->datas[0].maxsize / stride;
    if (b->requested)
        n_frames = SPA_MIN(n_frames, b->requested);

    /* We can now fill the buffer! We keep reading from libsndfile
       until the buffer is full. */
    sf_count_t current = 0;
    while (current < n_frames) {
        sf_count_t ret = sf_readf_float(data->file,
            &buf[current*data->fileinfo.channels],
            n_frames-current);
        if (ret < 0) {
            fprintf(stderr, "file reading error: %s\n",
                sf_strerror(data->file));
            goto error_after_dequeue;
        }

        current += ret;

        /* If libsndfile did not manage to fill the buffer we asked
           it to fill, we assume we reached the end of the file
           (as described by libsndfile's documentation) and we
           seek back to the start. */
        if (current != n_frames &&
                sf_seek(data->file, 0, SEEK_SET) < 0) {
            fprintf(stderr, "file seek error: %s\n",
                sf_strerror(data->file));
            goto error_after_dequeue;
        }
    }

    /* We describe the buffer we just filled before handing it back
       to PipeWire.  */
    b->buffer->datas[0].chunk->offset = 0;
    b->buffer->datas[0].chunk->stride = stride;
    b->buffer->datas[0].chunk->size = n_frames * stride;
    pw_stream_queue_buffer(data->stream, b);

    return;

error_after_dequeue:
    /* If an error occured after dequeuing a buffer, we end the
       event loop. The current buffer will be sent to the next
       node so we need to make it empty to avoid sending corrupted
       data. */

    pw_main_loop_quit(data->loop);
    b->buffer->datas[0].chunk->offset = 0;
    b->buffer->datas[0].chunk->stride = 0;
    b->buffer->datas[0].chunk->size = 0;
    pw_stream_queue_buffer(data->stream, b);
}

We will now want to compile this and integrate it into our root filesystem image. The easiest way for that is to create a custom Buildroot package. If in doubt during this step, the Buildroot manual can be a good ally as it is well written and does not gloss over any detail.

Before creating a Buildroot package, we’ll finish packaging our program by putting it into $WORK_DIR/pw-nodes/basic-source.c and creating a minimal Makefile that can compile it (used later on by the Buildroot package):

CFLAGS += $(shell pkg-config --cflags libpipewire-0.3 sndfile)
LDLIBS += -lm $(shell pkg-config --libs libpipewire-0.3 sndfile)

basic-source: basic-source.o
    $(CC) -o $@ $^ $(LDLIBS)

%.o: %.c
    $(CC) -c -o $@ $< $(CFLAGS)

.PHONY: clean
clean:
    find . -name "*.o" | xargs -r $(RM)
    $(RM) basic-source

The steps to create our package is as follows:

  • Create the buildroot/package/pw-nodes/pw-nodes.mk makefile, that declares the package’s vital informations, its build step and the steps to copy it into the staging and target environments.
  • Create the buildroot/package/pw-nodes/Config.in Kconfig file, that registers a new configuration option.
  • Edit the buildroot/package/Config.in to declare the new kconfig file. Config.in from packages do not get discovered automatically.

Here is what the makefile could look like:

################################################################################
#
# pw-nodes
#
################################################################################

PW_NODES_VERSION = 0
# TODO: substitute $WORK_DIR manually
PW_NODES_SITE = $WORK_DIR/pw-nodes
PW_NODES_SITE_METHOD = local
PW_NODES_DEPENDENCIES = host-pkgconf pipewire libsndfile

define PW_NODES_BUILD_CMDS
    $(TARGET_MAKE_ENV) $(TARGET_CONFIGURE_OPTS) \
        $(MAKE) -C $(@D) basic-source
endef

define PW_NODES_INSTALL_TARGET_CMDS
    $(INSTALL) -m 0755 -D $(@D)/basic-source $(TARGET_DIR)/usr/bin/basic-source
endef

$(eval $(generic-package))

And the kconfig file:

config BR2_PACKAGE_PW_NODES
    bool "pw-nodes"
    help
      A custom package that includes our own PipeWire nodes.

Tip: check your package’s files using the check-package tool that can be found in the utils directory of every Buildroot install.

We can now update our project’s configuration to add the new package using make menuconfig, and recompile our project so that the new program gets added to the root filesystem image. We can extract our root filesystem image and reboot our board to test our program:

cd $WORK_DIR/buildroot
make
cd ..
sudo rm -rf rootfs
mkdir rootfs
tar xf buildroot/output/images/rootfs.tar -C rootfs

Tip: PipeWire running on both desktop and embedded, programs such as the previous one can easily be run on the host if it is running PipeWire as its audio processing engine. This is practical to experiment and iterate with PipeWire’s specifics quickly.

Dynamic, I2C-controlled PipeWire node

Now that we have a working source program that can output audio to the PipeWire graph, let’s make it dynamic! We’ll rely upon a Wii Nunchuck for this, connected over I2C. We’ll expose this peripheral using a custom driver that will make our Nunchuck available over the input subsystem userspace API. A few steps are required in order to make those changes:

  • The driver has to be added to the kernel, and enabled.
  • The device tree has to be updated in order to register the new peripheral, that will communicate over an I2C bus.
  • Our program has to be updated to rely upon evdev, the generic input event interface, for receiving input events and acting appropriately.

As the first two bullet points are outside the scope of our experiments, here is a patch that provides those changes.

We connect our Nunchuk to the exposed SCL0 and SDA0 pins connectors. To find its address, the easiest way is to rely on the i2cdetect tool, that scans an I2C bus, returning a table of detected devices. Busybox provides an implementation:

# i2cdetect -qy 0
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:          -- -- -- -- -- -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- 52 -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --

Our modification to the device tree is therefore a new node in i2c0, of type joystick at address 0x52, with compatible = "nintendo,nunchuk".

Once the changes are applied and our system booted with the new device tree and kernel, the i2cdetect output changes, which confirms that our driver is handling the 0x52 address on i2c bus 0:

# i2cdetect -qy 0
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:          -- -- -- -- -- -- -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
50: -- -- UU -- -- -- -- -- -- -- -- -- -- -- -- --
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
70: -- -- -- -- -- -- -- --

i2cdetect works by opening /dev/i2c-0 (for bus 0), then doing a ioctl(fd, I2C_SLAVE, address) call for each possible address. errno == EBUSY means that the address is reserved by a kernel driver and i2cdetect prints UU in this case. Otherwise, it tries to read or write based on the mode i2cdetect was invoked with (-r or -q). In write mode, it writes an empty SMBus command; a successful operation means there is a device on the address.

Now that we know that our driver handles the i2c address, we can check that the input events are accessible through the generic input subsystem’s userspace interface: evdev. Each input is represented by a /dev/input/eventN character special file (that is a streaming file). evtest is a tool packaged by Buildroot under the BR2_PACKAGE_EVTEST symbol to read in a user-friendly manner events from an input:

# evtest
No device specified, trying to scan all of /dev/input/event*
Available devices:
/dev/input/event0:  Wii Nunchuk
/dev/input/event1:  gpio_keys
Select the device event number [0-1]: 0
Input driver version is 1.0.1
Input device ID: bus 0x18 vendor 0x0 product 0x0 version 0x0
Input device name: "Wii Nunchuk"
Supported events:
  Event type 0 (EV_SYN)
  Event type 1 (EV_KEY)
    Event code 304 (BTN_SOUTH)
    Event code 305 (BTN_EAST)
    Event code 306 (BTN_C)
    Event code 307 (BTN_NORTH)
    Event code 308 (BTN_WEST)
    Event code 309 (BTN_Z)
    Event code 310 (BTN_TL)
    Event code 311 (BTN_TR)
    Event code 312 (BTN_TL2)
    Event code 313 (BTN_TR2)
    Event code 314 (BTN_SELECT)
    Event code 315 (BTN_START)
    Event code 316 (BTN_MODE)
  Event type 3 (EV_ABS)
    Event code 0 (ABS_X)
      Value    126
      Min       30
      Max      220
      Fuzz       4
      Flat       8
    Event code 1 (ABS_Y)
      Value    132
      Min       40
      Max      200
      Fuzz       4
      Flat       8
Properties:
Testing ... (interrupt to exit)
Event: time 60.719169, type 3 (EV_ABS), code 0 (ABS_X), value 1
Event: time 60.719169, -------------- SYN_REPORT ------------
Event: time 60.919177, type 3 (EV_ABS), code 0 (ABS_X), value 126
Event: time 60.919177, -------------- SYN_REPORT ------------
Event: time 62.619168, type 3 (EV_ABS), code 0 (ABS_X), value 167
Event: time 62.619168, -------------- SYN_REPORT ------------
Event: time 62.719190, type 3 (EV_ABS), code 0 (ABS_X), value 254
Event: time 62.719190, -------------- SYN_REPORT ------------
Event: time 63.019155, type 3 (EV_ABS), code 0 (ABS_X), value 126
Event: time 63.019155, -------------- SYN_REPORT ------------
Event: time 64.319164, type 3 (EV_ABS), code 1 (ABS_Y), value 255
Event: time 64.319164, -------------- SYN_REPORT ------------
Event: time 64.619156, type 3 (EV_ABS), code 1 (ABS_Y), value 132
Event: time 64.619156, -------------- SYN_REPORT ------------
Event: time 65.318825, type 3 (EV_ABS), code 1 (ABS_Y), value 0
Event: time 65.318825, -------------- SYN_REPORT ------------
Event: time 65.618830, type 3 (EV_ABS), code 1 (ABS_Y), value 99
Event: time 65.618830, -------------- SYN_REPORT ------------
Event: time 65.719171, type 3 (EV_ABS), code 1 (ABS_Y), value 132
Event: time 65.719171, -------------- SYN_REPORT ------------
Event: time 67.419274, type 1 (EV_KEY), code 309 (BTN_Z), value 1
Event: time 67.419274, -------------- SYN_REPORT ------------
Event: time 67.619177, type 1 (EV_KEY), code 309 (BTN_Z), value 0
Event: time 67.619177, -------------- SYN_REPORT ------------
Event: time 68.319155, type 1 (EV_KEY), code 306 (BTN_C), value 1
Event: time 68.319155, -------------- SYN_REPORT ------------
Event: time 68.519157, type 1 (EV_KEY), code 306 (BTN_C), value 0
Event: time 68.519157, -------------- SYN_REPORT ------------

evtest starts by listing to us the list of detected inputs, with names declared by drivers. Once one is selected, it tells us informations about the input, including the list of supported event types and codes. Events are defined using the following structure in the evdev interface:

struct input_event {
    // Time of the event.
    struct timeval time;

    // Type of the event. EV_KEY means a change of button state.
    // EV_ABS means a change to an absolute axis value, which has
    // a range. EV_SYN are metadata markers that separate events
    // (notify dropped events, separate events in time or space).
    // There are others that won't concern us.
    unsigned short type;

    // The event identifier: BTN_{C,Z} for the two buttons on the
    // Nunchuk, ABS_{X,Y} for the two axis of our joystick.
    unsigned short code;

    // The value associated with the event: boolean for buttons,
    // between 0 and 255 inclusive for our two axis.
    unsigned int value;
};

We now have everything we require to edit our previous source node to add some dynamism to it. We’ll add a pause on the press of a button, as well as volume control. At init, we now have to:

  1. Find the correct file in /dev/input that represents our Nunchuk device. For that, we’ll scan their names and use the one that has the “Nunchuk” keyword, as given by the kernel driver.
  2. Add a file descriptor pointing to this file to the event-loop. This means that once the event-loop is started, the callback we will register will be called when events are available.

When our callback gets called, we need to know why we got woken-up (might be an error, a hang-up or available data). We can then read the structures and act as we like.

dynamic-source.c will be similar to basic-source.c, with the following additions:

#define DEV_INPUT_DIR "/dev/input"
#define FILE_PREFIX "event"
#define NUNCHUK_NAME_KEYWORD "Nunchuk"

/* Add our file-descriptor and callback to the event-loop. A
   negative return value means an error occured. */
int nunchuk_register(struct data *data)
{
    int fd = nunchuk_open_fd();
    if (fd < 0) {
        return -1;
    }

    int ret = fcntl(fd, F_SETFL, O_NONBLOCK);
    if (ret < 0) {
        perror("fcntl() failed");
        return -1;
    }

    /* We register our file descriptor to the event-loop. The third
       argument is a mask on the fd's events we want to be woken
       up to: in our case, we care when it is readable or when an
       error or hang-up occured. The fourth argument means that
       the event-loop will close automatically the fd when done.
       We then pass on the callback and a user pointer, which
       we'll want to retrieve in the callback. */
    pw_loop_add_io(pw_main_loop_get_loop(data->loop), fd,
        SPA_IO_IN | SPA_IO_ERR | SPA_IO_HUP, true,
        nunchuk_on_events, data);
}


/* Retrieve a file descriptor to the Nunchuk character device.

   This works by going over every device found in the input
   subsystem, exposed in /dev/input, and looking at their names.
   We keep the one that matches a specific keyword open and return
   its file descriptor. */
static int nunchuk_open_fd(void)
{
    DIR *d = opendir(DEV_INPUT_DIR);
    if (d == NULL) {
        perror("opendir() failed");
        return -1;
    }

    struct dirent *dir;
    while ((dir = readdir(d)) != NULL) {
        /* Filter out non block devices and files that don't have
           the right prefix. */
        if (dir->d_type != DT_CHR ||
                strncmp(FILE_PREFIX, dir->d_name,
                    strlen(FILE_PREFIX)) != 0)
            continue;

        char filepath[512];
        snprintf(filepath, sizeof(filepath), "%s/%s",
            DEV_INPUT_DIR, dir->d_name);
        int fd = open(filepath, O_RDONLY);
        if (fd < 0) {
            perror("open() failed");
            continue;
        }

        char name[256] = {0,};
        ioctl(fd, EVIOCGNAME(sizeof(name)), name);

        /* Search for the keyword in the input's name. */
        if (strstr(name, NUNCHUK_NAME_KEYWORD) != NULL) {
            closedir(d);
            return fd;
        }

        close(fd);
    }

    closedir(d);
    fprintf(stderr, "no Nunchuk input device found in %s\n",
        DEV_INPUT_DIR);
    return -1;
}

/* This function will be called by the event-loop when data is
   readable in the Nunchuk file descriptor, that is, when events
   are awaiting. We therefore read from the file descriptor and
   call the event handler for each event.

   We do a single read; if it fills up and there are still awaiting
   events, the event-loop will not go to sleep and we'll be called
   again. We could even read one event at a time if we were lazy.

   The mask tells us why we got woken up. We registered for input,
   errors and hang-ups. */
static void nunchuk_on_events(void *userdata, int fd,
    uint32_t mask)
{
    struct data *data = userdata;

    if (mask & (SPA_IO_ERR | SPA_IO_HUP)) {
        fprintf(stderr, "error or hang-up on the Nunchuk fd\n");
        pw_main_loop_quit(data->loop);
        return;
    }

    /* mask == SPA_IO_IN, that is we can read on the input
       character device. */

    struct input_event events[16];
    ssize_t ret = read(fd, events, sizeof(events));

    if (ret <= 0) {
        if (ret == EOF)
            fprintf(stderr,
                "read() on the Nunchuk input returned EOF\n");
        else
            perror("read() on the Nunchuk input failed");

        pw_main_loop_quit(data->loop);
        return;
    }

    for (size_t i = 0; i < ret / sizeof(struct input_event); i++)
        nunchuk_on_event(data, events[i]);
}

/* Called once for each Nunchuk event. */
static void nunchuk_on_event(struct data *data,
    struct input_event e)
{
    if (e.type == EV_KEY && e.code == BTN_Z) {
        /* This pauses the stream while the button is pressed.
           PulseAudio would call this corking, as in
           pa_stream_cork(). */
        pw_stream_set_active(data->stream, e.value == 0);
    } else if (e.type == EV_ABS && e.code == ABS_Y) {
        /* 0 <= e.value <= 255, which means data->volume will take
           a value between 0 and 1 inclusive. */
        data->volume = (float)e.value / 255;
    }
}

/* The return value is x, limited to the [a, b] interval. */
static float clampf(float x, float a, float b)
{
    if (x <= a) return a;
    if (x >= b) return b;
    else return x;
}

In addition to those function declarations, we:

  1. add a float volume field to struct data;
  2. call nunchuk_register(&data) from main;
  3. set data.volume = 0.5f from main, to avoid having a default value of zero for the volume;
  4. Add the following block at the end of on_process, to apply the volume change:
    // Clamping at the very end to avoid any bug that could hurt ears.
    float volume = clampf(data->volume, 0.0f, 1.0f);
    if (volume != 1.0f) {
        for (size_t i = 0; i < n_frames * data->fileinfo.channels; i++)
            buf[i] *= volume;
    }
    

Those changes enable us to control our stream using our i2c device. One thing might be surprising when running the experiment for the first time: the volume changes do not feel responsive to inputs. The reason is that we apply our volume filtering when we generate the samples buffer. Volume changes will therefore apply to blocks of samples, with a latency dependent on the graph’s execution frequency. If no follower node imposes a latency, the driver node’s one is used. In the current setup, the latency is set to 1024/48000, that is around 21ms.

The number of samples that is handled during one cycle is called the quantum. The stream abstracts this and allows one to provide more or less samples, providing buffering. It is called requested under the buffer structure.

To request for another latency, we can set the node.latency property on our stream node, when we create it. It works in the same way as node.rate:

#define QUANTUM 256
char rate_str[16], latency_str[16];

snprintf(rate_str, sizeof(rate_str), "1/%u", samplerate);
snprintf(latency_str, sizeof(latency_str), "%u/%u", QUANTUM, samplerate);

struct pw_stream *stream = pw_stream_new(core, media_name, pw_properties_new(
        /* ... */
        PW_KEY_NODE_RATE, rate_str,
        PW_KEY_NODE_LATENCY, latency_str,
        /* ... */
        NULL));

As can be seen, latency is expressed as a fraction. The denominator is often, but not enforced to be, the sample rate. The quantum is the buffer size in samples (per audio channel) that a node handles; it is therefore the numerator of the latency fraction when its denominator is the sample rate.

The quantum value is provided to us through the requested buffer field. The stream abstracts it and allows us to provide more samples, providing buffering, which is why the buffers are bigger than the quantum.

Note: a stream that wants to set its own volume should rather use pw_stream_set_control(stream, SPA_PROP_volume, 1, volume_as_float, 0). We only did it manually to get a feel for what the quantum is. This lets the stream’s audioconvert do the volume mixing (possibly using SIMD) and it triggers a flush, meaning the change is instantaneous.

Graph execution timings

PipeWire provides a module that is being loaded by the default configuration, that allows one to access profiling information about the nodes’ execution timing information.

The first important thing is that a node can be of two types: it can be a driver or a follower node. For each subgraph, there is a single driver node that is responsible, in addition to processing samples, of providing the timing: when should a graph execution cycle be started. Other nodes are followers; they get executed at every cycle.

Note: most follower nodes support not being connected to a driver node. They stay in a suspended state, with their process callback not being called. However, some nodes (in particular JACK nodes) do not support this, which is one of the reason for the graph to always contains a “Dummy-Driver” node. Another pretty-specific node is “Freewheel-Driver”, which is used to record samples as fast as possible: it is a driver node that starts the next cycle as soon as the previous one ended.

Every node, whether they are driver or follower, have the following three timing information available through the profiling module, for each execution cycle:

  • Signal: the time at which a node was asked to run. The driver node’s signal time is the start of the new graph execution cycle.
  • Awake: the time at which the node’s sample processing started. For a driver node, that is when the timeout occured, meaning that the underlying device expects to read (or write) samples if it is a sink (or source). A driver can awake itself before the execution is finished: it leads to an underrun.
  • Finish: the time at which the node’s sample processing is done. For a driver node, it means the next execution cycle will run (equal to the next cycle’s signal).

The driver node is not always the one that is signaling follower nodes when they should process samples. If we have a graph with a source, a filter and an ALSA sink, the ALSA sink will act as the driver and trigger the cycle starts, which will signal the source node. Once the source node is finished, it will signal the filter node. On the other hand, if there are no inter-dependencies, nodes will be signaled at the same time and will execute concurrently.

Tools rundown

Now that we have our own node running, let’s go through the tools PipeWire provides to get a sense of the graph state.

PIPEWIRE_DEBUG variable

This isn’t a tool as such, but rather an environment variable. It is however really useful as it tells the PipeWire library to dump logs to stderr. By default, no logging is enabled. It can be raised to print warnings and errors using a value of 2. A value of 5 prints all possible messages, up to trace messages (some can still be disabled because of the FASTPATH compile-time variable). See the documentation for more information about the variable format.

pw-dump

The pw-dump tool outputs the graph as a JSON array of every exported objects. Its main goal is to allow sharing the graph’s overall state when reporting a bug or describing a situation. It can be filtered to a specific ID by passing in a parameter. Its output is rather verbose and for more interactive debugging sessions, pw-cli is more adapted.

pw-cli

It provides a command-line interface to edit or view the graph, for debugging purposes. Its list-objects or ls command is particularly useful. It can be filtered by object type or identifier and it is rather brief. Here is how we would find the current format of a node, by knowing its name:

$ pw-cli ls Node
id 28, type PipeWire:Interface:Node/3
    object.serial = "28"
    factory.id = "10"
    priority.driver = "20000"
    node.name = "Dummy-Driver"
id 29, type PipeWire:Interface:Node/3
    object.serial = "29"
    factory.id = "10"
    priority.driver = "19000"
    node.name = "Freewheel-Driver"
...
id 62, type PipeWire:Interface:Node/3
    object.serial = "74"
    object.path = "alsa:pcm:0:front:0:playback"
    factory.id = "18"
    client.id = "32"
    device.id = "47"
    priority.session = "1009"
    priority.driver = "1009"
    node.description = "CalDigit Thunderbolt 3 Audio Analog Stereo"
    node.name = "alsa_output.usb-CalDigit__Inc._CalDigit_Thunderbolt_3_Audio-00.analog-stereo"
    node.nick = "CalDigit Thunderbolt 3 Audio"
    media.class = "Audio/Sink"
id 69, type PipeWire:Interface:Node/3
    object.serial = "197"
    factory.id = "8"
    client.id = "81"
    node.name = "Audio source"
    media.class = "Stream/Output/Audio"

$ # Now that we know the ID of the node we are looking for,
$ # we can dump its general information, properties and param list:
$ pw-cli info 69
    id: 69
    permissions: rwxm
    type: PipeWire:Interface:Node/3
    input ports: 0/0
    output ports: 2/65
    state: "running"
    properties:
        media.type = "Audio"
        media.category = "Playback"
        media.role = "Music"
        node.name = "Audio source"
        node.rate = "1/44100"
        media.name = "/root/example.wav"
        stream.is-live = "true"
        node.autoconnect = "true"
        node.want-driver = "true"
        media.class = "Stream/Output/Audio"
        adapt.follower.spa-node = ""
        object.register = "false"
        factory.id = "8"
        clock.quantum-limit = "8192"
        factory.mode = "split"
        audio.adapt.follower = ""
        library.name = "audioconvert/libspa-audioconvert"
        client.id = "81"
        object.id = "69"
        object.serial = "197"
    params:
      3 (Spa:Enum:ParamId:EnumFormat) r-
      1 (Spa:Enum:ParamId:PropInfo) r-
      2 (Spa:Enum:ParamId:Props) rw
      4 (Spa:Enum:ParamId:Format) rw
      10 (Spa:Enum:ParamId:EnumPortConfig) r-
      11 (Spa:Enum:ParamId:PortConfig) rw
      15 (Spa:Enum:ParamId:Latency) rw
      16 (Spa:Enum:ParamId:ProcessLatency) rw

$ # And now get the format param (index 4 or "Format")
$ pw-cli enum-params 69 Format
  Object: size 256, type Spa:Pod:Object:Param:Format (262147), id Spa:Enum:ParamId:Format (4)
    Prop: key Spa:Pod:Object:Param:Format:mediaType (1), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 20 4
        Id 1        (Spa:Enum:MediaType:audio)
    Prop: key Spa:Pod:Object:Param:Format:mediaSubtype (2), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 20 4
        Id 1        (Spa:Enum:MediaSubtype:raw)
    Prop: key Spa:Pod:Object:Param:Format:Audio:format (65537), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 24 4
        Id 283      (Spa:Enum:AudioFormat:F32LE)
        Id 283      (Spa:Enum:AudioFormat:F32LE)
    Prop: key Spa:Pod:Object:Param:Format:Audio:rate (65539), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 24 4
        Int 44100
        Int 44100
    Prop: key Spa:Pod:Object:Param:Format:Audio:channels (65540), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 24 4
        Int 2
        Int 2
    Prop: key Spa:Pod:Object:Param:Format:Audio:position (65541), flags 00000000
      Choice: type Spa:Enum:Choice:None, flags 00000000 32 16
        Array: child.size 4, child.type Spa:Id
          Id 0        (Spa:Enum:AudioChannel:UNK)
          Id 0        (Spa:Enum:AudioChannel:UNK)

Each object type expose different params. Devices for example expose a read-write Profile param for their currently selected profile or for setting the profile we want. They also expose the read-only EnumProfile for a list of available profiles.

pw-top

The pw-top tool provides a list of the graph’s nodes, associated with various instantaneous statistics coming from the profiler module. Its aim it to offer a quick overview of the graph execution state.

We can study a frame of pw-top‘s output, with the dynamic source node we developed previously outputting to an ALSA PCM sink:

S  ID  QUANT   RATE    WAIT    BUSY   W/Q   B/Q  ERR  NAME

S  26      0      0   0.0µs   0.0µs  0.00  0.00    0  Dummy-Driver
S  27      0      0   0.0µs   0.0µs  0.00  0.00    0  Freewheel-Driver
R  33    256  48000   1.1ms   1.1ms  0.21  0.22    0  alsa_output..
R  42    256  44100 185.0µs   1.0ms  0.03  0.19   36   + Audio source
S  34      0      0   0.0µs   0.0µs  0.00  0.00    0  alsa_input..

The name column for ALSA nodes has been shortened for terseness. Here are the columns that can be found, remembering that each row represents a node:

  • S: describes its status. The letter S means stopped while R is for running.
  • ID: its integer identifier.
  • QUANT: its selected quantum. Follower nodes have a displayed quantum equal to zero if they are not requesting a specific latency for their execution (in which case the driver node’s quantum is picked).
  • RATE: its sample rate. It displays as zero if no specific latency is requested by the node.
  • WAIT: its last scheduling time. It is the delta between signal and awake timings.
  • BUSY: its last execution time. Expressed differently, it is the delta between the awake and finish timings.
  • W/Q: fraction of WAIT over quantum.
  • B/Q: fraction of BUSY over quantum.
  • ERR: number of errors, that is currently the number of xruns (overruns or underruns) reported by the profiler module plus the number of cycles monitored by pw-top for which the node wasn’t finished executing.

pw-top is good at providing an overview, with some values coming from the profiler. The issue with its display method is that it hides many information and most importantly, it might miss some timing spikes that can explain why and when audio glitches occur. To solve that, pw-profiler can be used.

pw-profiler

In the same way as pw-top, pw-profiler registers to the profiler module events. Its operation method is different however: it logs to a profiler.log file a line for each execution cycle until interrupted. Together with this file it generates Gnuplot .plot files that describe various graphs and a generate_timings.sh shell scripts that should be called to turn those .plot files into .svg files, used by a Timings.html document. The workflow is simple in practice:

pw-profiler
# Interrupt it using Ctrl+C

# Possibly move profiler.log, Timing*.plot, generate_timings.sh and
# Timings.html to the host PC which has gnuplot installed

sh generate_timings.sh
xdg-open Timings.html

I strongly encourage you to run pw-profiler, even if it is on your desktop computer, and study its output. It currently outputs 5 plots that express the following:

  1. “Audio driver delay” is the reported total delay between the current audio position and the hardware. “Audio period” is the time from a cycle start to the next. “Audio estimated” is an estimation of the current cycle duration.
  2. “Driver end date” is the time from a cycle start to the end of the driver execution.
  3. Clients end date is showing the time from a cycle start to the end of each client execution. That is different from the client execution time, as it includes the time prior to its execution: waiting for its dependencies to run and the signaling time.
  4. Clients scheduling latency is the time from the client being scheduled to it starting running. This graph can highlight issues in your system’s IPC latencies.
  5. Clients duration is the time for the client to run. It can highlight spikes in your node processing time.

Here is what the scheduling latency could look like in a setup with two nodes (a source and a filter), without and with the PREEMPT_RT patch:

Without PREEMPT_RT
With PREEMPT_RT

Also, here is the impact of resampling on the clients duration (notice the legend, colors are inversed):

With resampling
Without resampling

A remote patchbay

A tool type that is frequently useful is a patch-bay: it displays your graph as is, in a 2D plane, in real-time. It also allows you to modify it by deleting and creating links. Helvum is such a patch-bay software which is aimed at showing your local PipeWire instance. It can however be tricked into connecting to a remote instance using the following commands:

ssh $LOGIN@$IP "socat TCP4-LISTEN:8000 UNIX-CONNECT:/run/pipewire-0" &
socat UNIX-LISTEN:/tmp/pipewire-0 TCP4:$IP:8000 &
PIPEWIRE_RUNTIME_DIR=/tmp helvum

This requires the socat command on the board and our desktop. We link the local /tmp/pipewire-0 Unix socket through a TCP tunnel hosted by our board on port 8000, linked to the /run/pipewire-0 Unix socket.

That last file is the IPC that is created by the PipeWire daemon to allow clients to connect to it. As such, when we spawn Helvum by setting the PIPEWIRE_RUNTIME_DIR, it connects as a standard client to our remote PipeWire instance.

wpctl

Interactions with WirePlumber are to be done using the wpctl CLI tool. It allows one to get access to overall information using wpctl status. The main WirePlumber way of controlling to which output audio goes is through setting the default sink, which can be done using wpctl set-default $ID. The get-volume, set-volume and set-mute commands expose sound volume control. As an example, here is the command you would need to run to raise the current output volume by 10%: wpctl set-volume @DEFAULT_SINK@ 10%+.

Conclusion

We have therefore seen the PipeWire API through an example source, modified our kernel to implement a new input device which we used from our source to control the audio. We then did a rundown of the various tools that are useful when dealing with the PipeWire ecosystem.

Hands-on installation of PipeWire

Let’s jump right in! In the previous article, we went through a theoretical overview of PipeWire. Our goal will now be to install and configure a minimal Linux-based system that runs PipeWire in order to output audio to an ALSA sink. The hardware for this demo will be a SAMA5D3 Xplained board and a generic USB sound card (a Logitech USB Headset H340 in our case, as reported by /sys/bus/usb/devices/MAJOR-MINOR/product).

We won’t bother with the bootloader setup (in our case U-Boot) as this is out of scope of our topic; if needed, Bootlin has training sessions for embedded Linux system development for which the training materials are freely available.

We will rely on Buildroot for the root filesystem, and compile our Linux kernel outside Buildroot for ease of development. In the chronological order, here are the steps we’ll follow:

  1. Download Buildroot and configure it. This step will provide us with two things: a cross-compiling toolchain and a root filesystem. We will use a pre-compiled toolchain as compiling a GCC toolchain is a slow process.
  2. Download, configure and build the kernel. This will require small tweaks to ensure the right drivers are compiled-in. We will rely upon the Buildroot-provided toolchain, which will make allow our project to be self-contained and reduce the number of dependencies installed system-wide. This also leads to a more reproducible routine.
  3. Boot our board; this requires a kernel image and a root filesystem. We’ll rely upon U-Boot’s TFTP support to retrieve the kernel image and Linux’s NFS support for root filesystems to allow for quick changes.
  4. Iterate on 1, 2 and 3 as needed! We might want to change kernel options or add packages to our root filesystem.

Feel free to skip the steps that are not required for you if you plan to follow along, this probably assumes some small configuration changes here and there on your side.

Buildroot: toolchain & root filesystem

Let’s start with Buildroot:

$ export WORK_DIR=PATH/TO/WORKING/DIRECTORY/
$ cd $WORK_DIR

# Download and extract Buildroot
$ export BR2_VERSION=2022.02
$ wget "https://buildroot.org/downloads/buildroot-$BR2_VERSION.tar.gz"
$ tar xf buildroot-$BR2_VERSION.tar.gz
$ mv buildroot-$BR2_VERSION buildroot

# Hop into the config menu
$ cd buildroot
$ make menuconfig
# nconfig, xconfig and gconfig are also available options

It’s config time! We’ll use a pre-compiled glibc-based toolchain.

  • In “Target options”:
    • “Target architecture” should be “ARM (little endian)” (BR2_arm symbol);
    • “Target architecture variant” should be “cortex-A5” (BR2_cortex_a5);
    • “Enable VFP extension support” should be true (BR2_ARM_ENABLE_VFP);
  • In “Toolchain”:
    • “Toolchain type” should be “External toolchain” (BR2_TOOLCHAIN_EXTERNAL);
    • “Toolchain” should be “Bootlin toolchains” (BR2_TOOLCHAIN_EXTERNAL_BOOTLIN);
    • “Bootlin toolchain variant” should be “armv7-eabihf glibc stable 2021.11-1” (BR2_TOOLCHAIN_EXTERNAL_BOOTLIN_ARMV7_EABIHF_GLIBC_STABLE);
    • “Copy gdb server to the target” can be set to true, this might come in useful in such experiments (BR2_TOOLCHAIN_EXTERNAL_GDB_SERVER_COPY).
  • In “Build options”, various options could be modified based on preferences: “build packages with debugging symbols”, “build packages with runtime debugging info”, “strip target binaries” and “gcc optimization level”.
  • In “System configuration”, the root password can be defined (BR2_TARGET_GENERIC_ROOT_PASSWD symbol). Changing this from the default empty password will allow us to login using SSH.
  • In “Target packages”, we’ll list them using symbol names as that is easier to search:
    • BR2_PACKAGE_ALSA_UTILS with its APLAY option, to enable testing devices directly using ALSA;
    • BR2_PACKAGE_DROPBEAR to enable the Dropbear SSH server, its client option can be disabled;
    • BR2_PACKAGE_PIPEWIRE, today’s topic.

From this article’s introduction, we know that we still need a session manager to go along with PipeWire. Both pipewire-media-session and WirePlumber are packaged by Buildroot but we’ll stick with WirePlumber as its the recommended option. At the place it should appear in the menuconfig is a message that tells us that we are missing dependencies:

*** wireplumber needs a toolchain w/ wchar, threads and Lua >= 5.3 ***

If in doubt of what causes this message to appear as it lists multiple dependencies, we can find the exact culprit by searching for the BR2_PACKAGE_WIREPLUMBER symbol in menuconfig, which tells us on which symbols WirePlumber depends on:

Symbol: BR2_PACKAGE_WIREPLUMBER [=n]
Type  : bool
Prompt: wireplumber
  Location:
    -> Target packages
      -> Libraries
      (1)     -> Graphics
  Defined at package/wireplumber/Config.in:1
  Depends on: BR2_PACKAGE_PIPEWIRE [=y] &&
    (BR2_PACKAGE_LUA_5_3 [=n] || BR2_PACKAGE_LUA_5_4 [=n]) &&
    BR2_USE_WCHAR [=y] && BR2_TOOLCHAIN_HAS_THREADS [=y] &&
    BR2_USE_MMU [=y]
  Selects: BR2_PACKAGE_LIBGLIB2 [=n]

The depends on entry tells us the boolean expression that needs to be fullfilled for BR2_PACKAGE_WIREPLUMBER to be available. Next to each symbol name is its current value in square brackets.

Note: this process could have been done manually, by looking for the WirePlumber symbol definition in buildroot/package/wireplumber/Config.in and grepping our current .config, seeing what was missing.

The conclusion is that we are missing Lua, which is the scripting used throughout WirePlumber. Enabling BR2_PACKAGE_LUA makes the BR2_PACKAGE_WIREPLUMBER option available, which we enable.

In the Buildroot version we selected, the WirePlumber package lists PACKAGE_DBUS as an unconditional dependency in the WIREPLUMBER_DEPENDENCIES variable, in package/wireplumber/wireplumber.mk. However, WirePlumber can be built fine without it and we therefore need to remove it manually to build successfully our image. This has been fixed for upcoming Buildroot versions.

As often in Buildroot, packages have optional features that get enabled if dependencies are detected. make menuconfig won’t tell us about those, the best way is to browse the package/$PKG/$PKG.mk files for $PKG that interests us and see what gets conditionnally enabled. By visiting PipeWire’s and WirePlumber’s makefiles, we can see that we might want to enable:

  • BR2_PACKAGE_DBUS for various D-Bus-related features which we have explored in the first article; this allows building the SPA D-Bus support plugin relied upon by both PipeWire and WirePlumber, which explains why WirePlumber doesn’t directly depend upon D-Bus;
  • BR2_PACKAGE_HAS_UDEV to support detection of events on ALSA, V4L2 and libcamera devices;
  • BR2_PACKAGE_SYSTEMD for systemd unit files to get generated and systemd-journald support (logging purposes);
  • BR2_PACKAGE_ALSA_LIB for ALSA devices support (which also requires BR2_PACKAGE_ALSA_LIB_{SEQ,UCM} and BR2_PACKAGE_HAS_UDEV);
  • BR2_PACKAGE_AVAHI_LIBAVAHI_CLIENT for network discovery in various PipeWire modules: search for the avahi_dep symbol in PipeWire’s meson.build files for the list;
  • BR2_PACKAGE_NCURSES_WCHAR to build the pw-top monitoring tool;
  • BR2_PACKAGE_LIBSNDFILE to build the pw-cat tool (equivalent of alsa-tools’ aplay);
  • and a few others.

One option that needs discussion is the BR2_PACKAGE_HAS_UDEV. It is required to have the -Dalsa=enabled option at PipeWire’s configure step. As can be seen in PipeWire’s spa/meson.build, this option enforces that ALSA support gets built:

alsa_dep = dependency('alsa', required: get_option('alsa'))

This line seems to indicate that to have ALSA support, we could simply add ALSA as a dependency and rely on the fact that the build system will find it. However, later on in the same Meson build file, we notice:

libudev_dep = dependency(
    'libudev',
    required: alsa_dep.found() or
        get_option('udev').enabled() or
        get_option('v4l2').enabled())

This line means that if the ALSA dependency is found, the libudev dependency is required which would lead to a failing build if we don’t have udev support.

As we expect ALSA support, we’ll make sure BR2_PACKAGE_HAS_UDEV is enabled. To find out what provides this config entry, the easiest way is a search through Buildroot for the select BR2_PACKAGE_HAS_UDEV string, which returns two results:

$ grep -sR "select BR2_PACKAGE_HAS_UDEV" .
./package/eudev/Config.in:      select BR2_PACKAGE_HAS_UDEV
./package/systemd/Config.in:    select BR2_PACKAGE_HAS_UDEV

We’ll stick with eudev and avoid importing the whole of systemd in our root filesystem. To do so, we tell Buildroot to use eudev for /dev management in the “System configuration” submenu (the BR2_ROOTFS_DEVICE_CREATION_DYNAMIC_EUDEV symbol, which automatically selects BR2_PACKAGE_EUDEV).

In turn, PipeWire’s build configuration automatically enables some options if specific dependencies are found. That is why the package/pipewire/pipewire.mk file has sections such as:

ifeq ($(BR2_PACKAGE_NCURSES_WCHAR),y)
PIPEWIRE_DEPENDENCIES += ncurses
endif

Then, in PipeWire’s meson.build, we see ncurses_dep = dependency('ncursesw', required : false) and in src/tools/meson.build:

if ncurses_dep.found()
  executable('pw-top',
    'pw-top.c',
    install: true,
    dependencies : [pipewire_dep, ncurses_dep],
  )
endif

That means pw-top will get built if ncursesw is found; for ncurses the trailing w means wide.

In our specific case, two tools that get conditionally built interest us: pw-top and pw-cat (and its aliases pw-play, pw-record, etc.). The first one will help us monitor the state of active nodes (their busy time, time quantum, etc.) and the second one is capable of playing an audio file by creating a PipeWire source node; it’s the equivalent of aplay, arecord, aplaymidi and arecordmidi. We therefore enable BR2_PACKAGE_NCURSES, BR2_PACKAGE_NCURSES_WCHAR and BR2_PACKAGE_LIBSNDFILE.

One last thing: let’s include an audio test file in our root filesystem image, for easy testing later on. We’ll create a root filesystem overlay directory for this:

$ cd $WORK_DIR
# Create an overlay directory with a .WAV example file
$ mkdir -p overlay/root
# This file is available under a CC BY 3.0 license, see:
# https://en.wikipedia.org/wiki/File:Crescendo_example.ogg
$ wget -O example.ogg \
    "https://upload.wikimedia.org/wikipedia/en/6/68/Crescendo_example.ogg"
# aplay only supports the .voc, .wav, .raw or .au formats
$ ffmpeg -i example.ogg overlay/root/example.wav
$ rm example.ogg

# Set BR2_ROOTFS_OVERLAY to "../overlay"
# This can be done through menuconfig as well
$ sed -i 's/BR2_ROOTFS_OVERLAY=""/BR2_ROOTFS_OVERLAY="..\\/overlay"/' \
    buildroot/.config

We now have a Buildroot configuration that includes BusyBox for primitive needs, Dropbear as an SSH server, PipeWire and its associated session manager WirePlumber, with automatic /dev management and tools that will help us in our tests (aplay and pw-play for outputting audio and pw-top to get an overview on PipeWire’s state). WirePlumber comes with a tool called wpctl that gets unconditionally built. make can be run in Buildroot’s folder so that both the cross-compiling toolchain and the root filesystem get generated and put into Buildroot’s output folder; see the manual for more information about Buildroot’s output/ directory. The toolchain’s GCC and binutils programs in particular can be accessed in output/host/bin/, all prefixed with arm-linux-.

Linux kernel

As we now have an available toolchain, we can go ahead by fetching, configuring and compiling the kernel:

# Download and extract the Linux kernel
$ export LINUX_VERSION=5.17.1
$ wget "https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-$LINUX_VERSION.tar.xz"
$ tar xf linux-$LINUX_VERSION.tar.xz
$ mv linux-$LINUX_VERSION linux

If we compile the kernel as such, it wouldn’t know what our target architecture is and what toolchain to use (it would use what can be found in our $PATH environment variable, which is most probably not right). We therefore need to inform it using three environment variables:

  • Update the $PATH to add access to the recently-acquired toolchain, the one available in Buildroot’s output/host/bin/;
  • Set the $ARCH to the target’s architecture, that is arm in our case;
  • Set $CROSS_COMPILE to the prefix on our binutils tools, arm-linux- in our scenario.

To avoid forgetting those every time we interact with the kernel’s build system, we’ll use a small script that throws us into a shell with the right variables:

#!/bin/sh
# Make sure $WORK_DIR is absolute
export WORK_DIR=$(dirname $(realpath $0))
export PATH="$WORK_DIR/buildroot/output/host/bin:$PATH"
export ARCH=arm
export CROSS_COMPILE=arm-linux-

This script will be called kernel.sh from now on.

We can now configure our kernel, using the SAMA5 defconfig as groundwork:

$ source kernel.sh
$ cd linux

$ make sama5_defconfig
$ make menuconfig
  • In “General setup”:
    • Set “Kernel compression mode” to “LZO” (optional, CONFIG_KERNEL_LZO symbol);
    • Set “Preemption model” to “Preemptible kernel” for a-bit-better latencies (optional, CONFIG_PREEMPT symbol); if low-latency audio is necessary the PREEMPT_RT patch is probably the first step, along with many other configuration tweaks; Bootlin’s PREEMPT_RT training might be of use;
  • Enable the CONFIG_SND_USB_AUDIO option, for support of USB sound cards in ALSA.

It’s time for compilation using make, without forgetting the -jN option to allow N simultaneous jobs.

Booting our board

We can now boot the kernel on our SAMA5D3 Xplained board. On the host side, that requires prepping a TFTP server with both the kernel image and the device tree binary as well as a NFS server (using the Linux kernel NFS server) for the root filesystem:

# Export the kernel image and device tree binary to the TFTP's
# root folder
$ sudo cp \
    linux/arch/arm/boot/{zImage,dts/at91-sama5d3_xplained.dtb} \
    /var/lib/tftpboot

# Create the root filesystem folder
$ mkdir rootfs
# Extract it from Buildroot's output
$ tar xf buildroot/output/images/rootfs.tar -C rootfs
# Allow read/write access to IP 192.168.0.100
$ echo "$WORK_DIR/rootfs 192.168.0.100(rw,no_root_squash,no_subtree_check)" \
    | sudo tee -a /etc/exports
# Tell the NFS server about our changes to /etc/exports
$ sudo exportfs -a

Do not forget to configure your host’s network interface to use a static IP and routing table, with a command such as the following:

nmcli con add type ethernet ifname $DEVICE_NAME ip4 192.168.0.1/24

On the target side, we configure U-Boot’s network stack, boot command and boot arguments.

# Connect to the board using a serial adapter
$ picocom -b 115200 /dev/ttyUSB0

# In U-Boot's command line interface:

=> env default -a
=> env set ipaddr 192.168.0.100
=> env set serverip 192.168.0.1
=> env set ethaddr 00:01:02:03:04:05
=> env set bootcmd "tftp 0x21000000 zImage ;
        tftp 0x22000000 at91-sama5d3_xplained.dtb ;
        bootz 0x21000000 - 0x22000000"
=> # $WORK_DIR has to be substituted manually
=> env set bootargs "console=ttyS0 root=/dev/nfs
        nfsroot=192.168.0.1:$WORK_DIR/rootfs,nfsvers=3,tcp
        ip=192.168.0.100:::::eth0 rw"
=> env save
=> boot

Outputting audio

That leads to a successful kernel boot! Once connected through SSH we can start outputting sound, first using ALSA directly:

# The password comes from BR2_TARGET_GENERIC_ROOT_PASSWD
$ ssh root@192.168.0.100

$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: H340 [Logitech USB Headset H340], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

$ cd /root
$ aplay example.wav
Playing WAVE 'example.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Mono

It’s time to start fiddling with PipeWire. The current Buildroot packaging for PipeWire and WirePlumber do not provide scripts for starting using the BusyBox init system’s scripts; it provides service and socket systemd units if that is what is used. We’ll have to start them both manually. Naively running pipewire won’t work but it will make the issue explicit:

$ pipewire
[W][00120.281504] pw.context   | [       context.c:  353 pw_context_new()] 0x447970: can't load dbus library: support/libspa-dbus
[E][00120.313251] pw.module    | [   impl-module.c:  276 pw_context_load_module()] No module "libpipewire-module-rt" was found
[E][00120.318522] mod.protocol-native | [module-protocol-:  565 init_socket_name()] server 0x460aa8: name pipewire-0 is not an absolute path and no runtime dir found. Set one of PIPEWIRE_RUNTIME_DIR, XDG_RUNTIME_DIR or USERPROFILE in the environment
[E][00120.320760] pw.conf      | [          conf.c:  559 load_module()] 0x447970: could not load mandatory module "libpipewire-module-protocol-native": No such file or directory
[E][00120.322600] pw.conf      | [          conf.c:  646 create_object()] can't find factory spa-node-factory

The daemon, during startup, tries to create the UNIX socket that will be used by clients to communicate with it; its default name is pipewire-0. However, without specific environment variables, PipeWire does not know where to put it. The fix is therefore to invocate pipewire with the XDG_RUNTIME_DIR variable set:

$ XDG_RUNTIME_DIR=/run pipewire
[W][03032.468669] pw.context   | [       context.c:  353 pw_context_new()] 0x507978: can't load dbus library: support/libspa-dbus
[E][03032.504804] pw.module    | [   impl-module.c:  276 pw_context_load_module()] No module "libpipewire-module-rt" was found
[E][03032.530877] pw.module    | [   impl-module.c:  276 pw_context_load_module()] No module "libpipewire-module-portal" was found

Some warnings still occur, but they do not block PipeWire in its process:

  • The first line is to be expected, as we compiled PipeWire without D-Bus support.
  • The second one is because the default configuration invokes a PipeWire module that makes the daemon process realtime using setpriority(2) and threads using pthread_setschedparam(3) with SCHED_FIFO. This module, until recently, was not getting compiled if D-Bus support wasn’t available as it had a fallback upon RTKit (D-Bus RPC to ask for augmented process priority, used to avoiding giving the privileges to every process). This is fixed in newer versions as the module is now being compiled without RTKit fallback if D-Bus is not available, but the stable Buildroot version we are using is packaging an older version of PipeWire.
  • The third one refers to portal as in xdg-desktop-portal, a D-Bus based interface to expose various APIs to Flatpak applications. This does not matter to us for an embedded use.

The default PipeWire’s daemon configuration can be overridden to remove those warnings: support.dbus in context.properties controls the loading of the D-Bus library, and modules to be loaded are declared in context.modules. The default configuration is located at /usr/share/pipewire/pipewire.conf and a good way to override is it to touch a file with the same name in /etc/pipewire.

Tip: PipeWire’s logging is controlled using the PIPEWIRE_DEBUG environment variable, as described in the documentation.

We can therefore use various PipeWire clients and connect to the daemon: XDG_RUNTIME_DIR=/run pw-top should display both the dummy and freewheel drivers doing nothing, and XDG_RUNTIME_DIR=/run pw-dump gives us a JSON of the list of objects in PipeWire’s global registry.

The reason we do not see our ALSA PCM device is that PipeWire is not responsible for monitoring /dev and adding new nodes to the graph; that is our session manager’s responsability. WirePlumber’s configuration needs to be updated from the default to avoid it crashing because of the lack of a few optional dependencies. To update it, the recommended way is the same as for PipeWire: by overloading the configuration file with one located in /etc/wireplumber. Here are the issues with a default config:

  • It expects the SPA bluez library which has as unconditional dependencies libm, dbus, sbc and bluez. It therefore does not get built and cannot be found at runtime by WirePlumber. wireplumber.conf has a { name = bluetooth.lua, type = config/lua } component, which should be commented out to disable Bluetooth support.
  • v4l2 support, through the SPA v4l2 library, has not been built. This can be enabled using the BR2_PACKAGE_PIPEWIRE_V4L2 flag. Disabling the v4l2 monitor requires not calling the v4l2_monitor.enable(), which needs to be commented out in /usr/share/wireplumber/main.lua.d/90-enable-all.lua (Lua’s comments start with two dashes).
  • The ALSA monitor tries to reserve ALSA devices using the org.freedesktop.ReserveDevice1 D-Bus-based protocol.
  • Similarly to PipeWire’s libpipewire-module-portal, WirePlumber has support for Flatpak’s portal, which needs to be disabled as it relies on DBus.

The last two issues can be solved by using the following Lua configuration script, in /etc/wireplumber/main.lua.d/90-disable-dbus.lua:

alsa_monitor.properties["alsa.reserve"] = false
default_access.properties["enable-flatpak-portal"] = false

Once all that is done, WirePlumber’s daemon keeps running and successfully connects to PipeWire:

$ XDG_RUNTIME_DIR=/run wireplumber
M 03:05:17.904989                 pw ../src/pipewire/context.c:353:pw_context_new: 0x4f21d8: can't load dbus library: support/libspa-dbus

The remaining warning can be gotted rid of by setting support.dbus = false in the context.properties section of WirePlumber’s primary configuration.

Tip: those modifications can be added to our filesystem overlay for persistance accross rebuilds of our root filesystem image.

That’s it! WirePlumber now has detected our ALSA sink and source, adding them as nodes to the PipeWire graph. It will detect source nodes that we add to the graph and will link them to the ALSA sink node, outputting audio for our ears to enjoy.

pw-dot, called without any argument, will generate a pw.dot file that represents the active nodes, their ports and the links in the current graph. A .dot file is a textual description of a graph which can be turned graphical using a tool from the Graphviz project. It is simpler to install Graphviz on your host PC, using your favorite package manager, and copy the pw.dot file from the target to the host (a simple local copy as we are using an NFS root filesystem). A SVG file can then be generated as such:

dot -Tsvg pw.dot > pw.svg

Here is what the graph looks like when audio is being outputted using a single source:

PipeWire graph generated by pw-dot, click to see in full size

Conclusion

We have managed to create a rather bare image, with WirePlumber monitoring ALSA devices and adding them as devices and nodes to the PipeWire graph. WirePlumber automatically creates links between source nodes and the default sink node, which means that audio is outputted.

The next step is to create our own custom PipeWire source node. We’ll be able to use the PipeWire API through libpipewire and see what information and capabilities it exposes relative to the overall graph.

An introduction to PipeWire

This blog post is the first part of a series of 3 articles related to the PipeWire project and its usage in embedded Linux systems.

Introduction

PipeWire is a graph-based processing engine, that focuses on handling multimedia data (audio, video and MIDI mainly).

It has gained steam early on by allowing screen sharing on Wayland desktops, which for security reasons, does not allow an application to access any framebuffer that does not concern it. The PipeWire daemon was run with sufficient privileges to access screen data; giving access through a D-Bus service to requesting applications, with file-descriptor passing for the actual video transfer. It was as such bundled in the Fedora distribution, version 27.

Later on, the idea was to expand this to also allow handling audio streams in the processing graph. Big progress has been done by Wim Taymans on this front, and PipeWire is now the default sound server of the desktop Fedora distribution, since version 34.

The project is currently in active development. It happens in the open, lead by Wim Taymans. The API and ABI can both be considered stable, even though version 1.0 has not been released yet. The changelog exposes very few breaking changes (two years without one) and many bug fixes. It is developed in C, using a Meson and Ninja based build system. It has very few unconditional runtime dependencies, but we’ll go through those during our first install.

Throughout this series of blog articles, our goal will be to discover PipeWire and the possiblities it provides, focusing upon audio usage on embedded platforms. A detailed theoretical overview at the start will allow us to follow up with a hands-on approach. Starting with a minimal Buildroot setup on a Microchip SAMA5D3 Xplained board, we will create then our own custom PipeWire source node. We will then study how dynamic, low-latency routing can be done. We’ll end with experiments regarding audio-over-ethernet.

A note: we will start with many theoretical aspects, that are useful to get a good mental model of the way PipeWire works and how it can be used to implement any wanted behavior. This introduction might therefore get a little exhaustive at times, and it could be a good approach to skip even if a concept isn’t fully grasped, to come back later during hands-ons when details on a specific subject is required.

Sky-high overview

A PipeWire graph is composed of nodes. Each node takes an arbitrary number of inputs called ports, does some processing over this multimedia data, and sends data out of its output ports. The edges in the graph are here called links. They are capable of connecting an output port to an input port.

Nodes can have an arbitrary number of ports. A node with only output ports is often called a source, and a sink is a node that only possesses input ports. For example, a stereo ALSA PCM playback device can be seen as a sink with two input ports: front-left and front-right.

Here is a visual representation of a PipeWire graph instance, provided by the Helvum GTK patchbay:

Screenshot provided by the Helvum project

Visual attributes are used in Helvum to describe the state of nodes, ports and links:

  • Node names are in white, with their ports being underneath the names. Input ports are on the left while output ports are on the right.
  • “Dummy-Driver” and “Freewheel-Driver” nodes have no ports. Those two are particular sinks (with dynamic input ports, that appear when we connect a node to them) used in specific conditions by PipeWire.
  • Red means MIDI, yellow means video and blue means audio.
  • Links are solid when active (data is “passing-through” them) and dashed when in a paused state.

Note: if your Linux desktop is running PipeWire, trying installing Helvum to graphically monitor and edit your multimedia graph! It is currently packaged on Fedora, Arch Linux, Flathub, crates.io and others.

Design choices

There are a few noticeable design choices that explain why PipeWire is being adopted for desktop and embedded Linux use cases.

Session and policy management

One first design choice was to avoid tackling any management logic directly inside PipeWire; context-dependent behaviour such as monitoring for new ALSA devices, and configuring them so that they appear as nodes, or automatically connecting nodes using links is not handled. It rather provides an API that allows spawning and controlling those graph objects. This API is then relied upon by client processes to control the graph structure, without having to worry about the graph execution process.

A pattern that is often used and is recommended is to have a single client be a daemon that deals with the whole session and policy management. Two implementations are known as of today:

  • pipewire-media-session, which was the first implementation of a session manager. It is now called an example and used mainly in debugging scenarios.
  • WirePlumber, which takes a modular approach: it provides another, higher-level API compared to the PipeWire one, and runs Lua scripts that implement the management logic using the said API. In particular, this session manager gets used in Fedora since version 35. It ships with default scripts and configuration that handle linking policies as well as monitoring and automatic spawning of ALSA, bluez, libcamera and v4l2 devices. The API is available from any process, not only from WirePlumber’s Lua scripts.

Individual node execution

As described above, the PipeWire daemon is responsible for handling the proper processing of the graph (executing nodes in the right order at the right time and forwarding data as described by links) and exposing an API to allow authorized clients to control the graph. Another key point of PipeWire’s design is that the node processing can be done in any Linux process. This has a few implications:

  • The PipeWire daemon is capable of doing some node processing. This can be useful to expose a statically-configured ALSA device to the graph for example.
  • Any authorized process can create a PipeWire node and be responsible for the processing involved (getting some data from input ports and generating data for output ports). A process that wants to play stereo audio from a file could create a node with two output ports.
  • A process can create multiple PipeWire nodes. That allows one to create more complex applications; a browser would for example be able to create a node per tab that requests the ability to play audio, letting the session manager handle the routing: this allows the user to route different tab sources to different sinks. Another example would be an application that requires many inputs.

API and backward compatibility

As we will see later on, PipeWire introduces a new API that allows one to read and write to the graph’s overall state. In particular, it allows one to implement a source and/or sink node that will be handling audio samples (or other multimedia data).

One key point for PipeWire’s quick adoption is a focus on providing a shim layer to currently-widespread audio API in the Linux environment. That is:

    • It can obviously expose ALSA sinks or sources inside the graph. This is at the heart of what makes PipeWire useful: it can interact with local audio hardware. It uses alsa-lib as any other ALSA client. PipeWire is also capable of creating virtual ALSA sinks or sources, to interface with applications that rely solely upon the alsa-lib API.
    • It can implement the PulseAudio API in place of PulseAudio itself. This simply requires starting a second PipeWire daemon, with a specific pulse configuration. Each PulseAudio sink/source will appear in the graph, as if native. PulseAudio is the main API used by Linux desktop users and this feature allows PipeWire to be used as a daily-driver while supporting all standard applications. An anecdote: relying on the PulseAudio API is still recommended for simple audio applications, for its more widespread and simpler API.
    • It also implements the JACK Audio Connection Kit (or JACK); this API has been in use by the pro-audio audience and targets low-latency for audio and MIDI connections between applications. This requires calling JACK-based applications using pw-jack COMMAND, which does the following according to its manual page:

pw-jack modifies the LD_LIBRARY_PATH environment variable so that applications will load PipeWire’s reimplementation of the JACK client libraries instead of JACK’s own libraries. This results in JACK clients being redirected to PipeWire.

Schema illustrating the way PulseAudio and JACK applications are supported

About compatibility with Linux audio standards, the PipeWire FAQ has an interesting answer to the expected question whenever something new appears: why another audio standard, Linux already has 13 of them? For exhaustiveness, here is a quick rundown of the answer: it describes how Linux has one kernel audio subsystem (ALSA) and only two userspace audio servers: PulseAudio and JACK. Others are either frameworks relying on various audio backends, dead projects or wrappers around audio backends. PipeWire’s goal, on the audio side, is to provide an alternative to both PulseAudio and JACK.

Real-time execution: push or pull?

In the simple case of a producer and a consumer of data, two execution models are in theory possible:

  • Push, where the producer generates data when it can into a shared buffer, from which the consumer reads. This is often associated with blocking writes to signal the producer when the buffer is full.
  • Pull, where the producer gets signaled when data is needed for the consumer, at which point the producer should generate data as fast as possible into the given shared buffer.

In a real-time case scenario, latency is optimal when the data quantity in the shared buffer is minimised: when the producer adds data to the buffer, all the data already present in the buffer needs to be consumed before the new data gets processed as well. As such, the pull method allows the system to monitor the shared buffer state and signal the producer before the shared buffer gets empty; this guarantees data that is as up-to-date as possible as it was generated as late as possible.

That was for a generic overview of pushed versus pulled communication models. PipeWire adopts the pull model as it has low latencies as a goal. Some notes:

  • The structure is more complex compared to a single producer and single consumer architecture, as there can be many more producers and consumers, possibly with nodes depending on multiple other nodes.
  • The PipeWire daemon handles the signaling of nodes. Those get woken up, fill a shared memory buffer and pass it onto its target nodes; those are the nodes that take its output as an input (as described by link objects).
  • The concept of driver nodes is introduced; other nodes are called followers. For each component (subgraph of the whole PipeWire graph), one node is the driver and is responsible for timing information. It is the one that signals PipeWire when a new execution cycle is required. For the simple case of an audio source node (the producer) and an ALSA sink node (the consumer), the ALSA sink will send data to the hardware according to a timer, signaling PipeWire to start a new cycle when it has no more data to send: it pulls data from the graph by telling it that it needs more.

Note: in this simple example, the buffer size provided to ALSA by PipeWire determines the time we have to generate new data. If we fail to execute the entire graph in time before the timer, the ALSA sink node will have no data and this will lead to an underrun.

Implementation overview

This introduction and the big design decisions naturally lead us to have a look at the actual implementation concepts. Here are the questions we will try to answer:

  • How is the graph state represented?
  • How can a client process get access to the graph state and make changes?
  • How is IPC communication handled?

Graph state representation: objects, objects everywhere

As said previously, PipeWire’s goal is to maintain, execute and expose a graph-structured multimedia execution engine. The graph state is maintained by the PipeWire daemon, which runs the core object. A fundamental principle is the concept of an object. Clients communicate with the core using IPC, and can create objects of various types, which can then be exported. Exporting an object means telling the core and its registry about it, so that the object becomes a part of the graph state.

Every object have at least the following: a unique integer identifier, some permissions flags for various operations, an object type, string key-value pairs of properties, methods and event types.

Object types

There is a fixed type list, so let’s go through the main existing types to understand the overall structure better:

  • The core is the heart of the PipeWire daemon. There can only be one core per graph instance and it has the identifier zero. It maintains the registry, which has the list of exported objects.
  • A client object is the representation of an open connection with a client process, from within the daemon process.
  • A module is a shared object that is used to add functionality to a PipeWire client. It has an initialisation function that gets called when the module gets loaded. Modules can be loaded in the core process or in any client process. Clients do not export to the registry the modules they load. We’ll see examples of modules and how to load them later on.
  • A node is a producer and/or consumer of data; its main characteristic is to have input and output port objects, which can be connected using link objects to create the graph structure.
  • A port belongs to a node and represents an input or output of data. As such, it has a direction, a data format and can have a channel position if it is audio data that is being transferred.
  • A link object connects two ports of opposite direction together; it describes a graph edge.
  • A device is a handle representing an underlying API, which is then used to create nodes or other devices. Examples of devices are ALSA PCM cards or V4L2 devices. A device has a profile, which allows one to configure them.
  • A factory is an object whose sole capability is to create other objects. Once a factory is created, it can only emit the type of object it declared. Those are most often delivered as a module: the module creates the factory and stays alive to keep it accessible for clients.
  • A session object is supposed to represent the session manager, and allow it to expose APIs through the PipeWire communication methods. It is not currently used by WirePlumber but this is planned.
  • An endpoint is the concept of a (possibly empty) grouping of nodes. Associated with endpoint streams and links, they can represent a higher-level graph that is handled by the session manager. Those would allow modeling complex behaviors such as mutually-exclusive sinks (think laptop speakers and line-out port) or nodes to which PipeWire cannot send audio streams, such as analog peripherals for which the streams do not go through the CPU. Those peripherals would therefore appear in the graph, be controlled with the same API (routing using links, setting volume, muting, etc.) but the processing would be done outside PipeWire’s reach. See PipeWire’s documentation for more information on the potential of those advanced features.

Permissions

The session and policy manager (most often WirePlumber) is also responsible for defining the list of permissions each client has. Each permission entry is an object ID and four flags. A special PW_ID_ANY ID means that those permissions are the default, to be used if a specific object is not described by any other permission. Here are the four flags:

  • Read: the object can be seen and events can be received;
  • Write: the object can be modified, usually through methods (which requires the execute flag);
  • eXecute: methods can be called;
  • Metadata: metadata can be set on the object.

This isn’t well leveraged upon yet, as all clients get default permissions of rwxm: read, write, execute, metadata.

Properties

All objects also have properties attributed to them, which is a list of string key-value pairs. Those are abitrary and various keys are expected for various object types. An example link object has the following properties (as reported by pw-cli dump LINK_ID):

# Link ID
object.id = "95"

# Source port
link.output.node = "91"
link.output.port = "93"

# Destination port
link.input.node = "80"
link.input.port = "86"

# Client that created the link
client.id = "32"

# Factory that was called to create the link
factory.id = "20"

# Serial identifier: an incremental identifier that guarantees no
# duplicate across a single instance. That exists because standard
# IDs get reused to keep them user-friendly.
object.serial = "677"

Parameters

Some object types also have parameters (often abbreviated as params), which is a fixed-length list of parameters that the object possesses, specific to the object type. Currently, nodes, ports, devices, sessions, endpoints and endpoint streams have those. Those params have flags that define if they can be read and/or written, allowing things like constant parameters defined at the object creation.

Parameters are the key that allow WirePlumber to negotiate data formats and port configuration with nodes: hardware that supports multiple sample rates? channel count and positions? sample format? enable monitor ports? etc. Nodes expose enumerations of what they are capable of, and the session manager writes the format/configuration it chose.

Methods & events

An object’s implementation is defined by its list of methods. Each object type has a list of methods that it needs to implement. One note-worthy method is process, that can be found on nodes. It is the one that eats up data from input ports and provides data for each output port.

Every object implement at least the add_listener method, that allows any client to register event listeners. Events are used through the PipeWire API to expose information about an object that might change over time (the state of a node for example).

Exposing the graph to clients: libpipewire and its configuration

Once an object is created in a process, it can be exported to the core’s registry so that it becomes a part of the graph. Once exported, an object is exposed and can be accessed by other clients; this leads us into this new section: how clients can get access and interact with the graph.

The easiest way to interact with a PipeWire instance is to rely upon the libpipewire shared object library. It is a C library that allows one to connect to the core. The connection steps are as follows:

  1. Initialise the library using pw_init, whose main goal is to setup logging.
  2. Create an event-loop instance, of which PipeWire provides multiple implementations. The library will later plug into this event-loop to register event listeners when requested.
  3. Create a PipeWire context instance using pw_context_new. The context will handle the communication process with PipeWire, adding what it needs to the event-loop. It will also find and parse a configuration file from the filesystem.
  4. Connect the context to the core daemon using pw_context_connect. This does two things: it initialises the communication method and it returns a proxy to the core object.

Proxies

A proxy is an important concept. It gives the client a handle to interact with a PipeWire object which is located elsewhere but which has been registered in the core’s registry. This allows one to get information about this specific object, modify it and register event listeners.

Event listeners are therefore callbacks that clients can register on proxy objects using pw_*_add_listener, which takes a struct pw_*_events defining a list of function pointers; the star should be replaced by the object type. The libpipewire library will tell the remote object about this new listener, so that it notifies the client when a new event occurs.

We’ll take an example to describe the concept of proxies:

Schema of a daemon and two clients, with one client having a proxy pointing to the remote node

In this schema, green blocks are objects (the core, clients and a node) and grey ones are proxies. Dotted blocks represent processes. Here is what would happen, in order, assuming client process 2 wants to get the the state of a node that lives in client process 1:

  1. Client process 2 creates a connection with the core, that means:
    • On the daemon side, a client object is created and exported to the registry;
    • On the client side, a proxy to the core object is acquired, which represents the connection with the core.
  2. It then uses the proxy to core and the pw_core_get_registry function to get a handle on the registry.
  3. It registers an event listener on the registry’s global event, by passing a struct pw_registry_events to pw_registry_add_listener. That event listener will get called once for each object exported to the registry.
  4. The global event handler will therefore get called once with the node as argument. When this happens, a proxy to the node can be obtained using pw_registry_bind and the info event can be listened upon using pw_node_add_listener on the node proxy with a struct pw_client_events containing the list of function pointers used as event handlers.
  5. The info event handler will therefore be called once with a struct pw_node_info argument, that contains the node’s state. It will then be called each time the state changes.

The same thing is done in tutorial6.c to print every clients’ information.

Context configuration

When a PipeWire context is created using pw_context_new, we mentioned that it finds and parses a configuration file from the filesystem. To find a configuration file, PipeWire requires its name. It then searches for this file in following locations, $sysconfdir and $datadir being PipeWire build variables:

  1. Firstly, it checks in $XDG_CONFIG_HOME/pipewire/ (most probably ~/.config/pipewire/);
  2. Then, it looks in $sysconfdir/pipewire/ (most probably /etc/pipewire/);
  3. As a last resort, it tries $datadir/pipewire/ (most probably /usr/share/pipewire/).

PipeWire ships with default configuration files, which are often put in the $datadir/pipewire/ path by distributions, meaning those get used as long as they have not been overriden by custom global configuration files (in $sysconfdir/pipewire/) or personal configuration files (in $XDG_CONFIG_HOME/pipewire/). Those are namely:

  • pipewire.conf, the daemon’s configuration file;
  • pipewire-pulse.conf, for the daemon process that implements the PulseAudio API;
  • client.conf, for processes that want to communicate using the PipeWire API;
  • client-rt.conf, for processes that want to implement node processing, RT meaning realtime;
  • jack.conf, used by the PipeWire implementation of the JACK shared object library;
  • minimal.conf, meant as an example for those that want to run PipeWire without a session manager (static configuration of an ALSA device, nodes and links).

The default configuration name used by a context is client.conf. This can be overriden either through the PIPEWIRE_CONFIG_NAME environment variable or through the PW_KEY_CONFIG_NAME property, given as an argument to pw_context_new. The search path can also be modified using the PIPEWIRE_CONFIG_PREFIX environment variable.

Make sure to go through one of them to get familiar with them! The format is described as a “relaxed JSON variant”, where strings do not need to be quoted, the key-value separator is an equal symbol, commas are unnecessary and comments are allowed starting with an hash mark. Here are the sections that can be found in a configuration file:

  • context.properties, that configures the context (log level, memory locking, D-Bus support, etc.). It is also used extensively by pipewire.conf (the daemon’s configuration) to configure the graph default and allowed settings.
  • context.spa-libs defines the shared object library that should be used when a SPA factory is asked for. The default values are best to be kept alone.
  • context.modules lists the PipeWire modules that should be loaded. Each entry has an associated comment that explains clearly what each modules does. As an example, the difference between client.conf and client-rt.conf is the loading of libpipewire-module-rt that turns on real-time priorities for the process and its threads.
  • context.objects allows one to statically create objects by providing a factory name associated with arguments. This is what is used by the daemon’s pipewire.conf to create the dummy node, or by minimal.conf to statically create an ALSA device and node as well as a static node.
  • context.exec lists programs that will be executed as childs of the process (using fork(2) followed by execvp(3)). This was primarily used to start the session manager; it is however recommended to handle its boot separately, using your init system of choice.
  • filter.properties and stream.properties are used in client.conf and client-rt.conf to configure node implementations. Filters and streams are the two abstractions that can be used to implement custom nodes, which we will talk in detail in a later article.

Inter-Process Communication (IPC)

Being a project that handles multimedia data, transfers it in-between processes and aims for low-latency, the inter-process communication it uses is at the heart of its implementation.

Event loop

The event-loop described previously is the scheduling mechanism for every PipeWire process (the daemon and every PipeWire client process, including WirePlumber, pipewire-pulse and others). This loop is an abstraction layer over the epoll(7) facility. The concept is rather simple: it allows one to monitor multiple file descriptors with a single blocking call, that will return once one file descriptor is available for an operation.

The main entry point to this event loop is pw_loop_add_source or its wrapper pw_loop_add_io, which adds a new file descriptor to be listened for and a callback to take action once an operation is possible. In addition to the loop instance, the file descriptor and the callback, it takes the following arguments:

  • A mask describing the operations for which we should be waken up: read(2) is possible (SPA_IO_IN), write(2) is possible (SPA_IO_OUT), an error occured (SPA_IO_ERR) and a hang-up occured (SPA_IO_HUP);
  • A boolean describing whether the file descriptor should be closed automatically at the end of not;
  • A void pointer given to the callback; this is often called user data which means we can avoid static global variables.

Note: this event loop implementation is not reserved to PipeWire-related processing; it can be used as a main event loop in your processes.

That leads us to the other synchronisation and communication primitives used, which are all file-descriptor-based for integration with the event loop.

File-descriptor-based IPC

eventfd(2) is used as the main wake-up method when that is required, such as with node objects that must run their process method. signalfd(2) is used to register signal callbacks in the event-loop.

epoll(7), eventfd(2) and signalfd(2) being Linux-specific, it should be noted that there is an abstraction layer that allows one to use other primitives for implementations. Currently, Xenomai primitives are supported through this layer.

The main communication protocol is based upon a local streaming socket(2): socket(PF_LOCAL, SOCK_STREAM | SOCK_CLOEXEC | SOCK_NONBLOCK, 0). The encoding scheme used is called Plain Object Data (POD) and is a rather simple format; a POD has a 32-bits size, a 32-bits type followed by the content. There are basic types (none, bool, int, string, bytes, etc.) and container types (array, struct, object and sequence). In top of this encoding scheme is provided the Simple Plugin API (SPA) which implements a sort of Remote Procedure Call (RPC). See this PipeWire under the hood blog article that has a detailed section on POD, SPA and example usage of the provided APIs.

D-Bus

PipeWire and WirePlumber also optionally depend on the higher-level D-Bus communication protocol for specific features:

  • Flatpaks are desktop sandboxed applications, that rely on portal (a process that exposes D-Bus interfaces) to access system-wide features such as printing and audio. In our case, libpipewire-module-portal allows the portal process to handle permission management relative to audio for Flatpak applications. See module-portal.c and xdg-desktop-portal for more information.
  • WirePlumber, through its module-reserve-device, supports the org.freedesktop.ReserveDevice1 D-Bus interface. It allows one to reserve an audio device for exclusive use. See the quick and to-the-point specification about the interface for more information.
  • D-Bus support is required if Bluetooth is wanted, to allow communication with the BlueZ process. See the SPA bluez5 plugin.

Conclusion

Now that the overall concepts as well as design and implementation choices have been covered, it is time for some hands-on! We will carry on with a bare install based upon a Linux kernel and a Buildroot-built root filesystem image. Our goal will be to output sound to an USB ALSA PCM sink, from an audio file.

Do not hesitate to come back to this article later on, that might help you clear-up some blurry concepts if needed!