alsa – Bootlin

Testing audio: the beauty of sine-waves

As part of a recent project involving advanced sound cards, Bootlin engineer Miquèl Raynal had to find a way to automate audio hardware loopback testing. In hand, he had a PCI audio device with many external interfaces, each of them featuring an XLR connector. The connectors were wired to analog and digital inputs and outputs. In a regular sound-engineers based company, playing back heavy music through amplifiers and loud speakers is probably the norm, but in order to prevent his colleagues ears from bleeding during his ALSA/DMA debug sessions, he decided to anticipate all human issues and save himself from any whining coming from his nearby colleagues.

Continue reading “Testing audio: the beauty of sine-waves”

Configuring ALSA controls from an application

ALSA logo A common task when handling audio on Linux is the need to modify the configuration of the sound card, for example, adjusting the output volume or selecting the capture channels. On an embedded system, it can be enough to simply set the controls once using alsamixer or amixer and then save the configuration with alsactl store. This saves the driver state to the configuration file which, by default, is /var/lib/alsa/asound.state. Once done, this file can be included in the build system and shipped with the root filesystem. Usual distributions already include a script that will invoke alsactl at boot time to restore the settings. If it is not the case, then it is simply a matter of calling alsactl restore.

However, defining a static configuration may not be enough. For example, some codecs have advanced routing features allowing to route the audio channels to different outputs and the application may want to decide at runtime where the audio is going.

Instead of invoking amixer using system(3), even if it is not straightforward, it is possible to directly use the alsa-lib API to set controls.

Let’s start with some required includes:

#include <stdio.h>
#include <alsa/asoundlib.h>

alsa/asoundlib.h is the header that is of interest here as it is where the ALSA API lies. Then we define an id lookup function, which is actually the tricky part. Each control has a unique identifier and to be able to manipulate controls, it is necessary to find this unique identifier. In our sample application, we will be using the control name to do the lookup.

int lookup_id(snd_ctl_elem_id_t *id, snd_ctl_t *handle)
{
	int err;
	snd_ctl_elem_info_t *info;
	snd_ctl_elem_info_alloca(&info);

	snd_ctl_elem_info_set_id(info, id);
	if ((err = snd_ctl_elem_info(handle, info)) < 0) {
		fprintf(stderr, "Cannot find the given element from card\n");
		return err;
	}
	snd_ctl_elem_info_get_id(info, id);

	return 0;
}

This function allocates a snd_ctl_elem_info_t, sets its current id to the one passed as the first argument. At this point, the id only includes the control interface type and its name but not its unique id. The snd_ctl_elem_info() function looks up for the element on the sound card whose handle has been passed as the second argument. Then snd_ctl_elem_info_get_id() updates the id with the now completely filled id.

Then the controls can be modified as follows:

int main(int argc, char *argv[])
{
	int err;
	snd_ctl_t *handle;
	snd_ctl_elem_id_t *id;
	snd_ctl_elem_value_t *value;
	snd_ctl_elem_id_alloca(&id);
	snd_ctl_elem_value_alloca(&value);

This declares and allocates the necessary variables. Allocations are done using alloca so it is not necessary to free them as long as the function exits at some point.

	if ((err = snd_ctl_open(&handle, "hw:0", 0)) < 0) {
		fprintf(stderr, "Card open error: %s\n", snd_strerror(err));
		return err;
	}

Get a handle on the sound card, in this case, hw:0 which is the first sound card in the system.

	snd_ctl_elem_id_set_interface(id, SND_CTL_ELEM_IFACE_MIXER);
	snd_ctl_elem_id_set_name(id, "Headphone Playback Volume");
	if (err = lookup_id(id, handle))
		return err;

This sets the interface type and name of the control we want to modify and then call the lookup function.

	snd_ctl_elem_value_set_id(value, id);
	snd_ctl_elem_value_set_integer(value, 0, 55);
	snd_ctl_elem_value_set_integer(value, 1, 77);

	if ((err = snd_ctl_elem_write(handle, value)) < 0) {
		fprintf(stderr, "Control element write error: %s\n",
			snd_strerror(err));
		return err;
	}

Now, this changes the value of the control. snd_ctl_elem_value_set_id() sets the id of the control to be changed then snd_ctl_elem_value_set_integer() sets the actual value. There are multiple calls because this control has multiple members (in this case, left and right channels). Finally, snd_ctl_elem_write() commits the value.

Note that snd_ctl_elem_value_set_integer() is called directly because we know this control is an integer but it is actually possible to query what kind of value should be used using snd_ctl_elem_info_get_type() on the snd_ctl_elem_info_t. The scale of the integer is also device specific and can be retrieved with the snd_ctl_elem_info_get_min(), snd_ctl_elem_info_get_max() and snd_ctl_elem_info_get_step() helpers.

	snd_ctl_elem_id_clear(id);
	snd_ctl_elem_id_set_interface(id, SND_CTL_ELEM_IFACE_MIXER);
	snd_ctl_elem_id_set_name(id, "Headphone Playback Switch");
	if (err = lookup_id(id, handle))
		return err;

	snd_ctl_elem_value_clear(value);
	snd_ctl_elem_value_set_id(value, id);
	snd_ctl_elem_value_set_boolean(value, 1, 1);

	if ((err = snd_ctl_elem_write(handle, value)) < 0) {
		fprintf(stderr, "Control element write error: %s\n",
			snd_strerror(err));
		return err;
	}

This unmutes the right channel of Headphone playback, this time it is a boolean. The other common kind of element is SND_CTL_ELEM_TYPE_ENUMERATED for enumerated contents. This is used for channel muxing or selecting de-emphasis values for example. snd_ctl_elem_value_set_enumerated() has to be used to set the selected item.

	return 0;
}

This concludes this simple example and should be enough to get you started writing smarter applications that don't rely on external program to configure the sound card controls.

Audio multi-channel routing and mixing using alsalib

Recently, one of our customers designing an embedded Linux system with specific audio needs had a use case where they had a sound card with more than one audio channel, and they needed to separate individual channels so that they can be used by different applications. This is a fairly common use case, we would like to share in this blog post how we achieved this, for both input and output audio channels.

The most common use case would be separating a 4 or 8-channel sound card in multiple stereo PCM devices. For this, alsa-lib, the userspace API interface to the ALSA drivers, provides PCM plugins. Those plugins are configured through configuration files that are usually known to be /etc/asound.conf or $(HOME)/.asoundrc. However, through the configuration of /usr/share/alsa/alsa.conf, it is also possible, and in fact recommended to use a card-specific configuration, named /usr/share/alsa/cards/<card_name>.conf.

The syntax of this configuration is documented in the alsa-lib configuration documentation, and the most interesting part of the documentation for our purpose is the pcm plugin documentation.

Audio inputs

For example, let’s say we have a 4-channel input sound card, which we want to split in 2 mono inputs and one stereo input, as follows:

Audio input example

In the ALSA configuration file, we start by defining the input pcm:

pcm_slave.ins {
	pcm "hw:0,1"
	rate 44100
	channels 4
}

pcm "hw:0,1" refers to the the second subdevice of the first sound card present in the system. In our case, this is the capture device. rate and channels specify the parameters of the stream we want to set up for the device. It is not strictly necessary but this allows to enable automatic sample rate or size conversion if this is desired.

Then we can split the inputs:

pcm.mic0 {
	type dsnoop
	ipc_key 12342
	slave ins
	bindings.0 0
}

pcm.mic1 {
	type plug
	slave.pcm {
		type dsnoop
		ipc_key 12342
		slave ins
		bindings.0 1
	}
}

pcm.mic2 {
	type dsnoop
	ipc_key 12342
	slave ins
	bindings.0 2
	bindings.1 3
}

mic0 is of type dsnoop, this is the plugin splitting capture PCMs. The ipc_key is an integer that has to be unique: it is used internally to share buffers. slave indicates the underlying PCM that will be split, it refers to the PCM device we have defined before, with the name ins. Finally, bindings is an array mapping the PCM channels to its slave channels. This is why mic0 and mic1, which are mono inputs, both only use bindings.0, while mic2 being stereo has both bindings.0 and bindings.1. Overall, mic0 will have channel 0 of our input PCM, mic1 will have channel 1 of our input PCM, and mic2 will have channels 2 and 3 of our input PCM.

The final interesting thing in this example is the difference between mic0 and mic1. While mic0 and mic2 will not do any conversion on their stream and pass it as is to the slave pcm, mic1 is using the automatic conversion plugin, plug. So whatever type of stream will be requested by the application, what is provided by the sound card will be converted to the correct format and rate. This conversion is done in software and so runs on the CPU, which is usually something that should be avoided on an embedded system.

Also, note that the channel splitting happens at the dsnoop level. Doing it at an upper level would mean that the 4 channels would be copied before being split. For example the following configuration would be a mistake:

pcm.dsnoop {
    type dsnoop
    ipc_key 512
    slave {
        pcm "hw:0,0"
        rate 44100
    }
}

pcm.mic0 {
    type plug
    slave dsnoop
    ttable.0.0 1
}

pcm.mic1 {
    type plug
    slave dsnoop
    ttable.0.1 1
}

Audio outputs

For this example, let’s say we have a 6-channel output that we want to split in 2 mono outputs and 2 stereo outputs:

Audio output example

As before, let’s define the slave PCM for convenience:

pcm_slave.outs {
	pcm "hw:0,0"
	rate 44100
	channels 6
}

Now, for the split:

pcm.out0 {
	type dshare
	ipc_key 4242
	slave outs
	bindings.0 0
}

pcm.out1 {
	type plug
	slave.pcm {
		type dshare
		ipc_key 4242
		slave outs
		bindings.0 1
	}
}

pcm.out2 {
	type dshare
	ipc_key 4242
	slave outs
	bindings.0 2
	bindings.1 3
}

pcm.out3 {
	type dmix
	ipc_key 4242
	slave outs
	bindings.0 4
	bindings.1 5
}

out0 is of type dshare. While usually dmix is presented as the reverse of dsnoop, dshare is more efficient as it simply gives exclusive access to channels instead of potentially software mixing multiple streams into one. Again, the difference can be significant in terms of CPU utilization in the embedded space. Then, nothing new compared to the audio input example before:

out1 is allowing sample format and rate conversion
out2 is stereo
out3 is stereo and allows multiple concurrent users that will be mixed together as it is of type dmix

A common mistake here would be to use the route plugin on top of dmix to split the streams: this would first transform the mono or stereo stream in 6-channel streams and then mix them all together. All these operations would be costly in CPU utilization while dshare is basically free.

Duplicating streams

Another common use case is trying to copy the same PCM stream to multiple outputs. For example, we have a mono stream, which we want to duplicate into a stereo stream, and then feed this stereo stream to specific channels of a hardware device. This can be achieved using the following configuration snippet:

pcm.out4 {
	type route;
	slave.pcm {
	type dshare
		ipc_key 4242
		slave outs
		bindings.0 0
		bindings.1 5
	}
	ttable.0.0 1;
	ttable.0.1 1;
}

The route plugin allows to duplicate the mono stream into a stereo stream, using the ttable property. Then, the dshare plugin is used to get the first channel of this stereo stream and send it to the hardware first channel (bindings.0 0), while sending the second channel of the stereo stream to the hardware sixth channel (bindings.1 5).

Conclusion

When properly used, the dsnoop, dshare and dmix plugins can be very efficient. In our case, simply rewriting the alsalib configuration on an i.MX6 based system with a 16-channel sound card dropped the CPU utilization from 97% to 1-3%, leaving plenty of CPU time to run further audio processing and other applications.