Boris Brezillon – Bootlin

In this article, we would like to introduce our work on the spi-mem Linux kernel framework, which will allow to re-use SPI controller drivers for both SPI NOR devices and regular SPI devices, as well as SPI NAND devices in the future.

From SPI to Dual, Quad and Octo SPI

In the good old days, SPI was a simple protocol, with only 3 signals shared by all devices present on the bus:

MISO: Master In Slave Out
MOSI: Master Out Slave In
SCLK: Serial Clock

and 1 extra signal per device to select the device we want to communicate with:

SS: Slave Select (also called CS for Chip Select sometimes)

But then SPI memories appeared. It started with small and relatively slow ones, like dataflash, EEPROMs and SRAMs and progressively moved to rather large SPI NORs and SPI NANDs. As usual when it comes to dealing with memories, we want to get the best performances out of them. SPI bus limitations quickly became the bottleneck, so vendors decided to add more I/O lines and make the MISO/MOSI lines bi-directional. Nowadays we see SPI controllers that are supporting up to 8 I/O lines. For those who are familiar with the terms, that’s what we call DualSPI, QuadSPI and OctoSPI.

In order to use all I/O lines when doing a master to slave or slave to master transfer, there must be some kind of contract between the slave and the master so that both of them know when they can receive or transmit data on the I/O lines and how many of them they should use. This is done through a predefined set of operations exposed by the slave that the master has to follow to enter a specific transmit or receive state. A SPI memory operation is usually composed of:

1 byte opcode encoding the operation to be executed (but be prepared for 2 bytes opcode, they are coming)
0 to N address bytes whose meaning is dependent on the opcode (can be an absolute memory address or something else)
0 to N dummy bytes to leave the slave enough time to enter the specific state requested through the opcode. Again, the number of dummy bytes depends on the opcode
0 to N IN or OUT data bytes, the direction depends on the opcode

Note that, while this protocol tends to be used by memory devices, there’s nothing restricting it to memories, and I wouldn’t be surprised if some FPGA were using the same kind of transactions for things that are not memory oriented at all.

The Linux SPI ecosystem

Linux has been supporting Dual and Quad SPI mode for quite some time already (v3.12), and SPI device drivers could specify the number of I/O lanes for each SPI transfer. This way, a SPI memory operation could be broken out in several SPI transfer each of them using a pre-defined number of I/O lanes.

That worked just fine until some IP vendors decided to make their SPI controllers smarter and embed some kind of high-level interface to execute SPI memory operations in a single step instead of splitting it in several transfers (actually, most SPI controllers are even smarter than that and allow you to directly map a SPI memory in the CPU address space, but let’s keep that for a future post). At this point, we needed to give more control to the SPI controller so that it could decide what to do exactly, without having to reconstruct a SPI memory operation out of a group of SPI transfers.

At that time, the decision has been to dedicate those controllers to one single task: control SPI NORs (which were the only user of quad and dual mode back then), and the SPI NOR framework was created for this reason.

Due to this decision, we now have in Linux a SPI NOR frame work which connects SPI NOR controller drivers to the SPI NOR logic on one side (spi-nor subsystem), and we have regular SPI controller drivers which can do basic SPI transfers (spi subsystem). However, from a hardware point of view, advanced SPI controllers that provide special features for SPI NOR can often also do basic transfers, and therefore control regular SPI devices. Unfortunately, with the current split between the spi-nor and spi subsystems, if a SPI controller is supported by a driver in the spi-nor subsystem, it cannot be used to communicate with regular SPI devices normally managed by the spi subsystem.

As a partial solution to solve this problem, the ->spi_flash_read() operation was added to struct spi_controller. This allowed regular SPI controller drivers in the spi subsystem to provide an optimized way to read from a SPI NOR memory, which is used by the generic SPI NOR driver m25p80. However, this solution is partial, as it only optimizes reading and is limited to SPI NORs.

Current SPI NOR stack

In the current stack, we have:

The SPI NOR framework, which knows the protocol to talk to SPI NOR memories. This framework relies on an interface listed in struct spi_nor, which is implemented by:
- Specialized SPI NOR controllers, which support advanced SPI controllers dedicated to SPI NORs.
- The m25p80 driver, which provides the same interface, but on top of dumb/regular SPI controller drivers, with the possible optimization of ->spi_flash_read()

What led us to propose the SPI memory interface?

We’ve seen before that the SPI NOR case was already properly supported thanks to the SPI NOR framework. But NORs are not the only memory devices you’ll find on a SPI bus, SPI NANDs are becoming more and more popular.
Peter Pan proposed a framework to handle SPI NAND devices which was following the SPI NOR model: SPI controllers had to implement the SPI NAND controller interface in order to control SPI NANDs. But when we got more deeply involved in this development, we quickly realized how messy it would be to follow this path, because that meant forcing SPI controllers to implement both the SPI NOR and SPI NAND interface if they want to be able control both kind of device. And what will happen when the trend will be at SPI NVRAM or any other kind of storage manufacturers decide to put on a SPI bus? Adding yet another interface that SPI controllers would again have to implement didn’t sound like a good idea.

So instead, we decided to take the problem from the other end, and tried to figure out what SPI NANDs and SPI NORs have in common.
The instruction set is different, the behavior and constraints are different (mainly due to NOR vs NAND differences), but they both follow the same SPI memory operation semantic when interacting with the device, the same one advanced controllers are trying to optimize.

The SPI memory layer is just a way for SPI controller drivers to be passed high-level SPI memory operations instead of letting them handle SPI transfers and try to optimize them themselves. It also simplifies SPI memory drivers’ life, since all they have to care about is sending SPI memory operations based on the SPI memory specs. No need for a complex, constantly evolving, memory dependent interface.

SPI memory stack

With this new stack, both SPI NOR and SPI NAND can be supported on top of the same SPI controller drivers. The m25p80 driver is modified to use the spi-mem interface instead of the limited ->spi_flash_read() interface. For now, we still have dedicated SPI NOR controller drivers, but the goal is to rid of them, and implement them as regular SPI controllers in drivers/spi. Help and contributions in this direction are very welcome!

What does the SPI memory API look like?

The SPI memory API is exposed in include/linux/spi/spi-mem.h.

SPI device drivers who want to use the SPI memory API should declare themselves as spi_mem_drivers and implement the ->probe() and ->remove() functions.
They will be passed a spi_mem object which is just a thin wrapper around a spi_device object. The reason we have a different object is because we want to be able to extend the spi_mem object and attach more information to it if required (like the type of memory, the memory organization and other kind of information advanced SPI controllers might want to know).
When a driver wants to execute a SPI memory operation, it will fill a spi_mem_op struct and call spi_mem_exec_op(). One can also test if a SPI controller is supporting a specific memory operation with spi_mem_supports_op() or try to split data transfers so that they don’t exceed the max transfer size supported by the controller with spi_mem_adjust_op_size().

Now, let’s have a look at the controller side of things. An SPI controller who wants to optimize SPI memory operations can implement the spi_mem_ops interface which contains 3 methods that are directly matching the user API:

->exec_op(): execute the memory operation or return -ENOTSUPP if it’s not supported
->supports_op(): just check if the memory operation is supported
->adjust_op_size(): adjust the size of the data transfer of a memory operation to cope with alignment and max FIFO size constraints

Note that when spi_mem_ops is not implemented, the core will add generic support for this feature by creating SPI messages formed of several SPI transfers, just as the generic SPI NOR controller driver (named m25p80) was doing before.

As you can see, the API is pretty straightforward, so hopefully, more SPI memory drivers will be converted to use it instead of manually creating SPI messages containing several SPI transfers.

Current status

A number of things have already been contributed and merged, scheduled to be part of the 4.18 Linux kernel release:

The spi-mem layer itself, in commit c36ff266dc82f4ae797a6f3513c6ffa344f7f1c7.
The two SPI controller drivers who were implementing the ->spi_flash_read interface now implement the spi-mem interface instead: bcm-qspi and ti-qspi.
The ->spi_flash_read interface is removed in commit c1f5ba70decfc2f35edcc10505e3e78fb528d212.
The m25p80 driver is modified to use the spi-mem layer in commit 4120f8d158ef904fb305b27e4a4524649faf3096.

What’s next?

Advanced SPI controllers can do more than just optimizing the execution of SPI memory operations, they can often hide all the complexity of memory accesses behind a directly-mapped IOMEM region, which, every time it is accessed, triggers a SPI memory operation on the bus and retrieves or sends the data for you, thus behaving like a memory that would be placed on a parallel memory bus. As you can imagine, this allows for even higher throughput and less CPU time consumed for SPI mem op management, but it’s also something that is hard to expose in a generic way. We have posted on the linux-mtd mailing list a proposal to support such a direct mapping capability.

As detailed earlier, another challenging topic is the conversion of all SPI NOR controller drivers to the SPI mem model so that all QSPI controllers are really exposed as SPI controllers and not SPI NOR controllers. This is likely to take a bit of time since we currently have 10 drivers in drivers/mtd/spi-nor and we’re only aware of 2 of them being converted to the SPI mem approach (fsl-quadspi and atmel-quadspi).

Raspberry Pi logo For a few months, Bootlin has been helping the Raspberry Pi Foundation upstream to the Linux kernel a number of display related features for the Raspberry Pi platform.

The main goal behind this upstreaming process is to get rid of the closed-source firmware that is used on non-upstream kernels every time you need to enable/access a specific hardware feature, and replace it by something that is both open-source and compliant with upstream Linux standards.

Eric Anholt has been working hard to upstream display related features. His biggest contribution has certainly been the open-source driver for the VC4 GPU, but he also worked on the display controller side, and we were contracted to help him with this task.

Our first objective was to add support for SDTV (composite) output, which appeared to be much easier than we imagined. As some of you might already know, the display controller of the Raspberry Pi already has a driver in the DRM subsystem. Our job was to add support for the SDTV encoder (also called VEC, for Video EnCoder). The driver has been submitted just before the 4.10 merge window and surprisingly made it into 4.10 (see also the patches). Eric Anholt explained on his blog:

The Raspberry Pi Foundation recently started contracting with Bootlin to give me some support on the display side of the stack. Last week I got to review and release their first big piece of work: Boris Brezillon’s code for SDTV support. I had suggested that we use this as the first project because it should have been small and self contained. It ended up that we had some clock bugs Boris had to fix, and a bug in my core VC4 CRTC code, but he got a working patch series together shockingly quickly. He did one respin for a couple more fixes once I had tested it, and it’s now out on the list waiting for devicetree maintainer review. If nothing goes wrong, we should have composite out support in 4.11 (we’re probably a week late for 4.10).

Our second objective was to help Eric with HDMI audio support. The code has been submitted on the mailing list 2 weeks ago and will hopefully be queued for 4.12. This time on, we didn’t write much code, since Eric already did the bulk of the work. What we did though is debugging the implementation to make it work. Eric also explained on his blog:

Probably the biggest news of the last two weeks is that Boris’s native HDMI audio driver is now on the mailing list for review. I’m hoping that we can get this merged for 4.12 (4.10 is about to be released, so we’re too late for 4.11). We’ve tested stereo audio so far, no compresesd audio (though I think it should Just Work), and >2 channel audio should be relatively small amounts of work from here. The next step on HDMI audio is to write the alsalib configuration snippets necessary to hide the weird details of HDMI audio (stereo IEC958 frames required) so that sound playback works normally for all existing userspace, which Boris should have a bit of time to work on still.

On our side, it has been a great experience to work on such topics with Eric, and you should expect more contributions from Bootlin for the Raspberry Pi platform in the next months, so stay tuned!