After managing to shake off the impostor's syndrome long enough for another post, I'm back with the follow-up to last month's GPIO war stories. In the previous entry we looked at the horror of GPIO registers, and managed to assuage the fears of the borrow checker with the promises of a clean, pure, typestate-powered pin representation. That's well and good, but it's time to make those platonic pins drive some real, honest hardware.

A word of warning: This entry contains copious amounts of unsafe. We're entering the sinister world of arbitrary memory dereferences, so please consult your physician, priest, or board of directors before attempting to replicate anything you're about to see. That said, not all unsafe blocks are made equal, and a theme through this entry will be to choose the ones least likely to come back and bite us. Onward!

A naive, freestyle approach🔗

Last time, partly for dramatic effect, partly to segue into the clever typestate bits, we introduced a rather crude way of driving pins that is all too familiar to people who know that Misra is not a town in Final Fantasy VII:

unsafe fn set_gpio(port: char, index: u8) {
    assert!(in_range(port, index));
    let register = match port {
        'A' => 0x0001000,
        /*...*/
    } as *mut u32;
    *register |= 1 << index;
}

Yeah, don't do that.

Why am I bringing it up? Because now that we have some scaffolding in the form of unique, managed Pin structs, it doesn't really seem that bad to fill them in with implementations like the above, right? This is where we left off:

impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
    fn set_low(&mut self) { unimplemented!() }
    fn set_high(&mut self) { unimplemented!() }
}

Putting them both together, we could envision something like this:

impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
   fn set_high(&mut self) {
       let port_base_address = PORT_BASE_ADDRESS[PORT as usize - 'A' as usize];
       let register = (port_base_address + SET_REGISTER_OFFSET) as *mut u32;
       unsafe { *register |= 1 << INDEX; }
   }
   //...
}

Is this wrong? Well...

I'd say it's the code equivalent of that chair scene in Contact, and it makes me uneasy for two reasons. The immediate one is that you can't guarantee the compiler won't mess with the write——core::ptr::write_volatile exists for that——but even if we were to solve that by calling the right core library function we'd face deeper problems. Lets look at the questions we asked ourselves last time, and wonder if this approach provides good responses.

Is the pin correctly configured? - Yes, enforced by the typestate.
Is the read + write operation atomic? - Nope.
What if something else is writing to the same port register? - Horror.
Am I the only one writing to a specific pin? - Yes, enforced by the limited constructor.

We get two out of four, so we're off the mark. Even if we scored better against that checklist, the reality is that writing modules this way doesn't scale. Conveniently hiding out of view, the PORT_BASE_ADDRESS and SET_REGISTER_OFFSET constants would have to be manually written, likely copied from the datasheet. Each module would depend on dozens of these constants, gated behind feature flags for chip variants. Manual duplication breeds errors, the kind that are difficult to find and diagnose. So how do we fix this?

A big blob of XML🔗

Enter CMSIS-SVD files——If you're using Cortex-M, of course, or at the very least an architecture for which an SVD file can be generated. If you aren't, I wish you good luck and godspeed, and ask that you bring back some nice souvenirs from your travels.

SVD files are big chunks of XML that define exactly how and where to access a microcontroller's collection of peripherals. When I say big, I mean it: efm32gg11's one stands at an impressive 200k lines. If we dig for the GPIO "data out" section we saw on a screenshot in the last entry, it looks like this:

All the information required to interface with this peripheral is neatly contained in these tags. Great! Now all we need is a way to access this information from our gpio module.

What is a PAC, man?🔗

Since CMSIS-SVD is XML, and hence easy to parse, there tools out there to convert all that information to C header files, and fortunately, also to Rust modules. Many microcontrollers already have such modules available as published crates, but don't be discouraged if yours doesn't: The instructions in the svd2rust crate are pretty easy to follow and the process surprisingly painless.

Following the steps in the documentation will turn your SVD file into a shiny PAC (Peripheral Access Crate). I'm not going to lie to you, the source you'll get is a bit crowded and robotic, with the metallic taste of auto-generated code. Thankfully cargo doc packages it in a fairly digestible form, with all the vendor-provided explanatory notes copied over as docstring comments. Clever!

NOTE: Through this entry I will often refer to GPIO (note the capitalization), which is the name of the PAC object granting us low-level access to GPIO peripherals. This is as opposed to Gpio, which is the type we defined in the last entry, which constructs Pin structs.

The svd2rust crate documentation covers how to use its peripheral API to read, write and modify registers. While last post was all about writing the rules of peripheral access, missing the mechanics, it seems we get the mechanics for free, so it should just be a matter of putting them together. Easy! Rust is great!

Hold on a moment.

There are no happy endings in embedded. This is a bleak, lifeless word of suffering and datasheet squinting. The few happy endings we get are hard-fought, not the kind where we save Earth but those in which it explodes and we find another planet to repopulate or something. So let's give our problem a first stab, and find out exactly what dangers await us.

impl<const INDEX: u8> OutputPin for Pin<Output, 'A', INDEX> {
    fn set_high(&mut self) {
        cortex_m::interrupt::free(|_| {
           unsafe {
              (*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })
           }
        }
    }

    fn set_low(&mut self) {
        cortex_m::interrupt::free(|_| {
           unsafe {
              (*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) })
           }
        }
    }
}

Jeez, that's some high-density symbol soup. Let's break it down.

Anatomy of a write🔗

impl<const INDEX: u8> OutputPin for Pin<Output, 'A', INDEX>

The above reads "we're implementing the OutputPin trait for any output-configured pin in port A". This is not how we want things to look in the final implementation——if we did, we'd have to replicate this code block for ports 'B', 'C', 'D'...——but we're doing it this way for now for the sake of simplicity.

   (*GPIO::ptr()).pa_dout

This scary looking bit means that we're stealing the data out register off the hierarchy that our PAC enforces. We're unsafely bypassing the ownership rules of the PAC and getting a "back door" into the GPIO struct member. So early, and we're already breaking the rules! For most other peripherals, we'd want to respect the PAC hierarchy and build a wrapper around the PAC type:

pub struct Flash {
    /// MSC stands for Memory System Controller
    msc: efm32pac::MSC,
}

//... Somewhere in our entry point:
let mut peripherals = efm32pac::Peripherals::take().unwrap();
// We pass ownership of the unique PAC object to our wrapper
let mcu_flash = flash::Flash::new(peripherals.MSC);

And then simply access the specific registers through it:

// We access the `status` register directly through the owned PAC object.
fn is_busy(&self) -> bool { self.msc.status.read().busy().bit_is_set() }

Look at how tame and sane that looks! But that wouldn't make for a good blog entry, because we've come here to suffer. This approach doesn't work for the gpio module, because of the split responsibility problem we covered in the first entry——we want Pins to reside in completely different, independent sections of the application, without the need to worry about interactions at a distance.

We could try to make each pin hold a reference to its associated port register, but then they'd have to hold a pointer all the way back to their Gpio parent, which is heavy and unnecessary. It also wouldn't solve the main problem of calling modify from different contexts, which is a method defined on a !Sync type. If we want to interface with a !Sync type from different contexts we're going to need the full power of unsafe anyway, so at that stage we might as well just steal the registers from the PAC with GPIO::ptr() and save us the cost of a reference.

If we're lucky, we may be able to ignore the synchronization problem altogether if our implementation only calls read and write (for example, in the case our microcontroller offers set and clear registers). read and write are typically atomic operations and thus we don't need to worry about preemption. If we plan to call modify() however, we need to provide an answer to the question "What happens if I'm halfway through a modify() and an interrupt triggers another modify() call on the same register?".

This may seem obvious, but note that marking our Pin structs !Sync doesn't help since they share internal references to a single !Sync object. If Pa0 and Pa1 exist in different contexts and they can both call modify on the same register, we're in trouble.

Making port access safe and polite🔗

The first instinct of an experienced rustacean could be to reach for the typical synchronization toolkit and concoct a mix of Arc, Rc, Mutex, RwLock so each pin can hold a well-behaved reference to its port register. This won't work for us for a few different reasons.

Most of what I've mentioned isn't available in no_std.
We'd be burdening our pins, which are thus far nicely zero-sized, with heavy machinery.
You really, really, really don't want to lock a mutex from within an interrupt context, and we want interrupt service routines to be able to drive and read pins.

The root of the problem is that the modify call is not atomic. If it were impossible to preempt, we could just spread the pins far and wide without having to worry about data races. So the simplest solution to the problem seems to be to disable preemption during the brief read-write cycle required to update the value of a register. And thus, after a long detour, we come to understand this line:

cortex_m::interrupt::free(|_| { /* ... */ })

The closure we pass will be executed inside a critical section, guaranteeing* that our modify operation will behave atomically. Since most methods in our Pin structs are one-liners with a single register operation, this approach will serve us well enough for the entire module.

* This guarantee only holds for single-core scenarios, which is thankfully our case. The only source of data races we have to worry about is interrupts. If we have to consider multiple cores, the problem becomes significantly more complicated and beyond the scope of this post.

Removing the last footgun🔗

Alright, we've made it impossible for the Pin structs to cause trouble to each other, no matter how far their travels take them. However, there remains one risk: What if the user decides to play with the PAC GPIO struct themselves? They can do this safely through the PAC hierarchy, and it could lead to unsoundness, given that we unsafely steal that reference with our dirty ptr() trick. This means we need to lock the user away from touching the PAC GPIO struct:

impl Gpio {
    pub fn new(_: efm32pac::GPIO) -> Self {
        matrix! { construct_gpio [a b c d e f g h i j k l] [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15] }
    }
}

This is the very same Gpio struct we defined last entry, which you'll remember is the only thing in the universe capable of generating objects of type Pin. By taking a efm32pac::GPIO by value, we throw away the only safe key to the PAC GPIO peripheral. Now Gpio, and only Gpio, can drive these registers.

Incidentally, this addition also fulfills one of the requirements from last entry: it forces the Gpio struct to be unique. By requiring the ownership of a PAC GPIO in order to construct our Gpio wrapper, we guarantee both safety and uniqueness.

Continuing on with another bit of our set_high function...

unsafe {}

Ah, our good, misunderstood friend unsafe. We need it here for two reasons. One is the dereference of a raw pointer GPIO::ptr() as discussed. The second is subtler but also worth highlighting; writing multiple bits to a register at once is considered unsafe by the PAC, simply because the SVD file isn't expressive enough to guarantee every combination of bits written to a register is valid.

Finally, all that's left is the operation itself:

.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })

A closure is passed that reads the register, toggles on a specific bit and writes the result back.

Expanding our horizons🔗

Now that we have a working set_high method, it isn't too difficult to imagine the rest. Pin reads, toggles and mode changes are pretty similar to what we've written, and you can check them out in the finished module source linked at the end of this entry. Our next step is, then, to address the limitation we mentioned at the start: Our set_high function is only implemented for port A, and we want it defined on every port.

Let's give it a try, shall we?

impl<const INDEX: u8, const PORT: char> OutputPin for Pin<Output, PORT, INDEX> {
    fn set_high(&mut self) {
        cortex_m::interrupt::free(|_| {
           unsafe {
              (*GPIO::ptr()).?????.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })
           }
        }
    }

    fn set_low(&mut self) {
        cortex_m::interrupt::free(|_| {
           unsafe {
              (*GPIO::ptr()).?????.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) })
           }
        }
    }
}

So far so good. We've expanded the impl block to be generic not only over the index, but also over the port. All that's necessary now is fill those question marks so that set_high and set_low write to the appropriate register depending on the port they're defined on.

Alright, let's take a look at the PAC GPIO register block in more detail. In particular, the types of its registers named pa_dout, pb_dout, etc:

pub type PA_DOUT = crate::Reg<u32, _PA_DOUT>;
pub struct _PA_DOUT;

// ...
pub type PB_DOUT = crate::Reg<u32, _PB_DOUT>;
pub struct _PB_DOUT;

Ha, that looks familiar, doesn't it? Looks like the PAC is also typestate based, nice! Surely, this will make it very easy to interface with those types. We just need to find a common trait that allows us to refer to a "writable" register generically. In fact, we could summarize our last requirement as a function of the form you see below, where ????? is some trait that allows us to write bits to a register:

fn get_data_out_register<const PORT: char>(gpio: &GPIO) -> &dyn ????? {
   match PORT {
      'A' => &gpio.pa_dout,
      'B' => &gpio.pb_dout,
      //...
   }
}

Well... About that...

Look, the svd2rust crate is very smart. It's capable of understanding each register's unique rules of access and generating a precise API for it. However, it isn't smart enough to notice when groups of registers are closely related enough to be treated in a common way. There is a common Writable trait, but it's just a marker. For all intents and purposes every register is its own unique type, no matter how frustratingly similar they may seem.

It seems then that our approach above is not going to work. Bummer. What do we do when we have multiple things that behave very similarly——in fact, identically in terms of syntax——but are seen as different by the compiler? That's right, we reach for our powerful friend the macro.

As far as I'm aware there's no better solution than reaching for macros here——and every HAL crate I've seen does it——but there's a chance I'm missing a cleaner way to solve the problem. If you know it, please let me know via email or reddit! (Links at the end).

With macros, it's often a good idea to start at the point of usage with an ideal syntax. We want to be able to modify registers in a generic and compact way, so let's draft it:

impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
    fn set_low(&mut self) {
        unsafe { gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
    }
    fn set_high(&mut self) {
        unsafe { gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() | (1 << INDEX)) }) }
    }
}

Note how we've kept unsafe, but not the critical section. Hiding unsafe in a macro would be all kinds of yucky, but an irrelevant detail like the ::cortex_m::interrupt::free invocation can be conveniently swept under the rug, especially in the case an outer unsafe block is already calling the user's attention to this area of the code.

Let's build our macro from top to bottom:

macro_rules! gpio_modify {
    ($port:ident, $register_name:ident, |$read:ident, $write:ident| $block:block) => {
        gpio_modify_inner!(
            ['A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L']
            [a b c d e f g h i j k l]
            $port, $register_name, |$read, $write| $block
        );
    };
}

Macro syntax can be intimidating, but this one isn't really doing much. It just takes a set of arguments (most importantly $port and $register_name) and passes them down to an inner macro, alongside a big blob of letters from A to L in two different formats. All the meat is in gpio_modify_inner:

macro_rules! gpio_modify_inner {
    ([$($character:literal)+] [$($letter:ident)+] $port:ident, $register_name:ident, |$read:ident, $write:ident| $block:block) => {
        paste::item! {
            ::cortex_m::interrupt::free(|_| {
                match PORT {
                    $($character => { (*GPIO::ptr()).[<p $letter _ $register_name>].modify(|$read, $write| $block) })+
                    _ => core::panic!("Unexpected port"),
                }
            })
        }
    };
}

Again, we have to thank our local hero David Tolnay for the paste crate, which allows us to conjure identifiers out of nothing. The best way to understand what the macro above is doing is to expand it manually:

gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() & !(1 << INDEX)) })

// The above expands into...
::cortex_m::interrupt::free(|_| {
    match PORT {
       'A' => { (*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
       'B' => { (*GPIO::ptr()).pb_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
       // ...
        _ => core::panic!("Unexpected port"),
    }
})

Since PORT is known at compile time, we have the guarantee that the match statement will be optimized away. The match statement is simply a trick which allows us to select an operation on a completely different type for each value of a compile time constant.

Whew, we're done! All that's left is replicating the approach above for the read and write calls, then fill the obvious blanks. We're finally free!

You may have noticed a significant drop in the amount and quality of banter in this last section, for which I deeply apologize——I really racked my brain to make it as understandable as I could. I'll make sure to include twice the colourful analogies in the next entry!

Conclusions and source🔗

I'm once again aware that this kind of code is not what most people need to do on the regular, but I'm hoping that regardless of your background you were able to take with you something of value. All the source for this module is available here, and it will of course be part of Loadstone as soon as we're ready to open source it.

In the next entry, I'll address all reader questions——I have a neat list of comments from various sources on the first entry, which haven't gone unnoticed, I promise——and finally take a look at pin alternate functions, as well as how to limit peripheral drivers to only be able to take supported pins. Stay tuned!

As always, looking forward to your feedback in the rust subreddit, community discord (I'm Corax over there) and over at my email.

Happy rusting!

The GPIO war: macro bunkers for typestate explosions (2)