The GPIO war: macro bunkers for typestate explosions (2)
After managing to shake off the impostor's syndrome long enough for another post, I'm back with the follow-up to last month's GPIO war stories. In the previous entry we looked at the horror of GPIO registers, and managed to assuage the fears of the borrow checker with the promises of a clean, pure, typestate-powered pin representation. That's well and good, but it's time to make those platonic pins drive some real, honest hardware.
A word of warning: This entry contains copious amounts of unsafe
. We're
entering the sinister world of arbitrary memory dereferences, so please consult
your physician, priest, or board of directors before attempting to replicate
anything you're about to see. That said, not all unsafe
blocks are made equal,
and a theme through this entry will be to choose the ones least likely to come
back and bite us. Onward!
A naive, freestyle approach🔗
Last time, partly for dramatic effect, partly to segue into the clever typestate bits, we introduced a rather crude way of driving pins that is all too familiar to people who know that Misra is not a town in Final Fantasy VII:
unsafe fn set_gpio(port: char, index: u8) {
assert!(in_range(port, index));
let register = match port {
'A' => 0x0001000,
/*...*/
} as *mut u32;
*register |= 1 << index;
}
Yeah, don't do that.
Why am I bringing it up? Because now that we have some
scaffolding in the form of unique, managed Pin
structs, it doesn't really seem
that bad to fill them in with implementations like the above, right? This is where we left
off:
impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
fn set_low(&mut self) { unimplemented!() }
fn set_high(&mut self) { unimplemented!() }
}
Putting them both together, we could envision something like this:
impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
fn set_high(&mut self) {
let port_base_address = PORT_BASE_ADDRESS[PORT as usize - 'A' as usize];
let register = (port_base_address + SET_REGISTER_OFFSET) as *mut u32;
unsafe { *register |= 1 << INDEX; }
}
//...
}
Is this wrong? Well...
I'd say it's the code equivalent of that chair scene in Contact, and it makes me uneasy for two reasons. The immediate one is that you can't guarantee the compiler won't mess with the write——core::ptr::write_volatile exists for that——but even if we were to solve that by calling the right core library function we'd face deeper problems. Lets look at the questions we asked ourselves last time, and wonder if this approach provides good responses.
- Is the pin correctly configured? - Yes, enforced by the typestate.
- Is the read + write operation atomic? - Nope.
- What if something else is writing to the same port register? - Horror.
- Am I the only one writing to a specific pin? - Yes, enforced by the limited constructor.
We get two out of four, so we're off the mark. Even if we scored better against
that checklist, the reality is that writing modules this way doesn't scale.
Conveniently hiding out of view, the PORT_BASE_ADDRESS
and
SET_REGISTER_OFFSET
constants would have to be manually written, likely copied
from the datasheet. Each module would depend on dozens of these
constants, gated behind feature flags for chip variants. Manual duplication
breeds errors, the kind that are difficult to find and diagnose. So how do we
fix this?
A big blob of XML🔗
Enter CMSIS-SVD files——If you're using Cortex-M, of course, or at the very least an architecture for which an SVD file can be generated. If you aren't, I wish you good luck and godspeed, and ask that you bring back some nice souvenirs from your travels.
SVD files are big chunks of XML that define exactly how and where to access a microcontroller's collection of peripherals. When I say big, I mean it: efm32gg11's one stands at an impressive 200k lines. If we dig for the GPIO "data out" section we saw on a screenshot in the last entry, it looks like this:
All the information required to interface with this peripheral is neatly
contained in these tags. Great! Now all we need is a way to access this
information from our gpio
module.
What is a PAC, man?🔗
Since CMSIS-SVD is XML, and hence easy to parse, there tools out
there to
convert all that information to C header files, and fortunately, also to Rust
modules. Many microcontrollers
already have such modules available as published crates, but don't be
discouraged if yours doesn't: The instructions in the svd2rust
crate are
pretty easy to follow and the process surprisingly painless.
Following the steps in the documentation will turn your SVD file into a shiny
PAC (Peripheral Access Crate). I'm not going to lie to you, the source you'll get
is a bit crowded and robotic, with the metallic taste of auto-generated code.
Thankfully cargo doc
packages it in a fairly digestible form, with all the
vendor-provided explanatory notes copied over as docstring comments. Clever!
NOTE: Through this entry I will often refer to
GPIO
(note the capitalization), which is the name of thePAC
object granting us low-level access to GPIO peripherals. This is as opposed toGpio
, which is the type we defined in the last entry, which constructsPin
structs.
The svd2rust
crate documentation covers how to use its peripheral API to read, write and
modify registers. While last post was all about writing the rules of
peripheral access, missing the mechanics, it seems we get the mechanics for
free, so it should just be a matter of putting them together. Easy! Rust is
great!
Hold on a moment.
There are no happy endings in embedded. This is a bleak, lifeless word of suffering and datasheet squinting. The few happy endings we get are hard-fought, not the kind where we save Earth but those in which it explodes and we find another planet to repopulate or something. So let's give our problem a first stab, and find out exactly what dangers await us.
impl<const INDEX: u8> OutputPin for Pin<Output, 'A', INDEX> {
fn set_high(&mut self) {
cortex_m::interrupt::free(|_| {
unsafe {
(*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })
}
}
}
fn set_low(&mut self) {
cortex_m::interrupt::free(|_| {
unsafe {
(*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) })
}
}
}
}
Jeez, that's some high-density symbol soup. Let's break it down.
Anatomy of a write🔗
impl<const INDEX: u8> OutputPin for Pin<Output, 'A', INDEX>
The above reads "we're implementing the OutputPin trait for any output-configured pin in port A". This is not how we want things to look in the final implementation——if we did, we'd have to replicate this code block for ports 'B', 'C', 'D'...——but we're doing it this way for now for the sake of simplicity.
(*GPIO::ptr()).pa_dout
This scary looking bit means that we're stealing the data out register off the hierarchy that our PAC enforces. We're unsafely bypassing the ownership rules of the PAC and getting a "back door" into the GPIO struct member. So early, and we're already breaking the rules! For most other peripherals, we'd want to respect the PAC hierarchy and build a wrapper around the PAC type:
pub struct Flash {
/// MSC stands for Memory System Controller
msc: efm32pac::MSC,
}
//... Somewhere in our entry point:
let mut peripherals = efm32pac::Peripherals::take().unwrap();
// We pass ownership of the unique PAC object to our wrapper
let mcu_flash = flash::Flash::new(peripherals.MSC);
And then simply access the specific registers through it:
// We access the `status` register directly through the owned PAC object.
fn is_busy(&self) -> bool { self.msc.status.read().busy().bit_is_set() }
Look at how tame and sane that looks! But that wouldn't make for a good blog
entry, because we've come here to suffer. This approach doesn't work for the
gpio
module, because of the split responsibility problem we covered in the first
entry——we want Pin
s to reside in completely different, independent
sections of the application, without the need to worry about interactions at a distance.
We could try to make each pin hold a reference to its associated port register,
but then they'd have to hold a pointer all the way back to their Gpio
parent, which is
heavy and unnecessary. It also wouldn't solve the main problem of calling modify
from different contexts, which is a method defined on a !Sync
type. If we want to interface
with a !Sync
type from different contexts we're going to need the full power
of unsafe
anyway, so at that stage we might as well just steal the registers from the
PAC
with GPIO::ptr()
and save us the cost of a reference.
If we're lucky, we may be able to ignore the synchronization problem altogether
if our implementation only calls read
and write
(for example, in the case
our microcontroller offers set
and clear
registers). read
and write
are
typically atomic operations and thus we don't need to worry about preemption. If
we plan to call modify()
however, we need to provide an answer to the question
"What happens if I'm halfway through a modify()
and an interrupt triggers
another modify()
call on the same register?".
This may seem obvious, but note that marking our
Pin
structs!Sync
doesn't help since they share internal references to a single!Sync
object. IfPa0
andPa1
exist in different contexts and they can both callmodify
on the same register, we're in trouble.
Making port access safe and polite🔗
The first instinct of an experienced rustacean could be to reach for the
typical synchronization toolkit and concoct a mix of Arc
, Rc
, Mutex
,
RwLock
so each pin can hold a well-behaved reference to its port register.
This won't work for us for a few different reasons.
- Most of what I've mentioned isn't available in
no_std
. - We'd be burdening our pins, which are thus far nicely zero-sized, with heavy machinery.
- You really, really, really don't want to lock a mutex from within an interrupt context, and we want interrupt service routines to be able to drive and read pins.
The root of the problem is that the modify
call is not atomic. If it were
impossible to preempt, we could just spread the pins far and wide without having
to worry about data races. So the simplest solution to the problem seems to be
to disable preemption during the brief read-write cycle required to update the
value of a register. And thus, after a
long detour, we come to understand this line:
cortex_m::interrupt::free(|_| { /* ... */ })
The closure we pass will be executed inside a critical section, guaranteeing* that
our modify
operation will behave atomically. Since most methods in our Pin
structs are one-liners with a single register operation, this approach will
serve us well enough for the entire module.
* This guarantee only holds for single-core scenarios, which is thankfully our case. The only source of data races we have to worry about is interrupts. If we have to consider multiple cores, the problem becomes significantly more complicated and beyond the scope of this post.
Removing the last footgun🔗
Alright, we've made it impossible for the Pin
structs to cause trouble to each
other, no matter how far their travels take them. However, there remains one
risk: What if the user decides to play with the PAC GPIO
struct
themselves? They can do this safely through the PAC hierarchy, and it could
lead to unsoundness, given that we unsafely steal that reference with our dirty
ptr()
trick. This means we need to lock the user away from touching the PAC GPIO
struct:
impl Gpio {
pub fn new(_: efm32pac::GPIO) -> Self {
matrix! { construct_gpio [a b c d e f g h i j k l] [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15] }
}
}
This is the very same Gpio
struct we defined last entry, which you'll remember
is the only thing in the universe capable of generating objects of type Pin
.
By taking a efm32pac::GPIO
by value, we throw away the only safe key to the
PAC GPIO
peripheral. Now Gpio
, and only Gpio
, can drive these registers.
Incidentally, this addition also fulfills one of the requirements from last entry:
it forces the Gpio
struct to be unique. By requiring the ownership of a
PAC GPIO
in order to construct our Gpio
wrapper, we guarantee both safety
and uniqueness.
Continuing on with another bit of our set_high
function...
unsafe {}
Ah, our good, misunderstood friend unsafe
. We need it here for two reasons.
One is the dereference of a raw pointer GPIO::ptr()
as discussed. The second
is subtler but also worth highlighting; writing multiple bits to a register at
once is considered unsafe by the PAC, simply because the
SVD file isn't expressive enough to guarantee every combination of bits written
to a register is valid.
Finally, all that's left is the operation itself:
.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })
A closure is passed that reads the register, toggles on a specific bit and writes the result back.
Expanding our horizons🔗
Now that we have a working set_high
method, it isn't too difficult to imagine
the rest. Pin reads, toggles and mode changes are pretty similar to what we've
written, and you can check them out in the finished module
source
linked at the end of this entry. Our next step is, then, to address the
limitation we mentioned at the start: Our set_high
function is only
implemented for port A
, and we want it defined on every port.
Let's give it a try, shall we?
impl<const INDEX: u8, const PORT: char> OutputPin for Pin<Output, PORT, INDEX> {
fn set_high(&mut self) {
cortex_m::interrupt::free(|_| {
unsafe {
(*GPIO::ptr()).?????.modify(|r, w| { w.bits(r.bits() | (1 << INDEX)) })
}
}
}
fn set_low(&mut self) {
cortex_m::interrupt::free(|_| {
unsafe {
(*GPIO::ptr()).?????.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) })
}
}
}
}
So far so good. We've expanded the impl
block to be generic not only over the
index, but also over the port. All that's necessary now is fill those question marks so
that set_high
and set_low
write to the appropriate register depending
on the port they're defined on.
Alright, let's take a look at the PAC GPIO
register block in more detail. In
particular, the types of its registers named pa_dout
, pb_dout
, etc:
pub type PA_DOUT = crate::Reg<u32, _PA_DOUT>;
pub struct _PA_DOUT;
// ...
pub type PB_DOUT = crate::Reg<u32, _PB_DOUT>;
pub struct _PB_DOUT;
Ha, that looks familiar, doesn't it? Looks like the PAC is also typestate
based, nice! Surely, this will make it very easy to interface with those types.
We just need to find a common trait that allows us to refer to a "writable"
register generically. In fact, we could summarize our last requirement as a
function of the form you see below, where ?????
is some trait that allows us
to write bits to a register:
fn get_data_out_register<const PORT: char>(gpio: &GPIO) -> &dyn ????? {
match PORT {
'A' => &gpio.pa_dout,
'B' => &gpio.pb_dout,
//...
}
}
Well... About that...
Look, the svd2rust
crate is very smart. It's capable of understanding each
register's unique rules of access and generating a precise API for it. However,
it isn't smart enough to notice when groups of registers are closely
related enough to be treated in a common way. There is a common Writable
trait, but it's just a marker. For all intents and purposes every register is
its own unique type, no matter how frustratingly similar they may seem.
It seems then that our approach above is not going to work. Bummer. What do we do when we have multiple things that behave very similarly——in fact, identically in terms of syntax——but are seen as different by the compiler? That's right, we reach for our powerful friend the macro.
As far as I'm aware there's no better solution than reaching for macros here——and every HAL crate I've seen does it——but there's a chance I'm missing a cleaner way to solve the problem. If you know it, please let me know via email or reddit! (Links at the end).
With macros, it's often a good idea to start at the point of usage with an ideal syntax. We want to be able to modify registers in a generic and compact way, so let's draft it:
impl<const PORT: char, const INDEX: u8> OutputPin for Pin<Output, PORT, INDEX> {
fn set_low(&mut self) {
unsafe { gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
}
fn set_high(&mut self) {
unsafe { gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() | (1 << INDEX)) }) }
}
}
Note how we've kept unsafe
, but not the critical section. Hiding unsafe
in a
macro would be all kinds of yucky, but an irrelevant detail like the
::cortex_m::interrupt::free
invocation can be conveniently swept under the
rug, especially in the case an outer unsafe
block is already calling the
user's attention to this area of the code.
Let's build our macro from top to bottom:
macro_rules! gpio_modify {
($port:ident, $register_name:ident, |$read:ident, $write:ident| $block:block) => {
gpio_modify_inner!(
['A' 'B' 'C' 'D' 'E' 'F' 'G' 'H' 'I' 'J' 'K' 'L']
[a b c d e f g h i j k l]
$port, $register_name, |$read, $write| $block
);
};
}
Macro syntax can be intimidating, but this one isn't really doing much. It just takes a set of arguments
(most importantly $port
and $register_name
) and passes them down to an inner
macro, alongside a big blob of letters from A to L in two different formats. All
the meat is in gpio_modify_inner
:
macro_rules! gpio_modify_inner {
([$($character:literal)+] [$($letter:ident)+] $port:ident, $register_name:ident, |$read:ident, $write:ident| $block:block) => {
paste::item! {
::cortex_m::interrupt::free(|_| {
match PORT {
$($character => { (*GPIO::ptr()).[<p $letter _ $register_name>].modify(|$read, $write| $block) })+
_ => core::panic!("Unexpected port"),
}
})
}
};
}
Again, we have to thank our local hero David Tolnay for the paste crate, which allows us to conjure identifiers out of nothing. The best way to understand what the macro above is doing is to expand it manually:
gpio_modify!(PORT, dout, |r, w| { w.bits(r.bits() & !(1 << INDEX)) })
// The above expands into...
::cortex_m::interrupt::free(|_| {
match PORT {
'A' => { (*GPIO::ptr()).pa_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
'B' => { (*GPIO::ptr()).pb_dout.modify(|r, w| { w.bits(r.bits() & !(1 << INDEX)) }) }
// ...
_ => core::panic!("Unexpected port"),
}
})
Since PORT
is known at compile time, we have the guarantee that the match
statement will be optimized away. The match statement is simply a trick which
allows us to select an operation on a completely different type for each value
of a compile time constant.
Whew, we're done! All that's left is replicating the approach above for the
read
and write
calls, then fill the obvious blanks. We're finally free!
You may have noticed a significant drop in the amount and quality of banter in this last section, for which I deeply apologize——I really racked my brain to make it as understandable as I could. I'll make sure to include twice the colourful analogies in the next entry!
Conclusions and source🔗
I'm once again aware that this kind of code is not what most people need to do
on the regular, but I'm hoping that regardless of your background you were able
to take with you something of value. All the source for this module is
available here,
and it will of course be part of Loadstone
as soon as we're ready to open
source it.
In the next entry, I'll address all reader questions——I have a neat list of comments from various sources on the first entry, which haven't gone unnoticed, I promise——and finally take a look at pin alternate functions, as well as how to limit peripheral drivers to only be able to take supported pins. Stay tuned!
As always, looking forward to your feedback in the rust subreddit, community discord (I'm Corax over there) and over at my email.
Happy rusting!