Skip to content

Instantly share code, notes, and snippets.

@Measter
Created September 8, 2020 16:04

Revisions

  1. Measter created this gist Sep 8, 2020.
    1,339 changes: 1,339 additions & 0 deletions abstraction_adventures.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,1339 @@
    # My Adventures in MMIO Abstraction

    Some years ago, I came across a simple Roguelike on Reddit called [coreRL](https://www.roguelikeeducation.org/2.html).
    It's very simplistic; levels are just a box with two walls, only one enemy type with basic AI, no health or
    character attributes, and the only goal is to see how far you can get before you die. Having nothing better to do, I
    thought it'd be a fun little project to write a port for an Arduino Nano. The only inputs needed are the four movement
    keys, and the display can just be a basic SSD1306-driven 128x64 OLED panel.

    I could, of course, do this in C++. The language is a known quantity for the ATmega328P that powers the Arduino Nano.
    The toolchain is mature, as are the abstractions for interacting with the onboard peripherals. There are also libraries
    for SSD1306-driven displays. It's the obvious choice. All I would have to really do is write the game logic.

    But where's the fun in that?

    ## Memory-Mapped IO

    To actually do anything useful, the microcontroller needs to interact with the outside world. The ATmega328P has four
    peripherals that I would be using for this project: An IO Port, a Timer, the Two-Wire Interface (TWI) bus, and the
    USART for sending debugging info back to my PC.

    These are used by manipulating Memory-Mapped IO (MMIO) registers. Normally, when you read or write to a
    memory address, you are accessing some sort of, well... memory. Memory-Mapped IO is what its name suggests: it's
    a piece of hardware mapped to a memory address, so that instead of simply accessing a value in memory, you access the
    connected hardware. The specifics of how exactly this happens is dependent on the hardware in question. Writing to the
    IO ports, for example, just sets a couple of flip-flops. But other devices will be more complex.

    ## Hello World!

    To illustrate how you can use these, I'll start with the hello world of microcontrollers: turn on an LED. The Arduino
    Nano has an LED connected to pin PB5. The name tells us what set of registers, and which bit we'll need to
    manipulate: **P**ort **B**, bit **5**. The IO Ports on the 328P are quite simple: data direction is set using the
    `DDRn` registers (set a bit to 0 for input, 1 for output), while output level is controlled with the `PORTn` registers
    (set a bit to 0 for low, 1 for high).

    We want port B, so this will be the `DDRB` and `PORTB` registers. The addresses for which are:

    * `PORTB`: 0x25
    * `DDRB`: 0x24

    This part, however, is something that Rust makes a bit awkward. Rust is very... particular about safety, and the problem
    here is that, to Rust, these are arbitrary memory addresses that I've magicked out of nowhere. It doesn't know
    anything about them, and certainly not that they're IO peripherals. This means I need to access them as raw pointers.
    Manipulating raw pointers requires `unsafe`. Additionally, to prevent an overzealous optimizer removing our
    (apparently) unused memory accesses, you need to use volatile reads and writes. In C++ this is easy: you just mark
    the entire pointer type as volatile, and every access is handled correctly. Rust is different. In Rust, whether an
    access is volatile or not is determined by the site of access, not the type of pointer.

    So, with that in mind, here's how I can turn on that LED. I first define some constants:

    ```rust
    const PORTB: *mut u8 = 0x25 as *mut _;
    const DDRB: *mut u8 = 0x24 as *mut _;
    const PB5: u8 = 5;
    ```

    And then I can set the specific bits in those registers:

    ```rust
    unsafe {
    DDRB.write_volatile(1 << PB5);
    PORTB.write_volatile(1 << PB5);
    }
    ```

    And my LED lights up.

    ## Abstraction

    This is clearly going to get very messy, very quickly. All this noise around the memory access is going to end up
    making code that interacts with the registers hard to read. Hard to read code is hard to understand, buggy, code.
    And it gets worse if you want to just set one bit without altering any of the others:

    ```rust
    let port_val = PORTB.read_volatile() | (1 << PB5);
    PORTB.write_volatile(port_val);
    ```

    What I want to do is hide away the details of twiddling the bits, and just leave me with the higher level concept.
    To do this, I thought about what common functionality these registers will have:

    * They all have some address in memory.
    * Reading or replacing the entire contents of the register.
    * Setting or clearing a specific bit.
    * Getting the value of a specific bit.

    There's a couple ways this could be done, but what I settled on was this trait:

    ```rust
    pub trait Register {
    const ADDR: *mut u8;

    unsafe fn set_value(val: u8) {
    Self::ADDR.write_volatile(val);
    }

    unsafe fn get_value() -> u8 {
    Self::ADDR.read_volatile()
    }

    unsafe fn get_bit(bit: u8) -> bool {
    let bit = 1 << bit;

    (Self::get_value() & bit) != 0
    }

    unsafe fn set_bit(bit: u8) {
    let bit = 1 << bit;
    let val = Self::get_value();
    Self::set_value(val | bit);
    }

    unsafe fn clear_bit(bit: u8) {
    let bit = 1 << bit;
    let val = Self::get_value();
    Self::set_value(val & !bit);
    }
    }
    ```

    The function implementations are identical for each register, so I just have a default implementation. The functions
    should be unsafe, because there's no way to prove here that what we're doing is actually correct. That will be down to
    the caller. With this abstraction, I can then change my register definitions to this:

    ```rust
    struct PORTB;
    impl Register for PORTB {
    const ADDR: *mut u8 = 0x25 as *mut _;
    }
    struct DDRB;
    impl Register for DDRB {
    const ADDR: *mut u8 = 0x24 as *mut _;
    }
    ```

    And finally, lighting the LED, *without clobbering the other bits*, now looks like this:

    ```rust
    unsafe {
    DDRB::set_bit(PB5);
    PORTB::set_bit(PB5);
    }
    ```

    ## Improving (or Complicating) the Abstraction

    There's two problems with the current implementation which, while aren't huge, do bug me. The first is that not
    all registers are 8-bit; some are 16-bit. Now, I could just define the high and low bytes of the registers separately,
    and this is what the AVR C headers do, but I would prefer to be able to address it in one operation.

    The second is that all the inputs for bit manipulation are just plain `u8`. This means I could do `PORTB::set_bit(30)`
    and have the compiler accept an incorrect input. It's also not immedately clear whether I should be passing in a bit ID
    or a pre-shifted value. There's an additional problem: not all bits in a register have meaning. For example TWI Control
    Register (`TWCR`) bit 1 has no function. Yet I can just pass in a 1 to the `set_bit` function without issue. This could
    all be part of the documentation, but wouldn't it be better if I just couldn't do it wrong in the first place?

    The first one is a little easier to tackle, so I'll do that first. I need the register to be generic over the type it
    stores. That's fairly simple: just introduce a generic type `T` and replace all instances of `u8` with `T`. If you just
    do that, you'll run into a series of compile errors along the lines of:

    ```
    error[E0277]: no implementation for `{integer} << T`
    --> src\register.rs:16:21
    |
    16 | let bit = 1 << bit;
    | ^^ no implementation for `{integer} << T`
    |
    = help: the trait `core::ops::Shl<T>` is not implemented for `{integer}`
    ```

    I need to constrain `T`. The compiler errors give an idea of what traits we'll need: `Shl`, `BitAnd`, `BitOr`,
    `Not`, and `Eq`. I should also constrain it to only the types that the registers *can* be: `u8` and `u16`. It should also
    be `Copy`, firstly because we're implicitly copying the value, but also because I'm doing pointer read and writes which
    do not play well with drop implementations and soundness related to that, so it should enforce that there's no complex
    drop behaviour. `Copy` also does this, because `Copy` cannot be implemented for types implementing `Drop`.

    In addition to all of this, I'm also using two constants: 0 and 1. The compiler has no idea in these function
    implementations that the 0 and 1 literals are `T`s. I'll need associated constants.

    To cover these requirements, I introduce another trait, `RegisterType`, which requires the types listed above, and
    implement it for `u8` and `u16`:

    ```rust
    pub trait RegisterType: Copy + BitAnd<Output=Self> + BitOr<Output=Self> + Shl<Output=Self> + Not<Output=Self> + Eq + PartialEq {
    const ZERO: Self;
    const ONE: Self;
    }

    impl RegisterType for u8 {
    const ZERO: Self = 0;
    const ONE: Self = 1;
    }

    impl RegisterType for u16 {
    const ZERO: Self = 0;
    const ONE: Self = 1;
    }
    ```

    Now to solve the other problem. The bits that can and cannot be used are specific to the register, and also have names
    (e.g. the `TWCR` register's bit 5 is called `TWSTA`). So what we have is a fixed set of specific, named, values. An enum
    is perfect for this. This is how we could represent the valid bits for `TWCR`:

    ```rust
    enum TWCRBits {
    TWIE,
    TWEN,
    TWWC,
    TWSTO,
    TWSTA,
    TWEA,
    TWINT
    }
    ```

    I then want to ensure that the bit twiddling functions can only take the enum representing that register's bits.
    For that I need an associated type on the `Register` trait, so I can name it in the function signatures:

    ```rust
    pub trait Register<T: RegisterType> {
    const ADDR: *mut T;
    type BitType;

    ...

    unsafe fn get_bit(bit: Self::BitType) -> bool {
    ...
    }

    unsafe fn set_bit(bit: Self::BitType) {
    ...
    }

    unsafe fn clear_bit(bit: Self::BitType) {
    ...
    }
    }
    ```

    Because all the logic here is based on shifting, I then need to be able to get a `T` from the `BitType` telling us
    which bit the variant represents. So the `BitType` needs to implement a function returning that. This function will also
    need to return a `u8` or `u16`, depending on the register. Enter the `NamedBits` trait:

    ```rust
    pub trait NamedBits: Copy {
    type DataType: RegisterType;

    fn bit_id(self) -> Self::DataType;
    }
    ```

    I then change the associated type in `Register` to properly restrict the bit type to only those that implement
    `NamedBits`, and update the bit twiddling functions to call the `bit_id` function on their input:

    ```rust
    pub trait Register<T: RegisterType> {
    const ADDR: *mut T;
    type BitType: NamedBits<DataType = T>;

    ...

    unsafe fn get_bit(bit: Self::BitType) -> bool {
    let bit = T::ONE << bit.bit_id();

    (Self::get_value() & bit) != T::ZERO
    }

    unsafe fn set_bit(bit: Self::BitType) {
    let bit = T::ONE << bit.bit_id();
    let val = Self::get_value();
    Self::set_value(val | bit);
    }

    unsafe fn clear_bit(bit: Self::BitType) {
    let bit = T::ONE << bit.bit_id();
    let val = Self::get_value();
    Self::set_value(val & !bit);
    }
    }
    ```

    The implementations for `PORTB` and `DDRB` now look like this:

    ```rust
    #[derive(Copy, Clone)]
    enum PortBBits {
    PB0,
    PB1,
    PB2,
    PB3,
    PB4,
    PB5,
    PB6,
    PB7,
    }

    impl NamedBits for PortBBits {
    type DataType = u8;
    fn bit_id(self) -> Self::DataType {
    use PortBBits::*;
    match self {
    PB0 => 0,
    PB1 => 1,
    PB2 => 2,
    PB3 => 3,
    PB4 => 4,
    PB5 => 5,
    PB6 => 6,
    PB7 => 7,
    }
    }
    }

    struct PORTB;
    impl Register<u8> for PORTB {
    const ADDR: *mut u8 = 0x25 as *mut _;
    type BitType = PortBBits;
    }

    struct DDRB;
    impl Register<u8> for DDRB {
    const ADDR: *mut u8 = 0x24 as *mut _;
    type BitType = PortBBits;
    }
    ```

    With this implementation, lighting that LED is done like this:

    ```rust
    unsafe {
    DDRB::set_bit(PortBBits::PB5);
    PORTB::set_bit(PortBBits::PB5);
    }
    ```

    There's no longer any ambiguity whether the input is a pre-shifted value or not, nor can I just throw in arbitrary
    numbers like before.

    There is a final issue: some registers are just data storage. An example of this would the the USART's `UDR0`
    register, which stores the byte being sent or received over the bus. In that case, the register is a byte of data, not a
    collection of control bits, so being able to set a specific bit doesn't make sense. However, the abstraction here
    requires a type representing the bits.

    My solution to this was to create a struct called `NoBits`, with a private field so it couldn't be constructed outside
    of its parent module:

    ```rust
    #[derive(Copy, Clone)]
    pub struct NoBits<T>(PhantomData<T>);
    impl<T: RegisterType> NamedBits for NoBits<T> {
    type DataType = T;
    fn bit_id(self) -> Self::DataType {
    T::ZERO
    }
    }
    ```

    The reason I went for a struct and not an enum with no variants is that it needs to be usable for both 8-bit and
    16-bit registers, meaning it does need to be generic, and not using a generic type parameter is a compile error. This
    means I can now define the `UDR0` register, and still use the `get_value` and `set_value` functions, but not the
    bit-related ones:

    ```rust
    struct UDR0;
    impl Register<u8> for UDR0 {
    const ADDR: *mut u8 = 0xC6 as *mut _;
    type BitType = NoBits<u8>;
    }
    ```

    At this point, those famaliar with Rust might be wondering why not just use the `From` or `Into` traits from corelib for
    converting the bit enum to `T`? I did try these at first, but found quickly that they, for some reason, don't optimize
    well. Even with all the inputs being known at compile time, it would end up not inlining the `into` call, so you'd get an
    unnecessary function call in the final binary. Defining my own specific conversion trait resulted in the inlining taking
    place, meaning the entire thing got optimised away.

    ## Multiple Bits

    Ok, I can tiddle a single bit in a way that is easy to read, and harder to get wrong. But I also want to set
    (or clear) multiple bits in one operation. I could do this with successive calls to `set_bit` (or `clear_bit`), but the
    volatile accesses start to become a problem. So far, with what I have, setting `PORTB`'s `PB5` bit optimises to
    this:

    ```asm
    sbi 0x05, 5`
    ```

    But what if we set `PB2`, `PB3`, `PB5`, `PB6`, and `PB7`? We get this:

    ```asm
    sbi 0x05, 2
    sbi 0x05, 3
    sbi 0x05, 5
    sbi 0x05, 6
    sbi 0x05, 7
    ```

    Because the access is volatile, the compiler doesn't know that it can collect all the bits together and set it all at
    once, so I need to do that myself. As before, I don't want to expose manual bit twiddling, so I'll implement two
    functions (`set_bits` and `clear_bits`) to do it for me. Rust doesn't have variadic functions, but it does have slices,
    so I'll make them take a slice:

    ```rust
    pub trait Register<T: RegisterType> {

    ...

    unsafe fn set_bits(bits: &[Self::BitType]) {
    // Construct the final bit pattern by ORing the shifted bits together.
    let bits = bits.iter().copied()
    .map(NamedBits::bit_id)
    .fold(T::ZERO, |acc , b| acc | (T::ONE << b));

    let val = Self::get_value();
    Self::set_value(val | bits);
    }

    unsafe fn clear_bits(bits: &[Self::BitType]) {
    let bits = bits.iter().copied()
    .map(NamedBits::bit_id)
    .fold(T::ZERO, |acc , b| acc | (T::ONE << b));

    let val = Self::get_value();
    Self::set_value(val & !bits);
    }
    }
    ```

    And now we can just set all our pins like this:

    ```rust
    PORTB::set_bits(&[
    PortBBits::PB2,
    PortBBits::PB5,
    PortBBits::PB7,
    PortBBits::PB3,
    PortBBits::PB6,
    ]);
    ```

    And have it optimise to a single operation (0xEC is what you get when you OR together the above bits):

    ```asm
    in r24, 0x05
    ori r24, 0xEC
    out 0x05, r24
    ```

    ## Replacing Bits

    What I've got so far works pretty nicely for setting or clearing in one operation. But one thing that comes up a
    couple times is when you want to replace the values of certain bits, but leave the others intact. This could be done with
    successive calls to `clear_bits` then `set_bits`, but that runs into the volatile register issue we had before, with the
    complication that actually clearing the bits in the register could do unexpected things for the more complex peripherals.

    The bitwise logic involved is fairly simple:

    ```rust
    let bits_to_replace = (1 << 2) | (1 << 4) | (1 << 7);
    let replace_val = (1 << 2) | (1 << 7);
    let reg_val = REG::get_value();

    let masked = reg_val & !bits_to_replace;
    let new_reg = masked | replace_val;
    REG::set_value(reg_val);
    ```

    But having a function that does it for me makes it easier and reduces the chance of an error. The logic here is just a
    combination of the `set_bits` and `clear_bits` functions, so let's just do that. It will need to take two sets of bits:
    one representing the bits to replace, and a second for the value to replace them with.

    ```rust
    unsafe fn replace_bits(mask: &[Self::BitType], value: &[Self::BitType]) {
    let mask = mask.iter().copied()
    .map(NamedBits::bit_id)
    .fold(T::ZERO, |acc , b| acc | (T::ONE << b));

    let value = value.iter().copied()
    .map(NamedBits::bit_id)
    .fold(T::ZERO, |acc , b| acc | (T::ONE << b));

    let masked_value = value & mask;
    let masked_reg = Self::get_value() & !mask;

    Self::set_value(masked_reg | masked_value);
    }
    ```

    One thing we should be careful about here is to ensure the incoming value is also masked (but *not* with the
    inverted mask!), so it doesn't clobber bits outside the masked area. So now I can do this:

    ```rust
    PORTB::replace_bits(
    &[
    PortBBits::PB2,
    PortBBits::PB4,
    PortBBits::PB7,
    ],
    &[
    PortBBits::PB2,
    PortBBits::PB7
    ]
    );
    ```

    And have it optimize to this:

    ```asm
    in r24, 0x05
    andi r24, 0x6B
    ori r24, 0x84
    out 0x05, r24
    ```

    ## Usage Ergonomics

    So far, this is looking OK to use for bit twiddling. The need to take in a slice for the multi-bit operations isn't
    great, and there's also the fact that I also need to explicitely import the bits enum, and know its name. Both of these
    are mildly irritating, but not a huge problem. However, they can be solved. The first, we'll come back to later, but the
    second one is trivial to deal with.

    The solution is simple enough: associated constants. We simple declare a bunch of associated constants on the
    register, which point to the enum variants:

    ```rust
    pub struct TWCR;
    impl TWCR {
    pub const TWIE: TWCRBits = TWCRBits::TWIE;
    pub const TWEN: TWCRBits = TWCRBits::TWEN;
    pub const TWWC: TWCRBits = TWCRBits::TWWC;
    pub const TWSTO: TWCRBits = TWCRBits::TWSTO;
    pub const TWSTA: TWCRBits = TWCRBits::TWSTA;
    pub const TWEA: TWCRBits = TWCRBits::TWEA;
    pub const TWINT: TWCRBits = TWCRBits::TWINT;
    }
    ```

    And now, when I want to bit-twiddle the `TWCR` register, I can just get the bit names through the register itself:

    ```rust
    TWCR::set_bit(TWCR::TWEN);
    ```

    There's another, much larger, issue when it comes to setting register values. At the moment, I can only set the register
    value with a raw integer. But it's entirely reasonable to want to set the value based on a set of bits. An example
    would be when configuring the TWI bus. There's several points where you need to replace the entire value; for example,
    when releasing the bus after an arbitration loss, which which currently looks like this:

    ```rust
    let bits = (1 << TWCR::TWEN.bit_id()) | (1 << TWCR::TWEA.bit_id()) | (1 << TWCR::TWINT.bit_id());
    TWCR::set_value(bits);
    ```

    What's the point in doing all this abstraction when we're back to that? I could make the `set_value` function take a
    slice of bits like the `set_bits`, etc. functions, but there are still times when you need to set an integer value. The
    approach I chose was (yet) another trait:

    ```rust
    pub trait SetValueType<T> {
    fn as_value(self) -> T;
    }
    ```

    I then implemented it for all register types, and slices of register bits, and updated the `set_value` function:

    ```rust
    impl<T: RegisterType> SetValueType<T> for T {
    fn as_value(self) -> T {
    self
    }
    }

    impl<T: NamedBits> SetValueType<T::DataType> for &[T] {
    fn as_value(self) -> T::DataType {
    self.iter()
    .copied()
    .map(NamedBits::bit_id)
    .fold(T::DataType::ZERO, |acc, b| acc | (T::DataType::ONE << b))
    }
    }

    pub trait Register<T: RegisterType> {
    ...

    unsafe fn set_value<V: SetValueType<T>>(val: V) {
    let val = val.as_value();
    Self::ADDR.write_volatile(val);
    }

    ...
    }
    ```

    This allows me to set the value both ways (though passing in a slice requires `as_ref` here, which isn't great):

    ```rust
    DDRB::set_value(0x25);
    DDRB::set_value([DDRB::PB0, DDRB::PB2, DDRB::PB5].as_ref());
    ```

    It also optimizes nicely:

    ```asm
    ldi r24, 0x25
    out 0x04, r24 // First line
    out 0x04, r24 // Second line
    ```

    ## Dealing With All These Slices

    I don't like all these slices. They look weird and awkward. There's also a couple points when operating the TWI bus where
    the `TWCR` register is being replaced in both branches of an if-statement, but the difference is only one bit. For
    example, when handling a packet, you need to configure the register, but may not want to send an ACK signal. Currently
    you need to do something like this:

    ```rust
    if ack {
    TWCR::set_value(&[TWCR::TWEN, TWCR::TWIE, TWCR::TWINT, TWCR::TWEA].as_ref())
    } else {
    TWCR::set_value(&[TWCR::TWEN, TWCR::TWIE, TWCR::TWINT].as_ref();
    }
    ```

    It would be nice if I could build up the value, optionally set the `TWEA` bit, *then* set the register. In fact, it
    would be pretty great if I could do something like this while still retaining some measure of idiot-protection:

    ```rust
    let mut bits = TWCR::TWEN | TWCR::TWIE | TWCR::TWINT;
    if ack {
    bits |= TWCR::TWEA;
    }
    TWCR::set_value(bits);
    ```

    You may notice that what I'm wanting to do is similar to the rather ugly mess I started with:

    ```rust
    let bits = (1 << TWCR::TWEN.bit_id()) | (1 << TWCR::TWEA.bit_id()) | (1 << TWCR::TWINT.bit_id())
    ```

    So what I need is some way to do the same but in a way that retains information about what register they came from. My
    solution was the `BitBuilder`. It should retain information about what bit type it's used for, so it needs to be generic
    over that. Because the bit type is now part of the overall type, we can make the internal field be the register's data
    type. We'll make that field private so it can't just be replaced with an arbitrary value.

    ```rust
    #[derive(Copy, Clone)]
    pub struct BitBuilder<T: NamedBits>(T::DataType);

    impl<T: NamedBits> BitBuilder<T> {
    pub fn new() -> Self {
    BitBuilder(T::DataType::ZERO)
    }

    fn set_bit(&mut self, b: T) {
    self.0 = self.0 | (T::DataType::ONE << b.bit_id());
    }
    }
    ```

    Now, to get the ORing behaviour I want, I can simply implement the `BitOr` and `BitOrAssign` traits for when the right
    hand side is both a Bit *of the same type* as the `BitBuilder`'s Bit type, and when it's another `BitBuilder` over the
    same Bit type:

    ```rust
    impl<T: NamedBits> BitOr<Self> for BitBuilder<T> {
    type Output = BitBuilder<T>;
    fn bitor(mut self, rhs: Self) -> Self::Output {
    self.0 = self.0 | rhs.0;
    self
    }
    }

    impl<T: NamedBits> BitOr<T> for BitBuilder<T> {
    type Output = BitBuilder<T>;
    fn bitor(mut self, rhs: T) -> Self::Output {
    self.set_bit(rhs);
    self
    }
    }

    impl<T: NamedBits> BitOrAssign<Self> for BitBuilder<T> {
    fn bitor_assign(&mut self, rhs: Self) {
    self.0 = self.0 | rhs.0;
    }
    }

    impl<T: NamedBits> BitOrAssign<T> for BitBuilder<T> {
    fn bitor_assign(&mut self, rhs: T) {
    self.set_bit(rhs);
    }
    }
    ```

    With that, I can now build up a bit pattern like this:

    ```rust
    let bits = BitBuilder::new() | DDRB::PB0 | DDRB::PB2 | DDRB::PB5;
    ```

    Much closer to what I want. Of course, I can't actually *use* it, because our `Register` doesn't take `BitBuilder`.
    Time to change every function that takes a slice, to instead take a `BitBuilder`:

    ```rust
    pub trait Register<T: RegisterType> {

    ...

    unsafe fn set_bits(bits: BitBuilder<Self::BitType>) {
    let val = Self::get_value();
    Self::set_value(val | bits.0);
    }

    unsafe fn clear_bits(bits: BitBuilder<Self::BitType>) {
    let val = Self::get_value();
    Self::set_value(val & !bits.0);
    }

    unsafe fn replace_bits(mask: BitBuilder<Self::BitType>, new_val: BitBuilder<Self::BitType>) {
    let reg_val = Self::get_value() & !mask.0;
    Self::set_value(reg_val | (new_val.0 & mask.0));
    }
    }
    ```

    I also need to implement `SetValueType` for the `BitBuilder` so that `set_value` can take it:

    ```rust
    impl<T: NamedBits> SetValueType<T::DataType> for BitBuilder<T> {
    fn as_value(self) -> T::DataType {
    self.0
    }
    }
    ```

    Now to get rid of that awkard construction at the beginning of the OR operation. The way to do that is to implement
    `BitOr` on the bit type itself, but instead of the output type being the same bit type, it's a `BitBuilder` over the bit
    type. The implementation is similar to `BitBuilder`:

    ```rust
    impl BitOr for PortBBits {
    type Output = BitBuilder<PortBBits>;
    fn bitor(self, rhs: Self) -> Self::Output {
    BitBuilder::new() | self | rhs
    }
    }

    impl BitOr<BitBuilder<PortBBits>> for PortBBits {
    type Output = BitBuilder<PortBBits>;
    fn bitor(self, rhs: BitBuilder<PortBBits>) -> Self::Output {
    rhs | self
    }
    }
    ```

    And now, finally, I can do what I wanted, and just OR together bits:

    ```rust
    DDRB::set_value(DDRB::PB0 | DDRB::PB2 | DDRB::PB5);
    ```

    One final problem, is that if I try to do this:

    ```rust
    DDRB::set_value(DDRB::PB0);
    ```

    I get the following error:

    ```
    error[E0277]: the trait bound `PortBBits: register::SetValueType<u8>` is not satisfied
    --> src\main.rs:94:25
    |
    94 | DDRB::set_value(DDRB::PB0);
    | ^^^^^^^^^ the trait `register::SetValueType<u8>` is not implemented for `PortBBits`
    |
    ::: src\register.rs:107:5
    |
    107 | unsafe fn set_value<V: SetValueType<T>>(val: V) {
    | ----------------------------------------------- required by `register::Register::set_value`
    ```

    Which feels *really* inconsistant. So to fix that, `SetValueType` needs to be implemented for the bit type, too:

    ```rust
    impl SetValueType<u8> for PortBBits {
    fn as_value(self) -> u8 {
    1 << self.bit_id()
    }
    }
    ```

    And now it builds. You might be concerned at this point about how well this optimises. There is a fair bit of
    indirection now. But, the following:

    ```rust
    DDRB::set_value(DDRB::PB0 | DDRB::PB2 | DDRB::PB5);

    DDRB::set_bits(DDRB::PB0 | DDRB::PB2 | DDRB::PB5);
    DDRB::clear_bits(DDRB::PB0 | DDRB::PB2 | DDRB::PB5);
    DDRB::replace_bits(
    DDRB::PB0 | DDRB::PB2 | DDRB::PB5,
    DDRB::PB2 | DDRB::PB5
    );
    ```

    Compiles down to:

    ```asm
    // set_value
    ldi r24, 0x25
    out 0x04, r24
    // set_bits
    in r24, 0x04
    ori r24, 0x25
    out 0x04, r24
    // clear_bits
    in r24, 0x04
    andi r24, 0xDA
    out 0x04, r24
    // replace_bits
    in r24, 0x04
    andi r24, 0xDA
    ori r24, 0x24
    out 0x04, r24
    ```

    And also, because the bit type is a part of the `BitBuilder`, mixing bit types becomes a compile error:

    ```
    error[E0277]: no implementation for `PortBBits | TWCRBits`
    --> src\main.rs:148:30
    |
    148 | let bits = DDRB::PB5 | TWCR::TWEN;
    | ^ no implementation for `PortBBits | TWCRBits`
    |
    = help: the trait `core::ops::BitOr<TWCRBits>` is not implemented for `PortBBits`
    ```

    One issue remains, which is that this is accepted:

    ```rust
    let bits = DDRB::PB5 | DDRB::PB4;
    TWCR::set_value(bits);
    ```

    That is because the trait bound on `set_value` only requires that the register type match, with no way to ensure that
    the bit type also matches. A way to prevent that is to add a new function for setting a raw value, and change the
    `SetValueType` to be generic over the bit type, not the register type:

    ```rust
    pub trait SetValueType<T: NamedBits> {
    fn as_value(self) -> T::DataType;
    }

    pub trait Register<T: RegisterType> {

    ...

    unsafe fn set_raw_value(val: T) {
    Self::ADDR.write_volatile(val);
    }

    #[inline(always)]
    unsafe fn set_value<V: SetValueType<Self::BitType>>(val: V){
    let val = val.as_value();
    Self::ADDR.write_volatile(val);
    }
    }
    ```

    And now we get a nice compile error:

    ```
    error[E0277]: the trait bound `register::BitBuilder<PortBBits>: register::SetValueType<TWCRBits>` is not satisfied
    --> src\main.rs:90:25
    |
    90 | TWCR::set_value(bits);
    | ^^^^ the trait `register::SetValueType<TWCRBits>` is not implemented for `register::BitBuilder<PortBBits>`
    |
    ::: src\register.rs:103:5
    |
    103 | unsafe fn set_value<V: SetValueType<Self::BitType>>(val: V){
    | ----------------------------------------------------------- required by `register::Register::set_value`
    |
    = help: the following implementations were found:
    <register::BitBuilder<T> as register::SetValueType<T>>
    ```

    # Declaration Ergonomcs

    Ok, so I've got an abstraction for the registers and their associated bits which: is easy to use; is harder to get
    wrong compared to bare bitwise; isn't noisy; and compiles well. Everything's great, right? We just have to define the
    types for the register, and we're good to go!

    ```rust
    #[derive(Copy, Clone, Eq, PartialEq)]
    pub enum TWCRBits {
    TWIE,
    TWEN,
    TWWC,
    TWSTO,
    TWSTA,
    TWEA,
    TWINT,
    }

    impl NamedBits for TWCRBits {
    type DataType = u8;
    fn bit_id(self) -> u8 {
    match self {
    TWCRBits::TWIE => 0,
    TWCRBits::TWEN => 2,
    TWCRBits::TWWC => 3,
    TWCRBits::TWSTO => 4,
    TWCRBits::TWSTA => 5,
    TWCRBits::TWEA => 6,
    TWCRBits::TWINT => 7,
    }
    }
    }

    impl SetValueType<TWCRBits> for TWCRBits {
    fn as_value(self) -> u8 {
    1 << self.bit_id()
    }
    }

    impl BitOr for TWCRBits {
    type Output = BitBuilder<TWCRBits>;
    fn bitor(self, rhs: TWCRBits) -> Self::Output {
    BitBuilder::new() | self | rhs
    }
    }

    impl BitOr<BitBuilder<TWCRBits>> for TWCRBits {
    type Output = BitBuilder<TWCRBits>;
    fn bitor(self, rhs: BitBuilder<TWCRBits>) -> Self::Output {
    rhs | self
    }
    }

    pub struct TWCR;
    impl TWCR {
    pub const TWIE: TWCRBits = TWCRBits::TWIE;
    pub const TWEN: TWCRBits = TWCRBits::TWEN;
    pub const TWWC: TWCRBits = TWCRBits::TWWC;
    pub const TWSTO: TWCRBits = TWCRBits::TWSTO;
    pub const TWSTA: TWCRBits = TWCRBits::TWSTA;
    pub const TWEA: TWCRBits = TWCRBits::TWEA;
    pub const TWINT: TWCRBits = TWCRBits::TWINT;
    }

    impl Register<u8> for TWCR {
    const ADDR: *mut u8 = 0xBC as *mut u8;
    type BitType = TWCRBits;
    }
    ```

    If you're like me, you just pulled quite a face looking at that. There's *so much boilerplate!* And it's going to be the
    same for every register. There's a lot of repeated information, namely the register type (`u8`), and the name of the
    bit type (`TWCRBits`) along with all its variants. It would be tedious and error-prone to write this out for the dozens
    of registers needed. Fortunately, it's all very orderly, and, as mentioned, basically the same for every register.

    Let's tackle the bit enum first, given it's two thirds of the entire definition. How much information is really *needed*
    to construct this? The type name, the register type, the variant names, and which bits they are. Everything else is based
    on that, so if we discard all the other boilerplate, and poke it round a bit to look Rust-like, we end up with this:

    ```rust
    TWCRBits: u8 {
    TWIE = 0,
    TWEN = 2,
    TWWC = 3,
    TWSTO = 4,
    TWSTA = 5,
    TWEA = 6,
    TWINT = 7
    }
    ```

    That's all of the unique information needed. All the rest is boilerplate with copies of that information. Macros are
    perfect for this kind of boilerplate creation. I'll call this one `reg_named_bits`, and start on our token pattern.
    Looking at the definition above, the first token is an ident (`TWCRBits`), follow by a colon, followed by a type (`u8`),
    then a bracket pair. Inside this bracket pair is a sequence of idents (`TWIE`, etc.), followed by an equals, followed by
    an expression (`1`, etc.), separated by a comma. That's not too complex to define:

    ```rust
    #[macro_export]
    macro_rules! reg_named_bits {
    (
    $name:ident: $type:ty {
    $( $bit:ident = $id:expr ),+ $(,)*
    }
    ) => {

    };
    }
    ```

    The little `$(,)*` is so you can have an optional trailing comma. Now to actually use this information to generate the
    boilerplate. One thing we need to be aware of is namespaces. There's no guarantee that traits and types will be imported
    to the macro invokation site, so we need to fully qualify the traits and types we're using. The macro body is basically
    what we have above, except with the specific bits replaced with the pattern matches:

    ```rust
    #[macro_export]
    macro_rules! reg_named_bits {
    (
    $name:ident : $type:ty {
    $( $bit:ident = $id:expr ),+ $(,)*
    }
    ) => {
    #[derive(Copy, Clone, Eq, PartialEq)]
    pub enum $name {
    $( $bit ),*
    }

    impl crate::register::NamedBits for $name {
    type DataType = $type;
    fn bit_id(self) -> $type {
    match (self) {
    $( $name::$bit => $id ),*
    }
    }
    }

    impl crate::register::SetValueType<$name> for $name {
    fn as_value(self) -> $type {
    1 << self.bit_id()
    }
    }

    impl core::ops::BitOr for $name {
    type Output = crate::register::BitBuilder<$name>;
    fn bitor(self, rhs: $name) -> Self::Output {
    crate::register::BitBuilder::new() | self | rhs
    }
    }

    impl core::ops::BitOr<crate::register::BitBuilder<$name>> for $name {
    type Output = crate::register::BitBuilder<$name>;
    fn bitor(self, rhs: crate::register::BitBuilder<$name>) -> Self::Output {
    rhs | self
    }
    }
    };
    }
    ```

    Note the very verbose paths to the traits and `BitBuilder`. Another piece of boilerplate is the associated constants on
    the struct. This could be done with the third and final macro, but it would be nice if the registers on the same port
    (e.g. `PORTB`, `DDRB`, and `PINB`) could share the same register type, as the bits mean the same thing. So with that in
    mind, we'll add another macro. This one will be similar to the first, with an input that looks like this:

    ```rust
    TWCR: TWCRBits {
    TWIE,
    TWEN,
    TWWC,
    TWSTO,
    TWSTA,
    TWEA,
    TWINT
    }
    ```

    That one's just ident, colon, ident, bracket, comma-separate ident sequence, bracket. And the output should be an
    impl on the struct with the constants:

    ```rust
    #[macro_export]
    macro_rules! reg_bit_consts {
    (
    $struct_name:ident : $bits_name:ident {
    $( $bit:ident ),+ $(,)*
    }
    ) => {
    impl $struct_name {
    $( pub const $bit: $bits_name = $bits_name::$bit; )*
    }
    }
    }
    ```

    This has shrunk the definitions down a lot, but there's still repeated information about the bit names. In most cases
    this could be condensed further by declaring another macro. However, this third macro needs to take into account three
    cases:

    * No bits.
    * Associating with an existing bits definition.
    * Fully generating the bits.

    All three of these cases share some of the same information: the register name (ident), the register type, and the
    register address (expression). So a similar structure to above can be used for this too:

    ```rust
    TWCR: u8 {
    addr: 0xBC
    }
    ```

    This is enough for the first case, but what about the other two? The most obvious solution to me was to have
    another part of the pattern, which names the bit type:

    ```rust
    TWCR: u8 {
    addr: 0xBC,
    bits: TWCRBits
    }
    ```

    That handles the second case. For the third case, this could be extended with the same list from the first macro
    defined earlier:

    ```rust
    TWCR: u8 {
    addr: 0xBC,
    bits: TWCRBits {
    TWIE = 0,
    TWEN = 2,
    TWWC = 3,
    TWSTO = 4,
    TWSTA = 5,
    TWEA = 6,
    TWINT = 7
    }
    }
    ```

    So, that's three cases, all with with simple patterns:

    ```rust
    #[macro_export]
    macro_rules! reg {
    (
    $name:ident : $type:ty {
    addr: $addr:expr $(,)*
    }
    ) => {

    };

    (
    $name:ident : $type:ty {
    addr: $addr:expr,
    bits: $bits_name:path $(,)*
    }
    ) => {

    };

    (
    $name:ident: $type:ty {
    addr: $addr:expr,
    bits: $bits_name:path {
    $( $bit:ident = $id:expr ),+ $(,)*
    }
    }
    ) => {

    };
    }
    ```

    One thing to note is that in the second case, the `$bits_name` is a `path`, not an `ident`. This allow the user to
    name a type in another module (e.g. `crate::registers::NoBits`). The first pattern is trivial to implement; we just
    forward to the second pattern, specifying that the bits type is the `NoBits` type we defined earlier. This is why we
    needed path support for the second pattern.

    ```rust
    (
    $name:ident : $type:ty {
    addr: $addr:expr $(,)*
    }
    ) => {
    reg! {
    $name: $type {
    addr: $addr,
    bits: crate::register::NoBits<$type>,
    }
    }
    };
    ```

    The second is where the struct and `Register` implementation live:

    ```rust
    (
    $name:ident : $type:ty {
    addr: $addr:expr,
    bits: $bits_name:path $(,)*
    }
    ) => {
    pub struct $name;
    impl crate::register::Register<$type> for $name {
    const ADDR: *mut $type = $addr as *mut $type;
    type BitType = $bits_name;
    }
    };
    ```

    And finally, the third pattern. This one forwards to the second pattern, as well as calling out to the previous two
    macros defined earlier:

    ```rust
    (
    $name:ident: $type:ty {
    addr: $addr:expr,
    bits: $bits_name:ident {
    $( $bit:ident = $id:expr ),+ $(,)*
    }
    }
    ) => {
    reg_named_bits! {
    $bits_name: $type {
    $( $bit = $id ),+
    }
    }

    reg! {
    $name: $type {
    addr: $addr,
    bits: $bits_name,
    }
    }

    reg_bit_consts! {
    $name : $bits_name {
    $( $bit ),+
    }
    }
    };
    ```

    With these three macros, the complete register definition listed earlier is now significantly smaller, and easier to read:

    ```rust
    reg! {
    TWCR: u8 {
    addr: 0xBC,
    bits: TWCRBits {
    TWIE = 0,
    TWEN = 2,
    TWWC = 3,
    TWSTO = 4,
    TWSTA = 5,
    TWEA = 6,
    TWINT = 7
    }
    }
    }
    ```

    ## Final Example

    To demonstrate a usage of it, here's code declaring the registers for the USART, and then using them to send a
    string over serial:

    ```rust
    const CPU_FREQ: u32 = 16_000_000;
    /// The baud rate we'll use for serial.
    const BAUD_RATE: u32 = 9600;
    /// The calculated value to put into the UBBR register to set the baud rate.
    const UBBR_VAL: u16 = ((CPU_FREQ / 8 / BAUD_RATE) - 1) as u16;

    reg! {
    UCSR0A: u8 {
    addr: 0xC0,
    bits: UBSR0ABits {
    MPCM0 = 0,
    U2X0 = 1,
    UDRE0 = 5,
    }
    }
    }

    reg! {
    UCSR0B: u8 {
    addr: 0xC1,
    bits: UBSR0BBits {
    TXEN0 = 3,
    RXEN0 = 4,
    }
    }
    }

    reg! {
    UCSR0C: u8 {
    addr: 0xC2,
    bits: UCSR0CBits {
    UCSZ00 = 1,
    UCSZ01 = 2
    }
    }
    }

    reg!{
    UDR0: u8 {
    addr: 0xC6,
    }
    }

    reg! {
    UBRR0: u16 {
    addr: 0xC4,
    }
    }

    #[no_mangle]
    extern "C" fn main() {
    unsafe {
    // Set our baud rate.
    UBRR0::set_raw_value(UBBR_VAL);

    // Configure for:
    // * 2x speed
    // * 8-bit characters
    // * 1 stop bit
    // * No parity
    // * Async mode,
    // * Enable RX/TX
    UCSR0A::set_value(UCSR0A::U2X0 | UCSR0A::MPCM0);
    UCSR0B::set_value(UCSR0B::RXEN0 | UCSR0B::TXEN0);
    UCSR0C::set_value(UCSR0C::UCSZ01 | UCSR0C::UCSZ00);

    let message = "Hello World!";

    message.bytes().for_each(|b| {
    // Wait for the data register to become available.
    while !UCSR0A::get_bit(UCSR0A::UDRE0) {}

    // Stick the byte into the buffer to send it.
    UDR0::set_raw_value(b);
    });
    }
    }
    ```