Background:
- When sharing data, we need to provide "mutex"-like behavior to make sure we don't allow simultaneous mutable access
- When sharing data on a platform with threads, it is reasonable to use a spinlock for short-held coordination locks
- When sharing data on a platform with interrupts, it is NOT reasonable to use a spinlock, as the pre-emptee could be holding the lock, meaning that the pre-empter would deadlock
- When sharing data on a platform with interrupts, it is ONLY reasonable to use a critical section, to prevent pre-emption
- There is a portable library for providing access to critical sections, the
critical-section
crate - The
critical-section
crate provides this interface, but it ONLY provides a re-entrant interface: only proving "pre-emption is guaranteed not to happen", and NOT guaranteeing exclusive access to a resources - On the topic of interrupts, there are actually two relevant use cases:
- When we have "thread mode" and "raw interrupts":
- In this case, we MUST hold a critical-section + mutex to guard data, as we must not attempt to lock the mutex within an interrupts while the mutex is actively taken
- The interrupt must not try to take the mutex twice within the ISR: it must be a NON-REENTRANT mutex
- When we have "multiprio interrupts", e.g. multiple executors at multiple
priority levels, where the high prio tasks CAN yield if blocked:
- we can potentially decompose the mutex into two pieces:
- An inner, critical section, blocking mutex, that ONLY guards two
pieces of "quick to update" information:
- "is there a guard live now?", e.g. is there currently access to the guarded content
- The Waker or intrusive queue of wakers for tasks to be registered, awoken when access is available, or de-registered if the future is dropped
- An outer, async mutex, which contains the above blocking mutex,
as well as the
UnsafeCell<T>
containing the data
- An inner, critical section, blocking mutex, that ONLY guards two
pieces of "quick to update" information:
- we can potentially decompose the mutex into two pieces:
- When we have "thread mode" and "raw interrupts":
- When we have multiple cores:
- We could get away with a spinlock mutex if none of the cores are pre-emptable, as the cores may make independent progress (as long as there is no other source of deadlock)
- But if we have pre-emption on either core, we're back to requiring a critical section, additionally one that works across cores (often using a hardware semaphore).
- There is a portable library for providing locking primitives,
lock-api
.- It has a
RawMutex
trait that describes a NON-REENTRANT mutex, which provides an interface with a blockinglock()
method - It has a
ReentrantMutex
struct, but it is generic over the genericRawMutex
trait to provide that behavior - Neither of these are a good match for
critical-section
style APIs
- It has a
- There is a portable library for providing async primitives,
embassy-sync
- It has a
RawMutex
trait that describes a REENTRANT mutex, intended to be used as the blocking inner-mutex for an outer async-mutex (as described above) - The
embassy-sync
RawMutex trait IS compatible withcritical-section
's guarantees, and provides an implementation for this embassy-sync
's Mutex is not necessarily suitable for using with raw interrupts: the critical section is only used to gate access to the "is taken" flagembassy-sync
also has an issue: it primarily uses anAtomicWaker
-like structure for "parked" tasks: This means ONLY ONE task may be waiting at a time, or there will be "waker churn": the tasks will take turns kicking each others waker out, fighting for the spot, burning CPU and causing odd-appearing ordering: it's a coin flip which task will "win" the race, rather than having a more reasonable FIFO access to the resource- "waker churn" is aggravated by "multiprio" use cases: the higher priority executor will pre-empt every time one of it's tasks is "kicked out" of the atomic waker slot, causing additional wasted CPU and jitter in lower priority tasks
- It has a
- There is a portable library for providing async primitives,
maitake-sync
maitake-sync
provides two main core async primitives:WaitCell
- a "single waker" or "AtomicWaker"-like structure, which does not require a blocking inner-mutexWaitQueue
- an "intrusive queue" suitable for holding an unbounded number of wakers, but does require a blocking inner-mutex to allow tasks to de-register themselves
- Today, it always uses a spinlock to provide the blocking inner-mutex behavior, making it unsuitable for use with interrupts or multiprio environments
- Eliza (the primary author of maitake-sync) has considered using
lock-api
to allow parameterization of async primitives over different lock kinds. There is a mostly-complete branch implementing this
With all of that background information set, I can now outline my actual goals:
embassy-sync
needs some way to allow for multiple waiters.- This isn't always or even often needed: most cases a single waker slot is acceptable and more efficient + less complex
- However it IS sometimes needed, for example in MPSC-like cases, where multiple tasks may be waiting for access to a single resource (like room in a bounded queue to send)
- It would be VERY possible to use
maitake-sync
'sWaitQueue
as an existing implementation of an intrusive waiter queue, EXCEPT the issue that maitake-sync always uses a spinlock
maitake-sync
needs some way to allow for different inner-blocking mutexes- If this is not fixed, it cannot reasonably be used in multiprio cases
- It would be preferrable if
embassy-sync
andmaitake-sync
used the SAME way of parameterization over blocking locks, for consistency and compatibility- It was originally thought that
lock-api
's traits were an option for this - But I now realize that there is an impedence mismatch here, and this route is not viable
- It was originally thought that
That leaves us with three options that I can see, moving forward:
- Extract
embassy-sync
'sRawMutex
trait into a separate, stable-ish crate- We can then refactor
embassy-sync
to depend on this crate - We can then refactor
maitake-sync
to depend on this crate - We can then consider depending on
maitake-sync
from withinembassy-sync
to allow for use ofWaitQueue
- We can then refactor
- We move forward with using
lock-api
inmaitake-sync
- We can maybe implement a slightly hacky impl of the Lock API traits
using a critical section, and hold it very carefully in a way that
happens to work for
maitake-sync
- This will make it difficult or impossible for
embassy-sync
to usemaitake-sync
as a dep.
- We can maybe implement a slightly hacky impl of the Lock API traits
using a critical section, and hold it very carefully in a way that
happens to work for
- I can give up, and walk into the sea