Skip to content

Instantly share code, notes, and snippets.

@korken89
Last active February 14, 2023 16:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save korken89/fe94a475726414dd1bce031c76adc3dd to your computer and use it in GitHub Desktop.
Save korken89/fe94a475726414dd1bce031c76adc3dd to your computer and use it in GitHub Desktop.
An RTIC monotonic based on RTC for nRF52 series
// RTIC Monotonic impl for the RTCs
use crate::hal::pac::{rtc0, RTC0, RTC1, RTC2};
pub use fugit::{self, ExtU32};
use rtic_monotonic::Monotonic;
pub struct MonoRtc<T: InstanceRtc> {
overflow: u8,
rtc: T,
}
impl<T: InstanceRtc> MonoRtc<T> {
pub fn new(rtc: T) -> Self {
unsafe { rtc.prescaler.write(|w| w.bits(0)) };
MonoRtc { overflow: 0, rtc }
}
pub fn is_overflow(&self) -> bool {
self.rtc.events_ovrflw.read().bits() == 1
}
}
impl<T: InstanceRtc> Monotonic for MonoRtc<T> {
type Instant = fugit::TimerInstantU32<32_768>;
type Duration = fugit::TimerDurationU32<32_768>;
const DISABLE_INTERRUPT_ON_EMPTY_QUEUE: bool = false;
unsafe fn reset(&mut self) {
self.rtc
.intenset
.write(|w| w.compare0().set().ovrflw().set());
self.rtc
.evtenset
.write(|w| w.compare0().set().ovrflw().set());
self.rtc.tasks_clear.write(|w| w.bits(1));
self.rtc.tasks_start.write(|w| w.bits(1));
}
#[inline(always)]
fn now(&mut self) -> Self::Instant {
let cnt = self.rtc.counter.read().bits();
let ovf = if self.is_overflow() {
self.overflow.wrapping_add(1)
} else {
self.overflow
} as u32;
Self::Instant::from_ticks((ovf << 24) | cnt)
}
fn set_compare(&mut self, instant: Self::Instant) {
let now = self.now();
const MIN_TICKS_FOR_COMPARE: u64 = 3;
// Since the timer may or may not overflow based on the requested compare val, we check
// how many ticks are left.
//
// Note: The RTC cannot have a compare value too close to the current timer value,
// so we use the `MIN_TICKS_FOR_COMPARE` to set a minimum offset from now to the set value.
let val = match instant.checked_duration_since(now) {
Some(x) if x.ticks() <= 0xffffff && x.ticks() > MIN_TICKS_FOR_COMPARE => {
instant.duration_since_epoch().ticks() & 0xffffff
} // Will not overflow
Some(x) => {
(instant.duration_since_epoch().ticks() + (MIN_TICKS_FOR_COMPARE - x.ticks()))
& 0xffffff
} // Will not overflow
_ => 0, // Will overflow or in the past, set the same value as after overflow to not get extra interrupts
};
unsafe { self.rtc.cc[0].write(|w| w.bits(val)) };
}
fn clear_compare_flag(&mut self) {
unsafe { self.rtc.events_compare[0].write(|w| w.bits(0)) };
}
#[inline(always)]
fn zero() -> Self::Instant {
Self::Instant::from_ticks(0)
}
fn on_interrupt(&mut self) {
if self.is_overflow() {
self.overflow = self.overflow.wrapping_add(1);
self.rtc.events_ovrflw.write(|w| unsafe { w.bits(0) });
}
}
}
pub trait InstanceRtc: core::ops::Deref<Target = rtc0::RegisterBlock> {}
impl InstanceRtc for RTC0 {}
impl InstanceRtc for RTC1 {}
impl InstanceRtc for RTC2 {}
@korken89
Copy link
Author

@eflukx Hmm, I have not seen a case like this before.
Do you have a way so I could test it locally?
It does sound like either:

  1. There is a bug in the timer queue handling (however this is very well tested code)
  2. There is a bug in the monotonic
  3. A bug in how task spawns are handled in the app which causes a spawn to fail and the event loop breaks

@eflukx
Copy link

eflukx commented Sep 24, 2022

There is a bug in the timer queue handling (however this is very well tested code)

Agree this seems unlikely... I figured this would be probably a known bug in RTIC, but couldn't find any issues on the repo describing a similar problem. (must confess only did a QuickScan™️)

There is a bug in the monotonic

Seems more likely as it is provided outside the RTIC framework and as such contains self maintained/custom code. (That's the reason I posted issue here :) )

A bug in how task spawns are handled in the app which causes a spawn to fail and the event loop breaks

Could be.. but unrelated (timer) tasks also stop to be queued, that worked perfectly fine before. Furthermore, when using a SysTick-based monotonic the problem doesn't arise. Also, for testing I reduced the tasks in question to do the absolute minimum (essentially empty tasks outside the (self-)spawning 'logic').

In the meantime... I made two new observations:

1. Adding some delay before setting the compare register (rtc.cc[0]) in the MonoRtc set_compare() function seems to alleviate the problem. Like this:

cortex_m::asm::delay(1000); // <-- add some delay cylcles
unsafe { self.rtc.cc[0].write(|w| w.bits(val as u32)) };

I've had the app running overnight and it kept working correctly. Could it be there's some race condition when setting the compare register in the set_compare function?

the trait documentation specifically documents this should not be a problem, i.e. it's handled in the framework.. But this could be a bug of course... From Monotonic trait:

    /// Set the compare value of the timer interrupt.
    ///
    /// **Note:** This method does not need to handle race conditions of the monotonic, the timer
    /// queue in RTIC checks this.
    fn set_compare(&mut self, instant: Self::Instant);

2. Some tasks are run (at some other interrupt?)

It first seemed all 'timered' tasks stopped working altogether, but I noticed tasks are run "sometimes". I have a (sensor-measuring) task that normally runs every 20 seconds, but now (in the "hanging" state) runs every ~1024 seconds. I figured this is probably caused by some other source triggering an (overflow) interrupt. (there is a second RTC instance running from my BSP that is used for keeping uptime (will be removed, but now is still there. Removing/disabling it does not alleviate the problem, so it seems unrelated to the issue at hand, but it is causing periodic overflow interrupts...)).

Do you have a way so I could test it locally?

I'll try to isolate the case for further dissection... :)

@eflukx
Copy link

eflukx commented Sep 24, 2022

Do you have a way so I could test it locally?

I have created an isolated example that shows the problem (on my hardware it is very easily reproduced).
Please have a look at: https://github.com/eflukx/rtic-rtc-example/blob/rtc_fast_spawn/src/bin/rtic_rtc_spawn_fail.rs

Thanks in advance! 👍

@korken89
Copy link
Author

@eflukx Hi, I've been running your code for a while to get a grip on the issue.
I adapted it by changing the HAL to nrf52832 (that's what I have at hand).

So far I've been able to get the error to happen for me as well, I'll give it a deeper look and see what the issue it.
We use this monotonic impl in production, so I'll probably look deeper at this on Monday at work as well. :)

@korken89
Copy link
Author

I've added some debugging code that checks what instant we set the RTC to and at what time we exit the ISR.

INFO  fast_task first spawned!
└─ rtic_rtc_spawn_fail::app::fast_task @ src/bin/rtic_rtc_spawn_fail.rs:63
DEBUG set_compare 98310, now 98308
└─ rtic_nrf_rtc::monotonic_nrf52_rtc::{impl#1}::set_compare @ src/monotonic_nrf52_rtc.rs:66
DEBUG isr, now 98309
└─ rtic_nrf_rtc::monotonic_nrf52_rtc::{impl#1}::on_interrupt @ src/monotonic_nrf52_rtc.rs:81

So we set the compare to a value that is ~2 ticks into the future.
After that the handling takes some time, and we exit the ISR when there is 1 tick left.
Here is the weird part: We don't get any interrupt from the RTC!

I think there is some race-condition with the RTC HW if the wait is too short.

@korken89
Copy link
Author

Aha! The datasheet specifies that if you set the compare to a value that is within 2 cycles the COMPARE event may not happen.
So we need to add a check that sets the value to 2 or more ticks later.

@korken89
Copy link
Author

Here is an update set_compare that solves the issue.
Unfortunately it does add on the minimal possible delay.

    fn set_compare(&mut self, instant: Self::Instant) {
        let now = self.now();

        const MIN_TICKS_FOR_COMPARE: u64 = 3;

        // Since the timer may or may not overflow based on the requested compare val, we check
        // how many ticks are left.
        let val = match instant.checked_duration_since(now) {
            Some(x) if x.ticks() <= 0xffffff && x.ticks() > MIN_TICKS_FOR_COMPARE => {
                instant.duration_since_epoch().ticks() & 0xffffff
            } 
            Some(x) => {
                (instant.duration_since_epoch().ticks() + (MIN_TICKS_FOR_COMPARE - x.ticks()))
                    & 0xffffff
            } 
            _ => 0, // Will overflow or in the past, set the same value as after overflow to not get extra interrupts
        } as u32;

        unsafe { self.rtc.cc[0].write(|w| w.bits(val)) };
    }

@korken89
Copy link
Author

If you come up with a better fix, feel free to ping me!

@korken89
Copy link
Author

Also, remember to set this flag as you are using an extended timer:

    const DISABLE_INTERRUPT_ON_EMPTY_QUEUE: bool = false;

@eflukx
Copy link

eflukx commented Sep 25, 2022

Great to have a working solution! Having a minimum spawn-delay of 3 ticks (~10kHz) would work for me. (and not having an app that hangs "for no apparent reason" works for me as well.. 👍 )

Also, remember to set this flag as you are using an extended timer:

    const DISABLE_INTERRUPT_ON_EMPTY_QUEUE: bool = false;

Good catch! I did not even notice this (probably as it has a default value set in the trait). As the run queue in my specific use case is never empty, having this at false didn't manifest a real problem. Still, a foot gun lurking in the deep.. ;)

If you come up with a better fix, feel free to ping me!

Yep.. I'll dive into the Nordic datasheets and errata. I would expect this behavior probably to be documented in there somewhere...

What isn't exactly clear to me is why the added delay "solution" (using asm::delay()) seemed to solve the issue as well... (as the actual time-to-interrupt only becomes shorter by adding delay)

@korken89
Copy link
Author

The delay solution works because RTIC's timer queue checks if the time has expired and side steps the interrupt handler. So if it's too short RTIC catches that :)

@eflukx
Copy link

eflukx commented Feb 14, 2023

In reaction to

Here is an update set_compare that solves the issue.

There seems to be a logic error that can result in an overflow condition... consider

(instant.duration_since_epoch().ticks() + (MIN_TICKS_FOR_COMPARE - x.ticks()))

The match arm above that is conditional, so our code is executed only if the following evaluates to false:

Some(x) if x.ticks() <= 0xffffff && x.ticks() > MIN_TICKS_FOR_COMPARE => {

It is, however, executed when x.ticks() > 0xffffff, when at the same time x.ticks() > MIN_TICKS_FOR_COMPARE, the subtraction MIN_TICKS_FOR_COMPARE - x.ticks() results in an overflow. Shouldn't just checking for x.ticks() > MIN_TICKS_FOR_COMPARE in the first match arm condition be enough?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment