lupyuen/ox64-getprime-hello.md

## ox64-getprime-hello.md

      
    Raw
  

              ox64-getprime-hello.md
            
          
    NuttX on Ox64: getprime + hello crashes

Read the article...

"Flush the MMU Cache for T-Head C906"

getprime + hello crashes on NuttX 12.4.0 RC0:
## Downloaded from NuttX 12.4.0 RC0: `wget -r -nH --cut-dirs=100 --no-parent https://dist.apache.org/repos/dist/dev/nuttx/12.4.0-RC0/`
## https://github.com/apache/nuttx/tree/nuttx-12.4.0-RC0
## https://github.com/apache/nuttx-apps/tree/nuttx-12.4.0-RC0

NuttShell (NSH) NuttX-12.4.0
nsh> uname -a
NuttX 12.4.0 96c2707737 Jan 15 2024 10:35:29 risc-v ox64
nsh> 
nsh> getprime
Set thread priority to 10
Set thread policy to SCHED_RR
Start thread #0
thread #0 started, looking for primes < 10000, doing 10 run(s)
thread #0 finished, found 1230 primes, last one was 9973
Done
getprime took 0 msec
nsh> 
nsh> hello
riscv_exception: EXCEPTION: Store/AMO page fault. MCAUSE: 000000000000000f, EPC: 0000000050208fcc, MTVAL: 0000000080200000
riscv_exception: PANIC!!! Exception = 000000000000000f

(See the NuttX Log)
What is EPC 5020_8fcc? Why is it writing to invalid address MTVAL 8020_0000?
0x5020_8fcc is the memset() function in NuttX Kernel: libs/libc/string/lib_memset.c
FAR void *memset(FAR void *s, int c, size_t n) {
...
nuttx/libs/libc/string/lib_memset.c:126
                  *(FAR uint64_t *)addr = val64;
    50208fc8:	40ce0eb3          	sub	t4,t3,a2
    50208fcc:	011eb023          	sd	a7,0(t4)

0x8020_0000 is the Virtual Address of the Dynamic Heap (malloc) for the hello app: boards/risc-v/bl808/ox64/configs/nsh/defconfig
CONFIG_ARCH_DATA_NPAGES=128
CONFIG_ARCH_DATA_VBASE=0x80100000
CONFIG_ARCH_HEAP_NPAGES=128
CONFIG_ARCH_HEAP_VBASE=0x80200000
CONFIG_ARCH_TEXT_NPAGES=128
CONFIG_ARCH_TEXT_VBASE=0x80000000
Somehow NuttX Kernel crashed when writing to the App Heap for hello. Which should never crash!
NuttX Kernel crashes when initing App Heap

Who called memset()?
In the Stack Trace we search for 0x50??????. (Kernel Code Addresses). And we see 0x5020_976e, which is in mm_initialize(): mm/mm_heap/mm_initialize.c
FAR struct mm_heap_s *mm_initialize(FAR const char *name, FAR void *heapstart, size_t heapsize) {
...
nuttx/mm/mm_heap/mm_initialize.c:245 (discriminator 2)
  DEBUGASSERT(MM_MIN_CHUNK >= MM_SIZEOF_ALLOCNODE);

  /* Set up global variables */
  memset(heap, 0, sizeof(struct mm_heap_s));
    50209762:	4581                	li	a1,0
    50209764:	2a800613          	li	a2,680
    50209768:	8522                	mv	a0,s0
    5020976a:	fb0ff0ef          	jal	ra,50208f1a <memset>
nuttx/mm/mm_heap/mm_initialize.c:239 (discriminator 2)
  heapstart = (FAR char *)heap_adj + sizeof(struct mm_heap_s);
    5020976e:	2a840993          	addi	s3,s0,680

And there's the memset()! This means NuttX Kernel was initing the App Heap for hello so...

Nutt Kernel calls mm_initialize()
Which calls memset()
Which failed to write to the App Heap Memory

Why did NuttX Kernel fail when initing the Application Heap for the NuttX App hello? Let's find out...
What if we run hello twice?
It works OK. (See the NuttX Log)
QEMU is OK

Does this happen with QEMU 64-bit RISC-V (rv-virt:nsh64), since the MMU Code is similar?
QEMU is OK (phew)...
NuttShell (NSH) NuttX-12.3.0-RC1
nsh> uname -a
NuttX 12.3.0-RC1 96c2707 Jan 15 2024 14:40:22 risc-v rv-virt
nsh> getprime
Set thread priority to 10
Set thread policy to SCHED_RR
Start thread #0
thread #0 started, looking for primes < 10000, doing 10 run(s)
thread #0 finished, found 1230 primes, last one was 9973
Done
getprime took 363 msec
nsh> hello
Hello, World!!
nsh>

QEMU is OK but Ox64 is not. Any recent changes to QEMU?
QEMU had a recent patch: rv-virt/knsh: Set correct RAM_START and RAM_SIZE
So we apply the same patch to Ox64: boards/risc-v/bl808/ox64/configs/nsh/defconfig
CONFIG_RAM_SIZE=2097152
CONFIG_RAM_START=0x50200000
CONFIG_RAM_VSTART=0x50200000
But the problem still happens.
What about Upstream NuttX for Ox64?
Same problem happens on Upstream NuttX:
## Based on
## https://github.com/apache/nuttx/tree/fe5ca39a79ec13e16290d8a903fcd8bb2f87b4b5
## https://github.com/apache/nuttx-apps/tree/8930743831dcae07c2d52b04b3f4266a521c3f2a
NuttShell (NSH) NuttX-12.4.0-RC0
nsh> uname -a
NuttX 12.4.0-RC0 fe5ca39 Jan 15 2024 12:2:37 risc-v ox64
nsh> 
nsh> getprime
Set thread priority to 10
Set thread policy to SCHED_RR
Start thread #0
thread #0 started, looking for primes < 10000, doing 10 run(s)
thread #0 finished, found 1230 primes, last one was 9973
Done
getprime took 15404 msec
nsh> 
nsh> hello
riscv_exception: EXCEPTION: Store/AMO page fault. MCAUSE: 000000000000000f, EPC: 0000000050209134, MTVAL: 0000000080200000
riscv_exception: PANIC!!! Exception = 000000000000000f

BL808 Timer

Is Latest Upstream NuttX different from 12.4.0 RC0?
BL808 Timer Implementation is missing from 12.4.0 RC0, but present in the latest build. Since the latest build crashes, this problem is not due to the Timer Implementation. We need more analysis.
Why do we need BL808 Timer?
The BL808 Timer is needed for ostest to succeed, so 12.4.0 RC0 is somewhat incomplete for Ox64. (See the ostest Log for Upstream NuttX)
Since the BL808 Timer is missing from 12.4.0 RC0 and since getprime + hello crashes, I suggest we proceed with the rollout of 12.4.0 RC0. The next release will have BL808 Timer and getprime + hello fixed.
MMU Log

Was the MMU Virtual Address Mapping clobbered when getprime was terminated?
Is there a problem swapping the MMU SATP Register?
Let's log the MMU and SATP Register like this: Log the MMU.
From the MMU Log we see the MMU Page Tables...
Page Tables for getprime:
## `getprime` maps Virtual Address 0x80200000 to Physical Address 0x506a6000
## Level 1 Page Table is at 0x5069d000
## Level 2 Page Table is at 0x5069e000
## Level 3 Page Table is at 0x506a5000
nsh> getprime
up_addrenv_create: textsize=0x27c0, datasize=0xc, heapsize=0x80000, addrenv=0x50409980
mmu_ln_setentry: ptlevel=1, lnvaddr=0x5069d000, paddr=0x5069e000, vaddr=0x80100000, mmuflags=0
...
mmu_ln_setentry: ptlevel=2, lnvaddr=0x5069e000, paddr=0x506a5000, vaddr=0x80200000, mmuflags=0
mmu_ln_setentry: ptlevel=3, lnvaddr=0x506a5000, paddr=0x506a6000, vaddr=0x80200000, mmuflags=0x16

Page Tables for hello:
## `hello` maps Virtual Address 0x80200000 to Physical Address 0x506a4000
## Level 1 Page Table is at 0x5069d000
## Level 2 Page Table is at 0x5069e000
## Level 3 Page Table is at 0x506a3000
nsh> hello
up_addrenv_create: textsize=0xf24, datasize=0xc, heapsize=0x80000, addrenv=0x50409940
mmu_ln_setentry: ptlevel=1, lnvaddr=0x5069d000, paddr=0x5069e000, vaddr=0x80100000, mmuflags=0
...
mmu_ln_setentry: ptlevel=2, lnvaddr=0x5069e000, paddr=0x506a3000, vaddr=0x80200000, mmuflags=0
mmu_ln_setentry: ptlevel=3, lnvaddr=0x506a3000, paddr=0x506a4000, vaddr=0x80200000, mmuflags=0x16

(Inside the Level 1 and 2 Page Tables for NuttX Apps)
(Inside the Level 3 Page Table for NuttX Apps)
Also from the MMU Log we see the MMU SATP Register...
MMU SATP Register for getprime
## SATP 5069d points to the `getprime` Level 1 Page Table at 0x5069d000
## We assume that SATP 50600 points to the the NuttX Kernel Level 1 Page Table at 0x50600000
mmu_satp_reg: pgbase=0x5069d000, asid=0x0, reg=0x800000000005069d
up_addrenv_select: addrenv=0x50409980, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_select: addrenv=0x50409980, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_select: addrenv=0x50409980, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_select: addrenv=0x50409980, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
...
getprime took 0 msec
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600

MMU SATP Register for hello
## SATP 5069d points to the `hello` Level 1 Page Table at 0x5069d000
## We assume that SATP 50600 points to the the NuttX Kernel Level 1 Page Table at 0x50600000
mmu_satp_reg: pgbase=0x5069d000, asid=0x0, reg=0x800000000005069d
up_addrenv_select: addrenv=0x50409940, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_select: addrenv=0x50409940, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_select: addrenv=0x50409940, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
riscv_exception: EXCEPTION: Store/AMO page fault. MCAUSE: 000000000000000f, EPC: 0000000050209066, MTVAL: 0000000080200000
riscv_exception: PANIC!!! Exception = 000000000000000f

hello crashes after NuttX Kernel swaps MMU SATP Register to 5069d, which points to the Level 1 Page Table at 0x5069d000
Dump the Level 1 Page Tables

Is the Level 1 Page Table at 0x5069d000 clobbered?
Let dump the Level 1 Page Tables inside up_addrenv_select. According to the Page Table Log...
Level 1 Page Table for getprime: (0x5069d000)
up_addrenv_select: addrenv=0x50409980, satp=0x800000000005069d, page_table=0x5069d000
*page_table= (0x5069d000):
0000  e7 00 00 00 00 00 00 90 21 14 10 14 00 00 00 00  ........!.......
0010  01 78 1a 14 00 00 00 00 21 10 10 14 00 00 00 00  .x......!.......
0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
mmu_write_satp: reg=0x800000000005069d
Set thread priority to 10
Set thread policy to SCHED_RR
Start thread #0

Level 1 Page Table for hello: (0x5069d000)
up_addrenv_select: addrenv=0x50409940, satp=0x800000000005069d, page_table=0x5069d000
*page_table= (0x5069d000):
0000  e7 00 00 00 00 00 00 90 21 14 10 14 00 00 00 00  ........!.......
0010  01 78 1a 14 00 00 00 00 21 10 10 14 00 00 00 00  .x......!.......
0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
mmu_write_satp: reg=0x800000000005069d
riscv_exception: EXCEPTION: Store/AMO page fault. MCAUSE: 000000000000000f, EPC: 0000000050209066, MTVAL: 0000000080200000
riscv_exception: PANIC!!! Exception = 000000000000000f

The Level 1 Page Tables for getprime and hello are identical. So the hello Level 1 Page Table is not clobbered.
(8 bytes per Page Table Entry)
(Inside the Level 1 and 2 Page Tables for NuttX Apps)
(Inside the Level 3 Page Table for NuttX Apps)
How to get Level 2 Page Table?
Based on Sv39 MMU, Page Table Entry Bits 10 to 53 contains the Physical Page Number of the Page Table. (Physical Page Number = Physical Address >> 12)
So the above Page Table Entry #3: 01 78 1a 14 00 00 00 00 means Physical Page Number = 0x5069e, which means Physical Address of L2 Page Table is 0x5069_e000
Dump the Level 2 and 3 Page Tables

What about the Level 2 and 3 Page Tables for getprime and hello?
This is how we dump the Level 2 and 3 Page Tables. From the L2 and L3 Page Table Log:
Level 2 and 3 Page Tables for getprime:
up_addrenv_select: l2_page_table=0x5069e000
*l2_page_table= (0x5069e000):
0000  01 7c 1a 14 00 00 00 00 01 94 1a 14 00 00 00 00  .|..............
0010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

up_addrenv_select: l3_page_table=0x506a5000
*l3_page_table= (0x506a5000):
0000  d7 98 1a 14 00 00 00 00 d7 9c 1a 14 00 00 00 00  ................
0010  d7 a0 1a 14 00 00 00 00 d7 a4 1a 14 00 00 00 00  ................
0020  d7 a8 1a 14 00 00 00 00 d7 ac 1a 14 00 00 00 00  ................
0030  d7 b0 1a 14 00 00 00 00 d7 b4 1a 14 00 00 00 00  ................
0040  d7 b8 1a 14 00 00 00 00 d7 bc 1a 14 00 00 00 00  ................

Level 2 and 3 Page Tables for hello:
up_addrenv_select: l2_page_table=0x5069e000
*l2_page_table= (0x5069e000):
0000  01 7c 1a 14 00 00 00 00 01 8c 1a 14 00 00 00 00  .|..............
0010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0030  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0040  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

up_addrenv_select: l3_page_table=0x506a3000
*l3_page_table= (0x506a3000):
0000  d7 90 1a 14 00 00 00 00 d7 94 1a 14 00 00 00 00  ................
0010  d7 98 1a 14 00 00 00 00 d7 9c 1a 14 00 00 00 00  ................
0020  d7 a0 1a 14 00 00 00 00 d7 a4 1a 14 00 00 00 00  ................
0030  d7 a8 1a 14 00 00 00 00 d7 ac 1a 14 00 00 00 00  ................
0040  d7 b0 1a 14 00 00 00 00 d7 b4 1a 14 00 00 00 00  ................

The Level 2 and 3 Page Tables for getprime and hello look similar. So the hello Level 2 and 3 Page Tables are not clobbered.
TODO: Check the MMU Flags
TODO: Compare with QEMU SATP Log
Are the Address Environments created and destroyed correctly?
We logged up_addrenv_destroy. According to this log, the Address Environments are created and destroyed correctly, in the right sequence...
## `getprime`: Create Address Environment 0x50409980
nsh> getprime
up_addrenv_create: textsize=0x27c0, datasize=0xc, heapsize=0x80000, addrenv=0x50409980
...

## `getprime`: Destroy Address Environment 0x50409980
getprime took 0 msec
up_addrenv_select: addrenv=0x5040b830, satp=0x8000000000050600
mmu_write_satp: reg=0x8000000000050600
up_addrenv_destroy: addrenv=0x50409980

## `hello`: Create Address Environment 0x50409940
## Wonder why the address is smaller than `getprime`?
nsh> hello
up_addrenv_create: textsize=0xf24, datasize=0xc, heapsize=0x80000, addrenv=0x50409940
...
up_addrenv_select: addrenv=0x50409940, satp=0x800000000005069d
mmu_write_satp: reg=0x800000000005069d
riscv_exception: EXCEPTION: Store/AMO page fault. MCAUSE: 000000000000000f, EPC: 0000000050209066, MTVAL: 0000000080200000

We also enabled Memory Manager Logging, doesn't show anything interesting. (See the NuttX Log)
T-Head MMU

Does T-Head C906 need Special RISC-V Instructions to Flush the MMU, after swapping the MMU SATP Register?
This crash happened just after swapping MMU SATP Register from NuttX Kernel to NuttX App. Maybe C906 MMU wasn't flushed correctly after swapping?
Below is the Linux Errata for T-Head C906
/*
 * th.dcache.ipa rs1 (invalidate, physical address)
 * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
 *   0000001    01010      rs1       000      00000  0001011
 * th.dcache.iva rs1 (invalidate, virtual address)
 *   0000001    00110      rs1       000      00000  0001011
 *
 * th.dcache.cpa rs1 (clean, physical address)
 * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
 *   0000001    01001      rs1       000      00000  0001011
 * th.dcache.cva rs1 (clean, virtual address)
 *   0000001    00101      rs1       000      00000  0001011
 *
 * th.dcache.cipa rs1 (clean then invalidate, physical address)
 * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
 *   0000001    01011      rs1       000      00000  0001011
 * th.dcache.civa rs1 (clean then invalidate, virtual address)
 *   0000001    00111      rs1       000      00000  0001011
 *
 * th.sync.s (make sure all cache operations finished)
 * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
 *   0000000    11001     00000      000      00000  0001011
 */
#define THEAD_INVAL_A0	".long 0x02a5000b"
#define THEAD_CLEAN_A0	".long 0x0295000b"
#define THEAD_FLUSH_A0	".long 0x02b5000b"
#define THEAD_SYNC_S	".long 0x0190000b"

#define THEAD_CMO_OP(_op, _start, _size, _cachesize)			\
asm volatile("mv a0, %1\n\t"						\
	     "j 2f\n\t"							\
	     "3:\n\t"							\
	     THEAD_##_op##_A0 "\n\t"					\
	     "add a0, a0, %0\n\t"					\
	     "2:\n\t"							\
	     "bltu a0, %2, 3b\n\t"					\
	     THEAD_SYNC_S						\
	     : : "r"(_cachesize),					\
		 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
		 "r"((unsigned long)(_start) + (_size))			\
	     : "a0")
TODO: Shall we call sync.s, dcache.ipa, dcache.cpa, dcache.cva, dcache.cipa, dcache.civa? They are documented in C906 User Manual (Page 548)
TODO: Who calls ALT_CMO_OP?
https://github.com/torvalds/linux/blob/master/arch/riscv/mm/pmem.c#L20
https://github.com/torvalds/linux/blob/master/arch/riscv/mm/dma-noncoherent.c#L28
TODO: Who calls arch_wb_cache_pmem and arch_dma_cache_wback?
What if we disable the T-Head C906 Shareable MMU Flag?
Same result.
Found Workaround: Bypass the Destroying of Address Environment

What if we disable the deallocation of App Page Tables? Might help to determine if it's an MMU Cache Issue.
MMU no longer crashes yay! T-Head C906 MMU was somehow reusing the Page Tables of the Previous Process. Even though we have updated MMU SATP Register to point to the New Page Tables.
This means we found an (awful) workaround for the MMU Problem: We Bypass the Destroying of Address Environment. Which prevents the Page Tables of the Previous Process (getprime) from getting deallocated.
TODO: Why is the MMU still using the Page Tables of the Previous Process? When we have already updated the MMU SATP Register to point to the New Page Tables?
TODO: Is MMU still caching the old SATP Register? Try clearing the MMU Cache with the T-Head C906 Caching Instructions: sync.s, dcache.ipa, dcache.cpa, dcache.cva, dcache.cipa, dcache.civa (from above)
Write to the Offending Data Address 0x8020_0000. Does it modify the Old App Heap (getprime) or the New App Heap (hello)? To confirm which App Page Tables are actually in effect.
The writes to Virtual and Physical Addresses look OK
What if we restore the original code that Destroys the Address Environment?
NuttX crashes when reading the New App Heap (hello)
What if we Enforce Strong Ordering in T-Head C906 MMU for the RAM Regions?
NuttX won't boot:
Starting kernel ...

ABCmmu_ln_map_region: ptlevel=1, lnvaddr=0x50406000, paddr=0, vaddr=0, size=0x40000000, mmuflags=0x8000000000000026
mmu_ln_map_evel=2, lnvaddr=0x50404000, padd00, vaddr=0xe0000000, size=0x10000000, mmuflags=0x8000000000000026
mmu_ln_map_region: ptlevel=2, lnvaddr=0x50405000, paddr=0x50600000, vaddr=0x50600000, size=0x1400000, mmuflags=0x8000000000000026
mmu_satp_reg: pgbase=0x50406000, asid=0x0, reg=0x8000000000050406
mmu_write_satp: reg=0x8000000000050406

What if we set to No Cache?
https://github.com/torvalds/linux/blob/master/arch/riscv/include/asm/pgtable-64.h#L126-L142
// T-Head Memory Type Definitions in Linux
#define _PAGE_PMA_THEAD     ((1UL << 62) | (1UL << 61) | (1UL << 60))
#define _PAGE_NOCACHE_THEAD ((1UL << 61) | (1UL << 60))
#define _PAGE_IO_THEAD      ((1UL << 63) | (1UL << 60))
#define _PAGE_MTMASK_THEAD  (_PAGE_PMA_THEAD | _PAGE_IO_THEAD | (1UL << 59))

// [63:59] T-Head Memory Type definitions:
// Bit[63] SO  - Strong Order
// Bit[62] C   - Cacheable
// Bit[61] B   - Bufferable
// Bit[60] SH  - Shareable
// Bit[59] Sec - Trustable

// 00110 - NC:  Weakly-Ordered, Non-Cacheable, Bufferable, Shareable, Non-Trustable
// 01110 - PMA: Weakly-Ordered, Cacheable, Bufferable, Shareable, Non-Trustable
// 10010 - IO:  Strongly-Ordered, Non-Cacheable, Non-Bufferable, Shareable, Non-Trustable
We changed to No-Cache:
void bl808_kernel_mappings(void) {
  ...
  /* Map the kernel text and data for L2/L3 */
  #define _PAGE_NOCACHE_THEAD ((1UL << 61) | (1UL << 60))
  map_region(KFLASH_START, KFLASH_START, KFLASH_SIZE, MMU_KTEXT_FLAGS | _PAGE_NOCACHE_THEAD);
  map_region(KSRAM_START, KSRAM_START, KSRAM_SIZE, MMU_KDATA_FLAGS | _PAGE_NOCACHE_THEAD);

  /* Connect the L1 and L2 page tables for the kernel text and data */
  mmu_ln_setentry(1, PGT_L1_VBASE, PGT_L2_PBASE, KFLASH_START, PTE_G);

  /* Map the page pool */
  mmu_ln_map_region(2, PGT_L2_VBASE, PGPOOL_START, PGPOOL_START, PGPOOL_SIZE,
                    MMU_KDATA_FLAGS | _PAGE_NOCACHE_THEAD);
}

static int create_region(arch_addrenv_t *addrenv, uintptr_t vaddr,
                         size_t size, uint32_t mmuflags) {
	  ...		 
          /* Then map the virtual address to the physical address */
          #define _PAGE_NOCACHE_THEAD ((1UL << 61) | (1UL << 60))
          mmu_ln_setentry(ptlevel + 1, ptlast, paddr, vaddr, mmuflags | _PAGE_NOCACHE_THEAD);
We see the same problem...
up_addrenv_select: l3_page_table=0x506a3000
*l3_page_table= (0x506a3000):
0000  d7 90 1a 14 00 00 00 30 d7 94 1a 14 00 00 00 30  .......0.......0
0010  d7 98 1a 14 00 00 00 30 d7 9c 1a 14 00 00 00 30  .......0.......0
0020  d7 a0 1a 14 00 00 00 30 d7 a4 1a 14 00 00 00 30  .......0.......0
0030  d7 a8 1a 14 00 00 00 30 d7 ac 1a 14 00 00 00 30  .......0.......0
0040  d7 b0 1a 14 00 00 00 30 d7 b4 1a 14 00 00 00 30  .......0.......0
up_addrenv_select: Virtual Address 0x80200000 maps to Physical Address 0x506a4000
up_addrenv_select: Before Update: *0x506a4000 is 0
riscv_exception: EXCEPTION: Load page fault. MCAUSE: 000000000000000d, EPC: 000000005020c036, MTVAL: 0000000080200000
riscv_exception: PANIC!!! Exception = 000000000000000d

Reminder: map_region mmuflags: Increase from 32-bit to 64-bit
What if we call THEAD_SYNC_S?
int up_addrenv_select(const arch_addrenv_t *addrenv)
{
  DEBUGASSERT(addrenv && addrenv->satp);
  mmu_write_satp(addrenv->satp);

  // #define THEAD_SYNC_S	".long 0x0190000b"
  __asm__ __volatile__
    (  
      ".long 0x0190000b\n" // THEAD_SYNC_S
    );
Same problem
Found Solution: Flush the cache with DCACHE.CALL, DCACHE.IALL, ICACHE.IALL

After writing MMU SATP: What if we flush the MMU Cache with DCACHE.IALL?
// Flush the MMU Cache for T-Head C906
void weak_function mmu_flush_cache(void)
{
  __asm__ __volatile__
    ( 
      ".long 0x0020000b\n" // DCACHE.IALL: Invalidates all page table entries in the D-Cache.
      ".long 0x0190000b\n" // THEAD_SYNC_S: th.sync.s (make sure all cache operations finished)
    );  
}
It works yay! hello no longer crashes after getprime...
nsh> hello
up_addrenv_select: l3_page_table=0x506a3000
*l3_page_table= (0x506a3000):
0000  d7 90 1a 14 00 00 00 00 d7 94 1a 14 00 00 00 00  ................
0010  d7 98 1a 14 00 00 00 00 d7 9c 1a 14 00 00 00 00  ................
0020  d7 a0 1a 14 00 00 00 00 d7 a4 1a 14 00 00 00 00  ................
0030  d7 a8 1a 14 00 00 00 00 d7 ac 1a 14 00 00 00 00  ................
0040  d7 b0 1a 14 00 00 00 00 d7 b4 1a 14 00 00 00 00  ................
up_addrenv_select: Virtual Address 0x80200000 maps to Physical Address 0x506a4000
up_addrenv_select: Before Update: *0x506a4000 is 0
up_addrenv_select: Before Update: *0x80200000 is 0
up_addrenv_select: Expected Values: *0x80200000 is 0x1, *0x506a4000 is 0x1 (not 0xffffffffffffffff)
up_addrenv_select: Actual Values: *0x80200000 is 0x1, *0x506a4000 is 0x1

(DCACHE.CALL and ICACHE.IALL are not needed)
This fix has been Upstreamed to NuttX Mainline yay!
(See the Pull Request)