Some random notes about Besta RTOS. Will probably ended up on a wiki somewhere after Project Muteki become mostly usable.
NOPE. Not even close.
For unknown reasons, instead of using e.g. GCC, seems that Besta decided to bend MSVC CE toolchain to build applets for their own not-Windows-CE OS. It's speculated that coredll.dll was included solely for the compiler helpers it provide (namely soft floating point emulation, 64-bit arithmetic and integer division routines). As an unfortunate side effect, this also caused a lot of things to be (seemingly) broken including C++ exceptions and threading/TLS due to Windows CE-specific helpers obviously not working on a completely different OS.
For developing Besta RTOS applets with Windows CE toolchain, custom CRT0 must be used and coredll functions must NOT be used for anything unless the functions are OS-independent (like the previously mentioned helper routines). All syscalls must either go through sdklib/krnllib or be invoked via bare SVCs (e.g. with help from mutekishims).
There are also some Win32API-looking routines provided by sdklib/krnllib although there are no technical reasons for them to be Windows-ish (other than providing some familiarity to Windows devs but with huge caveats sometimes).
Looking at the scheduler code, seems like the Besta RTOS is based on a heavily modified uC/OS-II kernel with a drastically different set of OS API exposed to user code. The rest of the system seems to be developed in-house, sometimes utilizing various existing open-source (e.g. SQLite3, FFmpeg, YAFFS2) and closed-source (e.g. Voxware) components.
Maybe. I haven't tried it yet and I'm not a lawyer but maybe.
Diagnostic mode can be used to identify the board type (usually a 5 character identifier starts with BA, CA, EA, etc. that is different from the device model number) and verify the integrity of the system ROM. On some newer TLCS-900 based systems where a diagnostic menu is known to exist (tested on CA736) it can also be used to dump the system ROM currently installed onto the on-board NAND flash.
Open the TAD app (it's usually called Service Home, 服务中心 (服務中心), etc. depending on your language settings) and type "diagnostic" using the keyboard. This works on all Arm-based devices (except for HP Prime since it does not have the TAD app installed) and some later devices based on the TLCS-900 architecture.
For HP Prime calculators, holding C, F and O button and pressing the Reset button enters the diagnostic mode.
Once called (in Arm mode), push r0
and lr
to the stack in this exact order (needs 2 instrucrtions) and initiate an SVC call with the desired syscall #.
Syscalls will not work directly in THUMB mode due to instruction size limit. Interwork is needed in order to do syscall from THUMB code.
Besta RTOS uses a single address space memory layout with kernel, all applets and shared libraries sharing the same address space. There's no MMU support even on SoCs with MMU support so there's zero memory protection, meaning user space code can have direct access to hardware registers, etc. Beware that this also makes NULL an valid address and this will cause NULL dereferencing to be harder to debug.
Seems like there are different heaps maybe for kernel and userspace. More info needed. Doesn't seem like so under further inspection. Although there's a AllocBlock() function that is stubbed in BA110L, what is it?
Applet executables are mainly in PE format with ELF being an alternative option. The Windows CE subsystem type is not required although elf2bestape
would add it for consistency with some newer Besta RTOS applets. Applet can either be relocatable (true for most of the PE files) or loadable to an absolute base address, although applet of latter type is in practice only runnable if it was the "init" program (i.e. first program to run after the OS is initialized).
It's unclear whether ELF applets support relocation or not since the only one ELF applet known to exist is Prime G1's armfir.elf
and it loads to an absolute address.
Like applets, shared libraries can either be in PE format or ELF. The _start
function seems to get ignored when loading them.
Like under Windows, putting a shared library in the same directory as the applet overshadows the system version. This could be used to e.g. trace syscalls.
The thread model seems to be very similar to uC/OS-II (down to the algorithm level almost source-line-to-source-line), although the public API is totally different.
Total number of 64 threads can be created at the same time. With 38 threads accessible directly via OSCreateThread()
.
The code to handle THUMB mode in the CPU context initializer, which is in stock uC/OS-II's Arm Generic port, seems to be missing. Does that mean the THUMB mode is broken? (Maybe not but use THUMB function as an entry point might not work with workarounds i.e. interwork function or patching the saved CPSR. Given that we might need a stack aligner for EABI->OABI conversion anyway this might not be so bad.)
Priority is implied in the natural order of the threads in the global thread table (uC/OS-II just calls this priority table). Some slots in the table seem to be reserved (8 for the top and 18 for the bottom) and are not accessible by just allocating the thread with OSCreateThread()
. User can move threads to these reserved slots by calling the OSSetThreadPriority()
function.
The scheduler always executes the task that has the highest priority, so using OSSleep()
is necessary to prevent one thread getting hold of the CPU for too long.
(TODO figure out if the thread can be yielded when waiting for IO)
1 jiffy is 1ms.
The OSSleep(jiffies)
syscall calls the OSTimeDly()
function in uC/OS-II scheduler, which then put the thread to sleep for specified amount of jiffies. Since 1 jiffy is 1ms in Besta RTOS, this practically delays the thread for less than or equal to the specified amount of milliseconds.
There's also a Delay()
syscall that delays beyond INT16_MAX jiffies all within a single SVC call.
Events are stripped down version of uC/OS-II mboxes. They don't have the ability to pass arbitrary message pointers like mboxes do.
Events have one extra flag compare to mboxes. Once set by OSCreateEvent()
, it will prevent the event flag from cleared once a OSWaitForEvent()
call completes without hitting a timeout or error.
Critical sections provide mutually exclusive access to shared resources between threads. They seem to be recursive (as the context struct seems to hold a copy of the reent struct whenever it enters from the same thread/has the same reent struct pointer).
When a thread acquires a free critical section, it only changes the state of that critical section and nothing else on the kernel side is touched. (Unless, of course, when another thread tries to acquire the same critical section. Then that thread will be set to wait for that critical section.)
They also seem to have some kind of index value and a byte array for unknown purpose. More investigations needed. These are standard uC/OS-II thread wait states.
Since the context holds a copy of the current thread pointer, it is possible to use critical sections to know which thread is currently running. To do this, create a critical section locally first. This ensures that no other thread is acquiring it. After this, simply acquire the descriptor with OSEnterCriticalSection()
and read out the pointer.
One safe implementation (4 syscalls) is shown as follows:
#include <muteki/threading.h>
thread_t *get_current_thread() {
thread_t *thr = NULL;
critical_section_t mutex;
OSInitCriticalSection(&mutex);
OSEnterCriticalSection(&mutex);
thr = mutex.thr;
OSLeaveCriticalSection(&mutex);
OSDeleteCriticalSection(&mutex);
return thr;
}
There is also a faster but hackier way. It abuses an implementation detail of the critical section that there's no other resource allocated/state changed when a free critical section is acquired for the first time. By only barely initializing the critical section and call OSEnterCriticalSection()
without any clean up, this brings down the number of syscalls required to only 1. This works on both CD-580+ and WuDi V7.
#include <muteki/threading.h>
thread_t *get_current_thread() {
critical_section_t mutex;
// Magic is not checked so not needed here
mutex.thr = NULL;
mutex.refcount = 0;
OSEnterCriticalSection(&mutex);
return mutex.thr;
}
Error code is stored in the thread descriptors.
Code set via OSSetLastError will have the flag 0x20000000
set when read back by _GetLastError()
.
See muteki/errno.h for error codes documented by parsing FormatMessage()
string table.
See hca.xxdm
Would be useful for e.g. emulators.
GetActiveVRamAddress()
returns a framebuffer descriptor. It includes the framebuffer as well as its format. This could be a potential way of accessing the framebuffer with e.g. a sw renderer that has no tie to the kernel.
TODO.
(OpenPCMCodec
and ClosePCMCodec
look suspicious)
This can be used to e.g. simplify syscall black box testing or implement untethered other OS booting.
Create a file under C:\SYSTEM\DESKTOP.INI
with DOS line ending and put
[DESKTOP SETTING]
ENTRY = <dos-8.3-path-to-exe-you-want-to-run>
into the file.
WARNING: This will replace the home screen with the file you specified and might cause the system to not boot properly. If this happens, a full system reset (clearing settings and wiping C: drive) will fix it although it will erase all data in system memory and settings. Alternatively, if chainloading a secondary program is possible, you can also run a program that can help you recover from this situation (e.g. using this will not be possible on most systems without a loader that strips the v4 args).\\.\EXPLORE.ROM
to delete the ini file and reboot
PATH_MAX
is 256 UTF-16 CUs (512 bytes) with NUL terminator. For the 8.3 paths used by CreateFile()
, PATH_MAX
is 80 bytes with NUL terminator. For 8.3 paths in CWD, PATH_MAX
seems to be 64 bytes with NUL terminator.
Neither am I.