wwylele/CRO_doc.md

## CRO_doc.md

      
    Raw
  

              CRO_doc.md
            
          
    3DS CRO Information

AKA reference for citra CRO code.

wwylele: I am not a English native speaker, so there can be some strange words and sentences below. Suggestions for improvement are welcome.

Warning: CRO is still not completely understood, and there can be mistakes below. Please keep the sense of suspecting.
Terminology

Note: some terms are given by comparing the behavior with similar concept. They may be inaccurate, or even incorrect.

Module is a chunk of executable code and data. Modules can be linked to each other.
Static module refer to the main application (i.e. ExeFS). The executable code is loaded on application booting, while the symbol information is loaded at runtime.
Dynamic module is a module that can be loaded and unloaded at runtime.
RO service (ldr:ro) is the system service for managing modules.
CRO is a file with extension .cro, which contains executable code, data and symbol information for a dynamic module.
CRS is a file with extension .crs, and in the same format as CRO, which contains symbol information for the static module.
CRR is a file with extension .crr, which contains verification data for all modules. the RO service will verify every dynamic modules by CRR file when loading them.
Symbol is a location for a function or a variable(?) in a module, which can be found by other modules. The location is described by a SegmentTag type. There are three types of symbol:

Named symbol is a symbol with a name which is usually a (mangled) function name or a variable name(?). A module can refer to a named symbol in other modules by name.
Indexed symbol is a symbol without name but with an index. A module can refer to a indexed symbol in other modules by index.
Anonymous symbol is a symbol without name or index. A module must(?) keep a list of imported anonymous symbol to refer to them.


Importing is declaring the usage of symbols in other modules.
Exporting is making symbols available to other modules.
Resolving is making a imported symbols available by looking it up in the module that exports it, and patching the code in the importing modules.
Linking is looking up multiple modules import / export information and resolving symbols between them.
Auto-link module is a loaded module that will automatically link with new-loaded module. The static module is always an auto-link module. Game can specify a module to be auto-link or not when loading it. A module will automatically link with auto-link modules when loading, regardless of whether it is auto-link itself.
Manual-link module is a module that is not auto-link.
Rebasing is modifying all offsets in CRO file to actual virtual address (i.e. add CRO loaded address to all offsets). CRO will be rebased when loading, and unrebased when unloading.

CRO format

A CRO file consists with a header, executable code, non-executable data and several tables. Their order is often as

header
.text segment
.rodata segment
tables
.data segment

The header is in the following format,


Offset
Size
Description


0x0000
0x20
SHA-256 from 0x80 to Code offset


0x0020
0x20
SHA-256 from Code offset to Module name offset


0x0040
0x20
SHA-256 from Module name offset to Data offset


0x0060
0x20
SHA-256 from Data offset to the end of CRO


0x0080
0x04
Magic "CRO0"


0x0084
0x04
Name offset


0x0088
0x04
Next module **


0x008C
0x04
Previous module **


0x0090
0x04
File size


0x0094
0x04
.bss segment size(?). *


0x0098
0x04
Fixed size **


0x009C
0x04
Zero? *


0x00A0
0x04
nnroControlObject_ function segment tag. *


0x00A4
0x04
"OnLoad" function segment tag. 0xFFFFFFFF if not exists *


0x00A8
0x04
"OnExit" function segment tag. 0xFFFFFFFF if not exists *


0x00AC
0x04
"OnUnresolved" function segment tag. 0xFFFFFFFF if not exists


0x00B0
0x04
Code offset


0x00B4
0x04
Code size


0x00B8
0x04
Data offset


0x00BC
0x04
Data size


0x00C0
0x04
Module name offset. Equals to name offset (?)


0x00C4
0x04
Module name size


0x00C8
0x04
Segment table offset


0x00CC
0x04
Segment count


0x00D0
0x04
Exported named symbol table offset


0x00D4
0x04
Exported named symbol count


0x00D8
0x04
Exported indexed symbol table offset


0x00DC
0x04
Exported indexed symbol count


0x00E0
0x04
Exported strings offset


0x00E4
0x04
Exported strings size


0x00E8
0x04
Exported name tree offset


0x00EC
0x04
Exported name tree node count


0x00F0
0x04
Imported module table offset


0x00F4
0x04
Imported module count


0x00F8
0x04
External patch table offset


0x00FC
0x04
External patch count


0x0100
0x04
Imported named symbol table offset


0x0104
0x04
Imported named symbol count


0x0108
0x04
Imported indexed symbol table offset


0x010C
0x04
Imported indexed symbol count


0x0110
0x04
Imported anonymous symbol table offset


0x0114
0x04
Imported anonymous symbol count


0x0118
0x04
Imported strings offset


0x011C
0x04
Imported strings size


0x0120
0x04
Static anonymous symbol(?) table offset


0x0124
0x04
Static anonymous symbol(?) count


0x0128
0x04
Internal patch table offset


0x012C
0x04
Internal patch count


0x0130
0x04
Static anonymous patch(?) table offset


0x0134
0x04
Static anonymous patch(?) count


* RO service doesn't touch these fields
** Zero in CRO file. RO service will write to them
See code: HeaderField, GetField, SetField
All the "offset" fields in the header are relative to the file beginning. However, they will be modified to the virtual address when loading (i.e. rebasing).
For the detailed structure of each table, refer to the related struct in the code.
Memory mapping

CRO and CRS are loaded from RomFS to memory buffer by application, then the application specifies another address when calling RO service loading functions (Initialize i.e. "LoadCRS", LoadCRO, and LoadCRO_New), and RO service will map the original buffer to the specified address. RO service will always read and write data in the mapping address, while the application can read data in both address, and can write to the original buffer. This requires a proper memory aliasing implement, which is not in citra yet. The current work-around is mapping a new buffer to the mapping address, copying the data and synchronizing at the beginning and the end of each service call (see MemorySynchronizer).
Registration

See code: Register, Unregister
Modules forms two doubly linked lists in RAM: each module has a previous and a next field in its header, and will be set pointing to other modules when loading. The previous and next field of the static module are pointing to the head of two list: manual-link list and auto-link list, respectively. The previous field of the head of each list is pointing to the tail of the list. The next field of the tail is set to 0.

A dynamic module will be added to the tail of one list when loading (as "registering"), depending on whether it is specified to be auto-link; the module will be removed from the linked list (as "unregistering"). RO service (and probably the application as well) uses these two linked lists to iterate among modules when linking.
Segment

A module has several segments, with a segment entry table pointing to each segment. A segment can be of type 0(.text), 1(.rodata), 2(.data), or 3(.bss). For the the static module, all these segment entry are set in CRS directly pointing to corresponding userland memory address (for example .text begins from 0x00100000). For dynamic modules, .text, .rodata and .data are stored in CRO, and the entries are set pointing to these data. During CRO loading and rebasing, .text and .rodata entries will be set pointing to where they are mapped in memory, and .data entry will be set pointing to a application-specified buffer (the RO service doesn't handle copying .data from CRO to buffer, which is done by the application). .bss will also be set to a application-specified buffer.
A location in a segment is always represented by a segment tag. A segment tag is a 32-bit type, with 4 lower bits integer as segment index, and 28 higher bits integer as offset into the segment. Symbols location and patch targets are all presented by segment tags.
See code: DecodeSegmentTag, SegmentTagToAddress
Fixing

The word "fix" comes from Subv's branch, and CRO's header magic (see below).
See code: GetFixEnd, Fix
Application can specify a dynamic module to be "fixed" after loading. Fixing is cropping away some data from CRO end, where RO service will unmap the memory and return it back to the application for other use, so that memory can be saved.
A fix level can be specified. A higher level means to crop away more data and to lose more features.


Level 0 does not crop at all. Also, if a module with fix level 0 is unloaded, RO service will restore all the data of the module as if it hasn't been loaded. (See Patch - "clear patch") Therefore, a module with fix level 0 can be loaded and unloaded multiple times, without reloading from RomFS (?).


Level 1 crops away

Static anonymous symbol table(?),
internal patch table, and
Static anonymous patch table(?).
(and also very likely the data of .data segment)

A module with fix level 1 can't be reloaded after unloading(?), since the internal patch information was lost and it is not able to reapply internal patches for new allocated .data and .bss buffer(?).


Level 2 crops away

all data that level 1 crops,
imported module table,
external patch table,
imported named symbol table,
imported indexed symbol table,
imported anonymous symbol table, and
imported strings.

Because of losing import information, a module with fix level 2 can't resolve symbols imported from modules that are loaded after itself.


Level 3 crops away

all data that level 2 crops,
exported named symbol table,
exported indexed symbol table,
exported strings, and
exported name tree.

Because of losing export information, a module with fix level 3 can't resolve symbols exported to modules that are loaded after itself.


For modules with a fix level other than 0, the magic field in its header will be changed to "FIXD" when loading.
Note that only fix level 1 is known to be used by games (?), so the actual behavior different fix levels are not clear due to lack of test cases.
Patch

See code: PatchEntry, ApplyPatch, ApplyPatchBatch
Note: should be probably called "relocation" instead
Patch is the implement of resolving symbols. The module exporting symbols keeps the address (as segment tag) for each exported symbols, and the module importing symbols keeps a list of patches indicating where (also segment tags) and how to write the symbol address. One imported symbol is corresponding to several patches, which is called a patch batch. Patches have different types (PatchType), but only two of them, AbsoluteAddress and RelativeAddress are known to used by games. Other types are left unimplemented because of lack of test case. Patch types apparently match relocation types in ELF for ARM.
Patches are "reset" when modules are being loaded before linking, and when module are being unlinked: places to patch are written with a "OnUnresolved" function address instead of imported symbol address. The "OnUnresolved" function is specified by CRO header. See code ResetExternalPatches, ResetImportNamedSymbol, ResetImportIndexedSymbol, ResetImportAnonymousSymbol, ResetExportNamedSymbol, ResetModuleExport.
Patches are "cleared" when modules with fix level 0 are being unloaded: places to patch are written with zero address. The purpose is to restore the CRO to the state before loading(?) (See Fixing - Level 0). See code ClearPatch, ClearExternalPatches, ClearInternalPatches.
Patches are also used for another 2 different ways. One is internal patches. This can be treated as a module exporting symbols to itself. This is need for each segment to communicate to each other because their address will be changed on rebasing. The internal patches are slightly different from normal patches (see InternalPatchEntry): they are not organized in batches; they store not only where to write address, but also what address to write (i.e. exported symbol address). Also, note that internal patches will be applied upon rebasing (not linking!). See code ApplyInternalPatches.
Another uses of patches is for static anonymous symbols, which are symbols exporting from dynamic modules to the static module. They works quite similar to normal symbols and patches, but will be applied when rebasing dynamic modules (not linking, again), and never reset even the module is unloaded. No games (?) are actually known to use this feature, so it is unclear what it is used for. See code ApplyStaticAnonymousSymbolToCRS.
Importing and exporting

A module exporting symbols keeps record of them in two table: named symbols are recorded in the exported named symbol table, while the indexed symbols are recorded in the exported indexed symbol table. The module does not keep record of exported anonymous symbols.
A module importing symbols keeps record of them in several tables: named symbols are directly recorded in the imported named symbol, while the indexed and anonymous symbols are grouped by the modules exporting them, and the imported module table records the referenced modules and indexed / anonymous symbols they contains.
The relationship of each table is illustrated below:

Exported name tree

See code: FindExportNamedSymbol
A module contains a tree for symbol name fast lookups. When RO service look up a named symbol, it won't go over the exported named symbol table, but look up this tree instead.
The tree itself is a trie-like structure. Here is a reimplement in C++. The tree structure in CRO is bit_trie<symbol_name, symbol_index, string_tester> as the reimplement. There are also some differences: the structure in CRO doesn't store keys in nodes; it uses absolute offsets in Branch, instead of relative offsets in the reimplement.
Offset	Size	Description
0x0000	0x20	SHA-256 from 0x80 to `Code offset`
0x0020	0x20	SHA-256 from `Code offset` to `Module name offset`
0x0040	0x20	SHA-256 from `Module name offset` to `Data offset`
0x0060	0x20	SHA-256 from `Data offset` to the end of CRO
0x0080	0x04	Magic "CRO0"
0x0084	0x04	Name offset
0x0088	0x04	Next module **
0x008C	0x04	Previous module **
0x0090	0x04	File size
0x0094	0x04	.bss segment size(?). *
0x0098	0x04	Fixed size **
0x009C	0x04	Zero? *
0x00A0	0x04	`nnroControlObject_` function segment tag. *
0x00A4	0x04	"OnLoad" function segment tag. 0xFFFFFFFF if not exists *
0x00A8	0x04	"OnExit" function segment tag. 0xFFFFFFFF if not exists *
0x00AC	0x04	"OnUnresolved" function segment tag. 0xFFFFFFFF if not exists
0x00B0	0x04	Code offset
0x00B4	0x04	Code size
0x00B8	0x04	Data offset
0x00BC	0x04	Data size
0x00C0	0x04	Module name offset. Equals to name offset (?)
0x00C4	0x04	Module name size
0x00C8	0x04	Segment table offset
0x00CC	0x04	Segment count
0x00D0	0x04	Exported named symbol table offset
0x00D4	0x04	Exported named symbol count
0x00D8	0x04	Exported indexed symbol table offset
0x00DC	0x04	Exported indexed symbol count
0x00E0	0x04	Exported strings offset
0x00E4	0x04	Exported strings size
0x00E8	0x04	Exported name tree offset
0x00EC	0x04	Exported name tree node count
0x00F0	0x04	Imported module table offset
0x00F4	0x04	Imported module count
0x00F8	0x04	External patch table offset
0x00FC	0x04	External patch count
0x0100	0x04	Imported named symbol table offset
0x0104	0x04	Imported named symbol count
0x0108	0x04	Imported indexed symbol table offset
0x010C	0x04	Imported indexed symbol count
0x0110	0x04	Imported anonymous symbol table offset
0x0114	0x04	Imported anonymous symbol count
0x0118	0x04	Imported strings offset
0x011C	0x04	Imported strings size
0x0120	0x04	Static anonymous symbol(?) table offset
0x0124	0x04	Static anonymous symbol(?) count
0x0128	0x04	Internal patch table offset
0x012C	0x04	Internal patch count
0x0130	0x04	Static anonymous patch(?) table offset
0x0134	0x04	Static anonymous patch(?) count