THIS IS A DRAFT IN REVIEW. DO NOT IMPLEMENT THIS!
- Version: 0 (DRAFT!)
Pipsqueak is a position-independent patch format. It is designed such that simple ROM formats such as GBA can have multiple patches applied together composably by attaching semantic information to content in the patch such that addresses may be relocated, if required.
Goals:
- A patch format that supports relocations.
Non-goals:
- No compression is specified in the format. The patch may be subsequently compressed with a compression algorithm, if required.
- Patches are only designed to run over binary image formats, e.g. GBA ROMs. Anything involving a file system or compressed segments is not supported.
The two most common patches today are IPS and BPS.
-
IPS ("International Patching System") is a very simple linear patch format that specifies an offset, a size, and a chunk of data to replace in the program image. There is also some run-length encoding for compression, but it's not important to its core functionality. The format is very straightforward and easy to interpret, but pitfalls include having no semantic information and if data is moved around in the program image the patch size will be inflated, as there is no way to specify simply moving data around in the program image.
-
BPS ("Binary Patching System") is a more sophisticated delta patch format where patches are comprised of a series of commands that copy data from the source file to the target file. This may optimizes a lot of space usage as it is now able to account for data being moved around in the program image, but it is actually less convenient for us here as its dictionary-based approach is not suitable to encode semantic information in.
Pipsqueak is, in spirit, a lot more similar to IPS than BPS: Pipsqueak contains simple blocks of data to be inserted into the program image, but also encodes semantic information such that these blocks of data may be moved around appropriately.
Note: all data in the patch is little endian, however patches may be applied to a big endian target.
struct Pipsqueak {
header: Header,
num_replacements: POINTER_SIZE,
replacements: Replacement[num_replacements],
num_far_chunks: POINTER_SIZE,
far_chunks: FarChunk[num_far_chunks],
}
enum Endianness {
Little = 0x00,
Big = 0x01,
}
struct Header {
// P I P S in ASCII.
magic: [0x50, 0x49, 0x50, 0x53],
// Version number, incremented per revision of the data format.
version: 0x01,
// Endianness of the _target_. All addresses in the _patch_ are little endian.
endianness: Endianness,
// Pointer size of the target.
pointer_size: u8 = POINTER_SIZE,
// Base address of the target in memory. For GBA, this is 0x08000000.
base_address: POINTER_SIZE,
}
struct Replacement {
// Offset to start replacing at.
offset: POINTER_SIZE,
// Data to replace at the offset.
length: POINTER_SIZE,
data: [u8; length],
// Relocations within the data that may point into far memory.
num_relocations: POINTER_SIZE,
relocations: [Relocation; num_relocations],
}
Replacements have an identical purpose to those in IPS patches: they specify an offset to start replacing at, then what data should overwrite the original data at that location. However, in Pipsqueak, each replacement has additional relocation information that allow pointers within the replacement to point into far chunks.
struct FarChunk {
// Data to insert in far memory.
length: POINTER_SIZE,
data: [u8; length],
// Relocations within the data that may point into far memory.
num_relocations: POINTER_SIZE,
relocations: [Relocation; num_relocations],
}
Far chunks are free-floating segments of data. Traditionally, they are inserted into free space either anywhere within the program image or the program image is expanded and the data is appended to the end. In Pipsqueak, these are not explicitly tied to a fixed address. Anything that refers to far memory must do so via relocation.
struct Relocation {
// Offset, relative to the start of the replacement or far chunk.
rel_offset: POINTER_SIZE,
// Index of the far chunk that the relocation refers to.
far_chunk: POINTER_SIZE,
// Offset of the pointer from the start of the chunk.
ptr_offset: POINTER_SIZE,
}
Relocations are present in both replacements and far chunks: they specify an offset in the chunk where a sequence of bytes of length POINTER_SIZE
of the appropriate endianness is to be replaced with the resolved address of a far chunk. Far chunks may point to other far chunks, including themselves or far chunks defined earlier.
-
For each patch, append each far chunk to the end of the program image.
- A mapping of logical patch + far chunk index to physical far chunk address in the program image is created.
-
For each physical far chunk, rewrite each location in the program image with the base address + the resolved physical far chunk address + the pointer offset.
-
For each patch, rewrite the data in the program image with each replacement.
- Rewrite each location in the program image with the base address + the resolved physical far chunk address + the pointer offset.
While the patches are composable, they are neither commutative nor associative. Use with care.