The file format of the Source Archive Format file is very similar to that of object file libraries and various other schemes. It does not adhere to those other schemes due to their variances from platform to platform, all the code needed to support things that are unneeded for Source Archive Format files, and special consideration for D's needs. The format is meant to be friendly for memory-mapped file access, and does not have alignment issues.
The file extension is sar
.
The file is broken up into sequential blocks, the start of each block will be padded to alignment of 16 bytes to enable aligned SIMD access to that block's contents.
All integers are in little endian format and are unsigned.
Except the header, all blocks have a four byte block id, followed by a eight byte length field.
A required header that denotes version of the source file as well as verification that it is a Source Archive Format.
The block structure is as follows:
offset | length | Name | Value |
---|---|---|---|
0 | 4 | Block ID | In hex: 4D 73 61 72 , in ASCII: Msar |
4 | 4 | Zero terminator | 00 00 00 00 |
8 | 8 | Length Of Header | 16 |
Future versions may have added fields that are denoted from an increased length of header value.
A source entries block works as a table of contents upon all the different source files embedded. It includes support for applying a CLI argument string if any of these files are used.
There may be multiple of these blocks in a Source Archive Format file. Enabling blind appending by tooling (although checking for existing values and zeroing out its file name and auxillary file name would be a good idea).
offset | length | Name | Value |
---|---|---|---|
0 | 4 | Block ID | In hex: 45 6E 54 53 , in ASCII: ENTS |
4 | 8 | Size of block | |
12 | 4 | Number of entries | |
16 | 4 | Length of CLI arguments that are to be applied for all source files | |
20 | last value | CLI argument string, may have values separated by wrapping with double quotes |
Following this is the a variable length array that is composed of values:
offset | length | Name | Value |
---|---|---|---|
0 | 4 | Filename length | |
4 | 4 | Auxillary name length | The D module name |
8 | 4 | CLI argument string length | Must be zero terminated |
12 | 8 | File contents length | Does not include zero termination in length |
18 | 8 | File contents offset after this block | |
26 | File name length | File name | Must be zero terminated |
26 + file name length | auxillary name length | Auxillary name | Must be zero terminated |
26 + file name length + auxillary name length | CLI argument string length | CLI argument string if this file is used | Must be zero terminated |
Following this is the file contents at the offset specified by its entry and of a given length.
All file contents must be aligned to 16 bytes and end with 16 0
values for enable faster lexing. Padding for the next entry to align it, may contribute to the zeros at the end of the file contents.