.fpk files can be found in the installation location of «Sid Meier's Pirates!» and contain most assets of the game.
.fpk is just a zip of files. It's probably just a format they quickly invented to prevent people from messing with their assets. It's pretty simple, but the filenames are 'obfuscated' by adding 1 to each letter.
Apparently, Firaxis (game devs) released a «Civ4 PakBuilder» tool to pack/unpack
.fpk files: https://forums.civfanatics.com/threads/civ4-pakbuild.136023/
The script below (re)extracts all assets of an
.fpk with their original directory structure and file names. To use it, do:
./fpkextract.py d <extraction directory> <location of .FPK file> [<location of .FPK file>...]
./fpkextract.py d extracted_assets Pak1.FPK
extracted_assets_db.json will be created. You can 'add' more
.FPK to an existing extraction by running the command again passing the same extraction directory.
When you have modified some assets and want to recreate the
.fpk files, run:
./fpkextract.py a extracted_assets Pak1.FPK Pak3.FPK
This will recreate files
The file begins with a short Header which has a table of assets included in the .FPK. Each entry contains a filename length, the filename itself, two integers and the file offset where the assets contents start as well as the length (in bytes) of the asset.
The Header is:
- 32LE integer, version? (expected to be 2)
- 32LE integer, how many entries there are in the table.
- The asset table, which is a concatenation of Assets.
An Asset is:
- 32LE integer, the filename length in bytes
- The filename, padded to 4-byte blocks, therefore the length in bytes of this part is
ceil(filename_length / 4). See below for details about how the filename is encoded.
- 32LE integer, ??? (maybe crc of the asset?)
- 32LE integer, tag?? (looks like a timestamp, values repeat often or with +1/-1 variations.
time.ctime(x*36)makes some sense)
- 32LE integer, the length in bytes of the asset.
- 32LE integer, the fpk file offset where asset bytes starts.
Each byte of the filename has been added 1 (modularly), and then padded with either
02 00 or
03 00 00 as needed.
An extractor should check that file sections indicated by Assets don't overlap, but there maybe 1-4 bytes of padding between them. The table is expected to be sorted (i.e. entries appearing at the same order actual asset bytes appear).
Types of files seen in the game's .fpks:
- Gamebryo Asset Files (
.kfmfiles) -> nifskope can be used to view them
.ddgtextures (usually referenced from .nif) -> gimp-dds can be useful
- DLLs for libraries (
- TGA, BMP, JPG, PCX images
- some weird
- some weird
For textual data:
- UTF-16LE text files (
.txt), with BOM and \r\n lines. (pirateopedia, cinematics, city names, etc.)
- Some ASCII files
- CSV files
- "STBL files" (
.str) (which are just a table of translated strings) -> format is described (partially) bellow, and I don't know if it's the same format used in i.e. The Sims 3.
- XML files
- some internal notes or logs the developers left
The File has the following structure:
- A Language Header, starting with bytes
- List of Strings
The Language Header has the follwoing structure:
- Integer (always 1? version?)
- Integer, language code? (italian 0x17, spanish 0x24, english 0x7)
- Integer (always 0?) ???
- Integer, how many Strings are in this Language (has to be 18293)
- Integer, how many bytes the string text is padded to (has to be 0x160 - 4 = 0x15c or 0x150?)
- Integer, apparently the number of bytes that follow (always 146344)
- The other 146344 bytes which I have no idea what they are for but appearently are the same among headers so..
It seems that 146344 / 8 = 18293 so for every string there is maybe two integers?
Each String has the following structure:
- An integer, the length of the string in bytes (4 bytes)
- The string bytes, padded with
\0so that it takes as many bytes as the header says.
Text is encoded in latin-1 (verify) and uses
Game variables are written as