Skip to content

Instantly share code, notes, and snippets.

@iamgreaser
Last active April 15, 2021 19:30
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iamgreaser/b1ebe6debc439b45c5fba074c3e34052 to your computer and use it in GitHub Desktop.
Save iamgreaser/b1ebe6debc439b45c5fba074c3e34052 to your computer and use it in GitHub Desktop.
THUG2 LZSS compression scheme (as used by the *.prx files)
THUG2 LZSS compression scheme (as used by the *.prx files)
Documented by GreaseMonkey in 2017
Document version V1
I release this document into the public domain.
AWWW YEAAAAH! Datz RIGHT b0!Z! We got a ... yeah whatever I'm not doing the
ASCII art required for that kind of introduction.
Well, they could've packed it a bit better, but hey, it took 50 minutes to
crack so I'm not complaining, and it is at least a decent compression scheme.
On the other hand, zlib is a lot better, and has a licence which makes the MIT
licence look restrictive.
As for the actual PRX structure, files and filenames padded to the nearest 4
byte boundary, and the rest is pretty easy to work out - if not... OK, the
XeNTaX wiki pretty much lies, but it IS an IHIH main header / IIII file header
structure.
Data is written to a 4KB ring buffer as it gets decompressed.
LZSS data is stored as offset, (length-3).
This means that all LZSS runs are at least 3 bytes.
Offsets are absolute indices into the ring buffer.
The ring buffer starts decoding at index 0xFEE. Don't ask me why. It just does.
Main decode loop is as follows:
1. Read a byte. These are your type bits.
2. If the bottom bit of the type bits is 1:
A. If the file pointer is >= the compressed file length, END RIGHT HERE.
B. Read a byte.
C. Output that byte and store it into the ring buffer.
3. Otherwise if it's 0:
A. Read a byte. Call this b0. These are the lower bits of the offset.
B. Read another byte. Call this b1.
C. Take the top 4 bits of b1. These are the upper bits of the offset.
D. Take the bottom 4 bits of b1. Add 3. This defines the length.
E. For `length` bytes:
a. Get the byte at `offset` in the ring buffer.
b. Output that byte and store it into the ring buffer.
c. Add 1 to `offset` modulo 4096 (0x1000, or just AND with 0xFFF).
4. Shift the type bits right by one.
5. If you have any type bits left, go to 2. Otherwise, go to 1.
None of this was ripped from any actual Tony Hawk engine code, compiled or
source, so you are free to use this for whatever.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment