iamgreaser/thps-format-psx.md

## thps-format-psx.md

      
    Raw
  

              thps-format-psx.md
            
          
    *.psx - Apocalypse / THPS / Spiderman / whatever model format
Documented by GreaseMonkey
Document version v2
I release this document into the public domain.
This information aims to cover the format used as per the PS1 and PC versions of THPS2.
Let me know if anything needs clarifying, I can usually be found under the name iamgreaser in various places.
Most fixed point values are in s19.12 (32-bit) or s3.12 (16-bit) fixed format. Occasionally you will see values in s7.24 fixed format - these will be indicated explicitly.

Components

File header

Python struct string is "<HHI" .

H - Magic number: 0x0004
H - Magic number: 0x0002
I - Pointer to the start of the tagged chunk section.

Objects

This is prefixed with a 32-bit word indicating the number of objects.
Python struct string is "<IiiiIHHhhII" .

I - Unknown. Probably flags of some sort.
iii - 3D position. s7.24 fixed format.
IH - Unknown.
H - Model index. Several objects can share a model but this is not a good idea!
hh - 2D... thing. Purpose unknown but it may relate to texturing.
I - Unknown.
I - Pointer to a 256-entry 4-byte array of bytes in an (R, G, B, x) arrangement.

Models

This is prefixed with a 32-bit word indicating the number of models.
It is then followed by that many 32-bit words which contain pointers to each model in the file.
Python struct string is "<HHHHIhhhhhhI" .

H - Unknown flags. Usually 0x0008 , but sometimes 0x000A .
H - Number of vertices.
H - Number of planes. Often equal to the number of faces.
H - Number of faces.
I - Spherical radius from the model origin. s7.24 format.
hh - xmax, xmin bounds. Take note of the order.
hh - ymax, ymin bounds.
hh - zmax, zmin bounds.
I - Unknown. Usually 0xFFFF7FFF . Sometimes isn't. Probably 2 halfword values.

What follows is the vertices which are in this format:

hhh - 3D position.
h - Usually 0, most likely padding.

Then the planes:

hhh - 3D normal, should be of length 0x1000 which is 1.0 in the fixed-point scheme.
h - Usually 0, most likely padding.

Then the faces... sometimes these have extra fields.


H - Base flags.

0x0003 - both set if textured, cleared if flat.

Having either enabled enables texturing, but both should be enabled.
0x0001 on its own enables texturing and texcoords, but does not look up the correct texture.
0x0002 on its own enables texturing, but gives garbage coordinates.


0x0010 - set if triangle, cleared if quad.
0x0080 - set if invisible and non-physical, cleared if visible and physical.
0x0800 - set if gouraud-shaded, cleared if flat-shaded.
0x1000 - set if this polygon needs to be subdivided.

This should be enabled for textured polys, and disabled for untextured.
Attempting to subdivide an untextured poly results in it disappearing.


H - Length, starting from base flags.


BBBB - Vertex indices. For triangles the last one is 0.


BBBB - Depends on if this is flat or gouraud...

Gouraud case: Per-vertex RGBs palette indices. For triangles the last one is 0.
Flat case: The first 32-bit word of a PS1 GPU command...

First three bytes are (R, G, B).
Fourth byte is the command:

0x20 - Untextured, opaque, flat-shaded triangle.
0x22 - Untextured, translucent, flat-shaded triangle.
0x24 - Textured, opaque, flat-shaded triangle.
0x26 - Textured, translucent, flat-shaded triangle.
0x28 - Untextured, opaque, flat-shaded quad.
0x2A - Untextured, translucent, flat-shaded quad.
0x2C - Textured, opaque, flat-shaded quad.
0x2E - Textured, translucent, flat-shaded quad.


H - Plane index.


H - Surface flags.

0x0010 - set if wallrideable.
0x0040 - set for a quarterpipe's "large polygon". Typically has base flag 0x0080 set.
0x0080 - set if... I don't know what this does actually.
0x0100 - cleared if you can skate on it.


If this face is textured, this follows:

I - Texture index.
BB BB BB BB - Array of 4 2D texture coordinates. For triangles the last one is (0,0).

There exist other flags which add zeros after this, but the purpose of those flags are unknown.
Tagged chunks

Terminates when you hit an FF FF FF FF chunk. No length follows after that chunk, instead you end up going straight into the model names array.
Otherwise:

I - Chunk type. Sometimes it's a string, sometimes it's a number, but it's always 4 bytes.
I - Chunk length. Length calculation starts STRAIGHT AFTER this word.

RGBs palette

Used for gouraud shading.
256 4-byte tuples in the form of (R, G, B, 0) .
You could theoretically have several of these, but this hasn't been tested, and doesn't show up in the official levels.
0A 00 00 00 - Blockmap

Used for level physics. If you were wondering why you were being sucked into the middle of the map, it's probably because you forgot this or stuffed this up.
Python struct string is "<iiiiHH" .

ii - xmin, zmin. s7.24 format.
ii - xmax, zmax. s7.24 format.
HH - xcells, zcells.

After this there are xcells * zcells entries each of this form:

II - Both 0.
I - Number of objects in this cell, immediately followed by an array of:

I - Object index.


I - 0.

There are some restrictions:

Each cell must be an integer-dimensioned square.

Cell size is calculated by (xmax-xmin)/xcells or (zmax-zmin)/zcells . These MUST be equal, and MUST be integers.


Model names

For every model, there is a 32-bit word indicating the "name" of each respective model.
Textures

Names

32-bit word indicates the number of these.
Then there is an array of 32-bit words indicating the name of each texture.
4bpp / 16-colour palettes

32-bit word indicates the number of these.
Then there is an array of the following:

I - Texture name which this palette belongs to.
"H"*16 - 15bpp PS1-format palette entries.

8bpp / 256-colour palettes

32-bit word indicates the number of these.
Then there is an array of the following:

I - Texture name which this palette belongs to.
"H"*256 - 15bpp PS1-format palette entries.

Texture data

This is prefixed with a 32-bit word indicating the number of textures actually present in this file.
Python struct string is "<IIIIHH" .

I - Unknown. Either 0 or 1.
I - Number of colours in the palette.
I - Name of the texture.
I - Index of the texture. I think this was an index into the texture names array.
HH - Width and height.

Alignment behaviour is something like this:

4bpp texture widths are rounded up to the nearest 4-pixel boundary.
8bpp texture widths are rounded up to the nearest 2-pixel boundary.

Unsure if it rounds the whole texture size up to a 4-byte boundary. It probably does, but I haven't encountered that scenario.
Advice? Don't use stupid texture sizes.

Algorithms

All code snippets are written in Python 3.5.
I release all code written here into the public domain.
Model radius

Get the largest distance from the origin. This is your sphere.
self.radius = max(*map(
    lambda x: int(math.ceil(math.sqrt(
        (x[0]**2+x[1]**2+x[2]**2)<<24
    )))&~0xFFF,
    self.vertices))

Zeroing the bottom 12 bits is optional. The official tool appears to do this.
Planes

Take a cross product and normalise the result to 0x1000 .
v0 = self.vertices[face.vidxs[0]]
v1 = self.vertices[face.vidxs[1]]
v2 = self.vertices[face.vidxs[2]]
x0,y0,z0,_, = v0
x1,y1,z1,_, = v1
x2,y2,z2,_, = v2
dx1,dy1,dz1 = x1-x0,y1-y0,z1-z0
dx2,dy2,dz2 = x2-x0,y2-y0,z2-z0
fx = float((dy2*dz1)-(dz2*dy1))/4096.0
fy = float((dz2*dx1)-(dx2*dz1))/4096.0
fz = float((dx2*dy1)-(dy2*dx1))/4096.0
norm = 1.0/max((fx*fx+fy*fy+fz*fz)**0.5, 0.0001)
x = int(round(fx*norm*4096))
y = int(round(fy*norm*4096))
z = int(round(fz*norm*4096))
fp.write(struct.pack("<hhhh", x, y, z, 0))

The official tool appears to have slightly shoddy rounding here.
Blockmap generation

I bothered so you don't have to.
This isn't optimal but it works. It also assumes that there is a 1-to-1 mapping from objects to models. Adjust to suit.
For an alternative, just use the objects when forming the bounding box, ignore the model bounding boxes, and give a large enough padding.
GDIVX = 20
GDIVZ = 20
g_xmin = min(map(lambda o,m: o.px + (m.xmin<<12), self.objs, self.mdls))-0x20000
g_zmin = min(map(lambda o,m: o.pz + (m.zmin<<12), self.objs, self.mdls))-0x20000
g_xmax = max(map(lambda o,m: o.px + (m.xmax<<12), self.objs, self.mdls))+0x20000
g_zmax = max(map(lambda o,m: o.pz + (m.zmax<<12), self.objs, self.mdls))+0x20000
g_xlen = (g_xmax-g_xmin+GDIVX-1)//GDIVX
g_zlen = (g_zmax-g_zmin+GDIVZ-1)//GDIVZ
g_xlen = g_zlen = max(g_xlen, g_zlen) # grid must be regular!
g_xmax = g_xmin + g_xlen*GDIVX
g_zmax = g_zmin + g_zlen*GDIVZ
fp.write(struct.pack("<i", g_xmin))
fp.write(struct.pack("<i", g_zmin))
fp.write(struct.pack("<i", g_xmax))
fp.write(struct.pack("<i", g_zmax))
fp.write(struct.pack("<HH", GDIVX, GDIVZ))

for z in range(GDIVZ):
    for x in range(GDIVX):
        xmin = g_xmin + (x+0)*g_xlen
        xmax = g_xmin + (x+1)*g_xlen
        zmin = g_zmin + (z+0)*g_zlen
        zmax = g_zmin + (z+1)*g_zlen

        L = []
        for (i, (o, m,),) in enumerate(zip(self.objs, self.mdls)):
            if o.px+(m.xmax<<12) < xmin: continue
            if o.pz+(m.zmax<<12) < zmin: continue
            if o.px+(m.xmin<<12) > xmax: continue
            if o.pz+(m.zmin<<12) > zmax: continue
            L.append(i)

        fp.write(struct.pack("<II", 0, 0))
        fp.write(struct.pack("<I", len(L)))
        for n in L:
            fp.write(struct.pack("<I", n))
        fp.write(struct.pack("<I", 0))


Changelog

v2


Add changelog.
Elaborate on face texturing and subdivision flags.

v1

Initial release.