Skip to content

Instantly share code, notes, and snippets.

@Pokechu22
Last active April 17, 2021 05:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Pokechu22/897780fcb026f106f537e0cf0803fef0 to your computer and use it in GitHub Desktop.
Save Pokechu22/897780fcb026f106f537e0cf0803fef0 to your computer and use it in GitHub Desktop.
(Partically incorrect) notes while investigating the Luigi's Mansion portrait issue (https://bugs.dolphin-emu.org/issues/11462)

The painting texture for Luigi's Mansion in JMC's FIFO on frame 0 is object 373 and 374. Object 373 shows Mario, while Object 374 shows Bowser; a lot of the early objects are rendering either Mario or Bowser (and both are rendered, even though Bowser only shows up in the ending cutscene). This is presumably done by fading BPMEM_TEV_COLOR_RA's alpha from 0xff to 0 for object 373 and from 0 to 0xff for object 374; it is at 0 for 374 in the given frame in any case.

I dumped textures while using playing back the FIFO using the software renderer, and got tar364_stage0_map0_mip0.png, tar365_stage0_map0_mip0.png, and tar364_ind0_map1_mip0.png (and an identical copy as tar365_ind0_map1_mip0.png). Although that last file may appear invisible, it has contents; its alpha channel just happens to be all zeros so it doesn't show up properly. noise.png shows what it looks like with the alpha channel removed.

Lots of objects
Obj 372 @ 000e2b99:
XF register XFMEM_SETNUMTEXGENS
Number of tex gens: 1

Obj 373 @ 000e2d67:
BP register BPMEM_IREF
Stage 0 ntexmap: 1
Stage 0 ntexcoord: 1
Stage 1 ntexmap: 0
Stage 1 ntexcoord: 0
Stage 2 ntexmap: 0
Stage 2 ntexcoord: 0
Stage 3 ntexmap: 0
Stage 3 ntexcoord: 0

BP register BPMEM_RAS1_SS0
Indirect texture stages 0 and 1:
Even stage S scale: 0 (1)
Even stage T scale: 0 (1)
Odd stage S scale: 0 (1)
Odd stage T scale: 0 (1)

BP register BPMEM_IND_CMD command 0
Indirect tex stage ID: 0
Format: ITF_8 (0)
Bias: STU (7)
Bump alpha: Off (0)
Offset matrix ID: 1
Regular coord S wrapping factor: Off (0)
Regular coord T wrapping factor: Off (0)
Use modified texture coordinates for LOD computation: No
Add texture coordinates from previous TEV stage: No

BP register BPMEM_TEV_COLOR_RA Tev register 1
Type: Color (0)
Alpha: 0ff (this one varies; object 374 (bowser) has it at 0)
Red: 0ff

Obj 373 @ 000e2dab:

BP register BPMEM_TREF number 0
Stage 0 texmap: 0
Stage 0 tex coord: 0
Stage 0 enable texmap: Yes
Stage 0 color channel: Zero (7)
Stage 1 texmap: 0
Stage 1 tex coord: 0
Stage 1 enable texmap: No
Stage 1 color channel: Zero (7)

Obj 373 @ 000e2df1:
BP register BPMEM_IND_IMASK
No description available [parameter is 2]

BP register BPMEM_GENMODE
Num tex gens: 1
Num color channels: 1
Unused bit: 0
Flat shading (unconfirmed): No
Multisampling: No
Num TEV stages: 0
Cull mode: Back-facing primitives only (1)
Num indirect stages: 1
ZFreeze: No

CP register VCD_LO
Position and normal matrix index: Not present
Texture Coord 0 matrix index: Not present
Texture Coord 1 matrix index: Not present
Texture Coord 2 matrix index: Not present
Texture Coord 3 matrix index: Not present
Texture Coord 4 matrix index: Not present
Texture Coord 5 matrix index: Not present
Texture Coord 6 matrix index: Not present
Texture Coord 7 matrix index: Not present
Position: Direct (1)
Normal: Not present (0)
Color 0: Not present (0)
Color 1: Not present (0)

CP register VCD_HI
Texture Coord 0: Direct (1)
Texture Coord 1: Direct (1)
Texture Coord 2: Not present (0)
Texture Coord 3: Not present (0)
Texture Coord 4: Not present (0)
Texture Coord 5: Not present (0)
Texture Coord 6: Not present (0)
Texture Coord 7: Not present (0)

XF register XFMEM_VTXSPECS
Num colors: 0
Num normals: None (0)
Num textures: 2

CP register CP_VAT_REG_A - Format 0
Position elements: 3 (x, y, z) (1)
Position format: Float (4)
Position shift: 0 (1)
Normal elements: 1 (n) (0)
Normal format: Float (4)
Color 0 elements: 4 (r, g, b, a) (1)
Color 0 format: RGBA 32 bits 8888 (5)
Color 1 elements: 4 (r, g, b, a) (1)
Color 1 format: RGBA 32 bits 8888 (5)
Texture coord 0 elements: 2 (s, t) (1)
Texture coord 0 format: Float (4)
Texture coord 0 shift: 0 (1)
Byte dequant: shift applies to u8/s8 components
Normal index 3: single index per normal

CP register CP_VAT_REG_B - Format 0
Texture coord 1 elements: 2 (s, t) (1)
Texture coord 1 format: Float (4)
Texture coord 1 shift: 0 (1)
Texture coord 2 elements: 2 (s, t) (1)
Texture coord 2 format: Float (4)
Texture coord 2 shift: 0 (1)
Texture coord 3 elements: 2 (s, t) (1)
Texture coord 3 format: Float (4)
Texture coord 3 shift: 0 (1)
Texture coord 4 elements: 2 (s, t) (1)
Texture coord 4 format: Float (4)
Enhance VCache (must always be on): Yes

CP register CP_VAT_REG_C - Format 0
Texture coord 4 shift: 0 (1)
Texture coord 5 elements: 2 (s, t) (1)
Texture coord 5 format: Float (4)
Texture coord 5 shift: 0 (1)
Texture coord 6 elements: 2 (s, t) (1)
Texture coord 6 format: Float (4)
Texture coord 6 shift: 0 (1)
Texture coord 7 elements: 2 (s, t) (1)
Texture coord 7 format: Float (4)
Texture coord 7 shift: 0 (1)

Primitive GX_DRAW_TRIANGLES VAT 0

450ac000c3340000c5e45000 3dcccccd00000000 0000000000000000 
45147000c3bb8000c5e45000 3f6666663f800000 3f8000003f800000 
450ac000c3bb8000c5e45000 3dcccccd3f800000 000000003f800000 

Primitive GX_DRAW_TRIANGLES VAT 0

450ac000c3340000c5e45000 3dcccccd00000000 0000000000000000 
45147000c3340000c5e45000 3f66666600000000 3f80000000000000 
45147000c3bb8000c5e45000 3f6666663f800000 3f8000003f800000 

The above primitives decoded (https://godbolt.org/z/EE6qbdWhY), basically just a rectangle (but they didn't use a quad for some reason):

2220 -180 -7306  0.1 0  0 0
2375 -375 -7306  0.9 1  1 1
2220 -375 -7306  0.1 1  0 1

2220 -180 -7306  0.1 0  0 0
2375 -180 -7306  0.9 0  1 0
2375 -375 -7306  0.9 1  1 1

VCD_HI specifies 2 texture coords, XFMEM_VTXSPECS specifies 2 textures, but BPMEM_GENMODE and XFMEM_SETNUMTEXGENS specify 1 tex gen. (the latter is a leftover from object 372 and may not be relevant though; I think the BP one is more likely. The current patch uses the BP one, though my initial software renderer implementation used the XF one because I didn't realize that both existed.) And in case it's not clear, this effect is achieved through an indirect texture, which to my understanding serves to implement bump mapping — one texture acts as offsets to another texture, which with the noise image gives the scrambling effect.

The one thing that seems important is the use of BPMEM_IND_IMASK, which dolphin doesn't have implemented. It's not entirely clear what it does, but from looking at Kirby's Dream Collection's __GXSetIndirectMask (relevant code is at 80037b60 and 800320d4) it seems that it takes a byte as a parameter, and the default value is 0 (though the variable starts at 0xff). Setting it to 0x02 is thus suspicious; there are up to 8 texture coordinates so maybe it has one bit for each?

Libogc also has the same code, calling it __GX_SetIndirectMask, which is only used in GX_Init (it's also initialized there). The register is also written by __GX_FlushTextureState, and on GameCube only for some reason by __GX_UpdateBPMask. I don't know why it's GameCube only; it was changed from Wii only without explanation. But, assuming this function is correct and the names have the right meanings, the code gets bpmem.genMode.numindstages, and then tries to iterate over them, but then just hardcodes numbers anyways (the GX_INDTEXSTAGE0 constants simply expand to 0 through 3, and numindstages is a 3-bit field so it can represent 0 through 7). __gx->tevRasOrder[2] refers to bpmem.tevindref (for whatever reason libogc uses 11 fields, combining the two texscale fields, tevindref, and the 8 tevorders together). It sets ntexmap from what Dolphin calls bi0/bi1/bi2/bi4 (but YAGCD numbers as 0/1/2/3). It then puts 1 << ntexmap into the mask. Since the Luigi's Mansion one has bi0/ntexmap as 1, the mask is and should be 2. That seems consistent.

Looks like texscale is also for 4 indirect stages.

OK... if the indirect texture is #1, and the direct texture is #0, what exactly is going on? And what texture IS dolphin using?

Alright, seems like both tevind and tevorders are indexed from 0 to 15 by numtevstages. I buy that.


My current working hypothesis is that the texture being used for indirect purposes (the noise one) isn't being loaded right somehow. For instance, in the software renderer, on the line after TextureSampler::Sample, for Object 373, IndirectTex was always eeeeee00 (ee for non-alpha components). This led me to test differences between the software and hardware renderer -- and I actually found something quite odd: mario was in a different place. But actually, that doesn't even require the software renderer; I've gotten (or at least I think I've gotten; it may have been experimental error with some other changes left behind) different places with just OpenGL too. Unfortunately I can't consistently reproduce that aspect, but the software versus part seems consistent: take a look at the fingers in MarioOGL.png and MarioSoftware.png; they're offset differently (note that both of these images are on the first frame and were captured on 5.0-13963).

To test further, I need to see what the portrait is actually using for the indirect texture. Since, for some reason, the indirect texture is transparent, this requires hacking up Dolphin's texture decoding code by hardcoding alpha to be 0xff here and here. I also have to edit the FIFO to do this; I wrote c9 at offset 001e786f, changing 28 3803c0 to 28 3803c9 (making BPMEM_TREF number 0 use texture and tex coord 1 instead of 0). This gives the results in MarioTex1OGL.png and MarioTex1Software.png (software is gray but OGL shows the noise texture). Note that it's still applying the indirect texture in this case (so the noise is getting offset with itself); if the number of indirect stages in BPMEM_GENMODE is set to 0 by writing 00 to 001e78b8, then I get MarioTex1NoIndOGL.png and MarioTex1NoIndSoftware.png, with OGL having a different color than software. Since the color corresponds to the offset more or less, this explains the different shifts between OGL and software.

It's also worth looking at MarioTex1OGL_5K.png; you can see the noise texture much more clearly and that it's offset weirdly (look for the ᅥ‾ pattern near the center, which is on the bottom right of noise_upscaled.png).

So, to clarify, I've looked over the commands used to render the relevant object (373), and everything seems reasonable. It's supposed to use indirect textures/bump mapping with texture 1 (the noise image) to distort texture 0 (the Mario image). The only problem is that the number of tex gens is set to 1, not 2, so texture 1 isn't valid(?). The current "fix" changes texture 1 to texture 0, which would cause the Mario image to be distorted by the Mario image — which seems to produce some distortion, but I don't think it's the right distortion. (This was a misunderstanding of what was happening, see below.)

To test what it'd look like if the number of tex gens is 2, I did some more fifolog hex-editing. At 001e7840, replace 61 E3 0FF0FF 61 E3 0FF0FF (two redundant BPMEM_TEV_COLOR_BG commands; for some reason there tend to be 3 of these for each BPMEM_TEV_COLOR_RA) with 10 0000103f 00000002 00 (XF_SETNUMTEXGENS 2, followed by a NOP). Also write 12 to 001e78ba and 001e7b20 to change 00 014011 to 00 014012, i.e. to set BPMEM_GENMODE's num tex gens to 2.

This gives the following results in 5.0-13963: Mario2TexGensOGL.png and Mario2TexGensSoftware.png. And that result is... much more blocky than the version from #8296 (Mario8296.png).

Unfortunately, it seems like the path I assumed was intended isn't what's used. Looking at a longplay, the closest frame I could find is Screenshot1h10m32s17f.png; the hand at least matches very close. You can clearly see that it matches the smaller pixels in Mario8296.png. Furthermore, a later cutscene has a close up; Screenshot3h13m29s00f.png again shows much smaller pixels. Although I don't have a FIFO to compare to for that later cutscene, I think it's safe to assume that it's set up the same way there and has the same 64×64 noise texture as texture 1 but uses the Mario texture (or some other smaller texture) for offsetting instead.

In other words, #8296 is correct or at least not incorrect in this case, and hardware testing is needed to confirm the behavior when indirect textures are used with an insufficient number of tex gens.


CORRECTION: I previously said that the fix changes texture 1 to texture 0. That's wrong; it uses texture coord 0 instead, but continues using the noise texture. This means that the Mario texture is not being distorted by the Mario texture; it's being distorted by the noise texture as before. That explains why the distortion is the same even as the Mario texture moves. Since XF multiples texture coordinates by the corresponding size, tex coord 0 ranges from 0 to 128 and tex coord 1 would range from 0 to 64 (BPMEM_SU_SSIZE/TSIZE number 0 use 128, while BPMEM_SU_SSIZE/TSIZE number 1 use 64; this is consistent with the width and height in BPMEM_TX_SETIMAGE0 for texture unit 0/1 respectively). However, since the number of tex gens is 1, tex coord 1 isn't processed by XF, and isn't used here. The range from 0 to 128 instead of 0 to 64 is what causes the finer-grained noise, as the texture is repeating twice horizontally and twice vertically. (This can be confirmed by looking for the row of 3 black pixels near the top left corner of noise_upscaled.png which appears 4 times in MarioTex1OGL_5K.png. The pattern I looked for before was very close to the edge, and got cut off (but can be partially seen at the bottom of the painting as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment