I have noticed different ST coordinate behavior on PCSX2 vs hardware PS2 for texturing.
ST coordinates on hardware behave as if they have 1/2 the texel width and height. Eg, only half of the texture can be addressed in the ST coordinate range [0,1] on each axis (so only 1/4 of the total texture can be addressed!)
Here are images captured on emulator vs hardware demonstrating the problem:
In demo 1 the top left corner has ST coordinate (0,0), the bottom left has ST coordinate (1,1)
In demo 2 the top left corner has ST coordinate (1,1), the bottom left has ST coordinate (2,2).
Checkout this commit of this repo: https://github.com/phy1um/ps2-homebrew-livestreams/tree/a16848c3bf0ad479755b3ab1213e3e3c4b995176
Then either compile using pre-installed PS2SDK and extras, or for simplicity do:
make docker-image
make assets
make docker-elf
If you have PCSX2 installed and on your path, test easily with make run
To run on hardware manually launch the file dist/test.elf
with ps2client
. There are
Makefile rules for this - PS2HOST=x.x.x.x make runps2
and make resetps2
.
This program cycles between 2 different ST coordinates for the bottom-right corner as you press X. First (1,1) then (2,2).
Check the function sprite(...)
defined in script/draw2d.lua
for how the buffers are being
constructed. It's easy to poke around and change things if you understand how GS works (I hope!)
The following is an annotated capture of the buffer being sent to GS each frame, from PCSX2 stdout.
-- clear the screen (from PS2SDK)
1000000000000001 000000000000000e
000000000003200f 0000000000000047
1000000000000001 000000000000000e
000000000003200f 0000000000000047
1000000000000002 000000000000000e
0000000000000000 000000000000001a
0000000000000400 000000000000001b
1000000000000002 000000000000000e
0000000000000006 0000000000000000
3f800000802b2b2b 0000000000000001
2400000000000014 0000000000000055
00000000f1f9ebf9 000000000dfaedf9
00000000f1f9edf9 000000000dfaeff9
00000000f1f9eff9 000000000dfaf1f9
00000000f1f9f1f9 000000000dfaf3f9
00000000f1f9f3f9 000000000dfaf5f9
00000000f1f9f5f9 000000000dfaf7f9
00000000f1f9f7f9 000000000dfaf9f9
00000000f1f9f9f9 000000000dfafbf9
00000000f1f9fbf9 000000000dfafdf9
00000000f1f9fdf9 000000000dfafff9
00000000f1f9fff9 000000000dfa01f9
00000000f1f901f9 000000000dfa03f9
00000000f1f903f9 000000000dfa05f9
00000000f1f905f9 000000000dfa07f9
00000000f1f907f9 000000000dfa09f9
00000000f1f909f9 000000000dfa0bf9
00000000f1f90bf9 000000000dfa0df9
00000000f1f90df9 000000000dfa0ff9
00000000f1f90ff9 000000000dfa11f9
00000000f1f911f9 000000000dfa13f9
1000000000000001 000000000000000e
0000000000000001 000000000000001a
### my stuff begins here ###
-- set texture registers
1000000000000004 000000000000000e
-- set TEXA to 0x80,0x80 (no transparency by default)
0000008000000080 000000000000003b
-- set TEX1...
0000000000000101 0000000000000014
-- set TEX0
-- image has width=64, height=64, PSM=32 so we expect to see
-- TBW = 0x1
-- the TW and TH fields set to 0x6
-- this looks correct to me!
0000000598007499 0000000000000006
-- set PRIM register for SPRITE with texturing flag on
0000000000000016 0000000000000000
-- 1 loop setting 6 registers - ST, RGBAQ, XYZ2x2
6000000000000001 0000000000512512
-- ST = 0,0 Q = 1.0
0000000000000000 000000003f800000
-- RGBA = (0x80,0x80,0x80,0x80)
0000008000000080 0000008000000080
-- top left corner coordinate
00007e8000007880 0000000000000000
-- ST = 1.0, 1.0 Q = 1.0
3f8000003f800000 000000003f800000
-- RGBA = (0x80,0x80,0x80,0x80)
0000008000000080 0000008000000080
-- bottom left corner coordinate
00008b0000008500 0000000000000000
-- draw finish
1000000000008001 000000000000000e
0000000000000001 0000000000000061
The obvious but not very satisfying way to solve this is to simply add 1 to the TW and TH fields of the TEX0 register when running on real hardware. This is equivalent to doubling the width and height. This sucks but is easy to implement and produces the correct results on hardware (but incorrect results on PCSX2).
I am not sure if there is a difference in my DMA buffer data between PS2 and PCSX2 as ps2client is not printing to stdout. CI would appreciate if someone ran this code on hardware and pastee a DMA buffer dump from a single frame in the comments for comparison!
Hey there, this is interesting. Is there any chance you can provide a precompiled version in your dist folder?