-
-
Save FireNX70/10c72169ea0105f998bab6f51443d42e to your computer and use it in GitHub Desktop.
melonDS compute renderer testing notes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Cheep Cheep Beach in MKDS will display garbage even at 5x if you go into the | |
water. (Happened once, can't reproduce) | |
I tried using a debug context, enabling the debug output from the compute | |
renderer (even though it shouldn't be needed because debug contexts enable it by | |
default) and giving it a function to print debug messages. It didn't seem to | |
print any more info than when I tested with a regular context. | |
I also tested with Mesa. It prints even less stuff than Nvidia's driver | |
(Nvidia's driver only printed messages about successful buffer allocations). | |
Newer versions of Mesa will have the red and blue channels swapped. A fresh | |
install of Debian 12.2 has it working fine. It's broken after dist-upgrading to | |
testing (Trixie), which at the time of writing is using Mesa 24.0.8. I suppose | |
it's a small behavior change in newer Mesa releases. It still doesn't log any | |
errors. | |
I checked all compute shaders with the glslang validator and got no complaints | |
from it. | |
Tried increasing maxYSpanIndices (there's a comment saying the current values | |
are a bad guess). It didn't change anything. | |
There's a couple of glMemoryBarrier calls using GL_SHADER_STORAGE_BUFFER, which | |
looks wrong to me. The documentation doesn't list GL_SHADER_STORAGE_BUFFER as a | |
glMemoryBarrier flag; but it does list GL_SHADER_STORAGE_BARRIER_BIT, which | |
later glMemoryBarrier calls use. Replacing GL_SHADER_STORAGE_BUFFER with | |
GL_SHADER_STORAGE_BARRIER_BIT did not get rid of the garbage. Adding barriers | |
after the first two glDispatchCompute calls did not fix it either. Adding a | |
barrier after the glDispatchCompute call in the loop did not fix it. | |
I checked that the vast majority of the pipeline wasn't exceeding | |
GL_MAX_COMPUTE_WORK_GROUP_COUNT. It wasn't. For the stages using | |
glDispatchCompute; if this were a problem OpenGL should log errors and it | |
doesn't, so they must be fine. glDispatchComputeIndirect will not log OpenGL | |
errors even if the compute work group counts are too high, and checking these | |
would be annoying since I'd have to read the buffer back from VRAM. However, | |
RenderDoc did capture the values used for these. At 6x none of the calls ever go | |
above the minimum of 65535. At 7x there is one call that does exceed that | |
minimum (glDispatchComputeIndirect(1, 1, 79414)). This exceeds the maximum | |
supported by my 1080Ti with the 552 driver (it sticks to the minimum of 65535 in | |
the Z axis). I tried replacing the glDispatchComputeIndirect call in the loop | |
with glDispatchCompute(1, 1, 65535). This DID have an effect on the garbage, but | |
it did not fix it. It's likely this is the problem on Nvidia. The local sizes | |
all seem fine. | |
Tried grabbing a couple of captures with RenderDoc while running Phantom | |
Hourglass. The low res framebuffer seems to be affected in a similar way to the | |
high res one. The high res framebuffer also looks like it's got the red and blue | |
channels swapped before presentation (much like what I was seeing on Mesa). |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment