Created
March 15, 2018 08:07
-
-
Save snickerbockers/2e0343aa42bf3e0cd72ea0ced36d494f to your computer and use it in GitHub Desktop.
Notes on how I might go about fixing the bottleneck that is WashingtonDC's texture memory emulation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Need to unify texture cache and framebuffer to support render-to-texture | |
without needlessly copying the framebuffer back to texture memory after every | |
STARTRENDER command. | |
Also need to support framebuffer access from the SH4 (meaning situations where | |
the CPU writes directly to the framebuffer without using the graphics hw at | |
all, or situations where the CPU reads from something that's already been | |
rendered). | |
Previously I've attempted to solve this problem the "easy" way by only copying | |
data back from OpenGL when the CPU accesses memory that overlaps with the | |
framebuffer (and the framebuffer has been updated since the last sync) or the | |
framebuffer moves (fb_w_sof1/fb_w_sof2). This turned out to be entirely | |
fruitless because the framebuffer moves every frame due to double-buffering; | |
ergo the framebuffer would get copied back to PVR2 texture memory every frame | |
and literally nothing had improved over the original implementation. | |
Because of this, the only viable solution is to expand the role of the texture | |
cache so that the framebuffer itself is a texture. Currently the framebuffer is | |
already an OpenGL texture, but the idea I'm thinking of here is to actually | |
track the framebuffer from within the pvr2 tex_cache. The framebuffer would | |
not receive any special treatment other than being a rendering target, and the | |
texture which serves as the framebuffer would change whenever the framebuffer | |
moves or is reconfigured. | |
This has the following ramifications: | |
* Not all textures will be power-of-two (because the framebuffer is typically | |
never a power-of-two). | |
* Not all textures will be resident in texture memory (because we only want to | |
copy them back when they're needed). | |
* There will be situations where the framebuffer overlaps with another texture | |
while that texture resides in OpenGL (render-to-texture). In this case the | |
second texture needs to be updated whenever the first texture is rendered to | |
even though the second texture is not being used as the render target. | |
* When a texture is written to, it may first need to be synced from OpenGL to | |
texture memory. | |
To facilitate this, a simple residency system needs to be used. The residency | |
will have three states: | |
* Texture resides in PVR2 tex mem: this means that it needs to be uploaded to | |
gfx before it is mapped next time it gets mapped). | |
* Texture resides in gfx (ie OpenGL): this means that if the CPU tries to | |
read/write to a memory region that overlaps with the texture, the texture | |
first needs to be synced from gfx to pvr2 tex mem. | |
* Texture resides in both gfx and PVR2 tex mem: this means that both PVR2 tex | |
mem's version of the texture and gfx's version of the texture are identical | |
and therefore equally valid. Either system can read from the texture | |
immediately without syncing, but as soon as either system tries to write, then | |
the residency state changes to reflect that the texture only resides in one | |
of the two systems. This is also the state the texture gets placed in | |
immediately after a sync. | |
As mentioned above, the most difficult special case to solve will be the case | |
wherein a texture resides in gfx/OpenGL, and needs to be updated because it | |
overelaps with the framebuffer (but is not itself the framebuffer). Ideally the | |
texture would be copied over entirely within OpenGL without any CPU involvement. | |
There is probably a way to do this but for the near future I intend to go with | |
the more naive approach, which is to sync from gfx to tex memory, and then sync | |
from tex memory to gfx. | |
One mystery which needs to be solved before I implement any of this is the | |
mystery of how Crazy Taxi is able to map the framebuffer to a texture without | |
any apparent change in dimensions. PVR2 cannot map a texture with | |
non-power-of-two dimensions, and yet somehow this game can map the framebuffer | |
to a texture even though the framebuffer has non-power-of-two dimensinos. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment