Skip to content

Instantly share code, notes, and snippets.

@uyjulian
Created January 14, 2022 05:18
Show Gist options
  • Save uyjulian/52a407f52ba968f62e34966586cdddfd to your computer and use it in GitHub Desktop.
Save uyjulian/52a407f52ba968f62e34966586cdddfd to your computer and use it in GitHub Desktop.

DECKARD/PPC-IOP discussion

157416/1 User: uyjulian (09-06-2017, 05:01 PM)
Let's have a DECKARD and PPC-IOP discussion!

The PPC-IOP is present on the SCPH-75000 and newer. This PPC-IOP has 4MB of memory available to it (for reference, older models have 2MB of IOP memory and TOOLS have 8MB of IOP memory)

DECKARD is the software that emulates the IOP and its functions. It runs on the PPC-IOP.

157416/2 User: TnA (09-06-2017, 05:03 PM)
Arrrr,... I rather thought @Maximus32 or @wisi would create a thread rich of information! :D

You only wanted that Sticky! :p

157416/3 User: uyjulian (09-06-2017, 05:05 PM)

Originally Posted by TnA Arrrr,... I rather thought @Maximus32 or @wisi would create a thread rich of information! :D

You only wanted that Sticky! :p

I'll fill in more information later in the OP.

157416/4 User: TnA (09-06-2017, 05:10 PM)

Originally Posted by uyjulian I'll fill in more information later in the OP.

How you dare to start a thread without it?! :lol:

Yeah, it's all o.k.!

157416/5 User: Maximus32 (09-06-2017, 05:18 PM)
One of the first things we need to get started on DECKARD is a working toolchain:
https://github.com/rickgaiser/ps2toolchain/tree/deckard

And I have an example here:
https://github.com/rickgaiser/deckard_test

The example has the IOP code + IOP-PPC code, but still misses the EE code becouse I loaded everything from ps2linux.

157416/6 User: wisi (09-06-2017, 07:39 PM)
BTW, it was SP193 who perhaps first noticed DECKARD. I initially didn't believe that the IOP was emulated, and even after multiple tests I couldn't find a proof of that. It was when by chance I found the PPC-IOP serial port and connected to it, that I saw data output from it, which proved the code was used. At first I though that the PPC RAM would be protected from access from EE side, so It didn't even consider reading IOP RAM from EE side to see if there was anything unusual about it. A bit more on that here: https://assemblergames.com/threads/playstation-2-are-there-any-hardware-mysteries-left.62325/page-2#post-900268

A complete "example" based on the code by Maximus32 is attached here: http://psx-scene.com/forums/f98/hdpro-clone-132869/index5.html#post1211305
It loads the PPC and IOP loader-module code into the EE ELF. It uses a part of the big load/store LUT that is unused and very big, so one can add a lot more code and data. The base address where to be loaded can be easily changed by changing only the main Makefile (although that shouldn't be necessary). In fact, when I checked what the load/store LUT actually contains, it turned-out that the unused area is more than 220kB per LUT (one for loads and one for stores), so there is more than 440kB totally free and usable space (with a gap in the middle)!

The method Maximus32 came-up with for indirectly flushing the PPC cache, so that it starts executing the loaded code was by accessing IOP RAM. However there turned-out to be a simpler way - the entrance-point address of the loaded (patch) code has to be written to an unused handler-pointer of the big load/store LUT. Because it is unused (and far from used ones), the data cache would have never have contained it, so it will certainly read it from RAM the first time the emulated IOP reads/writes from a location handled by that handler at that address. As for the instruction cache, it would never contain data from the LUT as the LUT only contains data and not instructions, so there are no problems there either. The cache-block size is 32bytes.

The DECKARD file in the BOOT ROM is actually a directory, which contains three files: DECKARD (the actual emulator) is a PPC ELF, DPATCH is a small patch that DECKARD seems to load from absolute address in ROM, and LOADER, which is an ELF loader and loads it into RAM. This directory and all the data is in big endian. There seem to be some non-standard op-codes in the code. The emulator also seems to use self-modifiable code.

The IOP's maximum RAM size is 16MB, which was available (only) on some PS1 / early PS2 development hardware (mentioned in the DTL-T10000 code). Early PS2 development hardware seem to also have had 256MB of EE RAM (the maximum).


RAM: Virtual-IOP & DECKARD (as seen from EE side):
0x1C000000        0x1C200000        0x1C400000        0x1C600000        0x1C800000        0x1CA00000        0x1CC00000        0x1CE00000
IOP-RAM           IOP-RAM           IOP-RAM           IOP-RAM           IOP-RAM           DECKARD-RAM       IOP-RAM           DECKARD-RAM

The IOP (or the PS1 CPU) is known to have been used at one point with 16MB of RAM (DIP switch settings from TOOL), in which case, all the above would be taken-up by IOP RAM.
It is visible, that DECKARD-RAM is wired in such a way, that the IOP can still have 8MB of RAM connected to it, for use in TOOL. This has been the case at least in the earlier revisions, where the IOP is still in a separated ASIC - the latest PS2 models have the IOP in one ASIC with basically most of the PS2 hardware - a PoC - PS2 on Chip in a way, so it cannot have RAM added (at least with the known pinout, which is not known).

Another interesting fact is that the PPC TLB (configuration attached) maps the first 8MB of memory to mirrors of the 2MB IOP RAM. It also seems to have two variants of that, one of which might have been used for debugging DECKARD. There is yet another mirror with virtual address 0x40000000 (again 8MB of 2MB mirrored), with unknown purpose, and a few 2MB separate mirrors.
The IOP RAM configuration register seems to still be present so it is possible that it is functional.

157416/7 User: wisi (09-10-2017, 08:27 AM)
I am trying to make an IRX-like PPC ELF, however there are problems with the liker options:
The IRX imports/exports work, but require having program headers in the ELF. To get relocation information, the '-r' option is needed. Adding the -r option, however makes the program headers (all of them) (specified in the liker script) disappear! :wow:
... and I have no idea why this is happening or how to solve it. Any ideas?

I also noticed other strange things:

  • to disable page-alignment of sections and headers, there are the options:
    -n, --nmagic Do not page align data
    -N, --omagic Do not page align data, do not make text readonly
    using -n or -N does work, but using --nmagic or --omagic has no effect.
  • setting any alignment for sections (using ALIGN() or BLOCK) has no effect on the alignment value in the section header. However adding an aligned structure to that section does change the value - for example:
    struct {u32 dummyAlign;} dmyStruct __attribute__((section(".dummyAlignSct16"))) __attribute__((aligned(16))) = { 0x00 };

Maybe it is my limited knowledge of gcc that is the core cause of those issues... :p

If this gets fixed, then we may have a working SYSMEM and LOADCORE on PPC-side that can load and link IRX-like PPC modules. Currently the linking works, but because of the above the relocation doesn't so they can only be loaded at this point as static ELF at a pre-set address.

EDIT: Solved (at least for now): The reason program headers got excluded when -r was added, was because -r makes the file relocatable, but also not an "output" file, but rather a file suitable for further linking, and no program headers should be added to such a file. The correct option is -q, which includes relocation information in the *output* file. gcc doesn't recognize the -q option (unlike the -r), so it needs to be passed with an "-Xlinker" option before it i. e. "-Xlinker -q" to gcc (through to the linker ld).

157416/8 User: wisi (09-20-2017, 04:17 AM)
Attached is an experimental code for loading IRX (actually PRX PPC Relocatable eXecutable) modules on the PPC alongside with DECKARD.

For the PPC-side modified versions of SYSMEM and LOADCORE from pcsx2 Fps2Bios are used. They handle loading PRX modules and linking imports to exports.
The code also includes a very small "library" of functions and hooks to DECKARD functions to make coding easier.
A MIPS-IOP <-> PPC-IOP communication virtual interface is included. It runs over a pair of unused registers and enables each side calling functions on the other, along with passing arguments and returning a result (or a structure of data). Because (MIPS-)IOP RAM is accessible from the PPC-IOP at the same offset (0x000000) "both" can access structures in it and pass data this way (the code on the PPC-IOP has to make sure to mask the segment).
Code on the PPC can access peripheral devices registers the same way code on the virtual IOP would (with the emulation provided by DECKARD) through the use of provided load/store functions.

Makefiles and linker scripts are included, which can build ELF PSX (static - at fixed address, but with imports and exports) and PRX binaries for the PPC.
Example modules are included - the most complete of which is a modified version of the (old now) USBD module, that has several of its functions ported to PPC-side. This however led to no change in transfer speed as the chosen functions are neither called that often, nor so processing-intense to make a difference.
It is structured, so that its PPC-counterpart PRX is loaded as data in the IRX. From there, it is loaded to PPC-RAM. There is a bug that prevents modules with symbols stripped from being loaded correctly, so they currently consume a lot of MIPS-IOP-RAM.
From the PPC-IOP RAM there are 456kB free for loading modules and data.

The code is neither complete nor bug-free.
Although I am doubting more and more whether this would actually be of much use for making IOP modules run faster, if not that, then at least the property of this code to load multiple modules on the PPC may be useful.

157416/9 User: Maximus32 (09-20-2017, 02:03 PM)

Originally Posted by wisi Attached is an experimental code for loading IRX (actually PRX PPC Relocatable eXecutable) modules on the PPC alongside with DECKARD.

This is very usefull, thank you. I like what you did with the "PRX" and module loading in a similar way to the IRX files.

Originally Posted by wisi It is structured, so that its PPC-counterpart PRX is loaded as data in the IRX. From there, it is loaded to PPC-RAM. There is a bug that prevents modules with symbols stripped from being loaded correctly, so they currently consume a lot of MIPS-IOP-RAM.
From the PPC-IOP RAM there are 456kB free for loading modules and data.

It would probably be better to load the PRX as a file, instead of embedding it into the IRX. That would save a lot of MIPS-IOP RAM.

Originally Posted by wisi The code is neither complete nor bug-free.

But it works :-). Using ps2client it runs the first time. The second time it hangs, probably becouse the PPC code is still loaded?.

Originally Posted by wisi Although I am doubting more and more whether this would actually be of much use for making IOP modules run faster, if not that, then at least the property of this code to load multiple modules on the PPC may be useful.

There is no software involved in USB transmission. USBD just tells the OHC (hardware) what it needs to do.
I think the PPC will be very usefull as an "accelerator" for processing intensive tasks/loops, like memcpy or checksum calculation (smb driver?).

157416/10 User: wisi (09-20-2017, 04:23 PM)

Originally Posted by Maximus32 This is very usefull, thank you. I like what you did with the "PRX" and module loading in a similar way to the IRX files.

Actually PRX modules seem to already be used on the PS3, but those are probably different format, so I wasn't sure if using this abbreviation is a good idea.

Originally Posted by Maximus32 It would probably be better to load the PRX as a file, instead of embedding it into the IRX. That would save a lot of MIPS-IOP RAM.

My initial idea was to have an IRX that would contain both complete MIPS-IOP code and option to call part of its code on PPC, so the same module could be used on both on fat and DECKARD-models and the IRX would detect if the PS2 is a DECKARD model and load and use its PPC code. This would make IRX files very big though. Another option is to place the PPC code in a separate section that won't be loaded in MIPS-IOP RAM and patch the MIPS-IOP LOADCORE to send that section to the PPC-side for loading when it is detected. But I really don't want to mess with the linker scripts more, because it was quite difficult getting them to work correctly. :p
If the IRX is already bundled in an ELF, then the ELF would also need to include the PRX counterpart of the IRX (and load it somehow). Also note that communicating with the PPC-IOP from the EE is not straight-forward, because there isn't actually a clean way of doing that (that doesn't involve resources already available to the MIPS-IOP). So to load a PRX on the PPC, still an IRX would be necessary (it can unload itself afterwards).
A simpler way would be to place the PRX (somehow) at the end of the IRX (again maybe using a separate section) and after loading the PRX, free the memory of the IRX and allocate again over it, leaving the memory where the PRX is free.

Originally Posted by Maximus32 But it works :-). Using ps2client it runs the first time. The second time it hangs, probably becouse the PPC code is still loaded?.

Thanks for testing. :)
I've tried to figure-out why it fails on the second loading, but couldn't find the cause. ...not like there aren't at least a dozen reasons for it to fail :D
The loading mechanism is the same as that of ps2madd/p2mesac (even the code of it is still inside :p ), which does work correctly on multiple reloads. I don't think it is due to moving the store LUT and shrinking both LUTs, as the part of the load-LUT used for loading is at the same place. On the second loading it hangs after the IOP-trigger-IRX accesses the register which causes the (newly-loaded) PPC code to be called, but before it prints anything, which is rather odd. This might actually be fixable, now after thinking about it. It is another matter if it is a good idea, as overwriting the patch will also clear to default values all its local state-variables but the original DECKARD code will still retain its patches from the previous run, so this can cause problems later. Maybe the main loaded should include a check to determine if the main patch is already loaded and either refuse to load it, or use an "unloader-IRX" to trigger it to unpatch DECKARD if possible.

157416/11 User: wisi (01-25-2018, 05:45 AM)
A small update on DECKARD and the PPC-IOP:

Although the MIPS-IOP Interrupt Controller's registers (0xBF801070) are functional and are connected to (most of) the interrupt sources, they are treated in a special way: the interrupt status register is read and the bits that are found to be set are immediately cleared. Then the set bits are handled and written to a software shadow register which is OR-ed with the hardware interrupt status register when it is being read by the MIPS-IOP. This is normal and is a way to guarantee that no interrupts are missed.
This means that detecting interrupts (by homebrew patches) must be done before that function that reads and clears the interrupt status register, otherwise they will (almost) never be detected as active.

The interrupt controller may not be connected to the PPC's external interrupt (or any interrupt for that matter). This is not thoroughly tested, but it is certain that at least the code in DECKARD only checks for interrupts from the INTC at certain places in the code and is never interrupted by them.
Some of the PPC exceptions (like alignment and bus errors) are passed to the emulated IOP directly. They are also one of the few unmasked exceptions (uncertain).

The PPC emulates a few registers in the IOP Core registers range - 0xFFFE0000 for use when emulating the BOOT ROM regions (romVer) and SPEED revision and EEPROM and parameters setting (those are registers 0xFFFE0000 +0x180 - +0x1A7). This can only be done from the IOP, and is done by the EECONF IRX (and other code).

The IOP DMAC might be emulated (bu PIO accesses) or might be partially emulated. Some channels of it in PS1 mode are certainly emulated.

Up until recently, I believed that the unrecognized PPC instructions found in DECKARD are either used to trigger exceptions which are later emulated by the exception handler or are edited by self-modifying code according to the emulated MIPS instruction. DECKARD does indeed use self-modifing code in places, which I incorrectly considered to be a sign that it uses a recompiler.
At this point I believe it uses an interpreter, given that the PPC is quite faster than the MIPS IOP, and that emulation is done close to the hardware level. Now it seems that the unrecognized instructions may actually be executed by a MIPS-emulation coprocessor, connected to the PPC core. The PPC does indeed support an auxiliary processor unit (APU) so this is also a possibility. This would also mean that DECKARD cannot be ported to a different PPC, because a part of the instruction - emulation logic - this APU would be missing. This also explains why DECKARD is so small and why there aren't big look-up tables for handling the different instructions in it.

Another downside of the MIPS-core emulation is what seems to be the lack of a way to obtain the current CPU core state. There is a big structure that seems to contain all MIPS CPU registers (r0-r31, the Cop0 registers and a few more including the program counter). It is used when PPC exceptions need to be passed to the MIPS-IOP. However, the general purpose registers there always contain zeroes, and the PC and other Cop0 registers are only updated when the code encounters an exception. The PC only contains exception vectors and no real counter, so one can't interrupt the PPC and read the MIPS PC to see where the emulated IOP program is executing. This all seems to be "locked" in the MIPS-emulation coprocessor. There must be a way to read all the registers, but that would involve using coprocessor instructions and their functions are completely unknown.
Perhaps this works (seen from the emulated IOP) because this MIPS-emulation APU makes all the MIPS registers available to the emulated CPU, but not to DECKARD (at least not constantly).

It is possible that initially a full PPC-emulator was written for the MIPS IOP, but later it was found to be too slow on the PPC440/405, so the MIPS-emulation APU was added. This is just a speculation of course. This is yet another reason why the PS3 emulator of the PS2 IOP is considerably different from DECKARD (note that I haven't seen the emulator on the PS3 that much, so this is mostly an assumption) (because it most likely lacks the MIPS-emulation APU).

Emulation of some devices (memory accesses) does not go through the common load/store handlers,, but rather is done by the code that calls them. This includes emulation of the BOOT ROM accesses (and RomVer patching) and emulation of the MIPS-core registers at 0xFFFE0000, as well as RAM emulation.
The MIPS-Core registers at 0xFFFE0000 seem to also be made accessible at 0x1FFE0000 in the high part of the BOOT ROM and when the IOP is rebooted in PS1 mode their addresses at 0x1FFE0000 are used. This makes emulating the BOOT ROM challenging, but not impossible.

EDIT:

So it turned-out the PPC-IOP is not at all what it was considered to be - a PPC core that emulates the MIPS-IOP and parts of its peripherals.
The PPC-IOP actually *contains* a MIPS-IOP Core APU! :wow:
This makes a lot of sense because it answers the question of why (perhaps) a simple MIPS core running at 36MHz would be difficult to produce in 2005 - because it wasn't! They simply integrated the MIPS-IOP in the PPC-IOP. But why?! Most likely the reason was in order to fix compatibility problems that had arisen in the previous IOP revisions in the SCPH-70000 models. I don't know much about that and it would be better if somebody more knowledgeable of the PS2 compatibility adds some information on that.

There are additional instructions for that APU, used to read and write MIPS-IOP core registers and configuration registers of the APU. Also there are instructions for reading and writing all the MIPS core general purpose registers (r0-r31) and all the GTE's registers and even a few more registers of the GTE, making their total number 0x100 registers (the GTE has 0x40 known registers)!
I don't know yet if there are any instructions that do something more than transfer data between the APU and the PPC core. Also I haven't yet determined how the APU fetches its code for execution - whether it is sent to it by the PPC core or does the APU reads it directly from the data bus. If the code is read directly for the data bus, there would be no way to emulated storage devices with code executable directly from them i.e. the BOOT ROM.
There are a few APU control registers with unknown functions. Using PPC device-control registers 0x180(address) 0x181(data - bi-directional) one can read (at least) 0x40 APU registers. They include 0x20 control registers, where hi, lo, program counter, and a few of the Cop0 registers can be found (the rest are unknown) and 0x20 MIPS general purpose r0-r31. The MIPS registers are also accessible through APU instructions that read/write only them. A similar set of instructions is used to access the GTE registers.
It is possible that some APU instructions might be directly mapped to MIPS/GTE instructions, so the PPC could execute them if necessary (speculation).
The above also means that the state of the MIPS core can be read and a simple feature to print all registers when an external interrupt occurs can be easily implemented. This can help debug cases when the MIPS program is in unknown state. The interrupt can be fed from the controller ports - /HTR1 (connecting it to ground) triggers interrupt 19.

157416/12 User: wisi (06-10-2018, 12:25 PM)
Somehow I never noticed this thread from 2007! which states that the IOP has been changed to a PPC CPU :
https://forum.beyond3d.com/threads/pstwo-i-o-processor-now-a-powerpc-440.35154/

Here are other related threads (for completeness):
Mainly related to PS3 and its components, which may relate to why the replacement IOP is a PPC CPU:
https://forum.beyond3d.com/threads/japanese-article-about-ps3-backwards-compatablilty.29141/page-4
https://forum.beyond3d.com/threads/if-sony-had-to-use-ps2-components-to-do-a-next-gen-system.11252/
https://forum.beyond3d.com/threads/ps3-internals.32889/page-16#post-866756
https://forum.beyond3d.com/threads/playstation-3-may-not-have-lsi-chips-in-its-design.8895/

Regarding DECKARD and the PPC IOP:
https://assemblergames.com/threads/the-iop-on-the-scph-75000-and-newer.57048/
https://assemblergames.com/threads/playstation-2-are-there-any-hardware-mysteries-left.62325/
http://psx-scene.com/forums/f19/ps2-slim-deckard-emulator-usb-loading-possible-157330/

@EEUG99
Copy link

EEUG99 commented May 3, 2023

The APU mentioned in this discussion does not fetch the code for execution on its own. The PPC core does (lwz - apu - handle result - repeat).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment