157416/1 User: uyjulian (09-06-2017, 05:01 PM)
Let's have a DECKARD and PPC-IOP discussion!
The PPC-IOP is present on the SCPH-75000 and newer. This PPC-IOP has 4MB of memory available to it (for reference, older models have 2MB of IOP memory and TOOLS have 8MB of IOP memory)
DECKARD is the software that emulates the IOP and its functions. It runs on the PPC-IOP.
157416/2 User: TnA (09-06-2017, 05:03 PM)
Arrrr,... I rather thought
@Maximus32 or
@wisi would create a
thread rich of information! :D
You only wanted that Sticky! :p
157416/3 User: uyjulian (09-06-2017, 05:05 PM)
Originally Posted by TnA Arrrr,... I rather thought @Maximus32 or @wisi would create a thread rich of information! :D
You only wanted that Sticky! :p
I'll fill in more information later in the OP.
157416/4 User: TnA (09-06-2017, 05:10 PM)
Originally Posted by uyjulian I'll fill in more information later in the OP.
How you dare to start a thread without it?! :lol:
Yeah, it's all o.k.!
157416/5 User: Maximus32 (09-06-2017, 05:18 PM)
One of the first things we need to get started on DECKARD is a working
toolchain:
https://github.com/rickgaiser/ps2toolchain/tree/deckard
And I have an example here:
https://github.com/rickgaiser/deckard_test
The example has the IOP code + IOP-PPC code, but still misses the EE code becouse I loaded everything from ps2linux.
157416/6 User: wisi (09-06-2017, 07:39 PM)
BTW, it was SP193 who perhaps first noticed DECKARD. I initially
didn't believe that the IOP was emulated, and even after multiple
tests I couldn't find a proof of that. It was when by chance I found
the PPC-IOP serial port and connected to it, that I saw data output
from it, which proved the code was used. At first I though that the
PPC RAM would be protected from access from EE side, so It didn't even
consider reading IOP RAM from EE side to see if there was anything
unusual about it. A bit more on that here:
https://assemblergames.com/threads/playstation-2-are-there-any-hardware-mysteries-left.62325/page-2#post-900268
A complete "example" based on the code by Maximus32 is attached here:
http://psx-scene.com/forums/f98/hdpro-clone-132869/index5.html#post1211305
It loads the PPC and IOP loader-module code into the EE ELF. It uses a
part of the big load/store LUT that is unused and very big, so one can
add a lot more code and data. The base address where to be loaded can
be easily changed by changing only the main Makefile (although that
shouldn't be necessary). In fact, when I checked what the load/store
LUT actually contains, it turned-out that the unused area is more than
220kB per LUT (one for loads and one for stores), so there is more
than 440kB totally free and usable space (with a gap in the middle)!
The method Maximus32 came-up with for indirectly flushing the PPC cache, so that it starts executing the loaded code was by accessing IOP RAM. However there turned-out to be a simpler way - the entrance-point address of the loaded (patch) code has to be written to an unused handler-pointer of the big load/store LUT. Because it is unused (and far from used ones), the data cache would have never have contained it, so it will certainly read it from RAM the first time the emulated IOP reads/writes from a location handled by that handler at that address. As for the instruction cache, it would never contain data from the LUT as the LUT only contains data and not instructions, so there are no problems there either. The cache-block size is 32bytes.
The DECKARD file in the BOOT ROM is actually a directory, which contains three files: DECKARD (the actual emulator) is a PPC ELF, DPATCH is a small patch that DECKARD seems to load from absolute address in ROM, and LOADER, which is an ELF loader and loads it into RAM. This directory and all the data is in big endian. There seem to be some non-standard op-codes in the code. The emulator also seems to use self-modifiable code.
The IOP's maximum RAM size is 16MB, which was available (only) on some PS1 / early PS2 development hardware (mentioned in the DTL-T10000 code). Early PS2 development hardware seem to also have had 256MB of EE RAM (the maximum).
RAM: Virtual-IOP & DECKARD (as seen from EE side):
0x1C000000 0x1C200000 0x1C400000 0x1C600000 0x1C800000 0x1CA00000 0x1CC00000 0x1CE00000
IOP-RAM IOP-RAM IOP-RAM IOP-RAM IOP-RAM DECKARD-RAM IOP-RAM DECKARD-RAM
The IOP (or the PS1 CPU) is known to have been used at one point with
16MB of RAM (DIP switch settings from TOOL), in which case, all the
above would be taken-up by IOP RAM.
It is visible, that DECKARD-RAM is wired in such a way, that the IOP
can still have 8MB of RAM connected to it, for use in TOOL. This has
been the case at least in the earlier revisions, where the IOP is
still in a separated ASIC - the latest PS2 models have the IOP in one
ASIC with basically most of the PS2 hardware - a PoC - PS2 on Chip in
a way, so it cannot have RAM added (at least with the known pinout,
which is not known).
Another interesting fact is that the PPC TLB (configuration attached)
maps the first 8MB of memory to mirrors of the 2MB IOP RAM. It also
seems to have two variants of that, one of which might have been used
for debugging DECKARD. There is yet another mirror with virtual
address 0x40000000 (again 8MB of 2MB mirrored), with unknown purpose,
and a few 2MB separate mirrors.
The IOP RAM configuration register seems to still be present so it is
possible that it is functional.
157416/7 User: wisi (09-10-2017, 08:27 AM)
I am trying to make an IRX-like PPC ELF, however there are problems
with the liker options:
The IRX imports/exports work, but require having program headers in
the ELF. To get relocation information, the '-r' option is needed.
Adding the -r option, however makes the program headers (all of them)
(specified in the liker script) disappear! :wow:
... and I have no idea why this is happening or how to solve it. Any
ideas?
I also noticed other strange things:
- to disable page-alignment of sections and headers, there are the
options:
-n, --nmagic Do not page align data
-N, --omagic Do not page align data, do not make text readonly
using -n or -N does work, but using --nmagic or --omagic has no effect. - setting any alignment for sections (using ALIGN() or BLOCK) has no
effect on the alignment value in the section header. However adding an
aligned structure to that section does change the value - for
example:
struct {u32 dummyAlign;} dmyStruct __attribute__((section(".dummyAlignSct16"))) __attribute__((aligned(16))) = { 0x00 };
Maybe it is my limited knowledge of gcc that is the core cause of those issues... :p
If this gets fixed, then we may have a working SYSMEM and LOADCORE on PPC-side that can load and link IRX-like PPC modules. Currently the linking works, but because of the above the relocation doesn't so they can only be loaded at this point as static ELF at a pre-set address.
EDIT: Solved (at least for now): The reason program headers got excluded when -r was added, was because -r makes the file relocatable, but also not an "output" file, but rather a file suitable for further linking, and no program headers should be added to such a file. The correct option is -q, which includes relocation information in the *output* file. gcc doesn't recognize the -q option (unlike the -r), so it needs to be passed with an "-Xlinker" option before it i. e. "-Xlinker -q" to gcc (through to the linker ld).
157416/8 User: wisi (09-20-2017, 04:17 AM)
Attached is an experimental code for loading IRX (actually PRX PPC
Relocatable eXecutable) modules on the PPC alongside with DECKARD.
For the PPC-side modified versions of SYSMEM and LOADCORE from pcsx2
Fps2Bios are used. They handle loading PRX modules and linking imports
to exports.
The code also includes a very small "library" of functions and hooks
to DECKARD functions to make coding easier.
A MIPS-IOP <-> PPC-IOP communication virtual interface is included.
It runs over a pair of unused registers and enables each side calling
functions on the other, along with passing arguments and returning a
result (or a structure of data). Because (MIPS-)IOP RAM is accessible
from the PPC-IOP at the same offset (0x000000) "both" can access
structures in it and pass data this way (the code on the PPC-IOP has
to make sure to mask the segment).
Code on the PPC can access peripheral devices registers the same way
code on the virtual IOP would (with the emulation provided by DECKARD)
through the use of provided load/store functions.
Makefiles and linker scripts are included, which can build ELF PSX
(static - at fixed address, but with imports and exports) and PRX
binaries for the PPC.
Example modules are included - the most complete of which is a
modified version of the (old now) USBD module, that has several of its
functions ported to PPC-side. This however led to no change in
transfer speed as the chosen functions are neither called that often,
nor so processing-intense to make a difference.
It is structured, so that its PPC-counterpart PRX is loaded as data in
the IRX. From there, it is loaded to PPC-RAM. There is a bug that
prevents modules with symbols stripped from being loaded correctly, so
they currently consume a lot of MIPS-IOP-RAM.
From the PPC-IOP RAM there are 456kB free for loading modules and
data.
The code is neither complete nor bug-free.
Although I am doubting more and more whether this would actually be of
much use for making IOP modules run faster, if not that, then at least
the property of this code to load multiple modules on the PPC may be
useful.
157416/9 User: Maximus32 (09-20-2017, 02:03 PM)
Originally Posted by wisi Attached is an experimental code for loading IRX (actually PRX PPC Relocatable eXecutable) modules on the PPC alongside with DECKARD.
This is very usefull, thank you. I like what you did with the "PRX" and module loading in a similar way to the IRX files.
Originally Posted by wisi It is structured, so that its PPC-counterpart PRX is loaded as data in the IRX. From there, it is loaded to PPC-RAM. There is a bug that prevents modules with symbols stripped from being loaded correctly, so they currently consume a lot of MIPS-IOP-RAM.
From the PPC-IOP RAM there are 456kB free for loading modules and data.
It would probably be better to load the PRX as a file, instead of embedding it into the IRX. That would save a lot of MIPS-IOP RAM.
Originally Posted by wisi The code is neither complete nor bug-free.
But it works :-). Using ps2client it runs the first time. The second time it hangs, probably becouse the PPC code is still loaded?.
Originally Posted by wisi Although I am doubting more and more whether this would actually be of much use for making IOP modules run faster, if not that, then at least the property of this code to load multiple modules on the PPC may be useful.
There is no software involved in USB transmission. USBD just tells the
OHC (hardware) what it needs to do.
I think the PPC will be very usefull as an "accelerator" for
processing intensive tasks/loops, like memcpy or checksum calculation
(smb driver?).
157416/10 User: wisi (09-20-2017, 04:23 PM)
Originally Posted by Maximus32 This is very usefull, thank you. I like what you did with the "PRX" and module loading in a similar way to the IRX files.
Actually PRX modules seem to already be used on the PS3, but those are probably different format, so I wasn't sure if using this abbreviation is a good idea.
Originally Posted by Maximus32 It would probably be better to load the PRX as a file, instead of embedding it into the IRX. That would save a lot of MIPS-IOP RAM.
My initial idea was to have an IRX that would contain both complete
MIPS-IOP code and option to call part of its code on PPC, so the same
module could be used on both on fat and DECKARD-models and the IRX
would detect if the PS2 is a DECKARD model and load and use its PPC
code. This would make IRX files very big though. Another option is to
place the PPC code in a separate section that won't be loaded in
MIPS-IOP RAM and patch the MIPS-IOP LOADCORE to send that section to
the PPC-side for loading when it is detected. But I really don't want
to mess with the linker scripts more, because it was quite difficult
getting them to work correctly. :p
If the IRX is already bundled in an ELF, then the ELF would also need
to include the PRX counterpart of the IRX (and load it somehow). Also
note that communicating with the PPC-IOP from the EE is not
straight-forward, because there isn't actually a clean way of doing
that (that doesn't involve resources already available to the
MIPS-IOP). So to load a PRX on the PPC, still an IRX would be
necessary (it can unload itself afterwards).
A simpler way would be to place the PRX (somehow) at the end of the
IRX (again maybe using a separate section) and after loading the PRX,
free the memory of the IRX and allocate again over it, leaving the
memory where the PRX is free.
Originally Posted by Maximus32 But it works :-). Using ps2client it runs the first time. The second time it hangs, probably becouse the PPC code is still loaded?.
Thanks for testing. :)
I've tried to figure-out why it fails on the second loading, but
couldn't find the cause. ...not like there aren't at least a dozen
reasons for it to fail :D
The loading mechanism is the same as that of ps2madd/p2mesac (even the
code of it is still inside :p ), which does work correctly on multiple
reloads. I don't think it is due to moving the store LUT and shrinking
both LUTs, as the part of the load-LUT used for loading is at the same
place. On the second loading it hangs after the IOP-trigger-IRX
accesses the register which causes the (newly-loaded) PPC code to be
called, but before it prints anything, which is rather odd. This might
actually be fixable, now after thinking about it. It is another matter
if it is a good idea, as overwriting the patch will also clear to
default values all its local state-variables but the original DECKARD
code will still retain its patches from the previous run, so this can
cause problems later. Maybe the main loaded should include a check to
determine if the main patch is already loaded and either refuse to
load it, or use an "unloader-IRX" to trigger it to unpatch DECKARD if
possible.
157416/11 User: wisi (01-25-2018, 05:45 AM)
A small update on DECKARD and the PPC-IOP:
Although the MIPS-IOP Interrupt Controller's registers (0xBF801070)
are functional and are connected to (most of) the interrupt sources,
they are treated in a special way: the interrupt status register is
read and the bits that are found to be set are immediately cleared.
Then the set bits are handled and written to a software shadow
register which is OR-ed with the hardware interrupt status register
when it is being read by the MIPS-IOP. This is normal and is a way to
guarantee that no interrupts are missed.
This means that detecting interrupts (by homebrew patches) must be
done before that function that reads and clears the interrupt status
register, otherwise they will (almost) never be detected as active.
The interrupt controller may not be connected to the PPC's external
interrupt (or any interrupt for that matter). This is not thoroughly
tested, but it is certain that at least the code in DECKARD only
checks for interrupts from the INTC at certain places in the code and
is never interrupted by them.
Some of the PPC exceptions (like alignment and bus errors) are passed
to the emulated IOP directly. They are also one of the few unmasked
exceptions (uncertain).
The PPC emulates a few registers in the IOP Core registers range - 0xFFFE0000 for use when emulating the BOOT ROM regions (romVer) and SPEED revision and EEPROM and parameters setting (those are registers 0xFFFE0000 +0x180 - +0x1A7). This can only be done from the IOP, and is done by the EECONF IRX (and other code).
The IOP DMAC might be emulated (bu PIO accesses) or might be partially emulated. Some channels of it in PS1 mode are certainly emulated.
Up until recently, I believed that the unrecognized PPC instructions
found in DECKARD are either used to trigger exceptions which are later
emulated by the exception handler or are edited by self-modifying code
according to the emulated MIPS instruction. DECKARD does indeed use
self-modifing code in places, which I incorrectly considered to be a
sign that it uses a recompiler.
At this point I believe it uses an interpreter, given that the PPC is
quite faster than the MIPS IOP, and that emulation is done close to
the hardware level. Now it seems that the unrecognized instructions
may actually be executed by a MIPS-emulation coprocessor, connected to
the PPC core. The PPC does indeed support an auxiliary processor unit
(APU) so this is also a possibility. This would also mean that DECKARD
cannot be ported to a different PPC, because a part of the
instruction - emulation logic - this APU would be missing. This also
explains why DECKARD is so small and why there aren't big look-up
tables for handling the different instructions in it.
Another downside of the MIPS-core emulation is what seems to be the
lack of a way to obtain the current CPU core state. There is a big
structure that seems to contain all MIPS CPU registers (r0-r31, the
Cop0 registers and a few more including the program counter). It is
used when PPC exceptions need to be passed to the MIPS-IOP. However,
the general purpose registers there always contain zeroes, and the PC
and other Cop0 registers are only updated when the code encounters an
exception. The PC only contains exception vectors and no real counter,
so one can't interrupt the PPC and read the MIPS PC to see where the
emulated IOP program is executing. This all seems to be "locked" in
the MIPS-emulation coprocessor. There must be a way to read all the
registers, but that would involve using coprocessor instructions and
their functions are completely unknown.
Perhaps this works (seen from the emulated IOP) because this
MIPS-emulation APU makes all the MIPS registers available to the
emulated CPU, but not to DECKARD (at least not constantly).
It is possible that initially a full PPC-emulator was written for the MIPS IOP, but later it was found to be too slow on the PPC440/405, so the MIPS-emulation APU was added. This is just a speculation of course. This is yet another reason why the PS3 emulator of the PS2 IOP is considerably different from DECKARD (note that I haven't seen the emulator on the PS3 that much, so this is mostly an assumption) (because it most likely lacks the MIPS-emulation APU).
Emulation of some devices (memory accesses) does not go through the
common load/store handlers,, but rather is done by the code that calls
them. This includes emulation of the BOOT ROM accesses (and RomVer
patching) and emulation of the MIPS-core registers at 0xFFFE0000, as
well as RAM emulation.
The MIPS-Core registers at 0xFFFE0000 seem to also be made accessible
at 0x1FFE0000 in the high part of the BOOT ROM and when the IOP is
rebooted in PS1 mode their addresses at 0x1FFE0000 are used. This
makes emulating the BOOT ROM challenging, but not impossible.
EDIT:
So it turned-out the PPC-IOP is not at all what it was considered to
be - a PPC core that emulates the MIPS-IOP and parts of its
peripherals.
The PPC-IOP actually *contains* a MIPS-IOP Core APU! :wow:
This makes a lot of sense because it answers the question of why
(perhaps) a simple MIPS core running at 36MHz would be difficult to
produce in 2005 - because it wasn't! They simply integrated the
MIPS-IOP in the PPC-IOP. But why?! Most likely the reason was in order
to fix compatibility problems that had arisen in the previous IOP
revisions in the SCPH-70000 models. I don't know much about that and
it would be better if somebody more knowledgeable of the PS2
compatibility adds some information on that.
There are additional instructions for that APU, used to read and write
MIPS-IOP core registers and configuration registers of the APU. Also
there are instructions for reading and writing all the MIPS core
general purpose registers (r0-r31) and all the GTE's registers and
even a few more registers of the GTE, making their total number 0x100
registers (the GTE has 0x40 known registers)!
I don't know yet if there are any instructions that do something more
than transfer data between the APU and the PPC core. Also I haven't
yet determined how the APU fetches its code for execution - whether it
is sent to it by the PPC core or does the APU reads it directly from
the data bus. If the code is read directly for the data bus, there
would be no way to emulated storage devices with code executable
directly from them i.e. the BOOT ROM.
There are a few APU control registers with unknown functions. Using
PPC device-control registers 0x180(address) 0x181(data -
bi-directional) one can read (at least) 0x40 APU registers. They
include 0x20 control registers, where hi, lo, program counter, and a
few of the Cop0 registers can be found (the rest are unknown) and 0x20
MIPS general purpose r0-r31. The MIPS registers are also accessible
through APU instructions that read/write only them. A similar set of
instructions is used to access the GTE registers.
It is possible that some APU instructions might be directly mapped to
MIPS/GTE instructions, so the PPC could execute them if necessary
(speculation).
The above also means that the state of the MIPS core can be read and a
simple feature to print all registers when an external interrupt
occurs can be easily implemented. This can help debug cases when the
MIPS program is in unknown state. The interrupt can be fed from the
controller ports - /HTR1 (connecting it to ground) triggers interrupt
19.
157416/12 User: wisi (06-10-2018, 12:25 PM)
Somehow I never noticed this thread from 2007! which states that the
IOP has been changed to a PPC CPU :
https://forum.beyond3d.com/threads/pstwo-i-o-processor-now-a-powerpc-440.35154/
Here are other related threads (for completeness):
Mainly related to PS3 and its components, which may relate to why the
replacement IOP is a PPC CPU:
https://forum.beyond3d.com/threads/japanese-article-about-ps3-backwards-compatablilty.29141/page-4
https://forum.beyond3d.com/threads/if-sony-had-to-use-ps2-components-to-do-a-next-gen-system.11252/
https://forum.beyond3d.com/threads/ps3-internals.32889/page-16#post-866756
https://forum.beyond3d.com/threads/playstation-3-may-not-have-lsi-chips-in-its-design.8895/
Regarding DECKARD and the PPC IOP:
https://assemblergames.com/threads/the-iop-on-the-scph-75000-and-newer.57048/
https://assemblergames.com/threads/playstation-2-are-there-any-hardware-mysteries-left.62325/
http://psx-scene.com/forums/f19/ps2-slim-deckard-emulator-usb-loading-possible-157330/
The APU mentioned in this discussion does not fetch the code for execution on its own. The PPC core does (lwz - apu - handle result - repeat).