-
-
Save balidani/a89c4949fd4309065f9b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<plaintext> I'll explain the part where I create a new dummy too | |
<plaintext> because the original used to work and now it doesn't | |
<plaintext> Here is the OpenCL code for a kernel that produces a relatively long ISA | |
<plaintext> https://gist.github.com/balidani/a76cc41b1b041f02f980 | |
<plaintext> this produces ~2.7K of ISA | |
<plaintext> I load this into sample.cl and then run | |
<plaintext> run/binary_gen 1 0 sample.bin sample.cl | |
<plaintext> "1 0" means device 0 of platform 1, which is the Tahiti device we need | |
<plaintext> sample.bin will be the generated binary | |
<plaintext> I find the code section and change everything to "00 00 80 bf" (= NOP) | |
<plaintext> Then I create the ISA for a much simpler kernel: | |
<plaintext> https://gist.github.com/balidani/f5745068b7e140acec0f | |
<plaintext> Even though the AMD KernelAnalyzer can generate ISA, I used OpenCL itself to do it, because I trust it | |
<plaintext> so I create a sample2.cl, and load it again with the binary_gen program | |
<plaintext> this time I use the _temp_0_Tahiti_main.isa file that was generated due to the "-save-temps" option | |
<plaintext> I cut the ISA and modify the comment syntax, because gcnasm uses ";" instead of "//" | |
<plaintext> I also add a 0 to the last instruction, s_endpgm, because gcnasm can't handle instructions with no arguments yet | |
<plaintext> I load the ISA to a file (test.isa) | |
<plaintext> and then run the assembler script | |
<plaintext> I tried doing the same as the assembler script by hand. Here are the steps for that too: | |
<plaintext> I create the microcode for the ISA file using this command: | |
<plaintext> run/gcnasm test.isa test.bin | |
<plaintext> now I use the python patching script | |
<plaintext> python tools/dummy_elf_patcher/patch_dummy.py sample.bin test.bin output.bin | |
<plaintext> output.bin will be the patched ELF | |
<plaintext> now I load this into OpenCL using this command: | |
<plaintext> run/binary_gen 1 0 output.bin none.cl | |
<plaintext> "none.cl" is something non-existing, since if the output.bin is found, the source is not used | |
<plaintext> and the result I get is an "error", which means that whatever I loaded as the ISA, I get the output from the last GPU execution | |
<plaintext> so if my small test contained out[gid] = 1337, and I change the isa to "out[gid] = 777", I will still find 1337 on the output | |
<plaintext> I also tried to change the vgpr and sgpr count in the binaries ATI CAL comments but it didn't help | |
<plaintext> tell me if something is hard to understand, it was a bit rushed, sorry | |
<ukasz_> are you able to patch any binary at the moment? | |
<plaintext> no, it looks like I'm not |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment