Skip to content

Instantly share code, notes, and snippets.

@balidani
Created August 23, 2013 12:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save balidani/a89c4949fd4309065f9b to your computer and use it in GitHub Desktop.
Save balidani/a89c4949fd4309065f9b to your computer and use it in GitHub Desktop.
<plaintext> I'll explain the part where I create a new dummy too
<plaintext> because the original used to work and now it doesn't
<plaintext> Here is the OpenCL code for a kernel that produces a relatively long ISA
<plaintext> https://gist.github.com/balidani/a76cc41b1b041f02f980
<plaintext> this produces ~2.7K of ISA
<plaintext> I load this into sample.cl and then run
<plaintext> run/binary_gen 1 0 sample.bin sample.cl
<plaintext> "1 0" means device 0 of platform 1, which is the Tahiti device we need
<plaintext> sample.bin will be the generated binary
<plaintext> I find the code section and change everything to "00 00 80 bf" (= NOP)
<plaintext> Then I create the ISA for a much simpler kernel:
<plaintext> https://gist.github.com/balidani/f5745068b7e140acec0f
<plaintext> Even though the AMD KernelAnalyzer can generate ISA, I used OpenCL itself to do it, because I trust it
<plaintext> so I create a sample2.cl, and load it again with the binary_gen program
<plaintext> this time I use the _temp_0_Tahiti_main.isa file that was generated due to the "-save-temps" option
<plaintext> I cut the ISA and modify the comment syntax, because gcnasm uses ";" instead of "//"
<plaintext> I also add a 0 to the last instruction, s_endpgm, because gcnasm can't handle instructions with no arguments yet
<plaintext> I load the ISA to a file (test.isa)
<plaintext> and then run the assembler script
<plaintext> I tried doing the same as the assembler script by hand. Here are the steps for that too:
<plaintext> I create the microcode for the ISA file using this command:
<plaintext> run/gcnasm test.isa test.bin
<plaintext> now I use the python patching script
<plaintext> python tools/dummy_elf_patcher/patch_dummy.py sample.bin test.bin output.bin
<plaintext> output.bin will be the patched ELF
<plaintext> now I load this into OpenCL using this command:
<plaintext> run/binary_gen 1 0 output.bin none.cl
<plaintext> "none.cl" is something non-existing, since if the output.bin is found, the source is not used
<plaintext> and the result I get is an "error", which means that whatever I loaded as the ISA, I get the output from the last GPU execution
<plaintext> so if my small test contained out[gid] = 1337, and I change the isa to "out[gid] = 777", I will still find 1337 on the output
<plaintext> I also tried to change the vgpr and sgpr count in the binaries ATI CAL comments but it didn't help
<plaintext> tell me if something is hard to understand, it was a bit rushed, sorry
<ukasz_> are you able to patch any binary at the moment?
<plaintext> no, it looks like I'm not
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment