Skip to content

Instantly share code, notes, and snippets.

@gynvael
Last active April 16, 2023 17:56
Show Gist options
  • Save gynvael/755f138ec4c6e656d2614b5749271f14 to your computer and use it in GitHub Desktop.
Save gynvael/755f138ec4c6e656d2614b5749271f14 to your computer and use it in GitHub Desktop.
Notes on Infiltration (JerseyCTF III '23)

Notes on Infiltration (JerseyCTF III '23)

A rogue AI has infiltrated a game server's custom VM run on PPC and its code is now traversing the user base. The developers have decompiled and given the current executing script the memory it was accessing at the time and opcode documentation. You are tasked with investigating the nature of this threat.

You were given three files:

  • opcodes.md with an incomplete description of opcodes
  • ctf.xsa with assembly (as in: text) for some architecture
  • memory.bin with 4KB of binary data (high entropy apart from the header)

Apart from that the somewhat useful hint was who made the challenge (Native Function), as they had a lot of "RAGE VM" related repos/tools on their github.

So yeah, this challenge was about the Rockstar Advanced Game Engine aka RAGE (which I've learnt a few hours into the challenge), or rather its scripting language. Or rather its assembly form in a dialect made by folks that made the RASM (dis)assembler. Actually the fact that it was a dialect made it a bit more difficult to Google for it, which made the whole challenge more confusing to solve.

Eventually the approach I took was to start implementing the assembly parser + emulator based on the instruction documentation for the actual instructions used only. And by that I mean I told GPT-4 to implement what it could and then started fixing that by analyzing the provided documentation and the actual code.

GPT-4 was actually also useful to translate the assembly code into Python based on documentation. It wasn't correct, but it gave me a decent idea what I was dealing with.

In the end there were 4 steps to this challenge:

  1. Load the header (4 big-endian ints: magic, data size, list start and initial key) and check some values.
  2. Decrypt the memory and iterate the key (decrypt function).
  3. Walk through the linked list and gather 5 ints (set function).
  4. "Decrypt" these 5 ints and print them together in ASCII as the flag.

I got my emulator to do points 1 and 2, and then switched to re-writing the assembly code to Python since I understood it enough at that point.

So here's the Python code step 2:

import struct
import sys
import os

def decrypt_2(sz, key):
    local_2 = 0
    local_3 = 0
    local_4 = 0
    local_5 = 0

    while local_4 < (sz - 16) // 4:
      #print(hex(local_4*4 + 16), hex(key))

      static_0[4 + local_4] ^= key

      key = key * 2
      key &= 0xffffffff
      if key == 0:
        key = local_4 % 15 + 1

      local_4 += 1

    print("final key:", hex(key))

def dumpstatic():
  global static_0
  with open("dump_plain.static_0", "wb") as f:
    #for i in static_0:
      #print(i)
    #    f.write(struct.pack(">I", i))
    f.write(struct.pack(f">{len(static_0)}I", *static_0))

def main():
  with open("memory.bin", "rb") as f:
    d = f.read()

  global static_0
  static_0 = list(struct.unpack(f">{len(d)//4}I", d))

  decrypt_2(0x4000, static_0[3])
  dumpstatic()


main()

And here's the code for step 3 and 4:

import sys
import struct
import os

with open("step2.bin", "rb") as f:
  d = f.read()

start = 0x1830  # This is the list start from the header.

# This is the loop in main + (set) function translated to Python:
off = start
values = []
for i in range(5):
  v_ptr, next = struct.unpack(">II", d[off:off+8])
  v = struct.unpack(">I", d[v_ptr:v_ptr+4])[0]
  values.append(v)
  print(hex(v_ptr), hex(v), hex(next))
  off = next

print(values)

# And this is the decryption of the values in (main) function.

"""
GetLocalP 6
SetLocal 14
GetLocal 5
Push 936175996
Add
SetLocal 15
"""
local_5 = 0xf000  # Final key from previous step.
local_15 = local_5 + 936175996  # 0x37cce97c
print("key", hex(local_15))

"""
GetLocal 14
Dup
pGet
GetLocal 15
Xor
pPeekSet
Drop

this has to be jctf  6a637466
"""
values[0] = values[0] ^ local_15


"""
GetLocal 14
GetImmP 1
Dup
pGet
GetLocal 15
Xor
pPeekSet
Drop
GetLocal 14
GetImmP 1
Dup
pGet
Push 7536640
Or
pPeekSet
Drop
"""
values[1] = (values[1] ^ local_15) | 0x730000

"""
GetLocal 14
GetImmP 2
Dup
pGet
GetLocal 15
Xor
pPeekSet
Drop
GetLocal 14
"""
values[2] = values[2] ^ local_15

"""
GetImmP 3
Dup
pGet
GetLocal 15
Xor
pPeekSet
Drop
GetLocal 14
GetImmP 3
Dup
pGet
Push 28416
Or
pPeekSet
Drop
"""
values[3] = (values[3] ^ local_15) | 0x6f00

"""
GetLocal 14
GetImmP 4
pGet
Push -16777216
And
Push 8192000
Or
GetLocal 14
GetImmP 4
pSet
GetLocalP 6
"""
values[4] = ((values[4] & 0xff000000) | 0x7d0000)

# And printing out the values and the flag.
o = ""
for i in range(5):
  hv = hex(values[i])[2:]
  print(hv, b''.fromhex(hv))
  o += hv

f = b''.fromhex(o)
print(f)
print(len(f))

# b'jctf{stickie_bomb}\x00\x00'

All in all I didn't get the emulator fully running mostly because it wasn't clear based on the provided opcode description whether the memory is treated as a list of ints or list of bytes, and apparently some instructions were doing this and some that. After finding the SC-CL source and especially this – https://github.com/NativeFunction/SC-CL/blob/master/bin/include/intrinsics.h – it became a bit clearer to me, but I decided to not rework the emulator, so just did the steps 2-4 in Python.

All in all it was a fun challenge, though I probably should have spent some more time initially trying to find out what architecture it was, and maybe find an emulator.

Anyway, in the end it worked, and after asking the admins to fix the flag in their system 🙃 I got first blood on it! :)

-- Gynvael

import struct
import sys
class Pointer:
def __init__(self, what, idx, name=""):
self.what = what
self.idx = idx
self.name = name
def get(self):
return self.what[self.idx]
def set(self, v):
self.what[self.idx] = v
def __repr__(self):
return f"Pointer[{self.name},{self.idx}]"
class VM:
def __init__(self):
self.stack = []
#self.memory = bytearray(16 * 1024) # 16KB memory
#with open('memory.bin', 'rb') as f:
# self.memory = bytearray(f.read())
self.labels = {}
self.call_stack = []
self.call_stack_locals = []
self.locals = []
def execute(self, parsed_instructions):
for i, (inst, params) in enumerate(parsed_instructions):
if inst.startswith(':'):
label = inst[1:]
self.labels[label] = i
print(f"Label {label} at {i}")
self.pc = self.labels["EntryPoint"]
while True:
instruction, params = parsed_instructions[self.pc]
#print(f"----- [{self.call_stack}] {instruction}, {params}")
#print("Locals:", self.locals)
#print("Stack:", self.stack)
if instruction.startswith(':'):
self.pc += 1
continue
f = getattr(self, instruction)
if f is None:
print("MISSING", f)
old_pc = self.pc
f(*params)
if self.pc == old_pc:
self.pc += 1
# Instruction methods
def Call(self, *params):
dumpstatic()
# Assuming the implementation of the function call will be done
jump_pos = self.labels[params[0][1:]]
self.call_stack.append(self.pc + 1)
self.pc = jump_pos
pass
def pSet(self, *params):
ptr = self.stack.pop()
value = self.stack.pop()
ptr.what[ptr.idx] = value
def Or(self):
a = self.stack.pop()
b = self.stack.pop()
self.stack.append(a | b)
def ToStack(self, *params):
ptr = self.stack.pop()
num_items = self.stack.pop()
# ORDER HERE MIGHT BE WRONG ------------------------------------------
for i in range(num_items):
print(ptr, "Actual idx:", ptr.idx + i, "Value:", ptr.what[ptr.idx + i])
self.stack.append(ptr.what[ptr.idx + i])
def Add(self):
a = self.stack.pop()
b = self.stack.pop()
if type(a) is Pointer and type(b) is int:
#print("POINTER ADDITION", a, b)
self.stack.append(Pointer(a.what, a.idx + b // 4, a.name))
return
if type(b) is Pointer and type(a) is int:
#print("POINTER ADDITION", a, b)
self.stack.append(Pointer(b.what, b.idx + a // 4, b.name))
return
if type(a) is int and type(b) is int:
self.stack.append((a + b) & 0xffffffff)
return
sys.exit("Two pointers being added???")
def Xor(self):
a = self.stack.pop()
b = self.stack.pop()
self.stack.append(a ^ b)
def Return(self, *params):
if not self.call_stack:
sys.exit("Execution finished")
self.pc = self.call_stack.pop()
self.locals = self.call_stack_locals.pop()
def SetLocal(self, *params):
value, idx = self.stack.pop(), int(params[0])
self.locals[idx] = value
def SetDefaultStatic(self, *params):
idx, value = int(params[0]), int(params[1])
self.memory[idx] = value
def Push(self, value):
self.stack.append(int(value) & 0xffffffff)
#self.stack.append(int(value))
def GetImmP(self, *params):
ptr, idx = self.stack.pop(), int(params[0])
self.stack.append(Pointer(ptr.what, ptr.idx + idx, ptr.name))
def FromStack(self, *params):
ptr = self.stack.pop()
num_items = self.stack.pop()
items = [self.stack.pop() for _ in range(num_items)][::-1]
for i in range(num_items):
ptr.what[ptr.idx + i] = items[i]
print(ptr, "Actual idx:", ptr.idx + i, "Value:", ptr.what[ptr.idx + i])
def Function(self, *params):
args_count, var_count = int(params[0]), int(params[1])
prev_locals = self.locals
self.call_stack_locals.append(prev_locals)
self.locals = [0] * var_count
items = []
for i in range(args_count):
items.append(self.stack.pop())
items = items[::-1]
for i in range(args_count):
self.locals[i] = items[i]
print(f" Arg {i} => {self.locals[i]}")
def Dup(self):
self.stack.append(self.stack[-1])
def AddImm(self, *params):
value = int(params[0])
self.stack[-1] += value
def Not(self):
self.stack[-1] = int(not self.stack[-1])
def Mult(self):
a = self.stack.pop()
b = self.stack.pop()
self.stack.append((a * b) & 0xffffffff)
def pGet(self):
ptr = self.stack.pop()
self.stack.append(ptr.get())
def SetStaticsCount(self, *params):
value = int(params[0])
# Assuming the implementation of setting the max statics will be done
def Sub(self):
a = self.stack.pop()
b = self.stack.pop()
if type(b) is Pointer and type(a) is int:
self.stack.append(Pointer(b.what, b.idx - a // 4, b.name))
return
if type(a) is int and type(b) is int:
self.stack.append(b - a) # not sure if this shouldn't be unsigned
return
sys.exit("unsupported sub types")
def CmpLT(self):
b = self.stack.pop()
a = self.stack.pop()
self.stack.append(int(a < b))
def CmpGE(self):
b = self.stack.pop()
a = self.stack.pop()
self.stack.append(int(a >= b))
def CmpLE(self):
b = self.stack.pop()
a = self.stack.pop()
self.stack.append(int(a <= b))
def CmpEQ(self):
b = self.stack.pop()
a = self.stack.pop()
self.stack.append(int(a == b))
def pPeekSet(self):
val = self.stack.pop()
ptr = self.stack.pop()
ptr.what[ptr.idx] = val
self.stack.append(val)
def Drop(self):
self.stack.pop()
def GetLocalP(self, *params):
idx = int(params[0])
self.stack.append(Pointer(self.locals, idx, "locals"))
def JumpFalse(self, *params):
tos = self.stack.pop()
if not tos:
#print("TAKEN")
self.pc = self.labels[params[0][1:]]
else:
pass
#print("not taken")
# Assuming the implementation of the jump will be done
def GetImmPs(self):
idx = self.stack.pop()
ptr = self.stack.pop()
self.stack.append(Pointer(ptr.what, ptr.idx + idx, ptr.name))
def GetLocal(self, *params):
idx = int(params[0])
self.stack.append(self.locals[idx])
# MISSING SetSignature
# Assuming the implementation of the SetSignature will be done
def GetStatic(self, *params):
self.stack.append(Pointer(static_0, 0, "static"))
#idx = int(params[0])
#return self.memory
#self.stack.append(struct.unpack("I", self.memory[idx:idx+4])[0])
def MultImm(self, *params):
value = int(params[0])
self.stack[-1] *= value
def Mod(self):
a = self.stack.pop()
b = self.stack.pop()
self.stack.append(b % a)
def PushB3(self, *params):
for p in params:
self.stack.append(int(p))
def CallNative(self, *params):
# Assuming the implementation of the native function call will be done
print("CALL_NATIVE:", params)
#sys.exit("CallNative")
def And(self):
a = self.stack.pop()
b = self.stack.pop()
self.stack.append(a & b)
def Div(self):
b = self.stack.pop()
a = self.stack.pop()
if b == 0:
raise ZeroDivisionError("division by zero")
#print(f"{a} // {b} --> {a // b}")
self.stack.append(a // b)
def PushString(self, string):
ptr = len(self.memory) # Assuming we append strings to the end of the memory
self.memory.extend(string.encode() + b'\0')
self.stack.append(ptr)
def parse_assembly_code(assembly_code):
parsed_instructions = []
for line in assembly_code.split('\n'):
line = line.strip()
line = line.split('//')[0].strip()
# Ignore comments and empty lines
if line.startswith('//') or not line:
continue
# Split instruction and parameters
parts = line.split()
instruction = parts[0]
parameters = parts[1:]
# Save the parsed instruction and parameters
parsed_instructions.append((instruction, parameters))
return parsed_instructions
def read_assembly_code(file_name):
with open(file_name, 'r') as f:
assembly_code = f.read()
return assembly_code
def dumpstatic():
global static_0
with open("dump.static_0", "wb") as f:
#for i in static_0:
#print(i)
# f.write(struct.pack(">I", i))
f.write(struct.pack(f">{len(static_0)}I", *static_0))
def main():
with open("memory.bin", "rb") as f:
d = f.read()
global static_0
static_0 = list(struct.unpack(f">{len(d)//4}I", d))
# Make the memory bigger by a bit.
"""
for i in range(0x1000):
static_0.append(0)
"""
file_name = 'ctf.xsa'
assembly_code = read_assembly_code(file_name)
parsed_instructions = parse_assembly_code(assembly_code)
vm = VM()
vm.execute(parsed_instructions)
"""
instr_set = set()
for instruction, parameters in parsed_instructions:
#print(instruction, parameters)
if not instruction.startswith(':'):
instr_set.add(instruction)
for i in list(instr_set):
print(i)
"""
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment