Skip to content

Instantly share code, notes, and snippets.

@meithecatte
Created August 29, 2018 00:10
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save meithecatte/da253c9d4a7d53fcd953d9a7d39cb03b to your computer and use it in GitHub Desktop.
Save meithecatte/da253c9d4a7d53fcd953d9a7d39cb03b to your computer and use it in GitHub Desktop.
oxfoo1m3 crackme writeup

oxfoo1m3 is a relatively simple crackme with elements of anti-debugging, anti-disassembly, and, as the author put it, anti-libbfd.

I created a new Vagrant virtual machine, and after a bit of fiddling with shared folders, ran the binary:

vagrant@debian9:/vagrant/oxfoo1m3$ ./oxfoo1m3
oxfoo1m3 started ;]
3nt4 p455w0rD:
ABCDABCDABCD

[1]+  Stopped                 ./oxfoo1m3
vagrant@debian9:/vagrant/oxfoo1m3$ D
-bash: D: command not found

We can see that the binary reads 11 characters, and then, since they look nothing like the password, sends a SIGSTOP to itself. I decided to run it under strace, and to my surprise...

vagrant@debian9:/vagrant/oxfoo1m3$ strace ./oxfoo1m3
execve("./oxfoo1m3", ["./oxfoo1m3"], [/* 18 vars */]) = 0
strace: [ Process PID=4467 runs in 32 bit mode. ]
ptrace(PTRACE_TRACEME)                  = -1 EPERM (Operation not permitted)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xf001} ---
+++ killed by SIGSEGV +++
Segmentation fault

... it segfaults. We can see that the si_addr spells out "fool" in hexspeak, so this is definitely intentional. We can also see that the first thing it does is a ptrace(PTRACE_TRACEME). This is a simple antidebugging technique - when a debugger spawns its debugee, it inserts this call between the fork and execve, to allow for debugging regardless of the security settings. Since using PTRACE_TRACEME twice is not allowed, this call, when used within a program, will fail with EPERM when being debugged.

This also explains the SIGSTOP - when a signal is delivered to a process that is being traced, the process is stopped and the parent is notified.

I decided to objdump the file, only to find another carefully planted obstacle:

vagrant@debian9:/vagrant/oxfoo1m3$ objdump -x oxfoo1m3
objdump: oxfoo1m3: File format not recognized

Since Linux only reads the ELF header and the program header table, the creator of an ELF file can put whatever garbage they desire in the section header table, and the binary will still execute flawlessly, while thwarting every program that believes in ELF headers.

I opened the binary in Hopper, which, being a tool designed partly with malware analysis in mind, didn't surrender just because a nonessential header was corrupted. However, only two instructions were identified as code:

        ; ================ B E G I N N I N G   O F   P R O C E D U R E ================


             EntryPoint:
08048080         call       EntryPoint+6
08048085         jmp        0x13c701e4
                        ; endp

As you can see, the jump points to a non-sense address, and the call destination is in the middle of the jump. This is a very simple anti-disassembly technique - jumping to the operand bytes of unused instructions. I marked the jump as data, and the subsequent "offset" bytes as code, to make the instructions that are actually executed display properly. This continued for a while, and then I realised that any readable version of this code will have to use assembler macros. I decided to convert the hexdump to a nasm source file and modify it until it makes sense, while making sure it assembles to the same binary, somewhat like what the guys from pret did.

First, I tried reproducing the ELF header with nasm, but after realising that a linker would be necessary, I decided to just spell out the header using the wikipedia page on ELF:

bits 32
org 0x08048000

%define __NR_read 3
%define __NR_ptrace 26

elf_header:
	db 0x7f, 'ELF'
	db 1 ; 32-bit
	db 1 ; little endian
	db 1 ; ELF version
	db 0 ; System V ABI
	db 0 ; ABI version
	times 7 db 0 ; padding
	dw 2 ; ET_EXEC
	dw 3 ; x86
	dd 1 ; ELF version, again
	dd start
	dd program_header - $$
	dd section_headers - $$
	dd 0 ; architecture-specific flags
	dw program_header - $$ ; ELF header size
	dw program_header_end - program_header ; program header size
	dw 1 ; program header count
	dw 0x28 ; section header size
	dw 4 ; section header count
	dw 3 ; section name header intex

program_header:
	dd 1 ; PT_LOAD
	dd 0 ; offset in file
	dd elf_header ; load address
	dd elf_header ; physical address (unused)
	dd code_end - $$ ; size on disk
	dd code_end - $$ ; size in memory
	dd 0b111 ; flags - rwx
	dd 0x1000 ; alignment

program_header_end:

	times 0x80 - ($ - $$) db 0

start:

I then added all the assembly code as raw bytes using this ad-hoc sed monstrosity:

vagrant@debian9:/vagrant/oxfoo1m3$ xxd -s 0x80 oxfoo1m3 | sed 's/^[0-9a-f]*: /\tdb /g;s/\([0-9a-f]\{2\}\)\([0-9a-f]\{2\}\)/0x\1, 0x\2,/g;s/,  .\{16\}//g' >> test.s                                                             

Then, I slowly identified the elements of the code by crossreferencing my manual disassembly with Hopper. The most basic building block is what I called ip2edx, which does exactly what you would expect - copy EIP to EDX:

%macro ip2edx 0
	call %%calldest
%%startedx:
	db 0xe9 ; jmp, anti-re
%%calldest:
	pop edx
	add edx, strict dword %%end - %%startedx
	push edx
	ret
	db 0xe9 ; jmp, anti-re
%%end:
%endmacro

This is then used in mcall, which, apart from code size and clobbering EDX, behaves just like normal call:

%macro mcall 1
	ip2edx
	add edx, strict dword %%code - $
	push edx
	push %1
	ret
	db 0xe8 ; call, anti-re
%%code:
%endmacro

Finally, a very common function used with mcall was _nop:

_nop:
	ret

... so I wrapped that in a macro called kdx (for Kill eDX):

%macro kdx 0
	mcall _nop
%endmacro

This made it possible to analyze the first part of the code:

start:
	mcall dexor_code
	mcall antidebug
	jmp strict near dexored_entry
	db 0xe8
dexor_code:
	mov esi, xor_begin
	jmp strict near dexor
_nop:
	ret
	db 0xe8
dexor:
	mov edi, esi
	kdx
	cld
	kdx
	mov ecx, xor_end - xor_begin
	kdx
	mov al, [unkatend] ; dead read
.loop:
	lodsb
	kdx
	xor al, 0x58
	kdx
	stosb
	kdx
	loop .loop
	ret

As you can see, the kdx macro is used between every pair of instructions, which makes the expanded code pretty confusing, to say the least. In comparison, a quick glance at this snippet reveals that this is a classic static-key xor decoder. I looked at the hexdump, and, indeed, there was a lot of 0x58 bytes in it. I noticed that there was some data that has looked like xor-0x58, but the length parameter used excluded it from decoding:

00000c10: 9bb1 39c5 95d8 c858 0c30 3d78 163d 2c2f  ..9....X.0=x.=,/ ; decoding ends at 0c16
00000c20: 313c 3d78 192b 2b3d 353a 343d 2a78 6876  1<=x.++=5:4=*xhv
00000c30: 6160 766b 6058 5876 2b30 2b2c 2a2c 393a  a`vk`XXv+0+,*,9:
00000c40: 5876 2c3d 202c 5876 3b37 3535 3d36 2c58  Xv,= ,Xv;755=6,X
00000c50: 5858 5858 5858 5858 5858 5858 5858 5858  XXXXXXXXXXXXXXXX ; the elf header claims section headers are here
00000c60: 5858 5858 5858 5858 5858 5858 5858 5858  XXXXXXXXXXXXXXXX
00000c70: 5858 5858 5858 5858 5358 5858 5958 5858  XXXXXXXXSXXXYXXX
00000c80: 5e58 5858 d8d8 5c50 d858 5858 cf53 5858  ^XXX..\P.XXX.SXX
00000c90: 5858 5858 5858 5858 4858 5858 5858 5858  XXXXXXXXHXXXXXXX
00000ca0: 4958 5858 5958 5858 0000 0000 0000 0000  IXXXYXXX........ ; the actual end seems to be at 0ca8
00000cb0: 170c 0000 1f00 0000 0000 0000 0000 0000  ................
00000cc0: 0100 0000 0000 0000 0100 0000 0300 0000  ................
00000cd0: 0000 0000 0000 0000 360c 0000 1a00 0000  ........6.......
00000ce0: 0000 0000 0000 0000 0100 0000 0000 0000  ................

Trusting my gut feeling, I decided to ignore what the code was saying and wrote a Python script that decoded a bit more bytes than the assembly decoder:

with open("oxfoo1m3", "rb") as f:
    data = list(f.read())

KEY = 0x155
START = 0x196
STOP = 0xca8

data[START:STOP] = [x ^ data[KEY] for x in data[START:STOP]]
data[KEY] = 0

with open("oxfoo1m3-dexored", "wb") as f:
    f.write(bytes(data))

The output binary has its key set to zero, making the output, or at least the part that matters, identical. Surprisingly, this fixed the section headers and objdump started working fine:

oxfoo1m3-dexored:     file format elf32-i386
oxfoo1m3-dexored
architecture: i386, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x08048080

Program Header:
    LOAD off    0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
         filesz 0x00000c17 memsz 0x00000c17 flags rwx

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000b97  08048080  08048080  00000080  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .comment      0000001f  00000000  00000000  00000c17  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
no symbols

There's even a comment left!

00000c17: 0054 6865 204e 6574 7769 6465 2041 7373  .The Netwide Ass
00000c27: 656d 626c 6572 2030 2e39 382e 3338 0000  embler 0.98.38..

To continue, I replaced the encoded blob, and compared the output from the assembler with the decoded binary instead.

The encoded part of the code is constructed a bit differently. Namely, the obfuscation macros preserve all registers and flags:

%macro kn 0 ; for kill nothing
	pushfd
	pushad
	mcall ip2ecx
	add ecx, strict dword %%code - $
	push ecx
	ret
	db 0xe9 ; jmp, anti-re
%%code:
	popad
	popfd
%endmacro
...
ip2ecx:
	pop ecx
	push ecx
	ret

After labeling a bit more of the code, I identified the part responsible for checking the password (noop macros removed):

passworddata:
	db 'XXXXXXXXXXX'
	db 0x6d, 0x79, 0x6e, 0x65, 0x7b, 0x78, 0x74, 0x76, 0x66, 0x77, 0x7e
.end:

checkpassword:
	mov eax, __NR_read
	mov ebx, 0
	mov ecx, passworddata
	mov edx, 11
	int 0x80
	push eax
	pop edx
	mov esi, passworddata
	mov edi, passworddata ; unused
	mov ecx, 11
.loop:
	lodsb
	xor al, dl
	inc dl
	push ecx
	neg ecx
	add ecx, strict dword passworddata.end
	cmp al, [ecx] ; [passworddata.end - ecx], ecx goes backwards so this pointer goes forwards
	je .skip
	mcall fool
.skip:
	pop ecx
	loop .jmploop
...
.jmploop:
	jmp strict near .loop

This was enough to reconstruct the algorithm in python and get the password:

orig = [0x6d, 0x79, 0x6e, 0x65, 0x7b, 0x78, 0x74, 0x76, 0x66, 0x77, 0x7e]
key = 11
out = ''

for b in orig:
    out += chr(b ^ key)
    key += 1

print(out)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment