Skip to content

Instantly share code, notes, and snippets.

@sunapi386
Last active October 26, 2019 01:15
Show Gist options
  • Save sunapi386/39ed91a4d13f02ca62af0b8892ce9aac to your computer and use it in GitHub Desktop.
Save sunapi386/39ed91a4d13f02ca62af0b8892ce9aac to your computer and use it in GitHub Desktop.
stack smash with copy_elision and lying to c++ compiler
# About
smash.cpp attempts to create a case where stack gets smashed when you lie to the compiler to say your function returns void.
# Story
In the C++ world, copy_elision is a compiler optimization.
Prior to C++11, there was no copy_elision, so when you write code like this...
Person makePerson() {
Person p();
p.name = "Bob";
return p;
}
... the compiler creates a Person object in the makePerson stack frame and is copied/moved to the caller's stack after makePerson finshes.
But with with copy_elision, the compiler optimizes the process and rather than a move/copy, compiler sets the address of makePerson's Person directly to the return address. In other words, there is no move/copy.
Explained by the official website https://en.cppreference.com/w/cpp/language/copy_elision
The objects are constructed directly into the storage where they would otherwise be copied/moved to.
This optimization is usually unnoticed.
Great! So what? Why do I need to care about this?
Tl;dr: When you lie to the compiler.
As a general rule: if you lie to the compiler, there lies bugs, undefined behavior, and probably stack smashing.
Let me tell a story about using ROS.
ROS creates a layer of abstraction for inter process communication (IPC) with sockets, in a publisher/subscriber model.
The ROS semantics for a subscriber is very overloaded, as in there are many definitions of "subscribe".
http://docs.ros.org/lunar/api/roscpp/html/classros_1_1NodeHandle.html
Here's what happened.
I defined an Object and created a callback. E.g. Cards object and callback_add_card(Card newCard).
Important: The callback returned an int. E.g. "int callback_add_card".
This leads me to make a mistake... We shall see why later.
I constructed the object and passed that object's callback to ROS subscribe.
But ROS's subscribe definition expected (void*) function pointer.
I.e. void callback_add_card was acceptable but int callback_add_card is not.
So I lied to the compiler and cast this to a void*. Subtle mistake. Won't show up until later.
#define callback_add_card_fn (void (Cards::*)(const Card&))
nodeHandle.subscribe("/cards/add", subscribeQueueSize, callback_add_card_fn &Cards::callback_add_card, &my_cards);
The compiler was happy and compiled. Because the "#define cb_car_speed ..." told the compiler to treat the function as void.
Code mostly worked fine.
Because an int is relatively small (byte), there appears to be no bugs because the stack smashing wasn't obvious.
Sometime later, I changed the return type.
From int callback_add_card to Card callback_add_card.
Because why not just return the Card instead? So I can write some test code and see if the Card is correct.
Well! Now we have big trouble.
The compiler was told this callback_add_card had void return type. So compiler did not allocate any space for a return value. And because of copy_elison, the compiler had generated code for callback_add_card "directly into the storage where they would otherwise be copied/moved to."
In other words, calling callback_add_card smashes the stack of the caller. Because the caller expected void return value, and no spaces was allocated. But the copy_elison code constructed the Card object in the caller's stack frame!
So how do we get around this? How to have a callback that returns some value, but still call the ROS subscribe properly?
Answer: By wrapping the call in a lambda, creating a closure. The lambda returns no value, so it's a void function. But the lambda can capture context. E.g.
nodeHandle.subscribe<Card>("/cards/add", subscribeQueueSize, [&my_cards](Card newCard) {
cout << "the card added is: " << my_cards.callback_add_card() << "\n";
});

Looking at the difference between the generated assembly.

Foo::Foo() [base object constructor]:                         <
        pushq   %rbp                                          <
        movq    %rsp, %rbp                                    <
        movq    %rdi, -8(%rbp)                                <
        movq    -8(%rbp), %rax                                <
        pxor    %xmm0, %xmm0                                  <
        movss   %xmm0, (%rax)   ; 1.                          <
        nop                                                   <
        popq    %rbp                                          <
        ret                                                   <


my_func():                                                      my_func(): 
        pushq   %rbp                                                    pushq   %rbp       ; Save previous stack frame addr
        movq    %rsp, %rbp      ; 4.                                    movq    %rsp, %rbp ; Address of current stack frame as new base ptr
        subq    $16, %rsp  ; save 16 bytes for local data     |         movl    $0, %eax   ; move value 0 to function return register
        movq    %rdi, -8(%rbp)  ; 3.                          |         popq    %rbp       ; unwind the stack to exit function
        movq    -8(%rbp), %rax                                <         
        movq    %rax, %rdi      ; 2.                          <
        call    Foo::Foo() [complete object constructor]      <
        movq    -8(%rbp), %rax                                <
        leave                                                 <
        ret                                                             ret

SIGSEGV Analysis:

  1. movss %xmm0, (%rax)

Caused the SIGSEGV. Using gdb: p/x %rax points to 0x4004d6 which is the address of my_func(). Attempt to write to the stack is the problem. Memory pages are write or execute (read) only. So why is %rax pointing there?

  1. movq %rax, %rdi This line was when the %rax was last written to, in my_func. What did %rdi contain?

  2. movq %rdi, -8(%rbp) Contained whatever was at the address of %rbp - 8. What's in %rbp? The stack's base pointer of course. This was 0x4004d6.

Why was there no SIGSEGV when changing to int return value?

The mov instructions write things. In the assembly code of int my_func(), besides the standard function setup/teardown, there was only movl $0, %eax which isn't going to cause a problem.

Misc Info

Registers

%rbp : base pointer of current stack frame (called %ebp in 32 bit) %rsp : stack pointer (top element) %eax : return value of a function, 32 bit register %rax : 64 bit register, same use as %eax %rdi : 64 bit general purpose register

Instructions

movss : move scalar single precision floating point value (copies 32 lowest bits from a XMM 128 bit register) pxor : logical exclusive or movl : move long (32 bit) movq : move quad word (64 bit) pushq : push quad word onto the stack

Suffix b : 8b aka byte s : single 32b float w : 16b l : 32b int or 64b float q : 64b t : 80b, 10 bytes

Foo::Foo() [base object constructor]:
pushq %rbp
movq %rsp, %rbp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl $0, (%rax)
nop
popq %rbp
ret
my_func():
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, %rdi
call Foo::Foo() [complete object constructor]
movq -8(%rbp), %rax
leave
ret
indirection(void (*)()):
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
call *%rax
nop
leave
ret
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $my_func(), %edi
call indirection(void (*)())
movl $0, %eax
leave
ret
my_func():
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
popq %rbp
ret
indirection(void (*)()):
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
call *%rax
nop
leave
ret
main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $my_func(), %edi
call indirection(void (*)())
movl $0, %eax
leave
ret
using void_return_func_ptr_no_args = void (*)();
struct Foo {
int x;
Foo() { x = 0; } // if you comment this out, it runs
~Foo() {} // and also, if you comment this out, it runs
};
Foo my_func() { return {}; }
void indirection(void_return_func_ptr_no_args callback) { callback(); }
int main(int argc, char const* argv[]) {
indirection(void_return_func_ptr_no_args(&my_func)); // boom!
return 0;
}
using void_return_func_ptr_no_args = void (*)();
struct Foo {
int x;
Foo() { x = 0; } // if you comment this out, it runs
~Foo() {} // and also, if you comment this out, it runs
};
int my_func() { return {}; } // notice this part is int! the only difference is here!
void indirection(void_return_func_ptr_no_args callback) { callback(); }
int main(int argc, char const* argv[]) {
indirection(void_return_func_ptr_no_args(&my_func)); // boom!
return 0;
}
@sunapi386
Copy link
Author

sunapi386 commented Oct 25, 2019

$ gcc -std=c++11 -g3 smash-main.cpp
$ gdb a.out
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...done.
(gdb) r
Starting program: /home/user/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x000000000040053a in Foo::Foo (this=0x4004d6 <my_func()>) at smash-main.cpp:5
5	  Foo() { x = 0; } // if you comment this out, it runs
(gdb) d
(gdb) disas
Dump of assembler code for function Foo::Foo():
   0x000000000040052a <+0>:	push   %rbp
   0x000000000040052b <+1>:	mov    %rsp,%rbp
   0x000000000040052e <+4>:	mov    %rdi,-0x8(%rbp)
   0x0000000000400532 <+8>:	mov    -0x8(%rbp),%rax
   0x0000000000400536 <+12>:	pxor   %xmm0,%xmm0
=> 0x000000000040053a <+16>:	movss  %xmm0,(%rax)
   0x000000000040053e <+20>:	nop
   0x000000000040053f <+21>:	pop    %rbp
   0x0000000000400540 <+22>:	retq   
End of assembler dump.

@sunapi386
Copy link
Author

Here's what happen if you don't lie to the compiler.
image
I.e.

using void_return_func = void (*)();

struct Foo {
    int x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};

using foo_return_func = Foo (*)();

Foo my_func() { return {}; }
void indirection(foo_return_func callback) { callback(); }

int main(int argc, char const* argv[]) {
    indirection(foo_return_func (&my_func)); // boom!
    return 0;
}

Notice these very important lines

        subq    $32, %rsp				      |	        subq    $16, %rsp
        movq    %rdi, -24(%rbp)				      |	        movq    %rdi, -8(%rbp)
        leaq    -4(%rbp), %rdx				      |	        movq    -8(%rbp), %rax
        movq    -24(%rbp), %rax				      <
        movq    %rdx, %rdi				      <

which left room on the stack for the Foo object. There was no change in my_func() code.

@sunapi386
Copy link
Author

Stack Smashing Summary

Here's the MVP code you need to repro this.

using void_return_func_ptr_no_args = void (*)();
struct Foo {
    float x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};
Foo my_func() { return {}; }
void indirection(void_return_func_ptr_no_args callback) { callback(); }
int main(int argc, char const* argv[]) {
    indirection(void_return_func_ptr_no_args(&my_func)); // boom!
    return 0;
}

In gdb, I find the reason for this crash.

Dump of assembler code for function Foo::Foo():
   0x000000000040052a <+0>:	push   %rbp
   0x000000000040052b <+1>:	mov    %rsp,%rbp
   0x000000000040052e <+4>:	mov    %rdi,-0x8(%rbp)
   0x0000000000400532 <+8>:	mov    -0x8(%rbp),%rax
   0x0000000000400536 <+12>:	pxor   %xmm0,%xmm0
=> 0x000000000040053a <+16>:	movss  %xmm0,(%rax)
   0x000000000040053e <+20>:	nop
   0x000000000040053f <+21>:	pop    %rbp
   0x0000000000400540 <+22>:	retq   
End of assembler dump.

The (%rax) points to 0x4004d6 which is where the my_func begins, so writing to it would overwrite the instruction page. Kernel protection W^X causes SIGSEGV. The cause is from the compiler, which doesn't reserve enough stack space for Foo to be written to, in the indirection function. I confirmed this by looking at the compiler's assembly code.

subq $32, %rsp        | subq $16, %rsp
movq %rdi, -24(%rbp)  | movq %rdi, -8(%rbp) ; notice the -24 vs -8 bytes reserved
leaq -4(%rbp), %rdx   | movq -8(%rbp), %rax ; which is for local variables
movq -24(%rbp), %rax  < 
movq %rdx, %rdi       < 

@rraval
Copy link

rraval commented Oct 26, 2019

it's because the compiler didn't reserve any space on indirection's stack

Not quite. It's actually an unfortunate re-use of the %rdi register between the Foo() constructor and the indirection function.

I've modified your code to simplify and focus on certain things. Notice that main() now invokes the Foo() default constructor directly, no function call or RVO involved.

using void_return_func_ptr_no_args = void (*)();

struct Foo {
    float x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};

Foo my_func() { return {}; }
void indirection(void_return_func_ptr_no_args callback) { callback(); }

int main() {
    Foo foo;
    indirection(void_return_func_ptr_no_args(&my_func)); // boom!
    return 0;
}

Let's look at the Foo() constructor first:

0x5555555551f6 <Foo::Foo()>             push   %rbp
0x5555555551f7 <Foo::Foo()+1>           mov    %rsp,%rbp
0x5555555551fa <Foo::Foo()+4>           mov    %rdi,-0x8(%rbp)  ; (1) Take whatever was in %rdi...
0x5555555551fe <Foo::Foo()+8>           mov    -0x8(%rbp),%rax  ; (1) ... and move it into %rax
0x555555555202 <Foo::Foo()+12>          pxor   %xmm0,%xmm0      ; Set %xmm0 to 0
0x555555555206 <Foo::Foo()+16>          movss  %xmm0,(%rax)     ; Set *%rax to 0, SEGV if %rax is bad pointer
0x55555555520a <Foo::Foo()+20>          nop
0x55555555520b <Foo::Foo()+21>          pop    %rbp
0x55555555520c <Foo::Foo()+22>          retq

Most importantly, note that whatever the caller had in %rdi is assumed to be a writable pointer!

This works fine for main() since it initializes %rdi to point to the stack before calling Foo():

0x555555555194 <main()+24>      lea    -0x1c(%rbp),%rax             ; Set %rax to the pointer 0x1c before %rbp
0x555555555198 <main()+28>      mov    %rax,%rdi                    ; ... and then move it to %rdi
0x55555555519b <main()+31>      callq  0x5555555551f6 <Foo::Foo()>  ; Now we invoke `Foo()` with `%rdi` as a proper pointer

Now let's look at how indirection handles its registers:

0x555555555167 <indirection(void (*)())>        push   %rbp
0x555555555168 <indirection(void (*)())+1>      mov    %rsp,%rbp
0x55555555516b <indirection(void (*)())+4>      sub    $0x10,%rsp
0x55555555516f <indirection(void (*)())+8>      mov    %rdi,-0x8(%rbp)  ; Move whatever the caller put in %rdi
0x555555555173 <indirection(void (*)())+12>     mov    -0x8(%rbp),%rax  ; ... into %rax
0x555555555177 <indirection(void (*)())+16>     callq  *%rax            ; And jump to it as a function

So indirection is following the AMD64 ABI where the caller passes pointer sized arguments directly in registers: https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI

The calling convention is followed on Linux, macOS. The first six integer or pointer arguments are passed in registers RDI, RSI, RDX, RCX, R8, R9

So... the caller puts the first argument to indirection inside %rdi. indirection doesn't do any cleanup of %rdi but simply de-references it and jumps to it.

You see where this is going: in indirections case, the first argument is a function pointer, and so will point to the code segment. indirection leaves %rdi as the function pointer, and then the Foo() constructor tries to write 0 to the address in %rdi, which blows up.

Here's how main() invokes indirection:

0x5555555551a0 <main()+36>      lea    -0x5e(%rip),%rdi        # 0x555555555149 <my_func()>
0x5555555551a7 <main()+43>      callq  0x555555555167 <indirection(void (*)())>

That's basically what the paragraph above said, we load the an offset from the current instruction pointer into %rdi and then call indirection. That means %rdi is guaranteed to point to the non-writable code segment.

So, this is NOT about stack smashing! This is simply Foo() writing to whatever pointer is in %rdi, and indirection using %rdi as its first argument.

To prove this, here's a slight tweak to the code that puts a writable pointer as the first argument of indirection:

#include <cstdio>

using void_return_func_ptr_no_args = void (*)();

struct Foo {
    float x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};

Foo my_func() { return {}; }
void indirection(float *y, void_return_func_ptr_no_args callback) { callback(); }

int main() {
    float z = 1000;
    ::std::printf("%f\n", z);
    indirection(&z, void_return_func_ptr_no_args(&my_func)); // no boom!
    ::std::printf("%f\n", z);
    return 0;
}

This allocates a float z in main and then passes it as the first argument of indirection. When Foo() gets called, it grabs the first argument of indirection through the stack frame (&z) and assigns 0 to it.

If you run this code, it should print:

1000.000000
0.000000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment