sunapi386/analysis.md

## readme.txt
# About
smash.cpp attempts to create a case where stack gets smashed when you lie to the compiler to say your function returns void.

# Story

In the C++ world, copy_elision is a compiler optimization.

Prior to C++11, there was no copy_elision, so when you write code like this...

Person makePerson() {
    Person p();
    p.name = "Bob";
    return p;
}
... the compiler creates a Person object in the makePerson stack frame and is copied/moved to the caller's stack after makePerson finshes.

But with with copy_elision, the compiler optimizes the process and rather than a move/copy, compiler sets the address of makePerson's Person directly to the return address. In other words, there is no move/copy.

Explained by the official website https://en.cppreference.com/w/cpp/language/copy_elision

The objects are constructed directly into the storage where they would otherwise be copied/moved to.

This optimization is usually unnoticed.

Great! So what? Why do I need to care about this?
Tl;dr: When you lie to the compiler.

As a general rule: if you lie to the compiler, there lies bugs, undefined behavior, and probably stack smashing.

Let me tell a story about using ROS.
ROS creates a layer of abstraction for inter process communication (IPC) with sockets, in a publisher/subscriber model.

The ROS semantics for a subscriber is very overloaded, as in there are many definitions of "subscribe".


http://docs.ros.org/lunar/api/roscpp/html/classros_1_1NodeHandle.html

Here's what happened.

I defined an Object and created a callback. E.g. Cards object and callback_add_card(Card newCard).

Important: The callback returned an int. E.g. "int callback_add_card".
This leads me to make a mistake... We shall see why later.

I constructed the object and passed that object's callback to ROS subscribe.

But ROS's subscribe definition expected (void*) function pointer.
I.e. void callback_add_card was acceptable but int callback_add_card is not.
So I lied to the compiler and cast this to a void*. Subtle mistake. Won't show up until later.

#define callback_add_card_fn (void (Cards::*)(const Card&))

nodeHandle.subscribe("/cards/add", subscribeQueueSize, callback_add_card_fn &Cards::callback_add_card, &my_cards);
The compiler was happy and compiled. Because the "#define cb_car_speed ..." told the compiler to treat the function as void.

Code mostly worked fine.

Because an int is relatively small (byte), there appears to be no bugs because the stack smashing wasn't obvious.

Sometime later, I changed the return type.
From int callback_add_card to Card callback_add_card.
Because why not just return the Card instead? So I can write some test code and see if the Card is correct.

Well! Now we have big trouble.

The compiler was told this callback_add_card had void return type. So compiler did not allocate any space for a return value. And because of copy_elison, the compiler had generated code for callback_add_card "directly into the storage where they would otherwise be copied/moved to."

In other words, calling callback_add_card smashes the stack of the caller. Because the caller expected void return value, and no spaces was allocated. But the copy_elison code constructed the Card object in the caller's stack frame!

So how do we get around this? How to have a callback that returns some value, but still call the ROS subscribe properly?
Answer: By wrapping the call in a lambda, creating a closure. The lambda returns no value, so it's a void function. But the lambda can capture context. E.g.

nodeHandle.subscribe<Card>("/cards/add", subscribeQueueSize, [&my_cards](Card newCard) {
  cout << "the card added is: " << my_cards.callback_add_card() << "\n";
});

## analysis.md

      
    Raw
  

              analysis.md
            
          
    Looking at the difference between the generated assembly.
Foo::Foo() [base object constructor]:                         <
        pushq   %rbp                                          <
        movq    %rsp, %rbp                                    <
        movq    %rdi, -8(%rbp)                                <
        movq    -8(%rbp), %rax                                <
        pxor    %xmm0, %xmm0                                  <
        movss   %xmm0, (%rax)   ; 1.                          <
        nop                                                   <
        popq    %rbp                                          <
        ret                                                   <


my_func():                                                      my_func(): 
        pushq   %rbp                                                    pushq   %rbp       ; Save previous stack frame addr
        movq    %rsp, %rbp      ; 4.                                    movq    %rsp, %rbp ; Address of current stack frame as new base ptr
        subq    $16, %rsp  ; save 16 bytes for local data     |         movl    $0, %eax   ; move value 0 to function return register
        movq    %rdi, -8(%rbp)  ; 3.                          |         popq    %rbp       ; unwind the stack to exit function
        movq    -8(%rbp), %rax                                <         
        movq    %rax, %rdi      ; 2.                          <
        call    Foo::Foo() [complete object constructor]      <
        movq    -8(%rbp), %rax                                <
        leave                                                 <
        ret                                                             ret


SIGSEGV Analysis:


movss   %xmm0, (%rax)

Caused the SIGSEGV. Using gdb: p/x %rax points to 0x4004d6 which is the address of my_func().
Attempt to write to the stack is the problem. Memory pages are write or execute (read) only.
So why is %rax pointing there?


movq    %rax, %rdi
This line was when the %rax was last written to, in my_func. What did %rdi contain?


movq    %rdi, -8(%rbp)
Contained whatever was at the address of %rbp - 8. What's in %rbp? The stack's base pointer of course. This was 0x4004d6.


Why was there no SIGSEGV when changing to int return value?

The mov instructions write things. In the assembly code of int my_func(), besides the standard function setup/teardown,
there was only movl $0, %eax which isn't going to cause a problem.
Misc Info

Registers
%rbp : base pointer of current stack frame (called %ebp in 32 bit)
%rsp : stack pointer (top element)
%eax : return value of a function, 32 bit register
%rax : 64 bit register, same use as %eax
%rdi : 64 bit general purpose register
Instructions
movss : move scalar single precision floating point value (copies 32 lowest bits from a XMM 128 bit register)
pxor  : logical exclusive or
movl  : move long (32 bit)
movq  : move quad word (64 bit)
pushq : push quad word onto the stack
Suffix
b : 8b aka byte
s : single 32b float
w : 16b
l : 32b int or 64b float
q : 64b
t : 80b, 10 bytes

  
## foo.asm
Foo::Foo() [base object constructor]:
        pushq   %rbp
        movq    %rsp, %rbp
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        movl    $0, (%rax)
        nop
        popq    %rbp
        ret
my_func():
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        movq    %rax, %rdi
        call    Foo::Foo() [complete object constructor]
        movq    -8(%rbp), %rax
        leave
        ret
indirection(void (*)()):
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        call    *%rax
        nop
        leave
        ret
main:
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp
        movl    %edi, -4(%rbp)
        movq    %rsi, -16(%rbp)
        movl    $my_func(), %edi
        call    indirection(void (*)())
        movl    $0, %eax
        leave
        ret

## int.asm
my_func():
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $0, %eax
        popq    %rbp
        ret
indirection(void (*)()):
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp
        movq    %rdi, -8(%rbp)
        movq    -8(%rbp), %rax
        call    *%rax
        nop
        leave
        ret
main:
        pushq   %rbp
        movq    %rsp, %rbp
        subq    $16, %rsp
        movl    %edi, -4(%rbp)
        movq    %rsi, -16(%rbp)
        movl    $my_func(), %edi
        call    indirection(void (*)())
        movl    $0, %eax
        leave
        ret

## smash-foo.cpp
using void_return_func_ptr_no_args = void (*)();

struct Foo {
    int x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};

Foo my_func() { return {}; }
void indirection(void_return_func_ptr_no_args callback) { callback(); }

int main(int argc, char const* argv[]) {
    indirection(void_return_func_ptr_no_args(&my_func)); // boom!
    return 0;
}

## smash-int.cpp
using void_return_func_ptr_no_args = void (*)();

struct Foo {
    int x;
    Foo() { x = 0; } // if you comment this out, it runs
    ~Foo() {} // and also, if you comment this out, it runs
};

int my_func() { return {}; } // notice this part is int! the only difference is here!
void indirection(void_return_func_ptr_no_args callback) { callback(); }

int main(int argc, char const* argv[]) {
    indirection(void_return_func_ptr_no_args(&my_func)); // boom!
    return 0;
}
	# About
	smash.cpp attempts to create a case where stack gets smashed when you lie to the compiler to say your function returns void.

	# Story

	In the C++ world, copy_elision is a compiler optimization.

	Prior to C++11, there was no copy_elision, so when you write code like this...

	Person makePerson() {
	Person p();
	p.name = "Bob";
	return p;
	}
	... the compiler creates a Person object in the makePerson stack frame and is copied/moved to the caller's stack after makePerson finshes.

	But with with copy_elision, the compiler optimizes the process and rather than a move/copy, compiler sets the address of makePerson's Person directly to the return address. In other words, there is no move/copy.

	Explained by the official website https://en.cppreference.com/w/cpp/language/copy_elision

	The objects are constructed directly into the storage where they would otherwise be copied/moved to.

	This optimization is usually unnoticed.

	Great! So what? Why do I need to care about this?
	Tl;dr: When you lie to the compiler.

	As a general rule: if you lie to the compiler, there lies bugs, undefined behavior, and probably stack smashing.

	Let me tell a story about using ROS.
	ROS creates a layer of abstraction for inter process communication (IPC) with sockets, in a publisher/subscriber model.

	The ROS semantics for a subscriber is very overloaded, as in there are many definitions of "subscribe".


	http://docs.ros.org/lunar/api/roscpp/html/classros_1_1NodeHandle.html

	Here's what happened.

	I defined an Object and created a callback. E.g. Cards object and callback_add_card(Card newCard).

	Important: The callback returned an int. E.g. "int callback_add_card".
	This leads me to make a mistake... We shall see why later.

	I constructed the object and passed that object's callback to ROS subscribe.

	But ROS's subscribe definition expected (void*) function pointer.
	I.e. void callback_add_card was acceptable but int callback_add_card is not.
	So I lied to the compiler and cast this to a void*. Subtle mistake. Won't show up until later.

	#define callback_add_card_fn (void (Cards::*)(const Card&))

	nodeHandle.subscribe("/cards/add", subscribeQueueSize, callback_add_card_fn &Cards::callback_add_card, &my_cards);
	The compiler was happy and compiled. Because the "#define cb_car_speed ..." told the compiler to treat the function as void.

	Code mostly worked fine.

	Because an int is relatively small (byte), there appears to be no bugs because the stack smashing wasn't obvious.

	Sometime later, I changed the return type.
	From int callback_add_card to Card callback_add_card.
	Because why not just return the Card instead? So I can write some test code and see if the Card is correct.

	Well! Now we have big trouble.

	The compiler was told this callback_add_card had void return type. So compiler did not allocate any space for a return value. And because of copy_elison, the compiler had generated code for callback_add_card "directly into the storage where they would otherwise be copied/moved to."

	In other words, calling callback_add_card smashes the stack of the caller. Because the caller expected void return value, and no spaces was allocated. But the copy_elison code constructed the Card object in the caller's stack frame!

	So how do we get around this? How to have a callback that returns some value, but still call the ROS subscribe properly?
	Answer: By wrapping the call in a lambda, creating a closure. The lambda returns no value, so it's a void function. But the lambda can capture context. E.g.

	nodeHandle.subscribe<Card>("/cards/add", subscribeQueueSize, [&my_cards](Card newCard) {
	cout << "the card added is: " << my_cards.callback_add_card() << "\n";
	});
	Foo::Foo() [base object constructor]:
	pushq %rbp
	movq %rsp, %rbp
	movq %rdi, -8(%rbp)
	movq -8(%rbp), %rax
	movl $0, (%rax)
	nop
	popq %rbp
	ret
	my_func():
	pushq %rbp
	movq %rsp, %rbp
	subq $16, %rsp
	movq %rdi, -8(%rbp)
	movq -8(%rbp), %rax
	movq %rax, %rdi
	call Foo::Foo() [complete object constructor]
	movq -8(%rbp), %rax
	leave
	ret
	indirection(void (*)()):
	pushq %rbp
	movq %rsp, %rbp
	subq $16, %rsp
	movq %rdi, -8(%rbp)
	movq -8(%rbp), %rax
	call *%rax
	nop
	leave
	ret
	main:
	pushq %rbp
	movq %rsp, %rbp
	subq $16, %rsp
	movl %edi, -4(%rbp)
	movq %rsi, -16(%rbp)
	movl $my_func(), %edi
	call indirection(void (*)())
	movl $0, %eax
	leave
	ret
	using void_return_func_ptr_no_args = void (*)();

	struct Foo {
	int x;
	Foo() { x = 0; } // if you comment this out, it runs
	~Foo() {} // and also, if you comment this out, it runs
	};

	Foo my_func() { return {}; }
	void indirection(void_return_func_ptr_no_args callback) { callback(); }

	int main(int argc, char const* argv[]) {
	indirection(void_return_func_ptr_no_args(&my_func)); // boom!
	return 0;
	}