iankronquist/c functions.md

## c functions.md

      
    Raw
  

              c functions.md
            
          
    We're going to focus on x86 because it's the easiest to understand. I'll then explain how x86_64 is more efficient.
So the stack is where local variables are stored. It is just a chunk of memory
available for the program. The top of the stack is stored in a special register
called the stack pointer, or esp on x86. It grows from higher addresses to
lower addresses. To put a value on the stack you can
use the push instruction.
push 42  # Push the literal number 42 to the stack
push eax # Push the value in the eax register to the stack. The value of eax
         # does not change!
push esp # You should be able to guess what this does.

To take a value off the stack you can use the pop instruction
pop eax # Take the value at the top of the stack and put it in the eax
        # register. Then add 4 bytes to the stack pointer. The previous value
        # of the eax register is destroyed.

Now, to go to somewhere in the program you can use the jmp or jump instruction.
It's basically a goto. You could use this for calling functions, but you
wouldn't know where you came from. You can store the address of the code you
came from on the stack with push, so calling a function could be expressed like
this:
# Let some_function be the address of some function
push esp  # Save the location of the code you were executing
jmp  some_function  # Go directly to some_function and execute code from there.

Fortunately x86 has an assembly instruction that does both of these in one
step. It's called call
# Does the same thing as above.
call some_function

Now, once we are done executing, we need to get back to where we came from. We
can use a combination of pop and jmp.
call some_function

some_function:
	# work here
	pop eax
	jmp eax

Fortunately there is a single instruction which does pop and jmp at the same
time without overwriting any of the registers. It is called ret.
I need to introduce one more instruction really quick, called mov. It's like the cp command for registers.
mov eax, ebx  # Move the value in ebx into eax

Now, how do we pass arguments to the function? On x86 using the c decl/Posix ABI
calling convention, arguments are passed by the stack. The return value is
always stored in the eax register. So if you have a function like this:
int func(int a, int b) {
	return a + b;
}

int main() {
	return func(1,2);
}

That could be compiled into this:
main:
	push 1
	push 2
	call some_function
	add esp, 8
	ret

some_function:
	mov eax, [esp+4] # Put the number 2, which is the second argument, into the
	                 # eax register. The brackets mean "get the value at this
	                 # memory address" just like dereferencing the pointer.
	mov esi, [esp+8] # Put the number 1, which is the first argument, into the
	                 # esi register.
	add eax, esi     # eax = eax + esi
	ret

So before the function is called, the stack looks like this:
Address | value | location of esp
0x16: [...]
0x12: [ 1 ]
0x8:  [ 2 ] <- esp

And then once it's called:
0x16: [...]
0x12: [ 1 ]
0x8:  [ 2 ]
0x4:  [ address of some_function ] <- esp

After the return:
0x16: [...]
0x12: [ 1 ]
0x8:  [ 2 ] <- esp

After adding space back to the stack
0x16: [...] <- esp

Now, on x86, arguments are not passed via the stack. The first and second
pointer or integer arguments are passed in the edi and esi registers
respectively and any following arguments are passed via the stack. Structs are
passed by the stack and floats are passed by the floating point registers.