Skip to content

Instantly share code, notes, and snippets.

@jtpaasch
Created October 15, 2021 17:33
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jtpaasch/5555c3e11b2a9817292e1f5b7fcd9701 to your computer and use it in GitHub Desktop.
Save jtpaasch/5555c3e11b2a9817292e1f5b7fcd9701 to your computer and use it in GitHub Desktop.
Notes for ARM assembly

ARM Assembly

Tutorials

Filenames

Filenames should normally end with an .s extension. E.g., hello-world.s, not hello-world.asm.

Comments

Comments begin with:

  • @
  • //
  • /* My comment */

You can use # at the very start of a line to indicate a comment:

    # Move 10 into r0
    mov r0, #10

But you can't do this:

    mov r0, #10  # Move 10 into r0

Sections

Declare a section:

.section NAME

contents here...

Values for NAME can be:

  • .data - Stores global data (read and write, not execute)
  • .rodata - Stores global read-only data
  • .bss - Has uninitialized blank space (read and write, not execute)
  • .text - Stores code (read and execute, but not write)

Sections should appear in an assembly file in that order.

.section .data
my_variable: .word 0xabc

.section .rodata
my_constant: .word 0x32

.section .bss
my_slot: .space 4

.section .text
.global _start

_start:
    mov r0, #5
    bx lr

Declaring symbols

  • .global foo declares foo as a global symbol

Registers

  • R0-R12 are GPRs

  • R13 is SP (stack pointer)

  • R14 is LR (link register) - stores the return address, but the value is often pushed to the stack at the start of a function and then popped at the end of the function, which effectively puts the value back in LR.

  • R15 is PC (program counter)

  • CSPR (current status of the program register) - flags

Data

In the data section:

   .data

msg: .ascii "My string"

location: .byte 10

my_int32: .int 3

my_int16: .short 4

my_float: .float 10.23

my_int_array: .int 10,11,12

Data

Specifying the size of a piece of data:

  • .byte 0xab - 0xab is 1-byte long
  • .hword 0xbeef - 0xbeef is 16-bits long (half word)
  • .word 0xdeadbeef - 0xdeadbeef is 32-bits long (word)

Giving a piece of data a label/name:

  • x: .word 0xdeadbeef - makes x the label for 0xdeadbeef

Pad to the next word limit:

  • .align

Example:

.section .data

a: .byte 8
b: .byte 0xab

.align 

Allocating space (.bss)

To allocate empty space, use .space to specify the number of bytes:

.section .bss
my_slot: .space 4 @ allocates 4 bytes of empty space

Load and Store

Load the address of a stored piece data by its label with ldr DEST, =LABEL:

.section .data

x: .word 0xdeadbeef

.section .text

ldr r1, =x @put the address of 'x' in r1

Note that ldr DEST, =LABEL loads the address of LABEL, not the value of LABEL.

Load a piece of data at an address with ldr DEST, [ADDR]:

ldr r2, [r1] @get the word starting at address stored in r1 and put it in r2

Note that this loads not the address in R1, but the value stored at the address in R1.

So =LABEL is like &LABEL in C, and [r1] is like *r1 in C.

Store a value at an address with str SRC, [ADDR]:

str r2, [r3]  @put the contents of r2 at whatever address is in r3

Note that ldr and str load and store words, i.e., the bytes in the next 4 addresses starting at [r1].

Endianness

In big-endian, bytes are numbered from left to right (or top to bottom in the memory stack). The left-most byte is the lowest address. So, the word is built top down: start at the top address, that's your first byte. Then go to the next memory slot down, that's your second byte.

+----------+------+
| ADDRESS  | DATA |
+----------+------+
| 00000000 | 0a   |
+----------+------+        +-----------------------------------+
| 00000001 | 0b   |   ===> |   0a   |   0b   |   0c   |   0d   |
+----------+------+        +-----------------------------------+
| 00000002 | 0c   |          byte 0   byte 1   byte 2   byte 3
+----------+------+
| 00000003 | 0d   |
+----------+------+

Little-endian is the opposite. It's built bottom up. It's like the English words are in order, but the letters in each word are backwards:

+----------+------+
| ADDRESS  | DATA |
+----------+------+
| 00000000 | 0d   |
+----------+------+        +-----------------------------------+
| 00000001 | 0c   |   ===> |   0a   |   0b   |   0c   |   0d   |
+----------+------+        +-----------------------------------+
| 00000002 | 0b   |          byte 0   byte 1   byte 2   byte 3
+----------+------+
| 00000003 | 0a   |
+----------+------+

ARM (and intel) is little endian.

Operating on values in memory

ARM is a load/store architecture, which means it can only operate on values in registers. So, if you need to modify something in memory, you have to put it into a register first, do what you want to it, then put it back.

Values (in operands slot)

Literal and computed values as instruction operands:

  • #9 is the literal integer 9
  • #-9 is the literal integer -9
  • #0xff is the hex value 0xff.
  • #0b110 is the binary value 110
  • #'d' is the ascii character d
  • [r0, #2] sums the value in r0 and the literal integer 2
  • [r0, #-2] subtracts the integer 2 from the value in r0

Simple instructions

Moving values:

mov r1, #100  @Place 100 in R1
mov r1, r3    @Place the value of R3 in R1

Arithmetic:

add r3, r1, #3  @r3 = r1 + 3
add r3, r1, r2  @r3 = r1 + r2
add r4, r4, #1  @r4++ (r4 + 1, and put back in r4)

The stack

Each function gets a "stack" of addresses in memory that it can use for scratch storage. The stack grows downwards, for a pre-determined number of addresses:

+----------+------+
| ADDRESS  | DATA |
+----------+------+
| 00000000 |      | <-- Bottom of the stack
+----------+------+
| 00000001 |      |
+----------+------+
| 00000002 |      |
+----------+------+
| 00000003 |      |
+----------+------+
     |
     V
+----------+------+
| 00000009 |      | <-- Top of the stack
+----------+------+

Addresses below the bottom of the stack fall in the "red zone," which we shouldn't write to, because it may be used by other functions in the program.

When we add things to the stack, we add them to the top of the stack, i.e., at the lower addresses. E.g., the first item we push on the stack will go in 0000009. The next will go in 00000008, and so on. When we pop values off the stack, we must go in reverse order (the stack is LIFO - last one in, first one out).

At the start of a function, SP (i.e., R13) points to the top of the stack.

We can load and store values on the stack using ldr and str, just as we can with any other addresses in memory.

To get to the next stack address, subtract the number of bytes you care about from SP to get the new address. To go back to the previous stack address, add the number of bytes you care about to SP to get the new address.

Each address on the stack has a slot that can hold one byte. So, if we want to put a 32-bit word on the stack, we have to use four slots, one for each byte of the 32-bit word. Hence, we need to subtract/add 4 from SP to get to the next/previous available address.


.section .text
.global _start

_start:
    mov r0, #0xaa
    mov r1, #0xbb

    @ Put r0 on the stack
    sub sp, sp, #4 @ move SP down 4 slots
    str r0, [sp]   @ put r0 on the stack, at the address of SP
                   @ it fills up 4 slots, 1 byte each

    @ Put r1 on the stack
    sub sp, sp, #4 @ move SP down 4 slots
    str r1, [sp]   @ put r1 in the address of SP

    @ Restore r2
    ldr r1, [sp]   @ load into r1 the 4 bytes starting at [sp]
    add sp, sp, #4 @ move SP up 4 slots
    ldr r0, [sp]   @ load into r0 the 4 bytes starting at [sp]

Pushing and popping

Push a list of registers to the top of the stack:

push {r0, r1, r3}

This pushes the 4 bytes stored in r0 onto the stack, then the 4 bytes stored in r1 onto the stack, then the 4 bytes stored in r3 onto the stack, in that order. It also increments SP (to SP - 12).

To pop the same list of registers off the stack and put their values back into their respective registers:

pop {r0, r1, r3}

Not that pop pops this list in reverse order: it pops r3 first, then r1, then r0, and it updates SP (to SP + 12).

Jumping and returning

ARM uses bl and bx to jump and then return. It stores the return address in the "link register" (lr), which is r14. bl stands for "branch and link," and bx stands for "branch exchange." The "exchange" part means that with the bx instruction, we can tell the kernel to start executing the next code in Thumb mode or not. We can "exchange" our execution mode to another mode if we like.

  • bl sets the lr register to the address right after the bl instruction.
  • bx lr returns to the address in lr.
.section .text
.global _start

foo:
00    mov r0, #3
04    bx lr  @ return to the address in lr (0x0c) and continue

_start:
08  bl foo @ put the next address (0x0c) in lr, and go to foo (0x00)
0c  mov r0, r1 
    ...

Calling functions

Say we have a caller function foo, and we want to call another function bar. The caller performs the following steps:

  • Put any data in r0-r3 onto the stack for later.
  • Put arguments for bar into r0-r3.
  • Use BL bar to jump to bar.

Then, the callee (i.e., bar) takes the following steps:

  • The "prologue" of the function:
    • Push lr on the stack, to save it for later
    • Save registers r4-r9 by pushing them onto the stack. We will need to restore these registers in the epilogue of the function, in case the caller is using them.
  • The "body" of the function:
    • Do any work we need to do in the function.
    • We can find the arguments passed to our function bar in r0-r3.
  • The "epilogue" of the function:
    • Put any values we want to return to the caller foo in registers r0-r3.
    • Pop the stored registers r4-r9 from the stack, so as to restore them.
    • Pop lr to restore it.
    • Return to the caller foo with bx lr.

Back to the caller (i.e., foo):

  • Pull out any returned values from r0-r3 that we need.
  • Pop r0-r3 from the stack, to restore them. This way everything is back to the way it was when we called bar.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment