Skip to content

Instantly share code, notes, and snippets.

@eievui5
Last active November 28, 2021 16:31
Show Gist options
  • Save eievui5/adf5585ee1a5b402982a0292f6892c45 to your computer and use it in GitHub Desktop.
Save eievui5/adf5585ee1a5b402982a0292f6892c45 to your computer and use it in GitHub Desktop.

Functions, Macros, and Encapsulation; Promoting code re-use in assembly

The Game Boy has very simple function support through the call and ret opcodes. These allow you to create blocks of code that can be run from any location in rom, allowing you to re-use code from many different places.

Additionally, RGBDS has support for macros, blocks of code which are expanded at compile-time to real code or data rather then calling to another location in rom.

Understanding how and when to use these features is crucial in keeping your game size down, reducing overhead, and making your assembly code more readable.

Table of contents:

How to write a function in assembly language

There are a few different ways to create a function of the Game Boy, but all revolve around setting up some data, calling a label, and then reading the data that was returned. The most common way to do this is with the CPU's registers, but it's also possible to use the stack or an area of RAM when register pressure is high.

An example function, using the a register to input and output data.

Main:
    ld a, -10
    call AbsA
    cp a, 10 
    ; The function changed `a` from -10 to 10, so the program will now jump to .success
    jr z, .success 

; Sets `a` to its absolute value (Its distance from 0)
AbsA:
    bit 7, a
    ret z
    cpl
    inc a
    ret

This function is rather short but provides a good example of the a register being used as a return value. You can expand upon this to create functions that output complex data, or to create fast and re-usable collision detection.

There are also other ways of returning data. Take this function for example:

Main:
    ld a, 10
    ld b, 20
    ld c, 10
    call TernaryEquals
    ; Exit if `a`, `b`, and `c` are equal.
    ; This will fail because 10 is not equal to 20
    ret z
    ld b, 10
    call TernaryEquals
    ; However, if we load 10 into b the function will set the Zero Flag
    ret z

; Sets the Zero Flag (`z`) if `a`, `b`, and `c` are equal
TernaryEquals:
    cp a, b
    ret nz
    cp a, c
    ret

(Note: please keep in mind that this function is just an example for the sake of brevity. The best way to handle short functions is explained in the macros chapter)

This function first compares a and b which sets the Zero Flag if both registers are equal. If a is not equal to b then the function fails and exits before even checking c.

Return values are great, but they have a few issues with large functions that use many registers. When all registers are needed you will have to find other ways to store data, the simplest of which is using the stack.

MAX_ENTITIES EQU 16
sizeof_Entity EQU 16

; Loads a pointer and position into wEntityArray if space is available.
; @ b:  X position
; @ c:  Y position
; @ de: Entity Script
SpawnEntity::
    push bc
    push de
    ld d, MAX_ENTITIES + 1
    ld bc, sizeof_Entity
    ld hl, wEntityArray - sizeof_Entity
.loop
    dec d
    jr z, .break
    add hl, bc
    ld a, [hli]
    and a, a ; cp a, $00
    jr nz, .loop ; if `a != 0`, check the next position.
    ld a, [hld]
    and a, a ; cp a, $00
    jr nz, .loop ; if `a != 0`, check the next position.
    ; If both bytes were clear then we can use our input values to spawn an entity!
    pop de
    pop bc
    ld a, e
    ld [hli], a ; Load the first pointer byte.
    ld a, d
    ld [hli], a ; Load the second pointer byte
    ld a, c
    ld [hli], a ; Load the Y Position
    ld a, b
    ld [hli], a ; Load the X Position
    ret
.break
    ; You need to clean up your stack if you return before popping the input values.
    ; This is because `call/ret` share the stack with `push/pop`, which can cause your functions to return to random locations if you're not careful.
    pop de
    pop bc
    ret

This may be a bit difficult to understand, but it provides a real example of a complex function that a game engine could use. The function starts by pushing its inputs to the stack because seeking through arrays is pressure-intensive. When the values are needed again they can simply be popped and used to add an element to the array. Notice the .break label at the end, since this function uses the stack it needs to clean it up before exiting. Otherwise, ret would jump to the last value we stored, de, which in this case would cause the Game Boy to run some random data as code!

You may also find it more convenient to use a value in RAM at times. This can be faster than the stack for a single value. Just be careful to avoid this when possible, and make sure you have a good understanding of RAM on the Game Boy

Main:
    ld a, $AB
    ldh [hFoobarInput], a
    call Foobar

Foobar:

    ; ... a long block of register-heavy code ...
    ; Let's pretend that we calculated a value in `b` and now want to compare it to the input

    ldh a, [hFoobarInput]
    cp a, b
    jr z, .equalsBranch

    ; If the values aren't equal, we continue here

.equalsBranch

    ; Otherwise, we make a relative jump to this branch

SECTION UNION "Volatile", HRAM
; This SECTION is a UNION, which is useful for cutting down on RAM usage.
; Take a look at the RGBASM docs if you want to learn about UNIONs

hFoobarInput:
    ds 1 ; Reserve just one byte for inputting data

And as one final tip, you can slightly speed up your function if it ends with a call and a ret. This is known as a "tail call".

Consider this example:

Main:
    call Foo
    call Bar
    call Baz
    ret

Because every function ends with a ret, we can take advantage of Baz's ret and omit it from Main.

Main:
    call Foo
    call Bar
    jp Baz ; This is faster *and* smaller!

Just keep in mind that this only works for a call that occurs right before a ret!

Macros: How and when to use them

RGBDS supports the use of macros, blocks of code that expand every time they're called upon. These can be thought of as simply copy/pasting code, but have some much more advanced features and make your programs easier to read. However, it's very easy to overuse macros to the point where they make your program unreadable and confusing, not to mention impossible to optimize.

Let's start with a short macro that you may recognize from the functions chapter

Main:
    ld a, -10
    abs_a
    cp a, 10 
    ; The function changed `a` from -10 to 10, so the program will now jump to .success
    jr z, .success 

MACRO abs_a
    bit 7, a
    jr z, .skip\@
    cpl
    inc a
.skip\@
ENDM

Notice that we don't call abs_a because it is not a function. Short macros like this one should be used over call to speed up the program, this is because call/ret alone take up 10 machine cycles and 4 bytes. If you add up every line of code in the abs_a macro, you'll find that it only takes 7 cycles in the worst case, and only 6 bytes total. In this case, a function is inappropriate unless you are extremely tight on ROM space, since a call only saves 2 bytes with each invocation, and has to sacrifice 10 cycles for that tradeoff.

While this provides a great example of how to write a macro, this is not a great way of using them in your program. This is because short macros like this, usually called "pseudo-ops", can promote inflated code that obscures errors and eliminates any chance of optimization. While abs_a is already fairly optimized, it should be thoroughly understood before it is used in your code.

A common and more useful pseudo-op is lb, which can be used to speed up 16-bit loads while keeping code easy to read. However, this function relies on the most important feature of macros: arguments. Arguments are used to replace certain parts of a macro with user-defined strings. Consider the lb macro:

; Usage:
: @ lb r16, n8, n8
MACRO lb
    IF _NARG != 3
        FAIL "Expected 3 arguments!"
    ENDC
    ld \1, \2 << 8 | \3
ENDM

\1, \2, and \3 denote the arguments, and the variable _NARG contains the Number of ARGuments. This macro can be used as follows:

Main:
    ; We want:
    ; ld b, 100
    ; ld c, 200
    lb bc, 100, 200

This works because ld r16, n16 is 3 bytes and 3 cycles, whereas ld r8, n8 is 2 bytes and 2 cycles. Rather than writing ld r8, n8 twice we can combine them into a single instruction that takes care of both.

Macros are also very useful for defining data without using an external program. For example, you could use these Macros for a bytecode-based scripting language:

; @ teleport_player n16 YPos, n16 XPos
MACRO teleport_player
    IF _NARG != 2
        FAIL "Expected 2 arguments!"
    ENDC
    db BYTE_TELEPORT_PLAYER
    dw \1, \2
ENDM

Or you could use them to define a far pointer; a pointer to another bank:

; @ far_pointer n16 Pointer
MACRO far_pointer
    IF _NARG != 1
        FAIL "Expected 1 argument!"
    ENDC
    db BANK(\1)
    dw \1

Macros are an extremely powerful tool for defining data and optimizing code. Just keep in mind that they should not be overused for short pseudo-ops. There's nothing wrong with keeping a document full of short snippets you've picked up, and others will thank you for writing cleaner code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment