Skip to content

Instantly share code, notes, and snippets.

@DatZach
Created June 18, 2020 23:55
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DatZach/96a30d6ae4225f8ec152719e57aed26b to your computer and use it in GitHub Desktop.
Save DatZach/96a30d6ae4225f8ec152719e57aed26b to your computer and use it in GitHub Desktop.
Idiomatic GML - Garbage Collector (Preview Revision 0)

Garbage Collector

Garbage Collectors (GC) are common features in high level languages such as JavaScript or C#. The purpose of a Garbage Collector is to reclaim memory after it is no longer being used by anything in the program, or game in our case. Realistically, a Garbage Collector is a nessesity for the design of higher level languages that don't allow explicit access to the underlying memory but instead abstract those concept away behind the meaning of that memory in the form of struct instances and variables as in GML. That is to say, you still have to be cognizant of what is and isn't tracked by the GC, and also what is and isn't in use. In effect, you are still responsible for cleaning up after yourself, the difference being calling for the custodian to take care of the mess rather than carrying it to the trashcan yourself; you still need to make it known that it's trash generally.

Tracked and Untracked data

GameMaker Language, unlike other high level languages has a catch when it comes to its GC. There is data that is tracked by the GC, such as: structs, strings, and arrays; and there is data that is untracked, such as: ds_maps, ds_lists, surfaces, and buffers. Generally speaking if the data is something coming from syntax, it is tracked by the GC; if it comes from the result of calling a _create function (and has an equivelent _destroy function) then it is not tracked by the GC.

This leads to an interesting question in GML 2.3, how should we organize our code to take advantage of the GC? There are absolutely still cases where it's appropriate to use untracked data. In GML we have always had the complexity of keeping in mind when we were mixing tracked and untracked data; for example: An array of ds_map references. With the addition of structs however, we have many more considerations to take in mind for how we design our code.

Considerations of structs

This is one of the most important parts of Idiomatic GML: When dealing with structs you should avoid allocating untracked data in the Constructor Function. The reasoning for avoiding this pattern is due to the fact that GML 2.3 does not have formalized destructors. This is very common in other high level languages, and even in ones where destructors exist, their usage is not recommended because you cannot know when the GC will clean up and subsequently run the destructor. GameMaker takes this best practice one step further by simply not providing the feature in the language. This would not be a problem if it weren't for the fact that GML2.3's legacy runtime still contains a significant amount of untracked data, especially for very common systems such as buffers and ds_lists.

Bearing this in mind, let's take a look at some patterns to deal with untracked data in structs in order of recommendation.

Replace untracked data with tracked data

This is absolutely the way to go. YoYoGames has indicated that future version of the GML runtime will be moving towards using tracked data, and deprecating untracked data in favor of tracked data. This includes improving the feature set for working with tracked data to match parity with untracked data.

TODO Are there other subs?

Explicitly stated, you should use arrays instead of ds_lists, structs instead of ds_maps. Refactor code to avoid taking ownership of other kinds of untracked data such as: ds_stacks, buffers, and surfaces. Let's take a look at how to implement some of these substitutions.

Using an array instead of a ds_list:

/* Poorly designed code */

function TreasureChest() constructor {
    items = ds_list_create();

    static addItem = function (itemId) {
        ds_list_add(itemId);
    };
}

var chest = new TreasureChest();
chest.addItem(global.items.sword);
chest.addItem(global.items.potion);
// MEMORY LEAK! Local variable 'chest' will leak untracked 'items' when cleaned by the GC


/* Well designed code */

function TreasureChest() constructor {
    items = [];

    static addItem = function (itemId) {
        items[array_length(items)] = itemId;
    };
}

var chest = new TreasureChest();
chest.addItem(global.items.sword);
chest.addItem(global.items.potion);
// GOOD! Local variable 'chest' will release last reference of 'items' when cleaned
//       resulting in both being cleaned by the GC

TODO Better example? TODO Tackle when it's possible (missing certian features from ds lists)

Using a struct instead of ds_map (this is covered in more detail in the Structs as Maps chapter):

/* Poorly designed code */

TODO Better example?

Consumer Pattern

If it is not possible to avoid using untracked data, such as in the case of buffers, surfaces and dynamically loaded sprites then we can use a Consumer Pattern to dispose of these resources. This is a pattern best suited for cases where we're composing data, such is the common case when working with buffers. The idea is that we produce a struct instance that will later be consumed by a function. That function is responsible for cleaning up the resources allocated during the production of the struct instance.

Consider the following example:

function Packet(type) constructor {
    buffer = buffer_create(3, buffer_grow, 1);
	buffer_seek(buffer, buffer_seek_start, 2);
	buffer_write(buffer, buffer_u8, type);
}

function ChatMessage(message) : Packet(PacketType.ChatMessage) constructor {
	buffer_write(buffer, buffer_string, message);
}

// Consumer Pattern
function net_send_packet(packet) {
    // 1. Guard against freeing already free resources
    if (packet.buffer != undefined) {
        // 2. Do the work
        var len = buffer_tell(packet.buffer);
        network_send_packet(global.socket, packet.buffer, len);
        
        // 3. Clean up resources and mark as freed
        buffer_delete(packet.buffer);
        packet.buffer = undefined;
    }
}

enum PacketType {
	None,
	ChatMessage,
	
	sizeof
}

Destructors Library

Another option is to use the Destructors Library (https://github.com/DatZach/Destructors). This library adds destructors as a language feature to GML2.3. However, since it is a library it cannot be considered Idiomatic GML. Additionally, it suffers from a limitation with regards to YYC and HTML5 builds in that it does not work, so you will not be able to target these platforms if you use it. Although, I expect that these limitations will be able to be resolved in future version of GML. That said, this is still a very useful library in some cases.

Consider the following example:

function Foo() constructor {
    list = ds_list_create();

    dtor_track(DtorType.List, list);
}

var a = new Foo();  // a.list exists
delete a;           // a.list has been destroyed

Often times this can be the simpliest way to deal with cleaning up untracked data inside of tracked data.

Cleanup Methods

Cleanup Methods are essentially a more obtuse version of the Consumer Pattern as previously described. Instead of the cleanup of resources being inline to a consumer function, they're encapsulated inside of the produced struct instance. This is particularly useful for cases where the data being worked with isn't nessisarily for a single purpose, or a composition of other data. Examples of this are surfaces and dynamically loaded sprites, where those resources might be drawn indefinitely. Or as in the case of surfaces might even spontaneously be freed by the GameMaker Runner.

function Minimap() constructor {
    var MINIMAP_SIZE = 240;

    surface = surface_create(MINIMAP_SIZE, MINIMAP_SIZE);

    static draw = function () {
        // ...
    };

    // ...

    static free = function () {
        if (surface != undefined) {
            surface_free(surface);
            surface = undefined;
        }
    };
}

/// mOverlay : Object
/// Create Event
minimap = new Minimap();

/// Draw Event
minimap.draw();

/// Clean Up Event
if (minimap != undefined) {
    minimap.free();
    delete minimap;
}

The unfortunate downside of this approach is observable in the Clean Up Event. If we are not careful we can forget to call free before we delete the struct instance constained in the variable minimap. We additionally should have an undefined guard around this block of code. This is relatively high cognative load which is why it's last on the list of recommendations. That said, it's still a very powerful pattern and is better than externally freeing untracked data.

There are instances with certain uses of inline structs that permit the usage of untracked data; for instance: a global singleton, or a struct used as a namespace.

Managing the Garbage Collector

The delete operator

TODO Rework this with new knowledge and recommendations

We can use the delete operator to notify the Garbage Collector that we're done with a particular reference of a struct instance. Invoking delete takes a single operand which is a variable that contains a reference to a struct instance. This operator in effect is simply syntactic sugar for assigning undefined to a variable. That said, it's still idiomatic GML to use delete over simply setting a variable to undefined as future version of the runtime might modify behavior to hint some additional behavior to the Garbage Collector.

Consider the following:

function Vector2(_x, _y) constructor {
    x = _x;
    y = _y;
}

var a = new Vector2(1, 2);
delete a;

We only have a single usage of the struct instance stored in local variable a, so it will be freed before the next step event.

Next let's take a look at how delete works when there's multiple references to a single struct instance:

function PlayerInfo(_name, _hp) constructor {
    name = _name;
    hp = _hp;
}

/// Object oPlayer : Create
info = new PlayerInfo("Username", 50);

/// Object oPlayer : Clean Up
// Nothing to do, info will implicitly become undefined when the instance is destroyed
// we could use delete here for good measure, but it's not strictly required


/// Object oEnemy : Create
targetInfo = undefined;

/// Object oEnemy : Collision with oPlayer
targetInfo = other.info;

/// Object oEnemy : Key Pressed <A>
delete targetInfo;

In the above example, let's consider the following senario: An instance of oPlayer and oEnemy both exist. oEnemy collides with oPlayer. Later, the instance of oPlayer is destroyed. What happens to info in oPlayer? It still exists! We copied the reference in the collision event of oEnemy. The struct instance will live until one of the following occurs:

  • oEnemy collides with another instance of oPlayer. Thus setting targetInfo to a new struct instance reference and dropping the reference count of the previous PlayerInfo struct instance to 0.
  • oEnemy is destroyed. Thus implicitly setting targetInfo to undefined, dropping the reference count to 0.
  • The 'A' key is pressed. Thus deleting the targetInfo, implicitly setting it to undefined and dropping the reference count to 0.

TODO Multiple deletes

A tour through the Garbage Collector

GameMaker 2.3 uses a generational garbage collector with mark and sweep reference counting.

TODO Explain what this means

Optimizing code for the Garbage Collector

Optimizing code with the Garbage Collector in mind is an exercise in reducing in the amount of work it has to do. This largely means reducing the allocation of tracked data. It is very important to keep in mind that there are different kinds of memory allocation that can happen in GML. Some of these allocations are tracked and managed by the GC, some are untracked and dealt with by the Runner via other methods. When we are optimizing with the GC in mind, all we need concern ourselves with is what adds additional work to the GC. As of the current runtime those seem to be structs, and object instances. Other data types such as strings, and arrays, while "tracked data" (as in they do not require manual deallocation), are not tracked or managed by the GC. Additionally, data such as ds_lists, ds_maps, and surfaces are also not tracked by the GC. This is not to downplay the important of managing these resources! Reducing their memory footprint will also improve the performance of your game as you will have a higher chance of achieving Cache Hits, among other benefits. However, they are not worth considering in this section as we explore reducing work for the GC, except perhaps for how they can be leveraged in the reduction of that work. Also, the advice in this section is not intended to reduce the time it takes to execute code, but rather reduce the amount of time between frames the Runner spends cleaning up from the code that was just ran. Generally we will not see an improvement in the execution speed of code by employeeing these techniques, but rather an improvement in the number of frames per second due to being able to get to the next one faster. The following are all strategies you can use to reduce Garbage Collection work.

Prefer stack allocation

TODO Reconsider cutting this section, GML is basically already optimized for this so ???

Whenever we call a function, or enter an event, stack space is allocated for the local variables. That space is freed automatically when we return from an event or method. Data on the stack is almost always faster to read/write than data in the heap. This is less the case with GML (excluding YYC builds) due to the way the Virtual Machine works, but we still have an advantage: Stack variables do not have the overhead of being known to the Garbage Collector. Even with certian data types like strings, and arrays where the contents are stored somewhere in an unmanaged heap (not the GC's heap), their memory is still reclaimed faster as they are freed immediately upon reclaiming stack space.

Avoid short lived instances

Allocating a lot of short lived struct instances should be avoided where possible. Short lived struct instances are defined as any struct instance that won't survive past the first generation. These are generally stored in local variables and will be found inside of tight loops. It's important to keep in mind that Object Instances are also backed by a struct instance, so creating many object instances that live for a relatively short period of time will also add unnessisary work to the Garbage Collector.

Consider the following:

// Original Code - 200 Allocations
function Vector2(_x, _y) constructor {
    x = _x;
    y = _y;
}

repeat (100) {
    var point = new Vector2(
        irandom(room_width),
        irandom(room_height)
    );

    instance_create_layer(point.x, point.y, layer, oFirefly);
}

// OPTIMIZED CODE - 100 Allocations
repeat (100) {
    var xx = irandom(room_width);
    var yy = irandom(room_height);

    instance_create_layer(point.x, point.y, layer, oFirefly);
}

In the above code we unroll the short lived Vector2 struct instance into local variables. If the the lifetime of oFirefly is long enough to get past generation 0, then this code is as optimized as we can make it.

Avoid heap fragmentation

TODO Expand on this

Pool instances

There are occationally cases where we simply need a lot of short lived struct instances. One optimization we can use in this case is something called Instance Pooling, where we allocate the expected number of struct instances we need up front and then modify their fields as needed. This allows the instances to pass generation 0 into later generations, hence reducing their overhead and the amount of time the Garbage Collector spends running each frame.

One pattern for instance pooling is to use a fixed array to hold references to all the struct instances we wish to pool. Our firefly example from earlier is actually a perfect example for us to further optimize using this method.

// Original Code - 100 Allocations per alarm

/// Object oFirefly : Create
alarm[0] = room_speed;      // Live for 1 second

/// Object oFirefly : Alarm 0
instance_destroy();

/// Object oFirefly : Draw
draw_sprite(sFirefly, 0, x, y);


/// Object oFireflySpawner : Create
alarm[0] = room_speed;

/// Object oFireflySpawner : Alarm 0
repeat (100) {
    var xx = irandom(room_width);
    var yy = irandom(room_height);

    instance_create_layer(point.x, point.y, layer, oFirefly);
}

alarm[0] = room_speed;

/// OPTIMIZED CODE - 0 Allocations per alarm

function Vector2(_x, _y) constructor {
    x = _x;
    y = _y;
}

/// Object oFireflySpawner : Create
fireflies = array_create(100);
for (var i = 0, isize = array_length(fireflies); i < isize; ++i)
    fireflies[i] = new Vector2(0, 0);

alarm[0] = room_speed;

/// Object oFireflySpawner : Alarm 0
for (var i = 0, isize = array_length(fireflies); i < isize; ++i) {
    var firefly = fireflies[i];
    firefly.x = irandom(room_width);
    firefly.y = irandom(room_height);
}

alarm[0] = room_speed;

/// Object oFireflySpawner : Draw
for (var i = 0, isize = array_length(fireflies); i < isize; ++i) {
    var firefly = fireflies[i];
    draw_sprite(sFirefly, 0, firefly.x, firefly.y);
}

In the above example we removed the object oFirefly and rolled its code directly into oFireflySpawner. In order to track the x and y we allocate 100 Vector2 struct instances and store them in fireflies. Then, instead of creating 100 new oFirefly instances every second we instead modify the x and y fields. The result of this is that we have long living struct instances, allocated once instead of every second.

There's more to consider here, we could have rewritten this code in a variety of ways. For instance, why not create 100 object instances and store those in the fireflies array instead? Absolutely a valid solution! Object Instances are partially backed by struct instances after all. The primary reason is that we only need the x and y variables of the instance, we can reduce memory footprint by only tracking these with Vector2; with 100s of fireflies, this will add up.

Use Enum Arrays

This is a technique that was populate in 2.2.5 and earlier, and has surprising use in 2.3 on. A enum array is an array where we reference fields by indexing that array via enum, rather than using a struct. Because arrays are cleaned up outside of the Garbage Collector we can use them to replace their usage.

// Struct Version
function Vector2(_x, _y) constructor {
    x = _x;
    y = _y;
}

var a = new Vector2(1, 2);
var b = a.x + a.y;

// Enum Array Version
enum Vector2 {
    x,
    y,

    sizeof
}

var a = array_create(Vector2.sizeof);
    a[Vector2.x] = 1;
    a[Vector2.y] = 2;
var b = a[Vector2.x] + a[Vector2.y];

We could actually rewrite our firefly example from earlier to use this technique and we would achieve 0 lifetime GC allocations.

Manually drive the Garbage Collector

In the case that you cannot utilize any of the above techniques to reduce the number of short lived struct instances being used then you could disable the GC all together for the duration of the performance critical section before re-enabling it and performing a manual collection. This will only be useful if the performance critical section executes over the course of a few frames. Remember that the GC only executes between frames, so disabling it for the duration of a tight loop before re-enabling it will not yield any useful benefit. Additionally, manually requesting a collection while the GC is enabled is not generally recommended as it can prematurely move short lived struct instances into an older generations, which will cause the game to use more memory than it strictly needs for longer. Generally, this approach should be used for things like for asyncronous loading screens, where an animation is playing to inform the player that the game has not crashed and is doing work.

gc_enable(enabled)

Enables or disables the Garbage Collector. When enabled the GC will tick between frames. This includes the Sweep & Mark pass, as well as its Collection pass.

Parameters

  • enabled bool: Specified if the Garbage Collector should tick between frames.

Returns

  • Nothing

Example

// IMPROPER USAGE - Disabling for a single loop inside of a single frame
function world_generate() {
    gc_enable(false);

    for (var i = 0; i < 100000; ++i) {
        var entity = {
            index: i
        };

        // ...
    }

    gc_enable(true);
}

// PROPER USAGE - Disabling for several frames before re-enabling
/// Object oLoadScreen : Create

isLoaded = false;

gc_enable(false);

/// Object oLoadScreen : Step
isLoaded = do_loading();

if (isLoaded) {
    gc_enable(true);
    room_goto_next();
}

/// Object oLoadScreen : Draw
draw_self();

gc_is_enabled()

Returns if the Garbage Collector is enabled or not.

Parameters

  • None

Returns

  • bool: True if enabled, false if disabled.

Example

function do_expensive_operation() {
    var isEnabled = gc_is_enabled();
    gc_enable(false);

    // ...
    
    gc_enable(isEnabled);
}

gc_collect()

Ticks the Garbage Collector, regardless of if it is enabled or not.

Parameters

  • None

Returns

  • Nothing

Example

// IMPROPER USAGE - Collecting prematurely
function world_generate() {
   var list = [];

   for (var i = 0; i < 100000; ++i) {
       var entity = {
           index: i
       };

       list[array_length(list)] = entity;
   }

   // ...

   // BAD! Local variable list still holds references to 100, 000 entity struct instances
   gc_collect();
}

world_generate();

// GOOD 
function world_generate() {
   var list = [];

   for (var i = 0; i < 100000; ++i) {
       var entity = {
           index: i
       };

       list[array_length(list)] = entity;
   }

   // ...
}

// PROPER USAGE - Local variable list released, all ref counts are 0
world_generate();
gc_collect();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment