TH16 can crash when a large number of enemies and bullets/lasers are destroyed, ruining some score runs. Notably, TH17 1.00a can also crash, but the bug was finally fixed in 1.00b.
Motivated by wanting to understand this crash and perhaps repair it for TH16, I made my best effort to look into the changes between TH17 1.00a and 1.00b. I have not yet figured out the true cause of the crash, but my efforts to discover the cause have forced me to understand even more about how ANMs work.
(Edit 2020/10/07: The reason for the crash is now known!)
Thanks to this, I have now mapped out a significant portion of AnmManager
, the single largest struct ever allocated by the game.
We'll start with TH16, as this is the code I'm most familiar with.
// TH 16
struct AnmManager __packed
{
/* 0x0000000 */ char __UNKNOWN[216];
// During AnmManager's on_tick functions, this counts the number of ANMs that have been
// encountered so far. I don't think it's used. I only mention it because it's touched
// by code that's otherwise pretty important.
/* 0x00000d8 */ int32_t useless_count;
// These are used by AnmManager's on_tick handlers.
/* 0x00000dc */ struct ZunListNode<AnmVm>* world_list_head;
/* 0x00000e0 */ struct ZunListNode<AnmVm>* world_list_tail;
/* 0x00000e4 */ struct ZunListNode<AnmVm>* ui_list_head;
/* 0x00000e8 */ struct ZunListNode<AnmVm>* ui_list_tail;
// This is discussed soon.
/* 0x00000ec */ struct AnmFastVm fast_array[8191]; // 8191 = 0x1fff
// Leftovers from LoLK.
// When PointDevice hits a checkpoint, all anm vms meaningful to gameplay are replicated into
// a backup set of vms, so they can be reloaded and/or saved to an autosave file.
/* 0x0c27ad8 */ int32_t __lolk_next_snapshot_fast_id;
/* 0x0c27adc */ char __UNKNOWN[4];
/* 0x0c27ae0 */ AnmVmList __lolk_vm_snapshot_list_head;
/* 0x0c27af0 */ struct AnmFastVm __lolk_fast_snapshot_array[8191];
/* 0x184f4dc */ struct ZunListNode<AnmFastVm> freelist_head;
/* 0x184f4ec */ int32_t __UNKNOWN_OR_UNUSED;
// An array of pointers to structures representing parsed .anm files.
// Everything has a special place in here; e.g. front.anm goes in index 05,
// and the current player character's anm goes in index 09.
/* 0x184f4f0 */ struct AnmLoaded* preloaded[31]; // parsed anm files
// Most of this stuff seems to only be used in low level rendering code.
// I doubt that any of it has changed meaningfully from TH16 to TH17.
/* 0x184f56c */ struct D3DMatrix __matrix_184f56c__used_during_rendering;
/* 0x184f5ac */ struct AnmVm __vm_184f5ac; // unknown purpose
// Some fields that don't appear to have changed between TH16 and TH17.
// They're used during rendering and mostly consist of two large D3D vertex buffers.
// (actually, I have mapped this out, at least for TH17; see addendum)
/* 0x184fba8 */ char __UNKNOWN[0x420088];
// These are used by AnmManager's on_draw callbacks prior to TH17 1.00b.
// Each one is only used for a single, 32-bit field: 'next_in_layer' at offset 0x5a8
/* 0x1c6fc30 */ struct AnmVm layer_list_dummy_heads[43];
/* 0x1c7fd84 */ int last_id_discriminator; // 19 bit integer, discussed soon
/* 0x1c7fd88 */ char __UNKNOWN[8];
}; /* 0x1c7fd90 total size */
// a very common type
template <typename T>
struct ZunListNode {
T* entry;
struct ZunListNode<T>* next;
struct ZunListNode<T>* prev;
// sometimes this is copied between list nodes. All I can say is that
// the ZunListNode type is used literally *everywhere* and I've never once
// seen this field used for anything meaningful, so don't worry about it.
struct ZunListNode<T>* __seldom_used;
}
In TH17 1.00a, AnmManager
has a single difference from TH16, which is the addition of an AnmVm
at offset 0xd8
(right before the on_tick
lists). Its purpose is unknown, and it was all zeros when I looked at it.
In TH17 1.00b, AnmManager
is 0x106c4
bytes larger than in TH17 1.00a. The breakdown is as follows:
0x100ac
bytes come from the addition of a 4 byte field at offsetAnmVm+0x5a8
to all ANM VMs. To my knowledge, this field is always zero. It may have been part of a failed attempt to fix the crash present in TH16 and TH17 1.00a.- The leftover
AnmFastVm
array from LoLK has been removed. - "Fast ids" (more on this soon) are now one bit larger, causing
fast_array
to grow to slightly more than double its size. Together with the above bullet, this leads to a net addition of a singleAnmFastVm
, which accounts for the remaining0x618
bytes.
// TH 17 1.00b
struct AnmManager __packed
{
/* 00000000 */ char __UNKNOWN[216];
// This is even more useless now. With the changes in 1.00b, it is incremented on draw,
// and will ultimately count the product of (num_layers * num_vms). THE MORE YOU KNOW
/* 000000d8 */ int32_t useless_counter;
/* 000000dc */ struct AnmVm __anm_vm_dc; // present in 1.00a, unknown purpose
// Beginning in 1.00b, these are used by BOTH the on_tick and on_draw callbacks.
/* 000006dc */ struct ZunListNode<AnmVm>* world_list_head;
/* 000006e0 */ struct ZunListNode<AnmVm>* world_list_tail;
/* 000006e4 */ struct ZunListNode<AnmVm>* ui_list_head;
/* 000006e8 */ struct ZunListNode<AnmVm>* ui_list_tail;
// Bigger now, as mentioned above.
/* 000006ec */ struct AnmFastVm fast_array[16383]; // 16383 = 0x3fff
// Most likely, the other leftover LoLK fields are still here; but it's impossible to tell,
// since any code that would have used them was stripped out.
/* 018600d4 */ char __UNKNOWN[24];
// blah blah, unchanged
/* 018601ec */ struct ZunList<AnmFastVm> freelist_head; // only used for its 'next' field
/* 018608fc */ int __UNKNOWN_OR_UNUSED;
/* 81860100 */ struct zAnmLoaded* preloaded[31];
/* 0186017c */ struct D3DMatrix __matrix_186017c__used_during_rendering;
/* 018601bc */ struct AnmVm __vm_18601bc;
/* 018607bc */ char __UNKNOWN[0x420088];
// these are unused now
/* 01c80844 */ struct AnmVm layer_list_dummy_heads[43];
/* 01c90a44 */ int32_t last_id_discriminator; // now 18 bits
/* 01c90a48 */ char __UNKNOWN[8];
}; /* 01c90a50 total size */
(the following section applies equally to TH16 and TH17)
On AnmManager
, there is a giant array of 8191 preallocated VMs (16383 in TH17 1.00b) that I have named fast_array
. These arrays hold structures that are just slightly larger than a VM:
struct AnmFastVm {
/* TH16 TH17 */
/* 0x000 0x000 */ struct AnmVm vm;
/* 0x5fc 0x600 */ struct ZunListNode<AnmFastVm> freelist_node;
/* 0x600 0x610 */ int32_t is_alive;
/* 0x610 0x614 */ int32_t fast_id; // index of this AnmFastVm in fast_array
} /* 0x614 0x618 total size*/
Now, this is not just an optimization to avoid some malloc
s and free
s! Crucially, this preallocated array exists to optimize lookups of ANM VMs from their IDs; many game entities store ANM IDs instead of pointers to prevent usage after free when an Anm VM runs to completion and deletes itself, and this array allows these checks to be performed with O(1) memory lookups for the vast majority of gameplay.
Here's how it works:
- ANM IDs are split into two parts: the fast id and the discriminator.
- The lower 13 bits (14 bits in TH17 1.00b) are the fast id, copied from
AnmFastVm.fast_id
. As long as this value is in the range0 <= fast_id <= 0x1ffe
(0x3ffe
in TH17 1.00b), the game knows exactly where the VM is (infast_array
), and can further know for a fact that the memory has not been freed. (so it is safe to double-check itsid
field) - The next 19 bits (18 bits in TH17 1.00b) are the discriminator. This is a nonzero integer that increases (with wraparound) every time a VM is created, and its purpose is to guarantee that the ID field of a preallocated VM will change if it is destroyed and then reused by something else. This is not just here for doomsday scenarios; after all, the (cache-efficient) FIFO nature of the freelist means that the most recently deleted VMs are the ones most likely to be reused by something else!
- The lower 13 bits (14 bits in TH17 1.00b) are the fast id, copied from
Normally, the freelist provides instant access to the next unused AnmFastId
, making "allocation" fast as well. But, if somehow all 0x1fff
(or 0x3fff
) of these VMs manage to get used and the game needs to create even more, they will use the slow path: When the freelist is empty, new VMs will be individually malloc
ed on the spot (as AnmVm
s, not AnmFastVm
s) and given a fast_id
of 0x1fff
(0x3fff
in TH17 1.00b). When the game needs to look up such a VM from its ID, it will search all living VMs by scanning the entire world_list
and ui_list
.
There is another change in TH17 1.00b.
- As detailed here,
AnmManager
has twoon_tick
callbacks (one forworld_list
and one forui_list
) and a ton ofon_draw
callbacks (one for every layer). - Prior to TH17 v1.00b, each
on_tick
function was responsible for repopulating some of the lists stored for each layer atAnmManager.layer_list_dummy_heads
. Then, theon_draw
callbacks would each simply iterate over one of those lists. - As of TH17 v1.00b, those lists are no longer populated or used. Every single
on_draw
callback now iterates over both theon_world
andon_tick
lists, rendering all of the VMs that have the desired layer.- Note it performs the same layer mappings as mentioned in the above linked document when checking the
layer
of each VM (though this time it doesn't update thelayer
field in-place). This means that, for instance, if a VM in theworld_list
has layer 36 (which is normally a UI layer), it will be rendered by theon_draw
function for layer 24, and not by the one for layer 36. - (In other words, everything is still rendered at the same priority as before the patch; when the
on_draw
func for a world layer iterates over theui
list, simply none of the VMs will be found to have a matching layer. Theon_draw
funcs just search both lists out of laziness basically)
- Note it performs the same layer mappings as mentioned in the above linked document when checking the
Working together with priw8 and sp0thss, I am now fairly confident that the cause for this crash is understood: As described in my other gist, AnmManager
's on_tick functions build the layer lists for drawing. More specifically, each VM is ticked, and one of two things happens based on the return value of the ANM parser:
- If the parser returns 0, the VM is prepended to its layer list.
- Otherwise, the VM and all of its children are destroyed (destructor is called, memory is potentially freed).
So basically, the scenario is as follows:
- A VM that is a child runs during
on_tick
and gets added to its layer list. - Later in this same function on the same frame, the parent runs and is deleted, which also deletes the child.
- During
on_draw
, the (now potentially deallocated) child is still in the layer list, and game segfaults due to a use-after-free.
Specifically, the problematic scripts appear to be not the bullet cancel script as you might imagine, but rather, the enemy death scripts! Many of these death scripts have children that outlive parents, and if you modify them so that the parent lives longer, then the crashes stop. (there may be other problematic scripts too that we don't know about)
Why does it only show up when there are many, many bullets being cancelled? I believe this is because it heavily stresses the slow path. Basically, even though fast-path VMs would have their destructors called in step 2, accessing them doesn't produce a memory access exception because, as far as the system is concerned, they're still allocated. And even once we hit the fast path limit, there may additionally be some effects such as some slow path VMs sharing memory pages with other allocations, allowing the memory to still be used after the VM is freed.
Why do the changes in 1.00b fix the crash? Basically, the layer lists are singly-linked, while the world/ui lists are doubly-linked. The destructor for VMs removes a VM from the world/ui list, but not from the layer list (because you can't remove from a singly-linked list!). Therefore, in step 3 of the above scenario, v1.00b does not encounter the deallocated child VM because it's using the world/ui lists.
It seems to me that what ZUN should have done to fix the issue is to simply change the layer lists to be doubly-linked so that they can be removed. I believe the current solution in 1.00b has some pretty bad implications for performance, but there's not much we can do about it now.
There's an additional bug (at least in TH15)! If you quit the game with many sprites on the screen, then when vms are being all destroyed, if a parent is destroyed before the first child in its child list, then the game will crash when it destroys the child, as the child tries to write to a linked list node on the parent when removing itself from its list of siblings.
The bug here would appear to be that ZUN forgot to have parent vms remove themselves from these lists. And once again, the crash doesn't normally happen because there usually isn't enough vms for memory deallocation to occur.
I imagine this bug still exists in TH17 v1.00b, and will be easy to run into once the bullet_cap
patch supports this game.