Skip to content

Instantly share code, notes, and snippets.

@apple1417
Last active May 10, 2023 23:49
Show Gist options
  • Save apple1417/b23f91f7a9e3b834d6d052d35a0010ff to your computer and use it in GitHub Desktop.
Save apple1417/b23f91f7a9e3b834d6d052d35a0010ff to your computer and use it in GitHub Desktop.
Crash Course in Unreal Engine Introspection

Crash Course in Unreal Engine Introspection

Table of Contents

Intro

Unreal objects are highly introspective. Knowing only a few offsets on base classes, you can follow this this introspection, and convert from raw property names to the actual offsets needed to reconstruct a pointer.

This guide assumes the following:

  • You have a working dumper, which gives you object names and their addresses
  • You have a pointer/signature to GNames - any tool which can give you object names will have found it, just need one which actually exposes it too
  • You have a pointer to some static base object such as GEngine or GWorld, which you can use as a starting point to get to the fields you want - just find them in your dumps then pointer scan
  • The base fields you need to know have completely unknown offsets, which you need to reverse engineer too. This isn't ever really the case, especially if you have the source code of your dumper, but assuming it is will help for when one does move.

Names

The first concept you need to understand is names. Names are strings which are expected to hold mostly constant data. All names are stored in a big strings table, GNames, and then name fields on objects actually just hold an index into them, allowing for very efficent compares. You may see them referred to as FNames in unreal documentation, or name fields in unrealscript.

struct FName {
    int32_t index;
    int32_t number
};

Given it's simplicity, this struct is very unlikely to change between ue versions.

If you autoguess offset types in cheat engine, names will generally appear as about a 5-digit hex value (index), followed by a decimal value which is typically (but not always) zero (number).

So how do you convert names back into their actual string? This is what you need that GNames pointer for. The structure of this has changed several times.

UE3

template <typename T>
struct TArray {
    T* data;
    int32_t size;
    int32_t max;
};

struct FNameEntry {
    bool is_wide : 1;
    int32_t index : 31;

    union {
        char ansi[];
        wchar_t wide[];
    };
};

TArray<FNameEntry*> GNames;

For clarity, in this inital is_wide/index bitfield, if you read it as a single int32, is wide is always bit 0, and index is always bits 1-31. The actual unreal code handles it a little differently to ensure this is always the case, regardless of compiler. In practice, you can get away with assuming all strings are wide, so you can mostly ignore this.

You should notice all the FNameEntrys are allocated in a single block, all the pointers should be quite close to each other. All the strings are of their minimal size (with a null terminator), so each entry starts immediately after the last (allowing for 4-byte alignment). You will probably find there's some extra padding in the actual structs, but it should be quite easy to pick out where the strings actually being.

To access an entry, simply index the tarray - *GNames.data[idx].

UE4

template <typename ElementType, int32 MaxTotalElements, int32 ElementsPerChunk>
struct TStaticIndirectArrayThreadSafeRead {
    ElementType** objects[(MaxTotalElements + ElementsPerChunk - 1) / ElementsPerChunk];
    int32_t count;
    int32_t chunk_count;
};

struct FNameEntry {
    bool is_wide : 1;
    int32_t index : 31;
    FNameEntry* next;

    union {
        char ansi[];
        wchar_t wide[];
    };
};

TStaticIndirectArrayThreadSafeRead<FNameEntry, 0x400000, 0x4000> GNames;

This may look a little complex, but simplifying it down, objects is just a 0x100-element array, of pointers to 0x4000 element arrays, of pointers to FNameEntrys. You can index through these using *GNames.objects[idx / 0x4000][idx % 0x4000].

The FNameEntrys are basically the same as in the previous version.

UE4.23 - FNamePool

Note I have no direct experience with this version, so this section is a bit more shakey than the others.

struct FName {
    int16_t name_offset;
    int16_t chunk_offset;
    int32_t number
};

struct FNamePool {
    uint8_t padding[0x10];
    uint16_t* data[];
};

struct name_metadata {
    uint16_t : 6;
    uint16_t size : 10;
};

FNamePool GNames;

So this version changed things up a lot - so much that structs don't really explain it that well.

The first big change is that the FName structs themselves don't really store a single int32 index anymore. In fact, they do away with indexes all together, and just store offsets. The lowest 16bits are the name offset, and the next 16bits are the chunk offset. Presumably precalculating these makes lookups more efficent.

Now you'll notice I never actually defined FNameEntry for this version. It isn't really needed anymore. Each chunk is made up entirely of tightly packed strings. Rather than using a null terminator as a seperator, the strings have a leading 2-byte metadata value seperating them. You could think of this as a struct of a uint16 and a (w)char array, but that may lead to misleading assumptions about indexing.

To lookup an entry, you just follow the offsets - GNames.data[chunk_offset][name_offset]. Note that unlike before, the two index operations are now looking through two different data types, the chunk offset is indexing through 4/8-byte values, while the name offset is only indexing through 2-byte values. The address you arrive at is the metadata value for the following name. Most notably for us, the uppermost 10 bits of the metadata contain the size of the name. Presumably one of the other bits is is_wide, I don't know which. Read the metadata value, work out the size, then increment the name offset and read that many more characters.

Using GNames

So with all this, you should be able to convert a name index into it's string. The first few entries are hardcoded, if you want to test a few:

  • 0 -> None
  • 1 -> ByteProperty
  • 2 -> IntProperty

In games using the FNamePool version, the indexes won't actually be 0-2, but those strings should still be the first entries.

There are three other important things to know about GNames:

  • Names can be added at runtime meaning some later strings may change their index based on what order things were loaded.
  • Once a name is in GNames, it's index will never change - it's fine to cache them (and I'd recommended it if you're running in a seperate process).
  • If you're in a seperate process, there's no efficent way to get the index of a name from it's string. If you're injected into the game process, you can find and call FName::Init.

There's still one more thing about names you may need to know about: the number. If you scroll through the object dumps, you'll notice a lot of objects named something like PlayerController_12, most common on very temporary objects. Since it'd be very inefficent to allocate a new name for every temporary object, the number field is used to add an underscore and a numeric suffix. If it's 0, there's no suffix, otherwise it's the number minus one.

name_string = GNames.get(name.index)
if name.number != 0:
    name_string += "_" + str(name.number - 1)

Typically, object fields will not use numeric suffixes, so you may be able to get away with ignoring them.

Object Structure

Now let's go over the data structures that let us actually do the introspection. It's easiest to start by just looking at the structs.

class UObject {
    FName name;
    UObject* outer;
    UClass* class_;
};

class UField : public UObject {
    UObject* next;
};

class UStruct : public UField {
    UStruct* super_field;
    UField* children;
    UProperty* property_link;
};

class UClass : public UStruct {};

class UProperty : public UField {
    int32_t element_size;
    int32_t offset_internal;
    UProperty* property_link_next;
};

Note that these are only the fields you might be interested in, these structs are definitely very unlike what you'll actually find laid out in memory.

Your unreal object dumper will show you all subclasses of UObject, so you can recover most of the actual offsets just by autoguessing fields in cheat engine, looking for pointers, then seeing what those addresses corrospond to in your dumps. More on what exactly you'd expect to see later.

As previously mentioned, FNames tend to show up as about a 5 digit hex value, generally followed by decimal 0. UObject.name will appear quite close to the start of the object, typically right next to one of the other listed pointers. You can then lookup the name in GNames and compare against what your dumper tells you to confirm that you've found the right offset.

The fields on UProperty are a little trickier. element_size typically appears near the start of the property, near the property flags, which should appear as a hex int32 bitmap. You can try lookup properties of known sizes to confirm - an IntProperty will always be 4 bytes and a NameProperty will always be 8. offset_internal typically shows up a little later in the object, nearer to property_link_next, and will typically be about a 3 digit hex value. If you've already pointer scanned an offset, use that to confirm you're reading the right thing, otherwise confirming it is mostly just a matter of trying it and seeing if it makes sense.

Now once you've worked out all the fields, there are a number of useful linked lists you can follow:

  • UObject.outer.outer - Outer objects. These don't actually affect anything, but they're used to get the full object "path name", which is handy for debugging. The exact path logic is as follows:

    def get_path_name(obj: UObject) -> str:
        if obj.Outer is None:
            return obj.Name
        seperator = "."
        if obj.Outer.Class.Name != "Package" and obj.Outer.Outer.Class.Name == "Package":
            seperator = ":"
        return get_path_name(obj.Outer) + seperator + obj.Name

    The seperator logic of course isn't particularly important, so you can choose to ignore it.

  • UStruct.super_field.super_field - Chain from most to least derived struct. Most common for classes, ending in the UObject class, but sometimes structs have inheritance too (e.g. Plane2D inherits Vector2D).

  • UStruct.property_link.property_link_next.property_link_next - All properties on a struct, including on the less derived structs. Properties are the actual values stored in memory after the start of the struct, these are what you want to get the offsets of.

  • UStruct.children.next.next - All fields on this struct in particular. Fields include properties, but also other things such as inner structs, functions, enums, and constants. This chain does not include the less derived structs, so you probably need to combine it with the super field one.

So to iterate through properties you have two choices - property_link, or super_field + children. I would recommend the former, as it tends to be quicker, but you may find the latter easier to find (especially since the offsets are smaller) or to work with.

Now finally, how do you find the specific property you're interested in? Just compare against the name. The name field on the property object is the name of the property. So to bring it all together, let's say you have a pointer to GEngine, and you want to find the GameInstance field:

prop = get_gengine().class_.property_link
while prop:
    if prop.Name == "GameInstance":
        break
    prop = prop.property_link_next

Of course you'll need to add some error checking/retrying on failure.

Intepreting Properties

So you can get from an object to the property objects you're interested in. What next?

As you may have already guessed, UProperty.offset_internal is the offset from the start of the object to that property's data. For a lot of property types, the data stored here is pretty self evident - a DoubleProperty stores a double at that offset, a NameProperty stores an FName, etc. There are four main special cases you probably want to know.

ObjectProperty

Object properties might seem like another one of those simple types - the value at the offset is a pointer to another object. However, object properties have an extra useful field.

class UObjectProperty : public UProperty {
    UClass* property_class;
};

This field, unsuprisingly, holds the class which this property accepts. In practice, it lets you jump directly to the next class, to start working out offsets on it, without having to wait for any objects to be constructed.

There is one caveat to this - it only holds the least derived class the property accepts. Sometimes you want to get one of the more derived classes instead. One common example of this is localplayer.PlayerController - this field holds an instance of a PlayerController, but you probably want to read fields off of a game specific subclass of PlayerController instead.

You have two real options here:

  • Find another pointer path which is restricted to the more derived class. This is preferred, but not always possible.
  • Wait for the property to get populated with an instance of the more derived class, then read the class off of it.

You could also try iterate through GObjects, the global array of all unreal objects, looking for the specific class object, but this can easily involve tens of millions of string comparisions. If you're injected into the game process, you could try find and call StaticFindObject to optimize a little, but it's still probably worse than one of the other two options.

StructProperty

Struct properties consist of a blob of data holding the struct contents. How do you parse it? You have to start by reading an extra field off of the property again.

class UScriptStruct : public UStruct {};

class UStructProperty : public UProperty {
    UScriptStruct* struct_;
};

After reading the struct off of the property, you can then parse through it's inner properties in the exact same way to get to it's offsets. Note that these offsets are relative to the start of the struct - the same struct can be used in a number of different places. You'll have to add the offset of the struct to the offset of it's inner properties.

ArrayProperty

Array properties bring back a template we briefly mentioned during parsing GNames.

template <typename T>
struct TArray {
    T* data;
    int32_t size;
    int32_t max;
};

Given it's simplicity, this struct is very unlikely to change between ue versions.

Note that you only need to care about max if you're writing to the array. You can otherwise ignore it if you're only reading from it, but it's handy to use to find these arrays while browsing memory, since it will always be >= size, and will generally be a power of two.

The offset you read off of the array property will point to the start of a TArray. But we still need to know the templated type, how do we know how where to look for each element? We need to read another value off the property again.

class UArrayProperty : public UProperty {
    UProperty* inner;
};

This time, we're not interested in the offset on the property (it should be 0), instead we want to read the element_size field. The ith element of an array will be at offset i * element_size within the data.

It's common for structs to be nested inside arrays (and vice versa). In this case, remember that each struct is a seperate array element, a field on the struct is at offset (i * element_size) + struct_offset within the data.

StrProperty

String properties hold an arbitrary, generally user provided string. They are essentially just a specialization of an ArrayProperty (though don't actually inherit from it). They don't have any special fields on their class you need to be aware of.

class UStrProperty : public UProperty {};

using FString = TArray<wchar_t>;

These strings should be null terminated, so you can get away with ignoring the size field;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment