Skip to content

Instantly share code, notes, and snippets.

@Dreaming381
Last active June 25, 2024 07:34
Show Gist options
  • Save Dreaming381/3aec938455f5e6f393a24e4febabd777 to your computer and use it in GitHub Desktop.
Save Dreaming381/3aec938455f5e6f393a24e4febabd777 to your computer and use it in GitHub Desktop.
Continuous Feedback for the ECS Team

Latios Framework Unity ECS Wishlist

Last Updated: 2024-6-25

As the original author and primary developer of the Latios Framework for Unity’s ECS, I regularly run into bugs, inadequate functionalities, and pitfalls within the ECS ecosystem. This has resulted in lots of “hacks” in the framework to patch up problems. This is a living document describing the issues and hacks. My hope is that the Unity ECS team will find this to be a valuable reference to help improve the quality of their packages.

This document is organized into four categories: Ship Stoppers, Feature Blockers and Hinderances, Hacks and Ugliness, and Annoyances, in greatest to least severity respectively.

This document does not include all possible features the Latios Framework may eventually implement if no official solution is provided. If you would like to learn about such features, it is best to ask me.

Ship Stoppers

These are high-severity items that is preventing the Latios Framework from functioning correctly, with no plausible workarounds.

As of Entities 1.3.0-exp.1, there are no immediate severe items. However, there are two items that have been hinted at their removal in the future. And if they are removed without an alternative, they will cause big problems for the Latios Framework, most likely resulting in a total fork or moving to another technology base.

ENTITY_STORE_V1

Entities 1.2 breaks determinism in a really bad way. Entity IDs are no longer deterministic, meaning the only way to order a list of entities in a deterministic way is to know both which chunks they belong to AND know the indices of those chunks relative to some EntityQuery.

Besides creating a bunch of confusion around whether chunk order determinism even matters (because now it is really hard to preserve) and what the point of the sortKey in ECB is, this new change introduces a major problem in the Latios Framework’s development, which is debugging.

Kinemation relies heavily on chunk components and caching of relationships. When a bug happens, it is crucial to be able to replay the simulation up to the bug to identify the source of the problem. Many of the algorithms Kinemation uses don’t have access to the chunk index and index in chunk for a list of entities collected in parallel. There is no way to order them deterministically in 1.2. That means that chunk order is not preserved, and whether or not two entities lie in the same chunk may change run to run. Thus, if the bug was dependent on two entities being in the same or different chunks, the bug is only reproducible by chance. That’s really, really bad.

Full determinism per architecture was one of Unity’s major competitive advantages over other solutions. And now it is being thrown away. I’m willing to compromise if it makes Game Object/ECS Unification amazing. But you’ll have to forgive me if I am a bit skeptical. I believe there may be other ways to solve the problem. What I ask is that the rules be well-defined regarding the expectations of determinism for maximum correctness that package developers should adhere to. And I ask that based on the rule defined, additional runtime and debugging tools be invested in to support the new rule and accommodate its shortcomings when it comes to chunk-level operations.

IAspect

The ECS learning curve is steep. And as a package developer aiming for maximum performance, I need some pretty complex data layouts, and I need to provide an intuitive way for users to interact with that data. IAspect addressed this problem. If it goes away, I need a replacement. I don’t care if I as the package author have to write a lot more code. But it has to be easy for the users, and fit into the existing ECS approaches everything else uses.

In general, I’m a bit frustrated with the inextensibility of codegen. The fact that IJobEntity can’t be taught new things to iterate. And it requires the support of the ISystem source generator that rewrites the entire method. And consequently makes it nearly impossible to add new SystemAPI-like extensions. But I will discuss more about these in the following sections.

If you need an example of the kind of complexity I am trying to abstract, my framework has a file called OptimizedSkeletonAspect.cs. Try to make sense of the buffer rotation mechanism that avoids massive buffer copies every frame.

Feature Blockers and Hinderances

These are items which are preventing or creating unnecessary friction for specific new features or optimizations of the Latios Framework from being developed. They may also be creating undesirable effects on usage of the framework and reducing quality on the overall solution.

Audio Editor Soft-lock

I’ve been encountering a strange error where when calling AudioClip.GetData() in a baker or baking system, and then creating a DSPGraph at runtime, and then triggering a domain reload from code changes, I end up with Unity soft-locked. I will continue to investigate to see if I can remove something from the formula (I already know that creating a DSPGraph and not calling AudioClip.GetData() avoids the soft-lock). But I would really appreciate this be looked into.

Linear Blend Skinning R32G32_FLOAT

I have encountered a situation in which the bone weights using shader graph’s Linear Blend Skinning node get bound with R32G32_FLOAT, and this causes the models to be scaled incorrectly. While I know Unity’s Entities Graphics has moved away from this node, I still support it because there are various scenarios where it outperforms the compute shader skinning alternative.

SystemAPI Extensibility

The fact that we can’t use SystemAPI in static methods makes it extremely difficult to build extensions and common patterns. For example, I have a static method Physics.BuildCollisionLayer() that needs to schedule 5 jobs in sequence. While there are several variants, one variant requires the first job to perform chunk iteration. Securing such type handles is extremely problematic. The user has to manually cache and update a struct containing those handles, because this method can’t rely on SystemAPI. That’s a lot of unnecessary boilerplate burdened directly on the user.

Codegen Accessibility

Source generators are an incredibly powerful tool. I used to complain about poor documentation, but I have seen some effort recently to address this.

However, not every problem is solved. Currently there’s no way to replace IAspect with a custom solution and have it work correctly with IJobEntity. There’s no way to access the IJobEntity EntityQuery. Additionally, there’s no way to recreate the allocation-free behavior of idiomatic foreach manually. And there’s no way to create our own aggregate type handles that can be automatically cached SystemAPI-style.

Subscene Imports

Subscene import workflows have significant usability issues. Because they occur in a separate Unity process, they do not use Burst, cannot easily be debugged, have limited reporting capabilities of memory leaks and the like, and many engine features are not well tested when accessed in this mode (it took 3 years for the audio crash bug to be fixed).

The Latios Framework pushes the boundaries of what can be baked, with new and exciting high-level features. But that only works when baking itself works, which has been a constant pain point.

Ending Writes to Graphics Buffers

A huge optimization I made with Kinemation’s renderer is writing to Graphics Buffers and dispatching compute shaders inside the culling loop instead of before it. This way I take culling results into account and do significantly less work. This applies to skinning, material properties, blend shapes, other mesh deformations, and whatever else. Only problem is that now I have to complete culling jobs inside the culling callbacks so that I can end GraphicsBuffer writes and dispatch the compute shaders. It would be awesome if BatchRendererGroup could get an additional callback when the jobs need to be completed so that I could do these compute shader dispatches as late as possible. SRP shenanigans are a big chunk of my frame time and the worker threads are starved.

Large Burst Jobs with Lots of Functions Difficult to Navigate

It would be awesome if I could see an outline of all functions a Burst job compiled and jump between them. Right now it is still difficult to understand what is happening in critical sections of code in massive jobs.

Can’t Add and Remove Components in a Single Structural Change

I have to move an entity, or more often an array of entities twice if I have both a set of components to add and a set of components to remove.

Adding/Removing Tag Components Causes Fragmentation

If a zero-sized component is added or removed on all entities in the chunk, the chunk’s archetype is converted in-place, which is a great optimization. However, when there are only a couple of entities in the chunk because the source archetype represents a temporary state, then this conversion in-place will leave lots of chunks with only a small number of entities each, causing fragmentation. It would be awesome if as an additional check, if there is another chunk that can accommodate the entire existing chunk, that the entities move to the new chunk rather than perform the in-place conversion.

Most of the time, I find myself using sub-optimal structural change sequences just to avoid this edge case.

Optional Containers in Jobs

Sometimes, I really want to specify that a container in a job is meant to be allocated in the job via Allocator.Temp, without having to use the nuclear attribute [NativeDisableContainerSafetyRestriction]. Other times, I might have a job that takes a variable number of DynamicComponentTypeHandles, and I never know what to populate the unused slots with. Again, disabling container safety is really bad because then if the user messes up job dependencies elsewhere, the issue may go unnoticed.

DynamicBuffer Allocations

I believe Dynamic Buffers should be allocated from a custom ECS-managed allocator and not using Allocator.Persistent. The allocations are often small and would be better suited with a pool. This is starting to become a performance problem for me during initialization due to all the consecutive small allocations in my ICleanupBufferElement types.

Inside-out Baking Struggles

There’s currently not a clean way to do the operation of “If this authoring instance is referenced in a list by any other authoring instance, add this runtime component”. You can only do this if you make the restriction that the other authoring instance with the list is an ancestor in the hierarchy. Incremental baking is hard, and I respect that this is not an easy problem to fix. But I will still bring it up here, since it is a limitation a lot of people run into.

Hacks and Ugliness

These are some of the other issues I have ran into with the Latios Framework that required explicit workarounds that were far from ideal.

Bootstraps and Baking Customizations

We have ICustomBootstrap for setting up systems at runtime. Why can’t we do the same thing in the Editor? I ended up extending ECS to do that, but I do it by accessing some internal Action after the Editor World is created and then try to replace it. And then I also have a hack to rebuild the EditorWorld as a menu option when a buggy editor system goes haywire and the full Editor state is corrupted. Unfortunately, this hack isn’t bullet-proof and sometimes causes the wrong world to run an update or two, which then fails and throws errors in the console.

Then there’s baking. I have a custom Skinned Mesh Rendering solution. Why can’t I turn off the built-in Entities Graphics Skinned Mesh Renderer baking without turning off the entire baking of Entities Graphics? Once again, I hacked this by using a custom baker list mechanism that seemed to be created for tests. I do this at startup to create a custom bootstrap callback, and then for each baking world, I have a system in OnCreate assign a RateManager to one of the first ComponentSystemGroups baking uses, and then in that callback I disable systems I don’t want and then inject systems with the DisableAutoCreation attribute. Why do systems have that attribute? It is because those systems are for an optional feature that users may or may not want. Why do I use RateManager? It is the only way to ensure the already included ComponentSystemGroups have had their OnCreate() called when I inject the systems, because otherwise I can’t add systems to them.

And while we are on this topic, I would greatly appreciate a flag in UpdateBefore/After attributes to suppress warnings about the systems being in the wrong groups. Such systems might just not be installed at all. A user may have replaced it with a custom version or something. Bonus points if they can be suppressed externally.

Personally, I think the whole bottom-up automatic injection design of systems is problematic. It makes it difficult for users to optimize system ordering for better worker thread occupancy, unless they want to decorate their systems with false dependencies. It becomes impossible to know just by looking at the code what the actual order of systems are if there’s a bug where some data is getting changed in the wrong place. And it makes it really hard to copy a system into a different project. Also, how do you define a system to run more than once in a frame?

A top-down approach solves all these problems, and the Latios Framework has the mechanisms in-place to support this. Unfortunately, this conflicts with a lot of existing paradigms. I don’t know the right answer.

Lastly, the whole ICustomBootstrap thing does not play well with embedded samples inside of packages. Bootstraps should be settings assets that can be swapped in the Editor. This feature is planned for a future Latios Framework version, but I wish I didn’t have to be the one to do it.

Collections in Components

Why are collections married to singletons? Why are there even singletons? Do you truly only want one of something, or do you just want to know which entity is the entity? The Latios Framework solves these use cases independently with blackboard entities and collection components. The latter has similar problems as managed structs, except this time user API is fully Burst-compatible. But if you are from Unity and want to do something more official, please reach out to me!

BlobAssetStore

Currently the Latios Framework has this SmartBlobber mechanism for creating blob assets in baking systems based on a “request” protocol. For each blob type, the user has to register the type so that a generic system can properly ref-count and store blobs in the BlobAssetStore (deduplicating in the process).

I currently face two problems that I have hacked around. First, adding concrete types to BlobAssetStore is not Burst-compatible. I have to use internal APIs to precompute the type hash prior to the job. Second, I would much rather add UnsafeUntypedBlobAssetReference blobs directly so that I don’t need generics. Honestly, I think the BlobAssetStore should use Burst’s type hashes instead of System.Type.GetHashCode and expose that as API for working with UnsafeUntypedBlobAssetReference.

Smart Bakers Alternative?

The Latios Framework Smart Blobbers are a powerful concept. They allow baking systems to generate blob assets without necessarily knowing nor caring how those blob assets will be used. User bakers can request blob assets to be created. Baking systems create the blob assets, then pass the blobs back to the user to do what they please. The issue is how to pass those blobs back to the user without making the user write a custom baking system, which is error prone. The solution I came up with is to create a generic baking system and a “bake item”. The bake item is a stateful IComponentData which does the original baking, and then later receives a callback with a reference to EntityManager and the primary entity to resolve any blob asset requests and assign them to components. This works, but it involves generic systems, and it is still somewhat unsafe. Ideally, there would be some way to have additional baker callbacks dispatched by a baking system. And inside these baker callbacks, the baker is only allowed to change or remove components it added. I’m open for ideas for improvements and/or alternatives!

Transforms

My complaints regarding transforms have been well-addressed, albeit at a snail’s pace. I still criticize the choice of LocaToWorld being a float4x4 instead of float3x4, but I also recognize that is a breaking change to rectify. What would not be a breaking change is fixing the change filter race condition when updating the child hierarchy. This would also improve performance, as it would result in less subtrees requiring the full matrix update due to an adjacent entity in a chunk poisoning the change version on a different thread.

WorldUpdateAllocator in Baking Systems

WorldUpdateAllocator doesn’t get rewound in baking worlds. Therefore, we have to use TempJob allocations everywhere when baking.

IJobEntityChunkBeginEnd with Default Methods

IJobEntityChunkBeginEnd doesn’t support a derived interface that uses default interface methods, because the source generators generate code that directly calls the methods rather than use a generic static invoker.

No NativeArray.DisposeJob for CollectionHelper

Why does this not exist?

Why can’t I Get Interfaces in Bakers?

In MonoBehaviours, you can do GetComponent<ISomeInterface>(). In bakers, this isn’t possible. Why?

Most of the time, I want a baker to check if some interface exists on the same Game Object, and if so, early out so that another Baker that processes the interface can work unhindered.

BlobStrings

I started using FixedString and BlobArray<byte> in blobs because I couldn’t log BlobStrings in Burst-compiled code. There’s a lot of missing APIs and features for BlobStrings. Make them better so that I can be more efficient with my data.

Nonzero Results of UnsafeUtility.MemCmp

I have a task where I have M arrays of bytes and a separate N array of bytes. For each array in M, I need to find an array in N that starts with all the bytes in the array from M. Currently, I’m using UnsafeUtility.MemCmp in O(n^2) fashion. But I believe that sorting M and N by raw byte values would lead to a faster algorithm. But can MemCmp be used for this kind of sorting? Is there a better approach?

Adding/Removing ComponentTypeSet containing chunk components to a NativeArray<Entity>

This used to trigger an error. It was finally patched, but the solution was to process each entity one-by-one. Performance is awful.

Auto-Load Subscenes Synchronously?

I feel like I shouldn’t need to write custom code to do this. Some subscenes are critical to be loaded before systems should start running. The player falling through the floor is a common complaint I’ve seen.

Procedural Meshes in Baking

A really common use case is to procedurally generate meshes for Mesh Renderers. While the algorithms work in bakers fine, getting the Mesh Renderer Baker to accept this and not bake a null mesh or something would be great. Currently, I replaced the MeshRenderer baker with a custom version which checks if there is not a subclass of some other MonoBehaviour before continuing. If there is, it leaves it up to that other MonoBehaviour to do custom baking instead, providing the custom mesh and list of materials to use for the renderer. You can see this in action in LSSS, as all the capsules are generated procedurally at bake time.

Entities Graphics Baking is Buggy

Entities Graphics baking got a rewrite in 1.2, and it is a lot better than it used to be. However, it still has a bug where it adds components to entities in a baking system, which it doesn’t know how to correctly revert.

Burst Generic Jobs Can’t Be Scheduled in Burst

Psyshock uses generic jobs in Physics.FindPairs() using a pattern that allows Burst to detect and compile the jobs both in the Editor and in builds without having to explicitly register the generic types with attributes. Unfortunately, the ILPP can’t pick up on it and patch these jobs to be Burst-schedulable. There should not be a discrepancy!

Currently I am relying on reflection to find and call the EarlyJobInit() methods myself for specific generic types.

Experimental Skinned Mesh Rendering

The whole Skinned Mesh Rendering solution in Entities Graphics is problematic. It generates GC every frame, it doesn’t scale, and even the public API types of SkinMatrix and BlendShapeWeight fundamentally prevent more efficient algorithms like the ones Kinemation uses. I’ve been told numerous times that the skinned mesh rendering design is “experimental”. If that’s the case, why is it in the released version of Entities Graphics without any guard flags?

I’m only asking this because I have a sliver of hope that Entities Graphics may adopt a design closer to Kinemation in which case I can delegate some features of Kinemation to the official package. While it has been announced a new design is being worked on, I know nothing about what that looks like or if it will make my life easier or harder.

LODs and LOD Crossfade

I ended up completely rewriting the LOD system in Entities Graphics because the existing system was bad, doing a bunch of random lookups and wasting chunk memory. Also, the existing system doesn’t support LOD Crossfade. I implemented it, though a bugfix to URP hasn’t been backported to URP 14 (2022 LTS) yet.

Default Groups

I frequently run into issues where default root ComponentSystemGroups accidentally get added to other groups if I don’t explicitly remove them from the list. Since these are systems that Unity will manually create, they should have a [DisableAutoCreation] attribute. At least now because they are partial, I am able to fix this with asmref.

TypeManager.GetAllSystems [DisableAutoCreation] Bug

If you try to get all systems, the systems with the DisableAutoCreation don’t get added to the list even if you specify All like the XML documentation suggests. I have to use reflection for this now, which sucks.

Why Do I Need a GraphicsBuffer to Get Blend Shapes?

Is there a more performant way to get the raw blend shapes data (the deltas, not the animated parameters) than queueing up a bunch of async readbacks and then batch-completing them inside a baking system?

Why is there No Burst-Compatible Way to Read Audio Clips?

I want to bake audio clip samples in blob assets. Current API doesn’t offer any NativeArray API. It is slow. Also, if I could get the raw compressed bytes and compression codec of audio clips, I could do my own decompression at runtime without having to do my own compression. That would be awesome!

Copying Shared Components Type-Agnostically

One of the features of blackboard entities in the Latios Framework is that they merge components of blackboard config entities whenever a subscene containing them loads. This allows the user to spread config authoring data across multiple GameObjects. However, doing this merging at runtime is surprisingly difficult. While it is easy to get the ComponentType list to copy from one entity to another, it is significantly more difficult to actually copy those types. For unmanaged components, we have the tools now. But for managed components, especially shared components, it is problematic. Currently, the Latios Framework uses reflection, but I would love for there to be a proper EntityManager.AddComponentFromOtherEntity(Entity src, Entity dst, ComponentType ct) API so that I can Burst-compile this whole thing.

BufferAccessor Missing Flexibility

You can get a read-only pointer to components in a chunk, even if the ComponentTypeHandle is declared with write access. You cannot do the same for a BufferAccessor.

Incremental Baking Discards Chunk Components

I had a bug where queries weren't being matched because of this. The bug only happened in an editor system while the subscene was open. It was really annoying.

A LinkedEntityGroup Regression

LinkedEntityGroup internal capacity was changed from 1 to 0, however, it is still added to solo prefab entities, causing heap allocations every time you instantiate the prefab. This was a measurable performance regression in one of my projects, and I had to write a baking system to address it.

Annoyances

These are little things in the API I think should be improved, but don’t have a major impact on the Latios Framework.

Missing Ref APIs

I have large chunk components. Reading/Writing by ref is way faster. I have extensions to do this, but official support would be better.

Similarly, I’ve also noticed ref gaps for EntityManager.

Idiomatic Foreach is Insufficient

The biggest issue I have with idiomatic foreach is that it is really clunky for large queries. With Entities.ForEach, you could put each argument (type and variable name) on a different line. That doesn’t really work well with idiomatic foreach. I recognize this is a hard problem, and I don’t have a proposed solution yet.

But at the very least, make it so that we can have Entity first in the tuple. It is difficult to articulate why, but the Entity being at the end annoys me and most others I talk to.

Lookups in IJobEntity

Codegen already injects the ComponentTypeHandles into IJobEntity. Can we have an [Inject] attribute for lookups and Time to have codegen do the same? That would reduce a bunch of boilerplate.

Unity Systems Outside Unity Namespace

I keep finding these, and they always catch me off guard in custom bootstraps. Put them where they belong.

Iterating Just Entities in a Query

This isn’t possible in idiomatic foreach. I have to have some additional dummy read component around, or deep copy the entity array.

NativeStream Woes

NativeStream doesn't respect alignment, gets its counts messed up when writing piecewise but reading in bulk or vice-versa, can't store writes 4kB or greater, can't defer allocation with a schedule-time known allocation size, ect.

Unamanged Shared Component Indices are Not Indices

This was an incident that caught me off guard. It turns out these indices are always negative, and contain metadata packed inside them. I suspect some of this data wasn’t supposed to reach the public API surface, but it does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment