Last Updated: 2024-6-25
As the original author and primary developer of the Latios Framework for Unity’s ECS, I regularly run into bugs, inadequate functionalities, and pitfalls within the ECS ecosystem. This has resulted in lots of “hacks” in the framework to patch up problems. This is a living document describing the issues and hacks. My hope is that the Unity ECS team will find this to be a valuable reference to help improve the quality of their packages.
This document is organized into four categories: Ship Stoppers, Feature Blockers and Hinderances, Hacks and Ugliness, and Annoyances, in greatest to least severity respectively.
This document does not include all possible features the Latios Framework may eventually implement if no official solution is provided. If you would like to learn about such features, it is best to ask me.
These are high-severity items that is preventing the Latios Framework from functioning correctly, with no plausible workarounds.
As of Entities 1.3.0-exp.1, there are no immediate severe items. However, there are two items that have been hinted at their removal in the future. And if they are removed without an alternative, they will cause big problems for the Latios Framework, most likely resulting in a total fork or moving to another technology base.
Entities 1.2 breaks determinism in a really bad way. Entity IDs are no longer
deterministic, meaning the only way to order a list of entities in a
deterministic way is to know both which chunks they belong to AND know the
indices of those chunks relative to some EntityQuery
.
Besides creating a bunch of confusion around whether chunk order determinism
even matters (because now it is really hard to preserve) and what the point of
the sortKey
in ECB is, this new change introduces a major problem in the
Latios Framework’s development, which is debugging.
Kinemation relies heavily on chunk components and caching of relationships. When a bug happens, it is crucial to be able to replay the simulation up to the bug to identify the source of the problem. Many of the algorithms Kinemation uses don’t have access to the chunk index and index in chunk for a list of entities collected in parallel. There is no way to order them deterministically in 1.2. That means that chunk order is not preserved, and whether or not two entities lie in the same chunk may change run to run. Thus, if the bug was dependent on two entities being in the same or different chunks, the bug is only reproducible by chance. That’s really, really bad.
Full determinism per architecture was one of Unity’s major competitive advantages over other solutions. And now it is being thrown away. I’m willing to compromise if it makes Game Object/ECS Unification amazing. But you’ll have to forgive me if I am a bit skeptical. I believe there may be other ways to solve the problem. What I ask is that the rules be well-defined regarding the expectations of determinism for maximum correctness that package developers should adhere to. And I ask that based on the rule defined, additional runtime and debugging tools be invested in to support the new rule and accommodate its shortcomings when it comes to chunk-level operations.
The ECS learning curve is steep. And as a package developer aiming for maximum
performance, I need some pretty complex data layouts, and I need to provide an
intuitive way for users to interact with that data. IAspect
addressed this
problem. If it goes away, I need a replacement. I don’t care if I as the package
author have to write a lot more code. But it has to be easy for the users, and
fit into the existing ECS approaches everything else uses.
In general, I’m a bit frustrated with the inextensibility of codegen. The fact
that IJobEntity
can’t be taught new things to iterate. And it requires the
support of the ISystem
source generator that rewrites the entire method. And
consequently makes it nearly impossible to add new SystemAPI-like extensions.
But I will discuss more about these in the following sections.
If you need an example of the kind of complexity I am trying to abstract, my framework has a file called OptimizedSkeletonAspect.cs. Try to make sense of the buffer rotation mechanism that avoids massive buffer copies every frame.
These are items which are preventing or creating unnecessary friction for specific new features or optimizations of the Latios Framework from being developed. They may also be creating undesirable effects on usage of the framework and reducing quality on the overall solution.
I’ve been encountering a strange error where when calling AudioClip.GetData()
in a baker or baking system, and then creating a DSPGraph at runtime, and then
triggering a domain reload from code changes, I end up with Unity soft-locked. I
will continue to investigate to see if I can remove something from the formula
(I already know that creating a DSPGraph and not calling AudioClip.GetData()
avoids the soft-lock). But I would really appreciate this be looked into.
I have encountered a situation in which the bone weights using shader graph’s Linear Blend Skinning node get bound with R32G32_FLOAT, and this causes the models to be scaled incorrectly. While I know Unity’s Entities Graphics has moved away from this node, I still support it because there are various scenarios where it outperforms the compute shader skinning alternative.
The fact that we can’t use SystemAPI
in static methods makes it extremely
difficult to build extensions and common patterns. For example, I have a static
method Physics.BuildCollisionLayer()
that needs to schedule 5 jobs in
sequence. While there are several variants, one variant requires the first job
to perform chunk iteration. Securing such type handles is extremely problematic.
The user has to manually cache and update a struct containing those handles,
because this method can’t rely on SystemAPI
. That’s a lot of unnecessary
boilerplate burdened directly on the user.
Source generators are an incredibly powerful tool. I used to complain about poor documentation, but I have seen some effort recently to address this.
However, not every problem is solved. Currently there’s no way to replace
IAspect
with a custom solution and have it work correctly with IJobEntity
.
There’s no way to access the IJobEntity
EntityQuery
. Additionally, there’s
no way to recreate the allocation-free behavior of idiomatic foreach manually.
And there’s no way to create our own aggregate type handles that can be
automatically cached SystemAPI-style.
Subscene import workflows have significant usability issues. Because they occur in a separate Unity process, they do not use Burst, cannot easily be debugged, have limited reporting capabilities of memory leaks and the like, and many engine features are not well tested when accessed in this mode (it took 3 years for the audio crash bug to be fixed).
The Latios Framework pushes the boundaries of what can be baked, with new and exciting high-level features. But that only works when baking itself works, which has been a constant pain point.
A huge optimization I made with Kinemation’s renderer is writing to Graphics
Buffers and dispatching compute shaders inside the culling loop instead of
before it. This way I take culling results into account and do significantly
less work. This applies to skinning, material properties, blend shapes, other
mesh deformations, and whatever else. Only problem is that now I have to
complete culling jobs inside the culling callbacks so that I can end
GraphicsBuffer
writes and dispatch the compute shaders. It would be awesome if
BatchRendererGroup
could get an additional callback when the jobs need to be
completed so that I could do these compute shader dispatches as late as
possible. SRP shenanigans are a big chunk of my frame time and the worker
threads are starved.
It would be awesome if I could see an outline of all functions a Burst job compiled and jump between them. Right now it is still difficult to understand what is happening in critical sections of code in massive jobs.
I have to move an entity, or more often an array of entities twice if I have both a set of components to add and a set of components to remove.
If a zero-sized component is added or removed on all entities in the chunk, the chunk’s archetype is converted in-place, which is a great optimization. However, when there are only a couple of entities in the chunk because the source archetype represents a temporary state, then this conversion in-place will leave lots of chunks with only a small number of entities each, causing fragmentation. It would be awesome if as an additional check, if there is another chunk that can accommodate the entire existing chunk, that the entities move to the new chunk rather than perform the in-place conversion.
Most of the time, I find myself using sub-optimal structural change sequences just to avoid this edge case.
Sometimes, I really want to specify that a container in a job is meant to be
allocated in the job via Allocator.Temp
, without having to use the nuclear
attribute [NativeDisableContainerSafetyRestriction]
. Other times, I might have
a job that takes a variable number of DynamicComponentTypeHandles
, and I never
know what to populate the unused slots with. Again, disabling container safety
is really bad because then if the user messes up job dependencies elsewhere, the
issue may go unnoticed.
I believe Dynamic Buffers should be allocated from a custom ECS-managed
allocator and not using Allocator.Persistent
. The allocations are often small
and would be better suited with a pool. This is starting to become a performance
problem for me during initialization due to all the consecutive small
allocations in my ICleanupBufferElement
types.
There’s currently not a clean way to do the operation of “If this authoring instance is referenced in a list by any other authoring instance, add this runtime component”. You can only do this if you make the restriction that the other authoring instance with the list is an ancestor in the hierarchy. Incremental baking is hard, and I respect that this is not an easy problem to fix. But I will still bring it up here, since it is a limitation a lot of people run into.
These are some of the other issues I have ran into with the Latios Framework that required explicit workarounds that were far from ideal.
We have ICustomBootstrap
for setting up systems at runtime. Why can’t we do
the same thing in the Editor? I ended up extending ECS to do that, but I do it
by accessing some internal Action after the Editor World is created and then try
to replace it. And then I also have a hack to rebuild the EditorWorld
as a
menu option when a buggy editor system goes haywire and the full Editor state is
corrupted. Unfortunately, this hack isn’t bullet-proof and sometimes causes the
wrong world to run an update or two, which then fails and throws errors in the
console.
Then there’s baking. I have a custom Skinned Mesh Rendering solution. Why can’t
I turn off the built-in Entities Graphics Skinned Mesh Renderer baking without
turning off the entire baking of Entities Graphics? Once again, I hacked this by
using a custom baker list mechanism that seemed to be created for tests. I do
this at startup to create a custom bootstrap callback, and then for each baking
world, I have a system in OnCreate
assign a RateManager
to one of the first
ComponentSystemGroups
baking uses, and then in that callback I disable systems
I don’t want and then inject systems with the DisableAutoCreation
attribute.
Why do systems have that attribute? It is because those systems are for an
optional feature that users may or may not want. Why do I use RateManager
? It
is the only way to ensure the already included ComponentSystemGroups
have had
their OnCreate()
called when I inject the systems, because otherwise I can’t
add systems to them.
And while we are on this topic, I would greatly appreciate a flag in
UpdateBefore/After
attributes to suppress warnings about the systems being in
the wrong groups. Such systems might just not be installed at all. A user may
have replaced it with a custom version or something. Bonus points if they can be
suppressed externally.
Personally, I think the whole bottom-up automatic injection design of systems is problematic. It makes it difficult for users to optimize system ordering for better worker thread occupancy, unless they want to decorate their systems with false dependencies. It becomes impossible to know just by looking at the code what the actual order of systems are if there’s a bug where some data is getting changed in the wrong place. And it makes it really hard to copy a system into a different project. Also, how do you define a system to run more than once in a frame?
A top-down approach solves all these problems, and the Latios Framework has the mechanisms in-place to support this. Unfortunately, this conflicts with a lot of existing paradigms. I don’t know the right answer.
Lastly, the whole ICustomBootstrap
thing does not play well with embedded
samples inside of packages. Bootstraps should be settings assets that can be
swapped in the Editor. This feature is planned for a future Latios Framework
version, but I wish I didn’t have to be the one to do it.
Why are collections married to singletons? Why are there even singletons? Do you truly only want one of something, or do you just want to know which entity is the entity? The Latios Framework solves these use cases independently with blackboard entities and collection components. The latter has similar problems as managed structs, except this time user API is fully Burst-compatible. But if you are from Unity and want to do something more official, please reach out to me!
Currently the Latios Framework has this SmartBlobber
mechanism for creating
blob assets in baking systems based on a “request” protocol. For each blob type,
the user has to register the type so that a generic system can properly
ref-count and store blobs in the BlobAssetStore
(deduplicating in the
process).
I currently face two problems that I have hacked around. First, adding concrete
types to BlobAssetStore
is not Burst-compatible. I have to use internal APIs
to precompute the type hash prior to the job. Second, I would much rather add
UnsafeUntypedBlobAssetReference
blobs directly so that I don’t need generics.
Honestly, I think the BlobAssetStore
should use Burst’s type hashes instead of
System.Type.GetHashCode
and expose that as API for working with
UnsafeUntypedBlobAssetReference
.
The Latios Framework Smart Blobbers are a powerful concept. They allow baking
systems to generate blob assets without necessarily knowing nor caring how those
blob assets will be used. User bakers can request blob assets to be created.
Baking systems create the blob assets, then pass the blobs back to the user to
do what they please. The issue is how to pass those blobs back to the user
without making the user write a custom baking system, which is error prone. The
solution I came up with is to create a generic baking system and a “bake item”.
The bake item is a stateful IComponentData
which does the original baking, and
then later receives a callback with a reference to EntityManager
and the
primary entity to resolve any blob asset requests and assign them to components.
This works, but it involves generic systems, and it is still somewhat unsafe.
Ideally, there would be some way to have additional baker callbacks dispatched
by a baking system. And inside these baker callbacks, the baker is only allowed
to change or remove components it added. I’m open for ideas for improvements
and/or alternatives!
My complaints regarding transforms have been well-addressed, albeit at a snail’s
pace. I still criticize the choice of LocaToWorld
being a float4x4
instead
of float3x4
, but I also recognize that is a breaking change to rectify. What
would not be a breaking change is fixing the change filter race condition when
updating the child hierarchy. This would also improve performance, as it would
result in less subtrees requiring the full matrix update due to an adjacent
entity in a chunk poisoning the change version on a different thread.
WorldUpdateAllocator
doesn’t get rewound in baking worlds. Therefore, we have
to use TempJob
allocations everywhere when baking.
IJobEntityChunkBeginEnd
doesn’t support a derived interface that uses default
interface methods, because the source generators generate code that directly
calls the methods rather than use a generic static invoker.
Why does this not exist?
In MonoBehaviours
, you can do GetComponent<ISomeInterface>()
. In bakers,
this isn’t possible. Why?
Most of the time, I want a baker to check if some interface exists on the same Game Object, and if so, early out so that another Baker that processes the interface can work unhindered.
I started using FixedString
and BlobArray<byte>
in blobs because I couldn’t
log BlobStrings
in Burst-compiled code. There’s a lot of missing APIs and
features for BlobStrings
. Make them better so that I can be more efficient
with my data.
I have a task where I have M arrays of bytes and a separate N array of bytes.
For each array in M, I need to find an array in N that starts with all the bytes
in the array from M. Currently, I’m using UnsafeUtility.MemCmp
in O(n^2)
fashion. But I believe that sorting M and N by raw byte values would lead to a
faster algorithm. But can MemCmp
be used for this kind of sorting? Is there a
better approach?
This used to trigger an error. It was finally patched, but the solution was to process each entity one-by-one. Performance is awful.
I feel like I shouldn’t need to write custom code to do this. Some subscenes are critical to be loaded before systems should start running. The player falling through the floor is a common complaint I’ve seen.
A really common use case is to procedurally generate meshes for Mesh Renderers.
While the algorithms work in bakers fine, getting the Mesh Renderer Baker to
accept this and not bake a null mesh or something would be great. Currently, I
replaced the MeshRenderer
baker with a custom version which checks if there is
not a subclass of some other MonoBehaviour
before continuing. If there is, it
leaves it up to that other MonoBehaviour
to do custom baking instead,
providing the custom mesh and list of materials to use for the renderer. You can
see this in action in LSSS, as all the capsules are generated procedurally at
bake time.
Entities Graphics baking got a rewrite in 1.2, and it is a lot better than it used to be. However, it still has a bug where it adds components to entities in a baking system, which it doesn’t know how to correctly revert.
Psyshock uses generic jobs in Physics.FindPairs()
using a pattern that allows
Burst to detect and compile the jobs both in the Editor and in builds without
having to explicitly register the generic types with attributes. Unfortunately,
the ILPP can’t pick up on it and patch these jobs to be Burst-schedulable. There
should not be a discrepancy!
Currently I am relying on reflection to find and call the EarlyJobInit()
methods myself for specific generic types.
The whole Skinned Mesh Rendering solution in Entities Graphics is problematic.
It generates GC every frame, it doesn’t scale, and even the public API types of
SkinMatrix
and BlendShapeWeight
fundamentally prevent more efficient
algorithms like the ones Kinemation uses. I’ve been told numerous times that the
skinned mesh rendering design is “experimental”. If that’s the case, why is it
in the released version of Entities Graphics without any guard flags?
I’m only asking this because I have a sliver of hope that Entities Graphics may adopt a design closer to Kinemation in which case I can delegate some features of Kinemation to the official package. While it has been announced a new design is being worked on, I know nothing about what that looks like or if it will make my life easier or harder.
I ended up completely rewriting the LOD system in Entities Graphics because the existing system was bad, doing a bunch of random lookups and wasting chunk memory. Also, the existing system doesn’t support LOD Crossfade. I implemented it, though a bugfix to URP hasn’t been backported to URP 14 (2022 LTS) yet.
I frequently run into issues where default root ComponentSystemGroups
accidentally get added to other groups if I don’t explicitly remove them from
the list. Since these are systems that Unity will manually create, they should
have a [DisableAutoCreation]
attribute. At least now because they are
partial
, I am able to fix this with asmref.
If you try to get all systems, the systems with the DisableAutoCreation
don’t
get added to the list even if you specify All
like the XML documentation
suggests. I have to use reflection for this now, which sucks.
Is there a more performant way to get the raw blend shapes data (the deltas, not the animated parameters) than queueing up a bunch of async readbacks and then batch-completing them inside a baking system?
I want to bake audio clip samples in blob assets. Current API doesn’t offer any
NativeArray
API. It is slow. Also, if I could get the raw compressed bytes and
compression codec of audio clips, I could do my own decompression at runtime
without having to do my own compression. That would be awesome!
One of the features of blackboard entities in the Latios Framework is that they
merge components of blackboard config entities whenever a subscene containing
them loads. This allows the user to spread config authoring data across multiple
GameObjects
. However, doing this merging at runtime is surprisingly difficult.
While it is easy to get the ComponentType
list to copy from one entity to
another, it is significantly more difficult to actually copy those types. For
unmanaged components, we have the tools now. But for managed components,
especially shared components, it is problematic. Currently, the Latios Framework
uses reflection, but I would love for there to be a proper
EntityManager.AddComponentFromOtherEntity(Entity src, Entity dst, ComponentType ct)
API so that I can Burst-compile this whole thing.
You can get a read-only pointer to components in a chunk, even if the
ComponentTypeHandle
is declared with write access. You cannot do the same for
a BufferAccessor
.
I had a bug where queries weren't being matched because of this. The bug only happened in an editor system while the subscene was open. It was really annoying.
LinkedEntityGroup
internal capacity was changed from 1 to 0, however, it is
still added to solo prefab entities, causing heap allocations every time you
instantiate the prefab. This was a measurable performance regression in one of
my projects, and I had to write a baking system to address it.
These are little things in the API I think should be improved, but don’t have a major impact on the Latios Framework.
I have large chunk components. Reading/Writing by ref is way faster. I have extensions to do this, but official support would be better.
Similarly, I’ve also noticed ref gaps for EntityManager
.
The biggest issue I have with idiomatic foreach is that it is really clunky for
large queries. With Entities.ForEach
, you could put each argument (type and
variable name) on a different line. That doesn’t really work well with idiomatic
foreach. I recognize this is a hard problem, and I don’t have a proposed
solution yet.
But at the very least, make it so that we can have Entity first in the tuple. It is difficult to articulate why, but the Entity being at the end annoys me and most others I talk to.
Codegen already injects the ComponentTypeHandles
into IJobEntity
. Can we
have an [Inject]
attribute for lookups and Time
to have codegen do the same?
That would reduce a bunch of boilerplate.
I keep finding these, and they always catch me off guard in custom bootstraps. Put them where they belong.
This isn’t possible in idiomatic foreach. I have to have some additional dummy read component around, or deep copy the entity array.
NativeStream
doesn't respect alignment, gets its counts messed up when writing
piecewise but reading in bulk or vice-versa, can't store writes 4kB or greater,
can't defer allocation with a schedule-time known allocation size, ect.
This was an incident that caught me off guard. It turns out these indices are always negative, and contain metadata packed inside them. I suspect some of this data wasn’t supposed to reach the public API surface, but it does.