Dreaming381/Latios Framework Unity ECS Wishlist.md

## Latios Framework Unity ECS Wishlist.md

      
    Raw
  

              Latios Framework Unity ECS Wishlist.md
            
          
    Latios Framework Unity ECS Wishlist

Last Updated: 2024-6-25
As the original author and primary developer of the Latios Framework for Unity’s
ECS, I regularly run into bugs, inadequate functionalities, and pitfalls within
the ECS ecosystem. This has resulted in lots of “hacks” in the framework to
patch up problems. This is a living document describing the issues and hacks. My
hope is that the Unity ECS team will find this to be a valuable reference to
help improve the quality of their packages.
This document is organized into four categories: Ship Stoppers, Feature Blockers
and Hinderances, Hacks and Ugliness, and Annoyances, in greatest to least
severity respectively.
This document does not include all possible features the Latios Framework may
eventually implement if no official solution is provided. If you would like to
learn about such features, it is best to ask me.
Ship Stoppers

These are high-severity items that is preventing the Latios Framework from
functioning correctly, with no plausible workarounds.
As of Entities 1.3.0-exp.1, there are no immediate severe items. However, there
are two items that have been hinted at their removal in the future. And if they
are removed without an alternative, they will cause big problems for the Latios
Framework, most likely resulting in a total fork or moving to another technology
base.
ENTITY_STORE_V1

Entities 1.2 breaks determinism in a really bad way. Entity IDs are no longer
deterministic, meaning the only way to order a list of entities in a
deterministic way is to know both which chunks they belong to AND know the
indices of those chunks relative to some EntityQuery.
Besides creating a bunch of confusion around whether chunk order determinism
even matters (because now it is really hard to preserve) and what the point of
the sortKey in ECB is, this new change introduces a major problem in the
Latios Framework’s development, which is debugging.
Kinemation relies heavily on chunk components and caching of relationships. When
a bug happens, it is crucial to be able to replay the simulation up to the bug
to identify the source of the problem. Many of the algorithms Kinemation uses
don’t have access to the chunk index and index in chunk for a list of entities
collected in parallel. There is no way to order them deterministically in 1.2.
That means that chunk order is not preserved, and whether or not two entities
lie in the same chunk may change run to run. Thus, if the bug was dependent on
two entities being in the same or different chunks, the bug is only reproducible
by chance. That’s really, really bad.
Full determinism per architecture was one of Unity’s major competitive
advantages over other solutions. And now it is being thrown away. I’m willing to
compromise if it makes Game Object/ECS Unification amazing. But you’ll have to
forgive me if I am a bit skeptical. I believe there may be other ways to solve
the problem. What I ask is that the rules be well-defined regarding the
expectations of determinism for maximum correctness that package developers
should adhere to. And I ask that based on the rule defined, additional runtime
and debugging tools be invested in to support the new rule and accommodate its
shortcomings when it comes to chunk-level operations.
IAspect

The ECS learning curve is steep. And as a package developer aiming for maximum
performance, I need some pretty complex data layouts, and I need to provide an
intuitive way for users to interact with that data. IAspect addressed this
problem. If it goes away, I need a replacement. I don’t care if I as the package
author have to write a lot more code. But it has to be easy for the users, and
fit into the existing ECS approaches everything else uses.
In general, I’m a bit frustrated with the inextensibility of codegen. The fact
that IJobEntity can’t be taught new things to iterate. And it requires the
support of the ISystem source generator that rewrites the entire method. And
consequently makes it nearly impossible to add new SystemAPI-like extensions.
But I will discuss more about these in the following sections.
If you need an example of the kind of complexity I am trying to abstract, my
framework has a file called OptimizedSkeletonAspect.cs. Try to make sense of the
buffer rotation mechanism that avoids massive buffer copies every frame.
Feature Blockers and Hinderances

These are items which are preventing or creating unnecessary friction for
specific new features or optimizations of the Latios Framework from being
developed. They may also be creating undesirable effects on usage of the
framework and reducing quality on the overall solution.
Audio Editor Soft-lock

I’ve been encountering a strange error where when calling AudioClip.GetData()
in a baker or baking system, and then creating a DSPGraph at runtime, and then
triggering a domain reload from code changes, I end up with Unity soft-locked. I
will continue to investigate to see if I can remove something from the formula
(I already know that creating a DSPGraph and not calling AudioClip.GetData()
avoids the soft-lock). But I would really appreciate this be looked into.
Linear Blend Skinning R32G32_FLOAT

I have encountered a situation in which the bone weights using shader graph’s
Linear Blend Skinning node get bound with R32G32_FLOAT, and this causes the
models to be scaled incorrectly. While I know Unity’s Entities Graphics has
moved away from this node, I still support it because there are various
scenarios where it outperforms the compute shader skinning alternative.
SystemAPI Extensibility

The fact that we can’t use SystemAPI in static methods makes it extremely
difficult to build extensions and common patterns. For example, I have a static
method Physics.BuildCollisionLayer() that needs to schedule 5 jobs in
sequence. While there are several variants, one variant requires the first job
to perform chunk iteration. Securing such type handles is extremely problematic.
The user has to manually cache and update a struct containing those handles,
because this method can’t rely on SystemAPI. That’s a lot of unnecessary
boilerplate burdened directly on the user.
Codegen Accessibility

Source generators are an incredibly powerful tool. I used to complain about poor
documentation, but I have seen some effort recently to address this.
However, not every problem is solved. Currently there’s no way to replace
IAspect with a custom solution and have it work correctly with IJobEntity.
There’s no way to access the IJobEntity EntityQuery. Additionally, there’s
no way to recreate the allocation-free behavior of idiomatic foreach manually.
And there’s no way to create our own aggregate type handles that can be
automatically cached SystemAPI-style.
Subscene Imports

Subscene import workflows have significant usability issues. Because they occur
in a separate Unity process, they do not use Burst, cannot easily be debugged,
have limited reporting capabilities of memory leaks and the like, and many
engine features are not well tested when accessed in this mode (it took 3 years
for the audio crash bug to be fixed).
The Latios Framework pushes the boundaries of what can be baked, with new and
exciting high-level features. But that only works when baking itself works,
which has been a constant pain point.
Ending Writes to Graphics Buffers

A huge optimization I made with Kinemation’s renderer is writing to Graphics
Buffers and dispatching compute shaders inside the culling loop instead of
before it. This way I take culling results into account and do significantly
less work. This applies to skinning, material properties, blend shapes, other
mesh deformations, and whatever else. Only problem is that now I have to
complete culling jobs inside the culling callbacks so that I can end
GraphicsBuffer writes and dispatch the compute shaders. It would be awesome if
BatchRendererGroup could get an additional callback when the jobs need to be
completed so that I could do these compute shader dispatches as late as
possible. SRP shenanigans are a big chunk of my frame time and the worker
threads are starved.
Large Burst Jobs with Lots of Functions Difficult to Navigate

It would be awesome if I could see an outline of all functions a Burst job
compiled and jump between them. Right now it is still difficult to understand
what is happening in critical sections of code in massive jobs.
Can’t Add and Remove Components in a Single Structural Change

I have to move an entity, or more often an array of entities twice if I have
both a set of components to add and a set of components to remove.
Adding/Removing Tag Components Causes Fragmentation

If a zero-sized component is added or removed on all entities in the chunk, the
chunk’s archetype is converted in-place, which is a great optimization. However,
when there are only a couple of entities in the chunk because the source
archetype represents a temporary state, then this conversion in-place will leave
lots of chunks with only a small number of entities each, causing fragmentation.
It would be awesome if as an additional check, if there is another chunk that
can accommodate the entire existing chunk, that the entities move to the new
chunk rather than perform the in-place conversion.
Most of the time, I find myself using sub-optimal structural change sequences
just to avoid this edge case.
Optional Containers in Jobs

Sometimes, I really want to specify that a container in a job is meant to be
allocated in the job via Allocator.Temp, without having to use the nuclear
attribute [NativeDisableContainerSafetyRestriction]. Other times, I might have
a job that takes a variable number of DynamicComponentTypeHandles, and I never
know what to populate the unused slots with. Again, disabling container safety
is really bad because then if the user messes up job dependencies elsewhere, the
issue may go unnoticed.
DynamicBuffer Allocations

I believe Dynamic Buffers should be allocated from a custom ECS-managed
allocator and not using Allocator.Persistent. The allocations are often small
and would be better suited with a pool. This is starting to become a performance
problem for me during initialization due to all the consecutive small
allocations in my ICleanupBufferElement types.
Inside-out Baking Struggles

There’s currently not a clean way to do the operation of “If this authoring
instance is referenced in a list by any other authoring instance, add this
runtime component”. You can only do this if you make the restriction that the
other authoring instance with the list is an ancestor in the hierarchy.
Incremental baking is hard, and I respect that this is not an easy problem to
fix. But I will still bring it up here, since it is a limitation a lot of people
run into.
Hacks and Ugliness

These are some of the other issues I have ran into with the Latios Framework
that required explicit workarounds that were far from ideal.
Bootstraps and Baking Customizations

We have ICustomBootstrap for setting up systems at runtime. Why can’t we do
the same thing in the Editor? I ended up extending ECS to do that, but I do it
by accessing some internal Action after the Editor World is created and then try
to replace it. And then I also have a hack to rebuild the EditorWorld as a
menu option when a buggy editor system goes haywire and the full Editor state is
corrupted. Unfortunately, this hack isn’t bullet-proof and sometimes causes the
wrong world to run an update or two, which then fails and throws errors in the
console.
Then there’s baking. I have a custom Skinned Mesh Rendering solution. Why can’t
I turn off the built-in Entities Graphics Skinned Mesh Renderer baking without
turning off the entire baking of Entities Graphics? Once again, I hacked this by
using a custom baker list mechanism that seemed to be created for tests. I do
this at startup to create a custom bootstrap callback, and then for each baking
world, I have a system in OnCreate assign a RateManager to one of the first
ComponentSystemGroups baking uses, and then in that callback I disable systems
I don’t want and then inject systems with the DisableAutoCreation attribute.
Why do systems have that attribute? It is because those systems are for an
optional feature that users may or may not want. Why do I use RateManager? It
is the only way to ensure the already included ComponentSystemGroups have had
their OnCreate() called when I inject the systems, because otherwise I can’t
add systems to them.
And while we are on this topic, I would greatly appreciate a flag in
UpdateBefore/After attributes to suppress warnings about the systems being in
the wrong groups. Such systems might just not be installed at all. A user may
have replaced it with a custom version or something. Bonus points if they can be
suppressed externally.
Personally, I think the whole bottom-up automatic injection design of systems is
problematic. It makes it difficult for users to optimize system ordering for
better worker thread occupancy, unless they want to decorate their systems with
false dependencies. It becomes impossible to know just by looking at the code
what the actual order of systems are if there’s a bug where some data is getting
changed in the wrong place. And it makes it really hard to copy a system into a
different project. Also, how do you define a system to run more than once in a
frame?
A top-down approach solves all these problems, and the Latios Framework has the
mechanisms in-place to support this. Unfortunately, this conflicts with a lot of
existing paradigms. I don’t know the right answer.
Lastly, the whole ICustomBootstrap thing does not play well with embedded
samples inside of packages. Bootstraps should be settings assets that can be
swapped in the Editor. This feature is planned for a future Latios Framework
version, but I wish I didn’t have to be the one to do it.
Collections in Components

Why are collections married to singletons? Why are there even singletons? Do you
truly only want one of something, or do you just want to know which entity is
the entity? The Latios Framework solves these use cases independently with
blackboard entities and collection components. The latter has similar problems
as managed structs, except this time user API is fully Burst-compatible. But if
you are from Unity and want to do something more official, please reach out to
me!
BlobAssetStore

Currently the Latios Framework has this SmartBlobber mechanism for creating
blob assets in baking systems based on a “request” protocol. For each blob type,
the user has to register the type so that a generic system can properly
ref-count and store blobs in the BlobAssetStore (deduplicating in the
process).
I currently face two problems that I have hacked around. First, adding concrete
types to BlobAssetStore is not Burst-compatible. I have to use internal APIs
to precompute the type hash prior to the job. Second, I would much rather add
UnsafeUntypedBlobAssetReference blobs directly so that I don’t need generics.
Honestly, I think the BlobAssetStore should use Burst’s type hashes instead of
System.Type.GetHashCode and expose that as API for working with
UnsafeUntypedBlobAssetReference.
Smart Bakers Alternative?

The Latios Framework Smart Blobbers are a powerful concept. They allow baking
systems to generate blob assets without necessarily knowing nor caring how those
blob assets will be used. User bakers can request blob assets to be created.
Baking systems create the blob assets, then pass the blobs back to the user to
do what they please. The issue is how to pass those blobs back to the user
without making the user write a custom baking system, which is error prone. The
solution I came up with is to create a generic baking system and a “bake item”.
The bake item is a stateful IComponentData which does the original baking, and
then later receives a callback with a reference to EntityManager and the
primary entity to resolve any blob asset requests and assign them to components.
This works, but it involves generic systems, and it is still somewhat unsafe.
Ideally, there would be some way to have additional baker callbacks dispatched
by a baking system. And inside these baker callbacks, the baker is only allowed
to change or remove components it added. I’m open for ideas for improvements
and/or alternatives!
Transforms

My complaints regarding transforms have been well-addressed, albeit at a snail’s
pace. I still criticize the choice of LocaToWorld being a float4x4 instead
of float3x4, but I also recognize that is a breaking change to rectify. What
would not be a breaking change is fixing the change filter race condition when
updating the child hierarchy. This would also improve performance, as it would
result in less subtrees requiring the full matrix update due to an adjacent
entity in a chunk poisoning the change version on a different thread.
WorldUpdateAllocator in Baking Systems

WorldUpdateAllocator doesn’t get rewound in baking worlds. Therefore, we have
to use TempJob allocations everywhere when baking.
IJobEntityChunkBeginEnd with Default Methods

IJobEntityChunkBeginEnd doesn’t support a derived interface that uses default
interface methods, because the source generators generate code that directly
calls the methods rather than use a generic static invoker.
No NativeArray.DisposeJob for CollectionHelper

Why does this not exist?
Why can’t I Get Interfaces in Bakers?

In MonoBehaviours, you can do GetComponent<ISomeInterface>(). In bakers,
this isn’t possible. Why?
Most of the time, I want a baker to check if some interface exists on the same
Game Object, and if so, early out so that another Baker that processes the
interface can work unhindered.
BlobStrings

I started using FixedString and BlobArray<byte> in blobs because I couldn’t
log BlobStrings in Burst-compiled code. There’s a lot of missing APIs and
features for BlobStrings. Make them better so that I can be more efficient
with my data.
Nonzero Results of UnsafeUtility.MemCmp

I have a task where I have M arrays of bytes and a separate N array of bytes.
For each array in M, I need to find an array in N that starts with all the bytes
in the array from M. Currently, I’m using UnsafeUtility.MemCmp in O(n^2)
fashion. But I believe that sorting M and N by raw byte values would lead to a
faster algorithm. But can MemCmp be used for this kind of sorting? Is there a
better approach?
Adding/Removing ComponentTypeSet containing chunk components to a NativeArray<Entity>

This used to trigger an error. It was finally patched, but the solution was to
process each entity one-by-one. Performance is awful.
Auto-Load Subscenes Synchronously?

I feel like I shouldn’t need to write custom code to do this. Some subscenes are
critical to be loaded before systems should start running. The player falling
through the floor is a common complaint I’ve seen.
Procedural Meshes in Baking

A really common use case is to procedurally generate meshes for Mesh Renderers.
While the algorithms work in bakers fine, getting the Mesh Renderer Baker to
accept this and not bake a null mesh or something would be great. Currently, I
replaced the MeshRenderer baker with a custom version which checks if there is
not a subclass of some other MonoBehaviour before continuing. If there is, it
leaves it up to that other MonoBehaviour to do custom baking instead,
providing the custom mesh and list of materials to use for the renderer. You can
see this in action in LSSS, as all the capsules are generated procedurally at
bake time.
Entities Graphics Baking is Buggy

Entities Graphics baking got a rewrite in 1.2, and it is a lot better than it
used to be. However, it still has a bug where it adds components to entities in
a baking system, which it doesn’t know how to correctly revert.
Burst Generic Jobs Can’t Be Scheduled in Burst

Psyshock uses generic jobs in Physics.FindPairs() using a pattern that allows
Burst to detect and compile the jobs both in the Editor and in builds without
having to explicitly register the generic types with attributes. Unfortunately,
the ILPP can’t pick up on it and patch these jobs to be Burst-schedulable. There
should not be a discrepancy!
Currently I am relying on reflection to find and call the EarlyJobInit()
methods myself for specific generic types.
Experimental Skinned Mesh Rendering

The whole Skinned Mesh Rendering solution in Entities Graphics is problematic.
It generates GC every frame, it doesn’t scale, and even the public API types of
SkinMatrix and BlendShapeWeight fundamentally prevent more efficient
algorithms like the ones Kinemation uses. I’ve been told numerous times that the
skinned mesh rendering design is “experimental”. If that’s the case, why is it
in the released version of Entities Graphics without any guard flags?
I’m only asking this because I have a sliver of hope that Entities Graphics may
adopt a design closer to Kinemation in which case I can delegate some features
of Kinemation to the official package. While it has been announced a new design
is being worked on, I know nothing about what that looks like or if it will make
my life easier or harder.
LODs and LOD Crossfade

I ended up completely rewriting the LOD system in Entities Graphics because the
existing system was bad, doing a bunch of random lookups and wasting chunk
memory. Also, the existing system doesn’t support LOD Crossfade. I implemented
it, though a bugfix to URP hasn’t been backported to URP 14 (2022 LTS) yet.
Default Groups

I frequently run into issues where default root ComponentSystemGroups
accidentally get added to other groups if I don’t explicitly remove them from
the list. Since these are systems that Unity will manually create, they should
have a [DisableAutoCreation] attribute. At least now because they are
partial, I am able to fix this with asmref.
TypeManager.GetAllSystems [DisableAutoCreation] Bug

If you try to get all systems, the systems with the DisableAutoCreation don’t
get added to the list even if you specify All like the XML documentation
suggests. I have to use reflection for this now, which sucks.
Why Do I Need a GraphicsBuffer to Get Blend Shapes?

Is there a more performant way to get the raw blend shapes data (the deltas, not
the animated parameters) than queueing up a bunch of async readbacks and then
batch-completing them inside a baking system?
Why is there No Burst-Compatible Way to Read Audio Clips?

I want to bake audio clip samples in blob assets. Current API doesn’t offer any
NativeArray API. It is slow. Also, if I could get the raw compressed bytes and
compression codec of audio clips, I could do my own decompression at runtime
without having to do my own compression. That would be awesome!
Copying Shared Components Type-Agnostically

One of the features of blackboard entities in the Latios Framework is that they
merge components of blackboard config entities whenever a subscene containing
them loads. This allows the user to spread config authoring data across multiple
GameObjects. However, doing this merging at runtime is surprisingly difficult.
While it is easy to get the ComponentType list to copy from one entity to
another, it is significantly more difficult to actually copy those types. For
unmanaged components, we have the tools now. But for managed components,
especially shared components, it is problematic. Currently, the Latios Framework
uses reflection, but I would love for there to be a proper
EntityManager.AddComponentFromOtherEntity(Entity src, Entity dst, ComponentType ct) API so that I can Burst-compile this whole thing.
BufferAccessor Missing Flexibility

You can get a read-only pointer to components in a chunk, even if the
ComponentTypeHandle is declared with write access. You cannot do the same for
a BufferAccessor.
Incremental Baking Discards Chunk Components

I had a bug where queries weren't being matched because of this. The bug only
happened in an editor system while the subscene was open. It was really
annoying.
A LinkedEntityGroup Regression

LinkedEntityGroup internal capacity was changed from 1 to 0, however, it is
still added to solo prefab entities, causing heap allocations every time you
instantiate the prefab. This was a measurable performance regression in one of
my projects, and I had to write a baking system to address it.
Annoyances

These are little things in the API I think should be improved, but don’t have a
major impact on the Latios Framework.
Missing Ref APIs

I have large chunk components. Reading/Writing by ref is way faster. I have
extensions to do this, but official support would be better.
Similarly, I’ve also noticed ref gaps for EntityManager.
Idiomatic Foreach is Insufficient

The biggest issue I have with idiomatic foreach is that it is really clunky for
large queries. With Entities.ForEach, you could put each argument (type and
variable name) on a different line. That doesn’t really work well with idiomatic
foreach. I recognize this is a hard problem, and I don’t have a proposed
solution yet.
But at the very least, make it so that we can have Entity first in the tuple. It
is difficult to articulate why, but the Entity being at the end annoys me and
most others I talk to.
Lookups in IJobEntity

Codegen already injects the ComponentTypeHandles into IJobEntity. Can we
have an [Inject] attribute for lookups and Time to have codegen do the same?
That would reduce a bunch of boilerplate.
Unity Systems Outside Unity Namespace

I keep finding these, and they always catch me off guard in custom bootstraps.
Put them where they belong.
Iterating Just Entities in a Query

This isn’t possible in idiomatic foreach. I have to have some additional dummy
read component around, or deep copy the entity array.
NativeStream Woes

NativeStream doesn't respect alignment, gets its counts messed up when writing
piecewise but reading in bulk or vice-versa, can't store writes 4kB or greater,
can't defer allocation with a schedule-time known allocation size, ect.
Unamanged Shared Component Indices are Not Indices

This was an incident that caught me off guard. It turns out these indices are
always negative, and contain metadata packed inside them. I suspect some of this
data wasn’t supposed to reach the public API surface, but it does.