Skip to content

Instantly share code, notes, and snippets.

@MerlinVR
Created January 20, 2020 07:46
Show Gist options
  • Save MerlinVR/bcbb3a51336fb9a2806fce484152875f to your computer and use it in GitHub Desktop.
Save MerlinVR/bcbb3a51336fb9a2806fce484152875f to your computer and use it in GitHub Desktop.
Blit scripts v2 early revision

Blit Component Specification v2

Proposed blit component for avatars and worlds to use in VRChat

Canny post made requesting this to be implemented: https://vrchat.canny.io/feature-requests/p/graphicsblit-scripts

This is an update to the original blit script request since there were concerns that the original request was underspecified and people would ask for more functionality from it after implementation. This script has been updated as such to allow a majority of the relevant functionality that shader developers were interested in after the original basic blit scripts were published.

These scripts have been tested and confirmed to work completely in Unity 2018.4, 2019.3, and 2020.1.0a.

If you have any questions, requests, or feedback, please open an issue on this GitHub or contact me through Discord at Merlin#0001.

Table of contents

Proposed features

These features are implemented in the current example scripts and have example scenes that cover basic use cases of each of these features.

Major changes in functionality and features have #defines for each of them at the top of the BlitComponent and BlitController. It will probably be easier to read the code if you choose a set of defines and remove the code that is not active for that set of defines.

Core functionality

These are things that should be considered required in some form in order for this to be useful to people.

Blit to target render texture

Allow specification of a target render texture, and optionally a source texture to reduce the number of materials needed for copying.

Blit component transform information

This passes the transforms of the object that the blit script is attached to into the shader. In this implementation you are provided with _BlitToWorldMatrix, _BlitToLocalMatrix, and _BlitWorldPosition. This is necessary for many interactive applications.

Animated material properties

This is why the BlitComponent requires a renderer on the object it is attached to. This uses the fact that Unity implements material property animations as a property block which is set on the material.

If you animate material properties on the attached renderer, the blit component will copy those properties to the blit invocation. There are two implemented mechanics for this that get switched between using the GRAPHICS_BLIT_USE_COMMAND_BUFFER define.

The default behaviour is to have GRAPHICS_BLIT_USE_COMMAND_BUFFER enabled since it is much easier and less error prone than not having it enabled. With this enabled, the blit functionality is eulated using a command buffer that uses DrawRenderer(). This is because you cannot pass a Property Block directly into a Graphics.Blit() call.

With GRAPHICS_BLIT_USE_COMMAND_BUFFER disabled, this uses Graphics.Blit internally, and the user must specify a list of properties that are animated and need to be copied since Unity provides no way to view the properties set on a property block. This has the other downside that while testing in editor the animated properties will be set on the material asset and will persist out of play mode.

Extra functionality

These are things that are optimizations for specific use cases or quality of life features.

Reference transform array

This allows the user to specify a list of transforms that get passed into the shader with the float4x4 _BlitToWorldMatrixArray[] and float4x4 _BlitToLocalMatrixArray[] uniforms. This functionality can be handled by having a blit component for each transform that needs to be written, but it can be cumbersome to do this and has more overhead.

Cube and 3D texture support

If GRAPHICS_BLIT_SUPPORT_CUBE_3D is enabled, you can use Cube and 3D render textures for the target render texture. If one of these types of render textures is used, the x component of the _BlitPassInfo uniform will specify the current slice that is being rendered to.

Skinned mesh and generic renderer capture

This is supported when GRAPHICS_BLIT_USE_COMMAND_BUFFER is enabled. The mesh particle emission uses this on a skinned mesh to record vertex positions and motion. To work properly on skinned meshes, this uses a new flag forceMatrixRecalculationPerRender that only exists in >=2018.3 that is required for the skinned mesh to update properly in time for the blit. Without using the flag, the skinned mesh data will be from the last frame which isn't acceptable for many use cases.

Rationale

There are a number of reasons that we originally proposed this script, and there are some new reasons that have only become clear after we have been able to use Custom Render Textures. We still think that this is a reasonable and useful request.

Custom Render Textures

Custom Render Textures (CRT's) are cool in theory, but in practice they are a black box thing that executes logic from an asset file. CRT's don't serve any particular purpose in an environment where you could script since the same behavior and feature set could be implemented in a few hundred lines and would be much less liable to break.

The original blit script request was written when VRChat was on Unity 5.6. While we did not make it clear, though we should have since CRT's were incorrectly touted as a magical in-engine solution by Unity that would solve everything, we were well aware that custom render textures existed in the next unity version (2017.4) and the original request was made despite that. This was because of two major reasons:

The first reason was that CRT's are a niche Unity black box asset. If things break with them, the best that VRC could do is attempt to work around the issues and make a bug report with Unity that may or may not be fixed in some number of years.

The second and more important reason was that it was already clear that CRT's expected you to be able to script to set any kind of external data in them. Since they were treated as assets there was no performant way to pass data into the materials that they used. So for instance, you couldn't tell them about the location of your hands for basic GPU particles without the use of a Camera component, or whether the system was turned on.

Custom Render Textures after a year

The first reason ended being a real issue once we actually got to Unity 2017.4 and quickly realized that CRTs placed on avatars would stop updating under various conditions, starting with https://vrchat.canny.io/bug-reports/p/custom-render-textures-bugged-on-avatars, then in later patches https://vrchat.canny.io/bug-reports/p/custom-render-textures-break-when-opening-menus. This made them not particularly useful on avatars since it required you to have the avatar cleared from cache to work.

Luckily they work in worlds correctly, you might say. But they have fundamental flaws in their design that are at odds with using them effectively for anything other than the simple single CRT examples or systems that are solely feed forward systems. I wasted 2 days attempting to get Custom Render Textures to function for my fluid simulation world, but ended up reimplementing the system in a few hours with cameras since CRTs constantly broke the systems in unexpected ways.

Cyclic dependencies

The root of their design flaws is that they cannot handle cyclic dependencies at all. CRT's boast the ability to automagically figure out the correct update order for a series of CRT's that depend on the results of each other. The caveat is that they cannot handle cyclic dependencies catastrophically. If you have any cyclic dependency in updates, Unity will spam errors in the editor and will shuffle the update order of the CRT's seemingly randomly in game. This is a massive issue since the main use of them, in my opinion, is to handle simulations. Simulations are very frequently recurrent and have cyclic dependencies on the previous frame's update results, and for anything remotely complex, they require more than 1 render texture pass so the double buffered flag does not help. I hoped that using OnDemand mode and manually calling update on them would allow you to specify a manual update order, but all that calling update on a CRT does is tell Unity that the update needs to run at some point in the current frame, in an order determined automatically by Unity's dependency finding.

No proper handling for repeated updates

The other issue is more of a problem from the performance side. While CRT's provide a way to call n updates per frame, they double buffer every update. This is the other main issue I didn't use them for my fluid simulation. It may not be immediately clear why this is an issue or that it is entirely avoidable if you're not familiar with how the GPU works or how simulation stuff is usually handled on the GPU. Unless you are using Unordered Access Views(UAV's), you cannot modify data in a texture at the same time that you read from it. In Unity UAV's are implemented using a Render Texture with the flag enableRandomWrite enabled and binding it with Graphics.SetRandomWriteTarget() or one of the other eqivilent functions for command lists. In order to work around the limitation, old game engines without access to compute shaders and Unity traditionally will copy the updated contents of a render texture into a second one so that you can read from the second one while you're writing to the main render texture. This is referred to as double buffering.

This is perfectly fine if you want to run one update on each buffer where before or after it gets updated, it copies to a second buffer for the next update to use. The issue is when you want to do multiple updates. For the fluid simulation updates I needed to do 10-20 diffusion updates for some parts of it. The problem is that with the way that CRT's currently handle sequential updates of the same buffer. If I wanted to do 20 updates of the same buffer in sequence, then Unity needs to copy the buffer 20 times as well. Those 20 copies are completely unnecessary. Unity's documentation implies that this may be fixed at some point, but it has not fixed it as of 2020.1 from what I can tell. The reason those copies are unnecessary is that you can do what some things refer to as ping-pong double buffering when you have a sequence of updates. This means that you swap the target and source render texture during each update. So if I wanted to update a buffer 4 times, using Unity's CRT update handling the sequence would look like this: Update -> copy -> update -> copy -> update -> copy -> update -> copy Using ping ponging, the update order instead looks like this with A and B being the two buffers that exist for double buffering: Update A to B -> update B to A -> update A to B -> update B to A. What this is doing, is using two buffers to handle double buffering by swapping them between each update. So you always get the last pass's data without needing to copy it around redundantly. For the fluid simulation specifically, this was a massive issue since the updates are entirely limited by bandwidth. The actual calculations are so simple that they are not measurable. So for no particular reason Unity makes updating the fluid simulation takes nearly twice as long on the GPU by copying the buffers around redundantly.

Of course if we had compute shaders and readWrite texture binding, we could do many of these updates in place for slightly better performance and half the VRAM usage. But that is not what this canny is for since the blit component is intended for use on avatar as well as worlds.

Other random issues

There are other issues with them that probably can't trivially be fixed from VRC's side and I won't go into detail here, so message me on Discord if you want more details.

The Camera component

Currently, the Unity Camera component is the most controllable and robust method that we have for storing data on the GPU between frames. However, VRC has left it in a weird, one-of-a-kind, limbo for filtering where only your friends are able to see the camera component update, and they have no way to turn it off without blocking your avatar. The camera component is not usually intended for doing simulations in shaders. But it has been adapted for that by the community, similar to how older -- more limited engines and rendering APIs used fake cameras for post processing or stuff like shallow water equation simulations. I'd argue that the camera components should still have a place in the performance settings whenever we get them since they have important use cases that the blit components don't cover, but the blit component is a more narrow use case that can be much more optimized and deserves a place of its own with that consideration.

The friends-only camera filtering is also not super convenient since there is some issue with caching that means that if you friend someone who you want to show camera-dependent stuff to, if they have already loaded your avatar with the cameras on it, they will need to restart their game for the cameras to work.

Most camera-based systems are also necessarily built on using the UiMenu layer for culling objects that need to only be visible to the camera. Since I made the original GPU particles using cameras on avatars, people have used the UiMenu layer to prevent the cameras that handle updates for their simulations from rendering objects unnecessarily. In VRC, usually all objects under your avatar hierarchy will be moved into either the PlayerLocal layer if the avatar is loaded locally, or onto the Player layer if the avatar is loaded remotely. The UiMenu is the one exception to this, any objects on this layer will not get reassigned to the Player or PlayerLocal layer. This was not only due to performance, but also a measure made to prevent crashing. Prior to Unity 2017, if you executed a grab pass on a texture format with 128 bit depth (ARGB Int or ARGB Float), the game would instantly hard crash. This was particularly bad because portals would often cull into the view of an update camera when they had grab passes, and this would crash you. This was fixed in the current version of 2017 that VRC uses so the crashing issue isn't a problem now, though performance is still a big concern. Unity has optimizations to cull layers the the camera isn't viewing, but if UiMenu were removed from avatars, the cameras would need to cull against every avatar which could take upwards of 0.4ms extra per camera on the CPU. Even with using UIMenu, cameras still have a flat overhead of somewhere on the order of 0.3-0.4ms that can be avoided for the most part with the blit scripts.

I bring this up here because it's undocumented functionality that people depend on knowingly or unknowingly. A number of my systems would need to be entirely refactored if UiMenu disappeared from avatars. And QuantumHero's gpu particle asset, which is the most popular use of avatar cameras to my knowledge has a YouTube tutorial video about how to configure it. At the point of writing this, that video alone has 13,000 views. As with most tutorial videos of its nature, you can probably assume that is a rough estimate for the number of people who have attempted to set it up. This is not counting people who have cloned avatars that use the system which probably outnumber the views on the video vastly. If UiMenu was removed without handling for patching old avatars, it's possible that 10's of thousands of people would be affected. This isn't just suppositions about "what if VRC did that." It was nearly removed a few months ago in an attempt to remove the self avatar stations, this is the relevant canny https://vrchat.canny.io/feature-requests/p/allow-sitting-in-your-own-station/. Luckily we had the opportunity to tell the ones removing the self stations about the unintended consequences if UiMenu was removed without special handling before it was pushed, so they were able to remove self stations properly without relying on brittle layer checks. But they told us at the time that UiMenu is still liable to be removed in the future, so blit scripts are a viable alternative to that before the potential removal. This is an opportunity to not shoot first and ask questions later, the blit scripts could be implemented many months or years before UiMenu needs to be removed from avatars. By then, due to the benefits of using it over cameras should have had most people move over to blit script based systems. And people who haven't switched could be pointed to up to date systems instead of just leaving people in the cold like many of these removal patches have in the past.

Filtering

Obviously since cameras currently only work for friends, a massive point, and the biggest point for some people of having the blit scripts on avatars is that it could be filtered via the performance and safety systems instead of being gated behind friends. It would probably fall under Shaders if you were to use the current safety system categories. And would be a thing of its own in the performance system.

While it's great that we still have cameras to work with, I've seen first hand how having them gated behind friends has stifled creativity using cameras within the community. There are numerous shader developers I've taught to use cameras for running simulations and logic, but most of them have stopped or gotten bored of it because they can't easily show off to strangers, and showing stuff to their friends group gets old quickly. Saying stuff along the lines of "but they could just add them as a friend" discounts the much more important interactions where someone is showing off something to one person, and other people come to observe. Due to the bug with avatar caching, this interaction is nearly-non existent because in order for it to happen, you need to friend the person and they need to rejoin. This is too much of an effort for many people. And it requires a catalyst of already having some people in the room friended. You can't just walk into a room and show how your work has paid off.

Udon

Worlds

The original proposal was made more than a year ago, expecting Udon to be something for far off in the future. Since Udon is being released for worlds soon, and will at some point have Graphics.Blit() what is the point of having this for worlds? I'd argue that being able to run 50+ blits in time on the order of tenths of a millisecond would be good, as opposed to multiple milliseconds which is where Udon performance is at the moment. Of course once major issues have been resolved and features have been implemented, I'm sure there's some room for improvement on Udon performance.

But given that performance will be improved over time, the original point of blit scripts for worlds will be satisfied by having Graphics.Blit for Udon. However the original graphics blit proposal was made expecting Udon to come a long time since then, and was just a stop-gap for worlds until Udon. For me, the end game for worlds was always to have the capability to execute compute shader kernels and bind the relevant resources via Udon. That is largely a separate thing from the blit scripts, so I'll leave it at that with the canny https://vrchat.canny.io/vrchat-udon-closed-alpha-feedback/p/add-nodes-for-compute-shaders

Avatars

Avatars are a different story. There are rumors that Udon will be supported on avatars at some capacity in the avatar 2.0 stuff that will be released at some nebulous point in the future. This section assumes that avatar 2.0 does support Udon, and that it miraculously exposes the Graphics.Blit function and material parameter setters that would be needed. If avatar 2.0 doesn't support Udon, then the blit scripts should be part of the avatar 2.0 patch in my opinion, or put in even earlier since they are independent from most systems.

Provided that Udon is supported on avatars, regardless of how it functions, I'd argue that the blit scripts should be separate from a performance standpoint since they primarily operate on the GPU, and any Udon equivalent would likely have somewhat more CPU overhead than having the scripts as part of the client in IL2CPP optimized code. If Udon does support graphics.blit on avatars, it would still be nice to have it regardless of if the blit scripts get implemented, since there may be some case where using Udon for blitting is necessary, though I've done by best with v2 to cover most of the possible use cases that blit would be able to.

This is the speculation corner where I speculate on ways that Udon might work on avatars and potential issues with those ways that should be taken into consideration if it actually works in those ways.

Since it's difficult to measure the time a given Udon script may take, and since the Udon VM already has an execution timeout, it's possible that Udon scripts for avatars could have a much much lower timeout than world scripts. If you want to only allow blit script functionality through Udon for avatars, or any functionality really and have a timeout, I'd argue that you should have some minimum number of instructions that are guaranteed to execute every frame regardless of timeout that are critical to things working correctly. For instance, many of my blit scripts rely on using local coordinate spaces for storing data as an optimization. The entire coordinate space is shifted every frame to follow my avatar and the mesh that is attached to it which shows what it contains. If script execution randomly times out on some low end systems after a couple of function calls, then that will randomly break my system and cause it to jitter around. Similarly, if some Udon script moves and object to your hand position every frame or something, then that needs to obviously run every frame or the object will become disconnected.

If Udon on avatars were to run on a separate execution thread, then obviously you'd need to make sure blits are synced, or make sure that they work from an async context. And that brings in the question of what if the blit update happens out of step with the frame, skips a frame, runs multiple updates in a single frame, etc.

Anyways I can only speculate, and there's not much point in it until udon is confirmed or denied for avatar 2.0. But regardless of speculation of how Udon would work on avatars, I think that there is a case to be made for allowing blit scripts separately from it from a performance standpoint.

Performance

This script allows much lower CPU overhead for using render texture loops to run stateful logic in shaders for avatars and worlds. It is approximately 10x as fast as using Camera components on the CPU side since blit does not need to run culling an a number of other things. This brings each pass from ~0.3ms to ~0.03ms. Using the command list version has a small flat overhead (~0.05ms for my system) on starting the command list, so I would recommend batching all of the updates of the controller into a single command list invocation for updates of blit controllers that aren't dependant on cameras renders.

Considerations for the performance system

Some basic metrics have been included at the bottom of BlitController.cs for consideration with rough recommendations for corresponding effects on performance. Of course these metrics aren't inclusive of the performance for the shaders being executed. See the definitions of the metrics in the code for more detail.

The metrics, some of these are not applicable depending on what is enabled/disabled:

  • Total blit count
  • Blit GPU bandwidth
  • Animated parameter count (only relevant if not in command buffer mode)
  • Reference transform count (only relevant if reference transforms are enabled)

All of these metrics return ints or floats and have some rough performance metrics I took, so if you decide to extend the performance system into a proper scoring system in the future instead of taking the worst performing thing for performance rating, it should be easy to convert to scores. The example scenes can also be modified for quick performance tests.

Relative performance of common applications

The performance system still doesn't/can't tell you that someone has a grab pass in their shader because Unity doesn't expose that info and you don't have an easy way to change that since you don't have the engine source. Grab passes are a larger hit on performance than most of the things you'd see directly using the blit scripts. For instance, playing the game at 4k results in each grab pass taking 0.7ms. This is obviously worse when you're playing in VR with supersampling. This is multiplied by mirrors and any cameras in the world. So it quickly gets out of hand when you have multiple people with grab passes sitting around. My unoptimized, basic GPU particle sample implementation included in this repository takes 0.32ms to update 1 million particles. With properly optimized implementations like the ones I usually use, this time is halved to ~0.16ms since everything to describe a particle can fit into 1 pixel instead of 2. Of course, the rendering of the actual 1 million particle mesh takes the vast majority of time in this case, where it can take >4ms if you cover your face with it depending on the scale of the particles. But that's not really relevant to blit performance since that mostly falls under the triangle count on the performance system.

From a performance standpoint, a majority of the practical applications of blit scripts would be much less performance heavy than GPU particles. Many applications just need blit to record data on stuff like recording finger positions for custom markers, or having a couple entities follow your player around. These kinds of things would take on the order of microseconds to execute on the GPU.

VRChat-specific considerations

Update manager

If you use the command buffer update mode, it'd probably be a good idea to have the blit controllers register with a manager that dispatches all of their updates to a single command list since there is a tiny overhead from executing the command list as noted in the performance section. If you do this, make sure to provide a public function that allows execution by worlds at specific times with a command list maintained by the blit controller. A specific use case of this would be executing a blit script directly after a camera in the world has rendered a frame via the OnPostRender event on an Udon behavior so that updates are not behind by a frame. Implementing this kind of thing is pretty dependent on how VRC has things setup and it should only take a couple of hours to do, so I don't have an example implementation of it.

Example scenes

todo

1-Blit Update

2-GPU Particles

3-Reaction Diffusion

4-Mesh Particles

5-3D Textures

/**
* MIT License
*
* Copyright (c) 2020 Merlin
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#define GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
#define GRAPHICS_BLIT_USE_COMMAND_BUFFER
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
#define GRAPHICS_BLIT_SUPPORT_CUBE_3D // Only supported if you're using GRAPHICS_BLIT_USE_COMMAND_BUFFER
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
#define GRAPHICS_BLIT_SUPPORT_MIPMAP_OUTPUT
#endif
#endif
using System;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.Rendering;
/// <summary>
/// Blit component which handles a single render texture blit update.
/// This component requires a mesh renderer to provide a easy way to animate properties on the blit material.
/// Defines are provided for enabling extended functionality or alternate functionality that would be nice, but is not necessary.
///
/// GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY: Allows the user to specify a list of transform references to pass into the blit material.
/// This isn't needed, but is a minor optimization and quality of life thing if people want multiple transforms to get passed into a material.
/// If users want to pass in multiple transforms without this, they can just run a bunch of blits that record each of their transforms.
///
/// GRAPHICS_BLIT_USE_COMMAND_BUFFER: Instead of using Graphics.Blit(), uses CommandBuffer.DrawMesh().
/// This has the benefit that the user no longer needs to specify animated properties
/// since they can be passed directly into the render call via the property block copied from the animated mesh renderer.
///
/// GRAPHICS_BLIT_SUPPORT_CUBE_3D: Minor change to support blitting Cube maps and 3D textures. This requires GRAPHICS_BLIT_USE_COMMAND_BUFFER to be enabled.
/// This more or less makes the script reach feature parity with Custom Render Textures while still allowing much more control over rendering updates,
/// and presumably without breaking in-game like Custom Render Textures do.
/// </summary>
#if !GRAPHICS_BLIT_USE_COMMAND_BUFFER // If we are using command buffer mode, we can pass in any type of renderer and render it directly to the blit target.
[RequireComponent(typeof(MeshRenderer))]
#endif
public class BlitComponent : MonoBehaviour
{
public RenderTexture sourceTexture = null;
public RenderTexture targetTexture = null;
#if GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
public Transform[] referenceTransforms;
#endif
#if !GRAPHICS_BLIT_USE_COMMAND_BUFFER
public string[] animatedFloatProperties;
public string[] animatedIntProperties;
public string[] animatedColorProperties;
public string[] animatedVectorProperties;
private int[] animatedFloatPropertyIds;
private int[] animatedIntPropertyIds;
private int[] animatedColorPropertyIds;
private int[] animatedVectorPropertyIds;
#endif
private int toWorldMatrixParam = -1;
private int toLocalMatrixParam = -1;
private int worldPositionParam = -1;
private int blitInfoParam = -1;
private Renderer meshRenderer = null;
private Material blitMaterial = null;
private Vector4 blitInfoData;
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
private static Mesh blitMesh = null;
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
private RenderTargetIdentifier[] targetTextureIdentifiers;
private Vector4 blitPassInfoData;
private int blitPassInfoParam = -1;
#else
private RenderTargetIdentifier targetTextureIdentifier;
#endif
private int mainTexParam = -1;
#endif
private MaterialPropertyBlock propertyBlock;
#if GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
private Matrix4x4[] frameToWorldTransforms;
private Matrix4x4[] frameToLocalTransforms;
private int toWorldMatrixArrayParam = -1;
private int toLocalMatrixArrayParam = -1;
private static readonly int MaximumPropertyArrayLength = 1023;
#endif
private bool initHasRun = false;
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
// When the component is created if there's no renderer on the object, make one.
// Only used when in command buffer mode because any renderer will do
private void Reset()
{
if (GetComponent<Renderer>() == null)
{
MeshRenderer newRenderer = gameObject.AddComponent<MeshRenderer>();
newRenderer.lightProbeUsage = LightProbeUsage.Off;
newRenderer.reflectionProbeUsage = ReflectionProbeUsage.Off;
}
}
#endif
public void Start()
{
if (initHasRun)
return;
if (targetTexture == null)
{
Debug.LogWarningFormat("No target render texture specified in Blit Component {0}, blitting will not execute!", gameObject.name);
}
meshRenderer = GetComponent<Renderer>();
if (meshRenderer == null)
{
Debug.LogWarningFormat("No renderer found on {0}, blitting will not execute!", gameObject.name);
}
propertyBlock = new MaterialPropertyBlock();
toWorldMatrixParam = Shader.PropertyToID("_BlitToWorldMatrix");
toLocalMatrixParam = Shader.PropertyToID("_BlitToLocalMatrix");
worldPositionParam = Shader.PropertyToID("_BlitWorldPosition");
blitInfoParam = Shader.PropertyToID("_BlitInfo");
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
blitPassInfoParam = Shader.PropertyToID("_BlitPassInfo");
#endif
#if GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
if (referenceTransforms.Length > MaximumPropertyArrayLength)
{
Debug.LogWarningFormat("Reference transform count for blit component is above maximum allowed shader array length ({0}), truncating to {0} transforms.", MaximumPropertyArrayLength);
Array.Resize(ref referenceTransforms, MaximumPropertyArrayLength);
}
// It would be nice to flatten these arrays to only unique, and non-null transform references from the referenceTransforms list.
// However doing this would break any shaders that rely on a specific ordering of transforms.
// Flattening the array would take a few minutes to implement, so I'll leave it to VRC to put in if they decide to do so.
// If you decide to flatten the reference transforms array into only unique, non-null references, please communicate that this is happening CLEARLY in the editor so that people don't get confused why something is not working.
frameToWorldTransforms = new Matrix4x4[referenceTransforms.Length];
frameToLocalTransforms = new Matrix4x4[referenceTransforms.Length];
toWorldMatrixArrayParam = Shader.PropertyToID("_BlitToWorldMatrixArray");
toLocalMatrixArrayParam = Shader.PropertyToID("_BlitToLocalMatrixArray");
#endif
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
// The one downside of using command buffers with texture identifiers is that it's not possible to animate the target texture reference since we build the identifiers once at Start()
// Though animating the target would have pretty niche use cases and people could just toggle different controllers if they want the same functionality for some reason.
if (targetTexture != null)
{
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
if (targetTexture.dimension == TextureDimension.Cube)
{
targetTextureIdentifiers = new RenderTargetIdentifier[6];
for (int i = 0; i < 6; ++i)
{
targetTextureIdentifiers[i] = new RenderTargetIdentifier(targetTexture, 0, (CubemapFace)i, 0);
}
mainTexParam = Shader.PropertyToID("_MainTexCube");
blitInfoData = new Vector4(targetTexture.width, targetTexture.height, 6, 1);
}
else if (targetTexture.dimension == TextureDimension.Tex3D)
{
targetTextureIdentifiers = new RenderTargetIdentifier[targetTexture.volumeDepth];
for (int i = 0; i < targetTexture.volumeDepth; ++i)
{
targetTextureIdentifiers[i] = new RenderTargetIdentifier(targetTexture, 0, CubemapFace.Unknown, i);
}
mainTexParam = Shader.PropertyToID("_MainTex3D");
blitInfoData = new Vector4(targetTexture.width, targetTexture.height, targetTexture.volumeDepth, 2);
}
else
{
targetTextureIdentifiers = new RenderTargetIdentifier[1];
targetTextureIdentifiers[0] = new RenderTargetIdentifier(targetTexture);
mainTexParam = Shader.PropertyToID("_MainTex");
blitInfoData = new Vector4(targetTexture.width, targetTexture.height, 0, 0);
}
#else
targetTextureIdentifier = new RenderTargetIdentifier(targetTexture);
mainTexParam = Shader.PropertyToID("_MainTex");
blitInfoData = new Vector4(targetTexture.width, targetTexture.height, 0, 0);
#endif
}
if (meshRenderer != null)
{
// Automatically handle creating a quad mesh for the renderer that fits the requirements of blitting if the person is using a MeshRenderer with no valid filter set.
if (meshRenderer is MeshRenderer)
{
MeshFilter filter = GetComponent<MeshFilter>();
if (filter == null)
{
filter = gameObject.AddComponent<MeshFilter>();
}
if (filter.sharedMesh == null)
{
filter.sharedMesh = GetBlitMesh();
}
}
#if UNITY_2018_3_OR_NEWER
else if (meshRenderer is SkinnedMeshRenderer)
{
// If forceMatrixRecalculationPerRender is disabled, the data we capture will be 1 frame behind since it's using skinning data from rendering which is not acceptable for many use cases.
(meshRenderer as SkinnedMeshRenderer).forceMatrixRecalculationPerRender = true;
}
#endif
meshRenderer.enabled = false;
}
#else
animatedFloatPropertyIds = BuildPropertyIDList(animatedFloatProperties);
animatedIntPropertyIds = BuildPropertyIDList(animatedIntProperties);
animatedColorPropertyIds = BuildPropertyIDList(animatedColorProperties);
animatedVectorPropertyIds = BuildPropertyIDList(animatedVectorProperties);
#endif
initHasRun = true;
}
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
public void RunBlit(CommandBuffer blitCommandBuffer)
#else
public void RunBlit()
#endif
{
// Allow people to animate the material reference
blitMaterial = meshRenderer.sharedMaterial;
if (
#if !GRAPHICS_BLIT_USE_COMMAND_BUFFER // This check can be skipped in command buffer mode since we depend on the render texture identifiers
targetTexture != null &&
#endif
blitMaterial != null)
{
// Get the current property block settings from the renderer.
// This allows animators to directly affect the blitting materials by copying the property block set by the animator.
meshRenderer.GetPropertyBlock(propertyBlock);
#if GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
if (referenceTransforms.Length > 0)
{
for (int i = 0; i < referenceTransforms.Length; ++i)
{
Transform refTransform = referenceTransforms[i];
if (refTransform != null)
{
frameToWorldTransforms[i] = refTransform.localToWorldMatrix;
frameToLocalTransforms[i] = refTransform.worldToLocalMatrix;
}
}
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
// Setting this property on the property block when we can is less error-prone.
// If you set the property directly on a shared material, it will lock the size of that array to the first array to get set.
// This means that if something goes and sets a 0 size array on a material that's shared with another blit that has a reference transform array, those transforms won't be set properly.
// Using the property block avoids this issue.
propertyBlock.SetMatrixArray(toWorldMatrixArrayParam, frameToWorldTransforms);
propertyBlock.SetMatrixArray(toLocalMatrixArrayParam, frameToLocalTransforms);
#else
blitMaterial.SetMatrixArray(toWorldMatrixArrayParam, frameToWorldTransforms);
blitMaterial.SetMatrixArray(toLocalMatrixArrayParam, frameToLocalTransforms);
#endif
}
#endif
Transform componentTransform = transform;
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
if (sourceTexture != null)
propertyBlock.SetTexture(mainTexParam, sourceTexture);
propertyBlock.SetMatrix(toWorldMatrixParam, componentTransform.localToWorldMatrix);
propertyBlock.SetMatrix(toLocalMatrixParam, componentTransform.worldToLocalMatrix);
propertyBlock.SetVector(worldPositionParam, componentTransform.position);
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
propertyBlock.SetVector(blitInfoParam, blitInfoData);
meshRenderer.SetPropertyBlock(propertyBlock);
for (int i = 0; i < targetTextureIdentifiers.Length; ++i)
{
// Needed for volume textures since the rendering command will only receive the last set property block for that renderer.
blitPassInfoData.x = i;
blitCommandBuffer.SetGlobalVector(blitPassInfoParam, blitPassInfoData);
blitCommandBuffer.SetRenderTarget(targetTextureIdentifiers[i]);
blitCommandBuffer.DrawRenderer(meshRenderer, blitMaterial);
}
#else
propertyBlock.SetVector(blitInfoParam, blitInfoData);
meshRenderer.SetPropertyBlock(propertyBlock);
blitCommandBuffer.SetRenderTarget(targetTextureIdentifier);
blitCommandBuffer.DrawRenderer(meshRenderer, blitMaterial);
#endif
#else
blitMaterial.SetMatrix(toWorldMatrixParam, componentTransform.localToWorldMatrix);
blitMaterial.SetMatrix(toLocalMatrixParam, componentTransform.worldToLocalMatrix);
blitMaterial.SetVector(worldPositionParam, componentTransform.position);
// Copy property block params over to the material for the blit
// Note that using this method in editor will set the material assets properties to the most recent animated shader property values and those changes will persist after exiting play mode.
foreach (int id in animatedFloatPropertyIds) blitMaterial.SetFloat(id, propertyBlock.GetFloat(id));
foreach (int id in animatedIntPropertyIds) blitMaterial.SetInt(id, propertyBlock.GetInt(id));
foreach (int id in animatedColorPropertyIds) blitMaterial.SetColor(id, propertyBlock.GetColor(id));
foreach (int id in animatedVectorPropertyIds) blitMaterial.SetVector(id, propertyBlock.GetVector(id));
// Perform the blit
Graphics.Blit(sourceTexture, targetTexture, blitMaterial);
#endif
}
}
// Used for the controller to check validity
public Renderer GetRenderer()
{
return meshRenderer;
}
#if !GRAPHICS_BLIT_USE_COMMAND_BUFFER
// Builds a list mapping shader property IDs to names so that setting the properties can use the faster int property getters and setters
private int[] BuildPropertyIDList(string[] paramNames)
{
HashSet<int> animatedPropertyList = new HashSet<int>(); // Use set to remove duplicate property references
foreach (string paramName in paramNames)
{
if (paramName.Length > 0)
animatedPropertyList.Add(Shader.PropertyToID(paramName));
}
int[] propertyIds = new int[animatedPropertyList.Count];
int targetIdx = 0;
foreach (int propertyId in animatedPropertyList)
{
propertyIds[targetIdx++] = propertyId;
}
return propertyIds;
}
#endif
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
// Generates a quad to use for blitting manually
// This quad has the same vertex layout as the one that Unity uses internally for regular Blit() so this is interchangeable with normal blitting from the shader perspective
private Mesh GetBlitMesh()
{
if (blitMesh != null)
return blitMesh;
Mesh newMesh = new Mesh();
newMesh.vertices = new Vector3[] {
new Vector3(-1, -1, 0),
new Vector3(1, 1, 0),
new Vector3(1, -1, 0),
new Vector3(-1, 1, 0)
};
newMesh.uv = new Vector2[] {
new Vector2(0, 0),
new Vector2(1, 1),
new Vector2(1, 0),
new Vector2(0, 1)
};
int[] indices = new int[] { 0, 1, 2, 1, 0, 3 };
newMesh.SetIndices(indices, MeshTopology.Triangles, 0);
newMesh.name = "VRC BlitComponent Blit Mesh";
blitMesh = newMesh;
return blitMesh;
}
#endif
#if GRAPHICS_BLIT_SUPPORT_MIPMAP_OUTPUT
int GetMipmapCount(int width, int height, int depth)
{
return Mathf.Max(Mathf.FloorToInt(Mathf.Log(width, 2)), Mathf.FloorToInt(Mathf.Log(height, 2)), Mathf.FloorToInt(Mathf.Log(depth, 2)));
}
#endif
// Used in the example performance metrics in BlitController.cs. These are not needed if you don't use those.
public int GetBlitCount()
{
if (targetTexture != null && meshRenderer != null)
{
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
if (targetTexture.dimension == TextureDimension.Cube)
return 6;
else if (targetTexture.dimension == TextureDimension.Tex3D)
return targetTexture.volumeDepth;
else
#endif
return 1;
}
return 0;
}
public int GetSliceCount()
{
int blitSliceCount = 1;
#if GRAPHICS_BLIT_SUPPORT_CUBE_3D
if (targetTexture != null)
{
if (targetTexture.dimension == TextureDimension.Cube)
blitSliceCount = 6;
else if (targetTexture.dimension == TextureDimension.Tex3D)
blitSliceCount = targetTexture.volumeDepth;
}
else
{
blitSliceCount = 0;
}
#endif
return blitSliceCount;
}
}
/**
* MIT License
*
* Copyright (c) 2020 Merlin
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/
#define GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
#define GRAPHICS_BLIT_USE_COMMAND_BUFFER
using UnityEngine;
using UnityEngine.Rendering;
/// <summary>
/// Top level control for batching blit updates. <para/>
/// This component is necessary for enforcing update order on blits since many things that require blits need to run in a specific order to work properly.
/// When using the command buffer mode, this component also handles batching all related blits into 1 command buffer for optimization sake.
///
/// If implemented in game, you may want to implement a manager that batches all BlitController executions into a single command list since command lists have a small overhead to execute.
/// </summary>
// Very high default execution order to hopefully run after IK and basically everything similar to how Rendering runs after everything.
// Change this to be higher if it's still before IK update unless VRC is doing something really weird like running IK in OnWillRenderObject, OnPreCull, or OnPreRender.
// The DefaultExecutionOrder attribute is basically undocumented, but is used in official Unity components so I'd hope it is reliable. Specifically, the NavMeshComponents GitHub repository uses it.
[DefaultExecutionOrder(200)]
public class BlitController : MonoBehaviour
{
public BlitComponent[] blitComponents;
/// <summary>
/// Delay between blit updates measured in seconds. Used if you want something to run at low frequency for optimization.
/// </summary>
public float updateDelay = 0f;
private float lastUpdateTime = 0f;
private bool attachedToCamera = false;
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
private CommandBuffer blitCommandBuffer = null;
#endif
void Start()
{
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
blitCommandBuffer = new CommandBuffer();
blitCommandBuffer.name = "VRC Blit Command";
#endif
int validComponentCount = 0;
// Call start to init in case the components are disabled and count the number of valid components
foreach (BlitComponent blitComponent in blitComponents)
{
if (blitComponent != null)
{
blitComponent.Start();
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
if (blitComponent.targetTexture != null && blitComponent.GetRenderer() != null)
#endif
validComponentCount++;
}
}
// Compress the blit component array if invalid blit components were found
if (validComponentCount != blitComponents.Length)
{
BlitComponent[] newBlitComponentRefs = new BlitComponent[validComponentCount];
int targetIdx = 0;
for (int i = 0; i < blitComponents.Length; ++i)
{
if (blitComponents[i] != null
&& blitComponents[i].GetRenderer() != null
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
&& blitComponents[i].targetTexture != null
#endif
)
newBlitComponentRefs[targetIdx++] = blitComponents[i];
}
blitComponents = newBlitComponentRefs;
}
attachedToCamera = GetComponent<Camera>() != null;
}
// Allow the blit controller to be executed after a camera render pass if there is a camera component on the same object as the blit controller
// This can be used to capture data from a camera in the world or on the avatar and post process that data or act on it.
//
void OnPostRender()
{
ExecuteUpdate();
}
// Late update to have this run after IK
// This should probably be disabled for the mirror and shadow clones, it will make things update faster than they should if it gets executed on the clones.
void LateUpdate ()
{
if (!attachedToCamera)
{
ExecuteUpdate();
}
}
public void ExecuteUpdate()
{
float time = Time.unscaledTime;
if (time - lastUpdateTime > updateDelay)
{
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
blitCommandBuffer.Clear();
blitCommandBuffer.SetViewProjectionMatrices(Matrix4x4.identity, Matrix4x4.identity);
#endif
foreach (BlitComponent blitComponent in blitComponents)
{
// We can skip the null check here since we already removed all null references from the array
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
blitComponent.RunBlit(blitCommandBuffer);
#else
blitComponent.RunBlit();
#endif
}
#if GRAPHICS_BLIT_USE_COMMAND_BUFFER
Graphics.ExecuteCommandBuffer(blitCommandBuffer);
#endif
lastUpdateTime = time;
}
}
// Example metrics that could be used for performance rating
// Nothing past this point is actually needed and is just here as an example.
// ------------------------------------------------------------------------------
#if GRAPHICS_BLIT_PROVIDE_TRANSFORM_ARRAY
// In most cases this is only going to be used for a couple of transforms
// Doing the maximum number of transforms in 1 blit component (1023) is about 0.5ms on my CPU
// So realistically you might want poor performance rating to start somewhere around 50 transforms which would be 0.025ms CPU time
public int GetTransformCount()
{
int transformCount = 0;
foreach (BlitComponent blitComponent in blitComponents)
{
if (blitComponent != null)
{
transformCount += blitComponent.referenceTransforms.Length;
}
}
return transformCount;
}
#else
int GetTransformCount()
{
return 0;
}
#endif
// Just the number of individual blits executed.
// 1000 blits takes around 4.5ms on my CPU for both the blit and command list approaches, so some example performance numbers could be medium starting at 3, and poor starting at 6 or so.
public int GetBlitCount()
{
int blitCount = 0;
foreach (BlitComponent blitComponent in blitComponents)
{
if (blitComponent != null)
blitCount += blitComponent.GetBlitCount();
}
return blitCount;
}
// Utility function for GetBlitGPUBandwidth() since I can't find an equivalent in Unity.
private static int GetRenderTextureBitDepth(RenderTextureFormat format)
{
switch (format)
{
case RenderTextureFormat.R8:
return 8;
case RenderTextureFormat.RHalf:
case RenderTextureFormat.R16:
case RenderTextureFormat.RG16:
case RenderTextureFormat.ARGB4444:
case RenderTextureFormat.ARGB1555:
case RenderTextureFormat.RGB565:
return 16;
// Weird texture formats that are not always supported, I'm not sure if they actually have these bit depths or if they just throw out bits to fit into a power of 2
case RenderTextureFormat.BGR101010_XR:
return 30;
case RenderTextureFormat.BGRA10101010_XR:
return 40;
case RenderTextureFormat.ARGB32:
case RenderTextureFormat.Depth:
case RenderTextureFormat.Shadowmap:
case RenderTextureFormat.RFloat:
case RenderTextureFormat.RInt:
case RenderTextureFormat.RGHalf:
case RenderTextureFormat.Default:
case RenderTextureFormat.RG32:
case RenderTextureFormat.RGB111110Float:
case RenderTextureFormat.ARGB2101010:
case RenderTextureFormat.BGRA32:
return 32;
case RenderTextureFormat.RGFloat:
case RenderTextureFormat.RGInt:
case RenderTextureFormat.RGBAUShort:
case RenderTextureFormat.ARGB64:
case RenderTextureFormat.ARGBHalf:
case RenderTextureFormat.DefaultHDR:
return 64;
case RenderTextureFormat.ARGBFloat:
case RenderTextureFormat.ARGBInt:
return 128;
default:
return 64; // In case Unity adds some new format in the future, fall back to a middle ground value
}
}
// This is a weird metric, but it shows how much memory in MB that the GPU has to write through the ROP assuming every blit does a quad that covers the entire target.
// Not all blits will have full screen quads, but they usually will.
// A value of 2048MB takes around 4.5ms on my gtx 1080 ti. So you might want very poor to begin somewhere around 40MB which would be around 0.1ms GPU time.
// Note that this metric does not take into account the cost of the shader since that's not feasible, but it measures an approximate minimum bound for the amount of time the GPU can take for the blit.
// This is also not a linear predictor of performance as there are other costs associated with just executing the shader
public float GetBlitGPUBandwidth()
{
float totalBandwidth = 0f;
foreach (BlitComponent blitComponent in blitComponents)
{
if (blitComponent != null && blitComponent.targetTexture != null && blitComponent.GetRenderer() != null)
{
float renderTextureBytesPerPixel = GetRenderTextureBitDepth(blitComponent.targetTexture.format) / 8f;
int blitSliceCount = blitComponent.GetSliceCount();
totalBandwidth += renderTextureBytesPerPixel *
blitComponent.targetTexture.width *
blitComponent.targetTexture.height *
blitSliceCount *
blitComponent.targetTexture.antiAliasing;
}
}
return totalBandwidth / (1024f * 1024f);
}
#if !GRAPHICS_BLIT_USE_COMMAND_BUFFER
// This metric only makes sense if you decide against using the command buffer method for whatever reason.
// Doesn't count the reduced property ID lists since we can't know if the component has been initialized yet.
public int GetAnimatedParameterCount()
{
int paramCount = 0;
foreach (BlitComponent blitComponent in blitComponents)
{
if (blitComponent != null)
{
paramCount += blitComponent.animatedFloatProperties.Length;
paramCount += blitComponent.animatedIntProperties.Length;
paramCount += blitComponent.animatedColorProperties.Length;
paramCount += blitComponent.animatedVectorProperties.Length;
}
}
return paramCount;
}
#endif
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment