Skip to content

Instantly share code, notes, and snippets.

@viridia
Last active September 23, 2023 03:08
Show Gist options
  • Save viridia/2db07ea58d0a254bcbad4b998abe4ba1 to your computer and use it in GitHub Desktop.
Save viridia/2db07ea58d0a254bcbad4b998abe4ba1 to your computer and use it in GitHub Desktop.
Bevy Guise / Design Notes

Bevy / Guise Design Notes

Guiding principles:

(This list is not comprehensive)

  • Artist friendly workflow - this means that non-technical artists can create UIs without writing code. It also means that UI designs which are highly art- and animation- intensive can be supported.
  • Separation of Concerns - an artist should be able to re-style a widget without needing to modify the source code of the widget. This means that style must be separable from logic.

These principles dictate an asset-centric, rather than a code-centric, approach to UI authoring. The primary creative output of the artist will be UI assets; the primary creative output of the programmer will be modular components ("presenters") that can be referenced from those assets.

  • Allow (but don't require) manual authoring - this means that the asset file format needs to be readable and relatively compact; but it should also allow serialization. However, this goal is secondary to the goal of making the asset format easy to manipulate in an interactive editor, so if there's a conflict between those goals (hopefully there won't be), then editability takes precedence.

Non-requirements

  • No requirement for round-trip serialization - making an asset format that can be both serialized and deserialized losslessly puts a lot of technical constraints on the implementation, and can impact runtime efficiency. Instead, interative designer apps will use a different, more flexible in-memory representation while editing, one that is optimized for editing rather than runtime performance.
  • Embedding templates in Rust via macro - This idea is very popular because it's how React, Svelte, Solid and other modern frameworks work, as well as Dioxus and some of the Rust UI packages. While this may be doable in principle, it runs against the principle of "artist-friendly". I'm not opposed to the idea, but it's a hard problem and I don't really want to spend much time thinking about it.
    • Templates encoded as assets have a much different set of design challenges from templates embedded in code. Examples: are specifications of formal template parameters, importing of rust symbols, and other things you get "for free" in Rust code.

High-Level Architecture

Guise UI components implement the Model-View-Presenter design pattern. This pattern consists of three elements:

  • A Model, which is responsible for managing the state of the widget. Models can be "local" (that is, private to the component) or "shared" (meaning that the data is shared, and accessed by multiple widgets). Models may or may not support some form of data binding (TBD).
  • A View, which is represents the visible elements. These are Bevy Entities + Components which are arranged in a Parent/Child hierarchy, and are often generated by a View Template.
  • A Presenter (also known as a "Supervising Controller") is a Rust object that handles input events to the component and which also updates view in response to changes in state. Presenters can be created via reflection, and can be referenced from within a view template. Presenters can be created by implementing the Presenter trait.

Note that in Bevy, the model, the view, and the presenter may all be separate components attached to the same Entity.

View Templates are assets which define a hierarchy of view components. View templates can contain:

  • specifications for 2D or 3D elements.
  • references to Style assets, as well as inline styles.
  • references to Presenters.
  • references to other templates ("call") with parameters.
  • references to other assets, such as images, sounds, or localized text strings.
  • conditional logic (if/then, for/each).

It's important to note that while view template can contain a limited degree of logic (interpolation expressions and such), they cannot contain arbitrary Rust code - this is not JSX. Instead, complex interaction and rendering logic lives in the presenter, which is outside of the template. The reason for this restriction is to make it possible to write artist-friendly interactive UI editors, which are generally not able to deal with arbitrary code blocks.

Note on Artist Workflow - A limitation is that while artists can create new UI appearances by authoring templates, any significant new behavior can only be created by writing Rust code - but this would be true for virtually any imaginable templating system that does not involve creating an embedded scripting language.

If that's the case, then how can an artist work? Assume the extreme case, which is an artist with no coding skills collaborating with a programmer with no art or design skills. In this case, there are two possible workflows, which are "code first" and "art first":

  • In the "code first" workflow, the programmer first creates a set of modular presenters for common widgets such as buttons, dialogs and so on, and checks these components in to source control. In order to test these presenters, the programmer may create temporary "programmer art" which is intended to be replaced by the artist. The artist can then author templates that are assemblages of these components, along with the styles and visual layouts.
  • In the "art first" workflow, the artist creates the layouts and styles, but uses temporary placeholder widgets that have no interactive functionality; these are then replaced by the programmer to provide actual functionality.

Actual practice will more likely be a combination of these two, with the most common, reusable widgets being created first, and the more specialized bespoke widgets being created last. Even in the case of a single creator with both art and programming skills, it's likely that they will often be working in one role or the other (authoring layouts or writing code), and there will likely be a switching cost going between modes (such as having to restart the layout editor to gain access to the new components).

Relationship between Templates, Styles and ECS entities

In Bevy, the visual appearance of a UI node (or any entity for that matter) is a consequence of which components have been inserted into the entity. For example, to display a background color on a 2D element requires added a BackgroundColor component.

Most of the time, creators who are working on UI layouts don't want to bother with this level of detail - they generally want to create "widgets" which are pre-packaged bundles of multiple ECS components. An artist typically will want to specify "background_color: #ff0" without caring how background color is implemented internally.

This presents a somewhat of difficulty for implementing style assets, because the implementation of a style property such as "background_color" is very different than the implementation of "margin_left". In addition, a 2D entity which displays a background image must have a background color as well, otherwise the image appears completely black. So there's a coupling between style properties that the styling system needs to know about.

Suppose we have a widget that has a border color sometimes, but not other times, depending on dynamic states. This means that the code will need to add and remove the BorderColor component based on the current style. This logic is specific to 2D elements, and does not necessarily apply to other kinds of elements.

Thus, somewhere in the system there has to be code that understands how to map artist-friendly style properties into ECS structures. The question is, where should this logic live? It can't live in the the Style objects, because any given Style is only a part of the whole - multiple Styles are composed to define widget's appearance. It can't be built into the templating language either, because this would require the template language to have special knowledge of 2D graphs, making it difficult to write templates for other kinds of scene graphs.

One obvious choice is the presenter component, which is written in Rust code by the game developer. To make life easier for the implementer of the presenter, we could supply a utility library for styling 2D entities, so for the vast majority of 2D widgets this would be a single function call.

However, because the presenter is a Component, we need to create the widget Entity first (otherwise we'd be calling methods on a Component that was detached from the entity). At the same time, we would prefer not to introduce a 1-frame delay between the time that the entity was created and the time that it's other components (background color and so on) were added. This suggests that component creation may have to happen in an exclusive, synchronous system.

Obviously, this is an area that still requires a lot of thought.

In any case, if it true that the presenter is responsible for populating the Ui widget entity with components, then templates need not ever directly contain references to Rust types other than presenters, and only presenters need to be registered with the reflection system. This significantly simplifies the design of the templating language.

File Format

Note: Sections marked with the word Bikeshed are invitations to suggest alternate solutions and syntax choices.

Overall file structure

A UI asset consists of a set of top-level items, which can have arbitrary numbers of children. Four types of top-level items are to be supported:

  • groups
  • font families
  • styles
  • templates

Groups

"Groups" are simply nested collections of named items, and are used to organize items hierarchically. Groups can contain groups, as well as templates, styles and font families.

The purpose of groups is to allow "bundles" of related styles or templates to be referenced with a single name. For example, a "slider" widget may have several styles, one for the thumb component, the track component, and the slider as a whole. These can be organized into a group:

slider: {
  base: Style { ... }
  thumb: Style { ... }
  track: Style { ... }
}

The syntax of a group is simply an object literal, that is, a pair of braces {} containing named items.

Each child of a group creates a separate labeled asset, where the label is the hierarchical path to the group. In the above case, three labeled assets would be created:

  • slider/base
  • slider/thumb
  • slider/track

Labeled assets can be referenced from other assets using both file-relative and label-relative paths, using the same syntax as the JSON Pointer standard. So for example, to create an asset reference from the base style to the track style, you can use an asset path of #./track. This ability to reference styles via relative paths is important for theming, since a "theme" is simply a collection of styles that has been structured in a particular way.

Font families

A font-family resource represents a collection of font file assets, along with a set of mappings which determine which text styles map to which font files. It is similar in structure to the CSS @font-face rule. See Issue #9725 for more detail.

Styles

Style objects are sparse maps of style attributes. They are strongly typed, but can also have dynamic elements. Internally, style objects are designed to be composable, meaning that multiple styles can be merged efficiently.

Style objects can be separate assets, or they can be inline, meaning that they are defined as part of a view template. Note that unlike CSS, there's no performance penalty for inline styles; they are just styles that happen to be anonymous.

Also unlike CSS, there's no cascade order or prioritization scheme; an array of styles attached to an element are simply applied in the order that they occur in the array.

Styles support a limited form of CSS-style rule matching. Whereas CSS selector syntax supports many different types of matching patterns (attributes, psuedo-classes, sibling selectors, etc.) Style Asset selectors support only three types:

  • "classname" selectors
  • "parent" selectors
  • "either" (comma operator) selectors.

The syntax for these will be described subsequently. The restriction on selector syntax is intended to keep the implementation simple, and to avoid the complex maintenance issues around CSS selector matching.

This also means that selectors can only affect the element that the style is attached to. A parent element cannot have a style which changes the appearance of its children.

Styles also allow for variables ($background_color) and functions (lighten($color, 0.3)). Styles can set variables which can then be reference by that style or by other styles on the same element.

Style Syntax

A Style object uses the same meta-syntax as other objects: A class name Style followed by a list of properties:

obs_splitter: Style {
    display: "flex",
    width: 7,
    align_items: "center",
    justify_items: "center",
    justify_content: "center",
    background_color: "#202020"
}

Property names must all be valid Rust identifiers.

String and number values are automatically coerced to the most appropriate type for each property. In the above example, the "flex" property is automatically converted to bevy::ui::Display::Flex; the number 7 is converted to bevy::ui::Val::Px(7.0). The "background-color" propery is converted to a Color object.

Bikeshed: Some file formats use a leading '.' to indicate that we're setting an object property. Also, it may be that we can avoid quoting a lot of these properties.

obs_splitter: Style {
    .display: flex,
    .width: 7,
    .align_items: center,
    .justify_items: center,
    .justify_content: center,
    .background_color: #202020
}

Asset References - The syntax $(...path...) is used to designate a reference to another asset. Note the use of parens; this is distinct from ${...expression...} which is used to reference the contents of a variable or parameter. (The fact that these two expressions look similar is intentional, because conceptually they are doing almost the same things - one is dereferencing the name of an asset, the other is dereferencing the name of a variable or parameter.)

obs_splitter: Style {
    background_image: $("../images/accept.png")
}

The path will usually be a quoted string. If it is a relative path, it is resolved relative to the current labeled asset. The $() syntax can contain multiple arguments, which are considered to be a chain of relative path elements. These are resolved using an algorithm similar to the node.js path.resolve() function:

obs_splitter: Style {
    background_image: $(${theme_base}, "../splitter_thumb.png")
}

Selectors - A style can contain a number of selector expressions. Selectors are specified in the "selectors" section, and are encoded as map keys (an idea taken from vanilla-extract):

obs_splitter: Style {
    display: "flex",
    width: 7,
    
    selectors: {
        "&.hover", {
          background_color: "#999",
        },
        "&.pressed", {
          background_color: "#aaa",
        },
        "&.disabled, &.focused", {
          background_color: "#777",
        }
    }
}

As mentioned, selectors only support classname, parent (direct parent, not ancestor) and "either" matching.

Bikeshed: It would be possible to add pseudo-classes like ":hover" and ":focus-within" if Bevy widgets gain such an ability. For now, however, the presenter will need to manually add a "hover" class to the element.

Bikeshed: It's currently envisioned that class names are strings, but they could be interned symbols as well to speed up matching.

Note on motivations:

The reason for having selectors (rather than having the presenter directly manipulate the style properties) is to maintain the separation between artist and coder roles; An artist should be able to use a common "Button" presenter but change how the various button states appear. Note that in some cases, such as for moving a slider thumb, direct manipulation of style attributes is the correct approach.

The reason for supporting parent selectors is because often a widget is composed of multiple elements, and it's convenient for the presenter to be able to add dynamic classes to the widget's root element without having to style every piece individually.

Writing an evaluator for this simplified matching structure is very easy and efficient as long as widgets have access to their parents.

It's also possible to pre-optimize styles by determining, for each style property, which attributes never change and which are dynamic. The algorithm for doing this is relatively straightforward.

Variables Styles can define variables, and can reference them within attributes. Variables go in their own section:

obs_splitter: Style {
    background-image: ${bg_image}
    display: "flex",
    width: 7,
    
    vars: {
        bg_image: $("../images/accept.png")
    }
}

Variables and selectors are often used together:

obs_splitter: Style {
    background-image: ${bg_image}
    display: "flex",
    width: 7,
    
    vars: {
        bg_image: $("../images/accept.png")
    }
    
    selectors: {
        "&.hover", {
            vars: {
                bg_image: $("../images/accept_hover.png")
            }
        },
        "&.pressed", {
            vars: {
                bg_image: $("../images/accept_pressed.png")
            }
        },
        "&.disabled, &.focused", {
            vars: {
                bg_image: $("../images/accept_disabled.png")
            }
        }
    }
}

This example shows a fairly common pattern in Game UIs, where the different button states are rendered with different artwork.

Note that variables are dynamically-typed, that is, the type of a variable isn't checked until you actually try to use it. The reason for this is to avoid the nusiance of having to put type annotations on all the variable definitions.

Prototype Implementation

View Templates

A view template is a tree of nodes or elements, of which there are two basic types:

  • "built-in" nodes:
    • Element - creates an Entity, possibly with Components and parent/child relationships.
    • If (or perhaps Cond) - conditionally renders its children.
    • Each (or For) - renders an array of nodes given an iterable data source.
    • Fragment - contains multiple nodes which are flattened
  • Calls to other templates (Button, Slider, Panel, MyGameInventoryPanel, etc.)

Open Issue We want the names of nodes to be relatively compact, however template calls are asset paths which can be arbitrarily long. One solution is some sort of "Use" declaration that can reference asset paths but give it a short name. (See this example from Makepad.)

Open Issue In JSX, there's a syntactic distinction between "components", which begin with an upper case letter, and "native elements" which start with lower case. Do we want to do something similar here?

The syntax for a view template is slightly more elaborate than for a Style, because it also includes the specification for the template's formal parameters:

VSplitter: Template(
  id: string?,
) Element {
    id: ${id},
    style: [
      $("#obs_splitter/base"),
      Style {
        flex_direction: "column",
      }
    ],
    presenter: "guise.ui.VerticalSplitDragger",
    children: [
        Element {
            style: $("#obs_splitter/thumb")
        }
    ]
}

The ? indicates an optional parameter. You can also use = to indicate a parameter with a default value:

Button: Template(
  primary = false,
) Element {
    classes: [primary ? "primary" : None],
    style: [$("#obs_splitter/base")],
    children: [
        Element {
            style: $("#obs_splitter/thumb")
        }
    ]
}

Note that even though the parameter list looks like a function, parameters are not ordered.

Immediately following the Template parameter list is a single child element. If there is a need for a template to return multiple top-level elements, then Fragment can be used, which is an element that automatically inlines it's children.

Elements, like styles, are defined using a syntax similar to Rust object literals. The Element type represents a raw Bevy UI node with no built-in behavior. It has a number of standard properties:

  • classes - a list of class names, which can contain conditional expressions. Class names are as inputs to style asset selectors.
  • styles a list of style objects which can either be asset references or inline styles.
  • children a list of child elements.
  • presenter the reflection name of a Presenter to attach to this element. If no presenter is specified, then the default presenter is used, which gives you a passive layout element kind of like a "div".

The children of an element can be any of the following:

  • An Element (which is kind of like a NodeBundle)
  • Another type of Bevy Bundle such as a Gizmo
  • Another Template
  • A text string (quoted - we're not doing weird whitespace stripping like XML).
  • An interpolation variable ${varname}.
  • A conditional primitive using <If> or <Each>.

A more complex example:

ListBox: Template(
  rows: Element[],
) Element {
    style: [$("../obsidian#listbox")],
    children: [
        Each {
            iter: ${rows},
            var: "i",
            body: Element {
               style: [$("../obsidian#listrow")],
               presenter: "guise.ui.ListRow",
               children:
                   ${i}
            }
        }
    ]
}

Open Issue: One thing that's missing here is how the presenter communicates its state to the template. Rust code that invokes a template can specify template parameters, and these parameters can be reactive so that the template re-renders when the value of these parameters changes. However, in the proposed architecture, templates call presenters and not the other way around, so the presenter cannot alter the template's input props. Instead, we'll need a different mechanism or syntax to allow the template access to state variables exported by the presenter, perhaps using a different prefix such as % or @.

Bikeshed: It's possible we could add some sugar to make for/each loops more compact; however for now this somewhat clunky syntax has the advantage that it can be serialied without any special exceptions to the grammar.

Execution Model for View Templates

When a view template is instantiated, it creates a tree of Bevy entities whose structure mirrors that of the template. Some template elements, such as Fragment, If or Each can generate zero, one or many children. These entities contain Bevy ECS components that allow them to be rendered via the Bevy graphics layer. They also contain ViewElement components intended to track their position and status in the template hierarchy.

A ViewElement contains an Arc which points to the template node that created it; it also contains shallow copies of the styles and presenters associated with that template node. This allows the ViewElement to detect whether or not it needs to regenerate its children. This can occur either because of a hot reload of the asset, or because of a change to application state. The model is kind of a hybrid between React's VDOM and Solid's method of updating the DOM.

A special marker component ViewUpdate is added to any view element that needs its visual components to be recomputed. This marker is added by various event handlers, and then removed when the computation is complete. An ECS query can be used to find all view elements that need to be updated. (Note: there might be more than one type of marker, for example updating a style is a much lighter weight operation than re-building a list.)

When building the ui node tree, view template nodes always return exactly one TemplateOutput object, which can either be empty, contain a single ui node entity id, or a list of TemplateOutputs. This list is flattened before attaching the node entities to their parents, but the original non-flat list is cached in the Viewelement. So for example if an "Each" node produces a variable number of children each render, the overall "shape" of the template output remains the same as the template input. Because output array indices are stable, this can be used to 'patch' the node tree instead of regenerating it every time.

#[derive(Debug, PartialEq, Clone)]
pub(crate) enum TemplateOutput {
    // Means that nothing was rendered. This can represent either an initial state
    // before the first render, or a conditional render operation.
    Empty,

    // Template rendered a single node
    Node(Entity),

    // Template rendered a fragment or a list of nodes.
    Fragment(Box<[TemplateOutput]>),
}

(See complete code at https://github.com/viridia/panoply/blob/feature/json-style-assets-v2/src/guise/view.rs#L21)

The goal is to preserve the state of the view hierarchy as much as possible when the state of the application changes, but also to allow the view hierarchy to radically change shape (such as going to a different major mode or opening a modal dialog) when called for.

Patching and Diffing

One approach to re-rendering is to use a VDOM, similar to React: create a complete copy of the UI graph, and then compare the current graph with the previous one. However, this approach has a high amount of overhead and caused a certain amount of dissatisfaction in the web developer world for that reason.

Instead, we can predict whether or not a UI node has changed by looking at its inputs. For many kinds of changes, it is fairly easy to simply overwrite the properties in question - things like background color, margin width and so on are polled by the ECS rendering system every frame, so there's no need to generate an update event. Adding or removing a class from a node requires re-composing the styles, but it does not require regenerating the entity graph.

There are a few operations which have the potential to dramatically change the shape of the graph:

  • conditional logic ("if").
  • changing the length or content of an array used in a for/each node.
  • changing the presenter id (this is only likely to happen in a hot reload scenario).
  • changing the parameters of a template.
  • changing the visible state of a presenter.

For these kinds of changes, we can predict fairly well whether a node can be patched in place, or whether it needs to be replaced by a different entity. If an entity is replaced, then all of that entity's children should be disposed, and a brand new child hierarchy created. Also, the entity's parent will need to update it's list of child entity ids by calling replace_children().

Doing this logic is relatively straightforward if we visit the entire tree every time an update happens, since a lot of the data involved is inherited (e.g. template parameters) from upper levels of the hierarchy. However, for performance reasons we may not want to do this - we'd rather be able to visit just the portion of the tree that is impacted by a state change. To do this, we need several things:

  • We need individual view element nodes to cache their inputs, that is, maintain a shallow copy of the data inherited from their parents, so that we can regenerate those nodes without having to visit the parents.
  • We need some way to mark nodes that need to be visited, possibly using a marker component (as discussed previously).

Notes on Reactivity

Reactivity is out of scope for this design proposal; it's really a matter of how the presenters interact with the rest of the game world state. Many different reactive schemes could be made to work with the template assets proposed here. However, a few notes are in order:

Any reactive system is going to produce some kind of event or signal that says when derived values need to be recomputed. The components of the UI graph are one kind of derived value. How granular these updates are is a design choice; finer-grained systems are more performant but are more challenging to implement.

For an ECS-based, retained mode UI, these derivation events will likely be processed in a deferred fashion, that is, changing a reactive variable isn't going to trigger an immediate re-render of the UI, but rather schedule an update to be processed as part of the normal ECS schedule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment