Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
The Problem Of UI (User Interfaces)

Links

Note: a lot of programmers talk about UI without mentionning the user even once, as if it was entirely a programming problem. I wonder what we’re leaving off the table when we do that.

Evaluation Axes

  • input handling, and latency,
  • composition,
  • dataflow,
  • layout,
  • painting,
  • styling,
  • extension,
  • resource demands: power draw, cpu cycles

Caveats/blind spots of homegrown solutions

Accessibility is a big issue. How to make your UI Accessible? Usually platform vendors provide APIs to enumerate/navigate/queryh UI elements, extracting some metadata for text-readers and the like.

Multi-viewport UIs: support for multiple windows, multiple displays/monitors with mixed DPI.

Power-draw.

Collaboration: support for more than one editing client for multi-user collaboration on the same data.

Responsive layouts and scalable/zoomables UI help users adapt an UI to viewing distances or screen sizes.

Internationalized input methods. (“IME” on windows)

Adaptability to large teams (adding new controls, new assets, new kinds) without contention.

Multi-touch and touch interfaces create many questions: they allow multiple-items to be interacted with at a time (contrary to the pair mouse-keyboard) ; should this be constrained somehow (to only similar items? to only similar items supporting similar interactions? what if more than one person is touching the screen?) ; they also either expect the items to be bigger or the hotzones to be more lenient.

Hi-DPI requires a great deal more pixels to fill. Also what should be the internal units?

Plugin supports: this exposes lots of problematic scenarios when plugins are allowed to hook to the underlying event loop and desktop APIs. For instance imagine dealing with non-DPI aware plugins within a DPI aware process on Windows.

GPU-based rendering:

  • how does it affect latency?
  • does it fit the type of graphics shown in UI?

Multiple-interactions in one: Very often UI reach a difficulty when we want to select multiple elements and change them all in parallel. Increment/Change more than one element at a time.

Jump-to-ui-element

Ideas

Collision

Consider each quad representing the bounding box of a UI control. The minkowski sum of that quad and a round region of about radius ‘tr’, representing the tolerance radius, depends on the input device (mouse, touch etc…) and represents the collision box between that input device’s center point and the control. In case n>1 controls match, the ambiguity must be resolved. @idea{separating axes test}

Window Management

In an UI where moving windows or elements is offered to users, it’s nice to satisfy needs for tightly packing those elements. Two options:

  • snap (magnetic)
  • bump+friction (solids) { allows for putting elements next to each other without them touching each other }

Layout algorithm (Flutter)

layout transformation :: min/max width, min/max height, fixed size elements, flexible size elements -> sizes per element

constraints go in, traverse the tree, bring out size of elements traversed. tree has row nodes and column nodes. go through fixed size elements first, taking the dimension that match the node type (row => heights, column => widths) apply size in chosen dimension to flexible elements

Comments

Sometimes people get confused between these two opposites:

  • the product is the user interface,
  • the user interface surfaces or allows user interacting with a deeper model or simulation

What remains true is that in the two points of view, data has to be shared across multiple systems, and data ultimately has to be seen and interacted with by the user, and belongs to the user interface.

Some surface level aspects of user interfaces (what people call design):

  • structure, sequence
  • 2d/3d layout
  • chrome, animations

Logic of interactions are difficult to express.

Example1: Firefox Web Render

https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-fps-how-webrender-gets-rid-of-jank/

@url: https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-fps-how-webrender-gets-rid-of-jank/ @title: The Whole Web At Maximum Speed

GPU based rendering.

Transformations used:

  • Page transformed into “stacking context” tree (Compositing tree?)
  • Early culling of display items to remove those that are not shown in the viewport.
  • Compositing tree turned into render task tree, aftering having optimized it (reduce number of intermediary textures
  • Batching draw calls (maximal)
  • Assist pixel shaders by allowing Early Z-culling
    • Opaque pass where opaque objects are draw front to back
    • Transculent shapes are drawn back to front

Pathfinder project: rendering font glyphs on GPU as well @url: https://github.com/pcwalton/pathfinder

Links

Reading notes for @url: https://www.youtube.com/watch?v=rX13p0Ndzzk

Docking

Confusing when it's more than one level deep. User going back to their UI: "How did I put my layout together?"

Window Management

  • Ability to move windows
  • Sizing vertically, horizontally, diagonally

However it gets tricky to lay things out. Solution: Snapping extremities to boundaries of windows, with visible guide lines. Snapping extremities to the parent container also makes sense.

Auto-anchoring feature. Whenever two windows are snapped together, moving the separator between resizes each side. Windows are "connected." Pressing down for a while then moving allows you to let go of that restriction.

Resizing containers

Internal windows are moved and resized using a physics (spring-mass) system.

We'll first describe the generic display time line, introducing some vocabulary. The display reads content from what's called a framebuffer, a buffer of pixels to be shown. It has its own timeline:

 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        
 . . . . . . . . . . . . . o . . . . . . . . . . . . . . . o . . .        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)

This is the case for a traditional non-freesync/g-sync display.

In the ideal case, we provide new content to the screen each frame. A frame is a group of pixels that are consistent as a sample in time:

 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f 0        time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . -----). . . . . . . . . . . . . -----).        swap front-buffer and back-buffer: depends on data being ready.
 a------------------------ ~ ~ ~ b------------------------ ~ ~ ~ a        front-buffer
 a-------------------------------b-------------------------------a        screen as seen by user
  • swapping front-buffer and back-buffer needs to happen during the swap period to prevent tearing (content changing while the scan-out period is active, where the screen shows two frame at a single time)
  • freesync/g-sync and other adaptative sync tech allow delivering content at arbitrary points, not only during the vblank period

Note that the presentation time (beginning of the scan-out) of a frame is often implicit to our programs which can't usually know what the real effective presentation time is for their user.

Animations feel correct when their calculation time approximates the presentation time within some accuracy. (It depends on ability of people for detecting jank in the speed of moving elements)

Missing a frame looks like this (repeated frame, noticeable by users)

 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -----).        swap front-buffer and back-buffer:
 a-------------------------------------------------------- ~ ~ ~ b        front-buffer
 a---------------------------------------------------------------b        screen

Unsynchronized front-buffer swap looks like this: (tearing, noticeable when the content changes quite a bit in the horizontal direction)

 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -----).        swap front-buffer and back-buffer:
 a---------------------------------------b------------------------        front-buffer
 a-------------------------------a-------b------------------------        screen: shows tearing at the line scanned out at time 4

Lowering UI latency

What's acceptable user latency? Can I make the argument that the lower possible the better? (i.e. that the simulated world that the UI shows is more useable, "feels more real" the lower the latency?)

Components throughout the system add latency in unknown way. And system builders add components between us and the scan-out in ways that isn't entirely transparent. (Examples: desktop compositors, GPU api buffering, in screen or transport latency and buffering)

An unacceptable demand on latency is one that cannot be achieved by the system: we can't expect latency that's smaller than what the screen speed allows.

Where is latency hiding? Can I simulate the feeling of laggy displays?

  • mouse cursor trailing behind the actual mouse position
  • clicking the wrong thing:
    • mouse position as input trails the actual screen position shown by cursor.
    • screen disagrees with simulation

How can we reduce input latency?

Our goal is to transform:

user inputs, autonomous processes -> frame content on the screen for the next presentation time

Given the following steps:

  • transport user inputs to program
  • map input to user interface
  • user interface interprets input into data model changes and new graphical content.
  • render: new graphical content results in frame buffer content
  • transport frame to screen

Some parts of the graphical content depend on user inputs, some other parts not. This varies frame to frame? (Imagine resizing panels in an user interface)

We define the input lag as follows: Input lag :: time(scan-out) - time(user inputs transport)

Reading user-inputs as close to the presentation time is what low-latency mean.

How can we push time(user inputs transport) closer to time(scan-out)?

General optimization: By reducing durations spent mapping inputs to user interface, rendering the UI etc... This is our baseline, it gives us the lowest expectable latency.

Another angle of attack is to reduce the "slop", the time wasted waiting for scan-out to start, when the frame is already ready. If that time exists, then we can technically wait before reading and processing user input until the very last moment where rendering the frame would not result in a missed frame.

I.e. if there are "compressible" parts in the data pipeline leading to new buffer contents, we can compress them by waiting upfront just the right amount of time that guarantees the frame will still be shown.

How to estimate the time it takes to process an input? What's the probability distribution, and does it have an average? What's the risk of estimating badly? What should we do when we estimated badly?

Our estimator at worst can take the conclusion that it does not matter, because rendering will take so long that we're in a regime where low-latency isn't even achievable at all. In that regime I'd make the hypothesis that slow but consistent times are enough. I.e. relying on reading inputs on every frame start.

If we were to render too fast, the estimator is dealing with that by using the extra capacity to produce lower latency. Normally it's the vblank synchronization that provides back-pressure to avoid having to render too many frames. Note how disabling vblank effectively disables the back-pressure by rendering more frames, in exchange for lower input latency.

Strategies when missing a frame?

If it's possible, it might be acceptable to allow tearing when we just about unexpectedly missed the frame boundary. There's tearing and tearing.. Maybe tearing within the first lines of the screen isn't so bad/visible.

How sophisticated should the predictor be?

Can we do something like branch prediction in CPU, that is, collect statistics about disparate events so as to predict well clicks in various parts of the screen? Think about the difference between moving the mouse around (only few elements change state and need to be redrawn, due to hover) and clicking on the a the button selecting a tab in a panel, which would redraws the full section.

If missing a prediction isn't so bad, then there's no need to get sophisticated.

UIAutomation + Accessibility module

  • describing an UI as a tree
  • element types:
    • content: textual, visual
    • container: . regions tagged with some purpose . to assist with navigation
  • focus:
    • there is a concept of focus connected with non-spatial control devices such as the keyboard

Any node may have 0..n children elements

Why a tree? Alternatives?

Linear sequences: Focus management conflict. Groups tagged with ids? (I.e. the tree exists somewhere; ids may be non-sensical)

zzstructures

APIs for defining trees:

  • serial form: s-exps / xml / json / iff
  • host language literal equivalent
  • element based:
    • construction of fragments
    • children list manipulation
    • re-rooting
  • iterator based (zipper?)
  • infix/postfix

Some Problems:

  • compactness
  • can lead to creation of graphs

Examples:

  • DOM API

Metadata:

  • WAI-ARIA proposes some tags to inform about roles of UI elements
  • Microsoft's UIAutomation also has control types

General Information about Accessibility on Multiple Platforms

http://corsi.dei.polimi.it/accessibility/download/22-OS%20and%20app%20accessibility.pdf on Linux ATK appears to be a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.