uucidl/00_the-problem-with-UI.org

## 00_the-problem-with-UI.org

      
    Raw
  

              00_the-problem-with-UI.org
            
          
    Links

Note: a lot of programmers talk about UI without mentionning the user even once, as if it was entirely a programming problem. I wonder what we’re leaving off the table when we do that.

  https://gist.github.com/vurtun/65977fcff17e413721dbd1191cda719d Vurtun’s notes about UI
  https://gist.github.com/vurtun/61b6dbf21ef060bcbbd8d1faa88350d9 Vurtun’s experiments
  https://gist.github.com/vurtun/9782db089430167453cff6785b37bb46 other notes by Vurtun
  https://gist.github.com/pervognsen/279156b894c5d04ca73df7afc12a37ee Notes about slate by Per Vognsen
  https://docs.unrealengine.com/latest/INT/Programming/Slate/
  https://docs.unrealengine.com/latest/INT/Engine/UMG/index.html
  https://soundcloud.com/podcastcode/6-dont-make-me-write-ui
  https://blog.johnnovak.net/2016/05/29/cross-platform-gui-toolkit-trainwreck-2016-edition/
  https://handmade.network/forums/t/2260-the_world_needs_a_better_way_to_create_cross_platform_desktop_gui_apps/2
  https://handmade.network/forums/t/2463-application_rendering
  https://nothings.org/gamedev/compositing_tree/ Compositing trees (Sean Barrett)
  http://ourmachinery.com/post/one-draw-call-ui/
  http://ourmachinery.com/post/ui-rendering-using-primitive-buffers/
  https://gist.github.com/pervognsen/75fe0a94ac690780a58fc23a6f354f38
  https://github.com/HotDrink/hotdrink/blob/master/README.md property model
  https://www.youtube.com/watch?v=Xr6dtXw0Ipg Makepad, Rethinking Web UI’s with WebGL - Rik Arends
  https://www.youtube.com/watch?v=Z1qyvQsjK5Y The seminal talk by Casey Muratori that documented the IM UI idea
  https://essence.handmade.network/blogs/p/3435-automatic_ui_layouts_part_1 https://essence.handmade.network/blogs/p/3436-automatic_ui_layouts_part_2 layouts
  https://www.superluminal.eu/2018/10/29/16xaa-font-rendering-using-coverage-masks-part-i/
  https://www.youtube.com/watch?v=ZQ5_u8Lgvyk Reusable components talk by Casey Muratori
  https://docs.microsoft.com/en-us/windows/uwp/design/basics/ overview of microsoft’s concepts around user interfaces
  https://docs.microsoft.com/en-us/windows/uwp/design/basics/commanding-basics
  https://guidebookgallery.org/
  https://sigchi.org/
  https://www.youtube.com/watch?time_continue=6&v=ZZi3I_NCiUk
  https://www.youtube.com/watch?time_continue=904&v=4YTfxresvS8 ECS UI in Rust
  flutter rendering pipeline and layout https://www.youtube.com/watch?v=UUfXWzp0-DU
  https://www.slideshare.net/julienkoenen/how-hard-can-it-be-ui-development-at-keen-games
  https://raphlinus.github.io/rust/graphics/gpu/2019/05/08/modern-2d.html GPU based 2d rendering
  https://www.forkingpaths.dev/posts/23-03-10/rule_based_styling_imgui.html styling, rule based vs style stacks

Evaluation Axes


  input handling, and latency,
  composition,
  dataflow,
  layout,
  painting,
  styling,
  extension,
  resource demands: power draw, cpu cycles

Caveats/blind spots of homegrown solutions

Accessibility is a big issue. How to make your UI Accessible? Usually platform vendors provide APIs to enumerate/navigate/queryh UI elements, extracting some metadata for text-readers and the like.
Multi-viewport UIs: support for multiple windows, multiple displays/monitors with mixed DPI.
Power-draw. (When little of the screen is changing, power-draw should be commensurate with the updated areas. In other word, partial rendering, partial presentation)
Collaboration: support for more than one editing client for multi-user collaboration on the same data.
Responsive layouts and scalable/zoomables UI help users adapt an UI to viewing distances or screen sizes.
Internationalized input methods. (“IME” on windows)
Adaptability to large teams (adding new controls, new assets, new kinds) without contention.
Multi-touch and touch interfaces create many questions: they allow multiple-items to be interacted with at a time (contrary to the pair mouse-keyboard) ; should this be constrained somehow (to only similar items? to only similar items supporting similar interactions? what if more than one person is touching the screen?) ; they also either expect the items to be bigger or the hotzones to be more lenient.
Hi-DPI requires a great deal more pixels to fill. Also what should be the internal units?
Plugin supports: this exposes  lots of problematic scenarios when plugins are allowed to hook to the underlying event loop and desktop APIs. For instance imagine dealing with non-DPI aware plugins within a DPI aware process on Windows.
GPU-based rendering:

  how does it affect latency?
  does it fit the type of graphics shown in UI?

Multiple-interactions in one: Very often UI reach a difficulty when we want to select multiple elements and change them all in parallel. Increment/Change more than one element at a time.
Jump-to-ui-element
Ideas

Collision

Consider each quad representing the bounding box of a UI control. The minkowski sum of that quad and a round region of about radius ‘tr’, representing the tolerance radius, depends on the input device (mouse, touch etc…) and represents the collision box between that input device’s center point and the control. In case n>1 controls match, the ambiguity must be resolved. @idea{separating axes test}
Window Management

In an UI where moving windows or elements is offered to users, it’s nice to satisfy needs for tightly packing those elements.
  Two options:

  snap (magnetic)
  bump+friction (solids) { allows for putting elements next to each other without them touching each other }

Layout algorithm (Flutter)

layout transformation :: min/max width, min/max height, fixed size elements, flexible size elements -> sizes per element
constraints go in, traverse the tree, bring out size of elements traversed.
  tree has row nodes and column nodes.
  go through fixed size elements first, taking the dimension that match the node type (row => heights, column => widths)
  apply size in chosen dimension to flexible elements
Comments

Sometimes people get confused between these two opposites:

  the product is the user interface,
  the user interface surfaces or allows user interacting with a deeper model or simulation

What remains true is that in the two points of view, data has to be shared across multiple systems, and data ultimately has to be seen and interacted with by the user, and belongs to the user interface.
Some surface level aspects of user interfaces (what people call design):

  structure, sequence
  2d/3d layout
  chrome, animations

Logic of interactions are difficult to express.

  
## 01_GPUBasedRendering.org

      
    Raw
  

              01_GPUBasedRendering.org
            
          
    Example1: Firefox Web Render

https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-fps-how-webrender-gets-rid-of-jank/
@url: https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-fps-how-webrender-gets-rid-of-jank/
  @title: The Whole Web At Maximum Speed
GPU based rendering.
Transformations used:

  Page transformed into “stacking context” tree (Compositing tree?)
  Early culling of display items to remove those that are not shown in the viewport.
  Compositing tree turned into render task tree, aftering having optimized it (reduce number of intermediary textures
  Batching draw calls (maximal)
  Assist pixel shaders by allowing Early Z-culling
    
      Opaque pass where opaque objects are draw front to back
      Transculent shapes are drawn back to front
    
  
Pathfinder project: rendering font glyphs on GPU as well
  @url: https://github.com/pcwalton/pathfinder
Links


  https://nothings.org/gamedev/compositing_tree/ Compositing Trees by Sean Barett (RAD Tools’ Iggy)
  http://www.radgametools.com/iggyinfo.htm Iggy
  DEAD LINK: https://timothylottes.github.io/20170408.html Advanced Desktop Display by Timothy Lottes
  DEAD LINK: https://timothylottes.github.io/20170515.html HDMI/DP Mode Reference
  https://www.slideshare.net/RenaldasZioma/trip-down-the-gpu-lane-with-machine-learning-83311744


## 02_DSL_And_Rendering.md

      
    Raw
  

              02_DSL_And_Rendering.md
            
          
    @url: https://www.youtube.com/watch?v=YyIQKBzIuBY&feature=youtu.be&t=2987
@speaker: Alan Kay
Math -> DSL -> Runnable code
Dan Amelang wrote Nile to implement shading routines for UI rendering.
"Problem Oriented Languages"
@url: https://github.com/damelang/nile
The talk references a paper "Syntax-Oriented Compiler"
@url: https://www.youtube.com/watch?v=L1rwVBLHGiU
META II: A Syntax-Oriented Compiler Writing Language

  
## 02_Remotery_Window_Management.md

      
    Raw
  

              02_Remotery_Window_Management.md
            
          
    Reading notes for @url: https://www.youtube.com/watch?v=rX13p0Ndzzk
Docking

Confusing when it's more than one level deep. User going back to their UI: "How did I put my layout together?"
Window Management


Ability to move windows
Sizing vertically, horizontally, diagonally

However it gets tricky to lay things out. Solution: Snapping extremities to boundaries of windows, with visible guide lines. Snapping extremities to the parent container also makes sense.
Auto-anchoring feature. Whenever two windows are snapped together, moving the separator between resizes each side. Windows are "connected." Pressing down for a while then moving allows you to let go of that restriction.
Resizing containers

Internal windows are moved and resized using a physics (spring-mass) system.

  
## 03_low_latency.md

      
    Raw
  

              03_low_latency.md
            
          
    We'll first describe the generic display time line, introducing
some vocabulary. The display reads content from what's called a
framebuffer, a buffer of pixels to be shown. It has its own
timeline:
 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        
 . . . . . . . . . . . . . o . . . . . . . . . . . . . . . o . . .        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)

This is the case for a traditional non-freesync/g-sync display.
In the ideal case, we provide new content to the screen each frame.
A frame is a group of pixels that are consistent as a sample in time:
 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f 0        time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . -----). . . . . . . . . . . . . -----).        swap front-buffer and back-buffer: depends on data being ready.
 a------------------------ ~ ~ ~ b------------------------ ~ ~ ~ a        front-buffer
 a-------------------------------b-------------------------------a        screen as seen by user


swapping front-buffer and back-buffer needs to happen during the
swap period to prevent tearing (content changing while the scan-out
period is active, where the screen shows two frame at a single time)
freesync/g-sync and other adaptative sync tech allow delivering
content at arbitrary points, not only during the vblank period

Note that the presentation time (beginning of the scan-out) of a
frame is often implicit to our programs which can't usually know what
the real effective presentation time is for their user.
Animations feel correct when their calculation time approximates the
presentation time within some accuracy. (It depends on ability of
people for detecting jank in the speed of moving elements)
Missing a frame looks like this (repeated frame, noticeable by users)
 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -----).        swap front-buffer and back-buffer:
 a-------------------------------------------------------- ~ ~ ~ b        front-buffer
 a---------------------------------------------------------------b        screen

Unsynchronized front-buffer swap looks like this: (tearing, noticeable
when the content changes quite a bit in the horizontal direction)
 0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a b c d e f          time ->
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 . . . . . . . . . . . . . o----). . . . . . . . . . . . . o----).        vblank signal
 -------------------------)      -------------------------)               scan-out from front-buffer (presentation)
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -----).        swap front-buffer and back-buffer:
 a---------------------------------------b------------------------        front-buffer
 a-------------------------------a-------b------------------------        screen: shows tearing at the line scanned out at time 4

Lowering UI latency

What's acceptable user latency? Can I make the argument that the lower
possible the better? (i.e. that the simulated world that the UI shows
is more useable, "feels more real" the lower the latency?)
Components throughout the system add latency in unknown way. And
system builders add components between us and the scan-out in ways that
isn't entirely transparent. (Examples: desktop compositors, GPU api
buffering, in screen or transport latency and buffering)
An unacceptable demand on latency is one that cannot be achieved by the
system: we can't expect latency that's smaller than what the screen
speed allows.
Where is latency hiding? Can I simulate the feeling of laggy displays?

mouse cursor trailing behind the actual mouse position
clicking the wrong thing:

mouse position as input trails the actual screen position shown
by cursor.
screen disagrees with simulation


How can we reduce input latency?
Our goal is to transform:
user inputs, autonomous processes -> frame content on the
screen for the next presentation time
Given the following steps:

transport user inputs to program
map input to user interface
user interface interprets input into data model changes and new
graphical content.
render: new graphical content results in frame buffer content
transport frame to screen

Some parts of the graphical content depend on user inputs, some other
parts not. This varies frame to frame? (Imagine resizing panels in an user
interface)
We define the input lag as follows:
Input lag :: time(scan-out) - time(user inputs transport)
Reading user-inputs as close to the presentation time is what low-latency mean.
How can we push time(user inputs transport) closer to time(scan-out)?
General optimization: By reducing durations spent mapping inputs to user
interface, rendering the UI etc... This is our baseline, it gives us the lowest
expectable latency.
Another angle of attack is to reduce the "slop", the time wasted waiting for
scan-out to start, when the frame is already ready. If that time exists, then
we can technically wait before reading and processing user input until the very
last moment where rendering the frame would not result in a missed frame.
I.e. if there are "compressible" parts in the data pipeline leading to new
buffer contents, we can compress them by waiting upfront just the right amount
of time that guarantees the frame will still be shown.
How to estimate the time it takes to process an input? What's the probability
distribution, and does it have an average? What's the risk of estimating badly?
What should we do when we estimated badly?
Our estimator at worst can take the conclusion that it does not matter, because
rendering will take so long that we're in a regime where low-latency isn't even
achievable at all. In that regime I'd make the hypothesis that slow but
consistent times are enough. I.e. relying on reading inputs on every frame
start.
If we were to render too fast, the estimator is dealing with that by using the
extra capacity to produce lower latency. Normally it's the vblank
synchronization that provides back-pressure to avoid having to render too many
frames. Note how disabling vblank effectively disables the back-pressure by
rendering more frames, in exchange for lower input latency.
Strategies when missing a frame?

If it's possible, it might be acceptable to allow tearing when we just about
unexpectedly missed the frame boundary. There's tearing and tearing.. Maybe
tearing within the first lines of the screen isn't so bad/visible.
How sophisticated should the predictor be?

Can we do something like branch prediction in CPU, that is, collect statistics
about disparate events so as to predict well clicks in various parts of the
screen? Think about the difference between moving the mouse around (only few
elements change state and need to be redrawn, due to hover) and clicking on the
a the button selecting a tab in a panel, which would redraws the full section.
If missing a prediction isn't so bad, then there's no need to get sophisticated.

  
## 04_Some_Thoughts_Collected_After_Doing_An_Experiment.md

      
    Raw
  

              04_Some_Thoughts_Collected_After_Doing_An_Experiment.md
            
          
    Resources we care about:


CPU usage (time taken from computation, battery life)
Memory

Goals:

A user interface takes available input devices, interprets those
continuous and discrete actions to trigger data transformations,
computation and communication.
An user interface presents itself and the application to output
devices (displays)
Displays

For ergonomy, the position of elements of an UI is stable in the
coordinate system of that UI. Their position generally shifts as
the result of an user input. Exceptions: timeline editors, graphs.
Display elements are in general not occluding each other, except
in windowing systems. Most apps today follow instead a tiling
arrangement.
Modern displays are framebuffer based and therefore can be seen as local
caches. Going further, graphics processing units (i.e. display accelerators)
go beyond that and can store bitmap elements, textures and GPU programs.
Although modern GPUs can re-render most UI within one display frame, to
preserve resources (CPU resources, for computing/battery life) an UI can
implement:

just-needed rendering: rather than rendering at the display frame rate
(144hz, 60hz) the UI can be rendered only the "cache" is out-of-date
partial rendering: only render what as changed

This applies to CPU-bound computations. Computations done on the GPU would
save on CPU computations. Implications however for battery life depend on
whether the computation is more efficient on the GPU than on the CPU and
whether the entire GPU can go back to IDLE quickly enough.
Just-needed rendering:


UI is called when input devices receive inputs,
UI is called on spontaneous state changes: timers, network or when
a frame ceases to be valid.
UI elements define validity of a rendering frame:
For animation, there is a certain time-to-live attached to the rendered
frame.
For corner cases such as layouting where the UI needs to converge to a
stable state, elements may opt to mark more than one frame as out-dated.
On these conditions the UI will trigger a re-render of the UI.

Partial rendering strategies:

by subregion (keep track of "dirty regions" and re-render only that)
overlays (dirty or fast updated regions are rendered separately
and blit on top of unchanged areas)
Goal: preserve resources.
Failure mode: complete re-render

Quality checklist:


Can I mark and copy text? Is it any text, or just specific things?
Can I enlarge the font or the entire view without breaking the app?
Can I resize the window without breaking the app?
Can I use the app with just the keyboard, or just the mouse (with an
OS-provided on-screen keyboard)?
Does it work with a screen reader?
Does it play nicely with other OS accessibility features (high-contrast mode
or DPI settings)?
Does it support localisation?
Does it have legible and high-quality text rendering and various sizes?
Does it have standard OS chrome (Window icons, menu-bar)

Technical quality checklist:


Good efficiency (resource un-used when application is idling, minimal
data-retention)
Styling
Layouting
Custom UI elements, canvas for custom drawing
Scalable UI elements with nevertheless sharp edges
Good platform sympathy: DPI settings, accessibility,

Platform:


Desktop + Mobile class platforms w/ GPU References
Reference: Compositing Tree
@url: https://nothings.org/gamedev/compositing_tree/
Text is often on top


## mikko_mononen_on_layout.md

      
    Raw
  

              mikko_mononen_on_layout.md
            
          
    @author Mikko Mononen
@url: https://twitter.com/MikkoMononen/status/1379125825459650572
@comment {
Mikko Mononen is referring to "RectCut" as defined in @url: https://halt.software/dead-simple-layouts/ by Martin Cohen.
Martin Cohen can also be found at @url: https://github.com/martincohen
The idea is to take axis-aligned bounding boxes (rects) and carry the layout as an input rect that gets mutated via rectangle producing functions that reserve space within it:
    Rect layout = { 0, 0, 180, 16};
    Rect r1 = cut_left(&layout, 16); // carves out a rectangle of width 16 on the left
    Rect r2 = cut_left(&layout, 12);  // carves out another one
     
    Rect r3 = cut_right(&layout, 32); // carves out another one, this time on the right side.
}
Inspired by @FlohOfWoe's Sokol, I've been tinkering with IMGUI layout based on C's struct initialization. The idea I'm exploring is rectCut + flexbox. Where in common case you can just get a slice of rect, but you can also create small flexbox-like layouts to go with it.
// Menu Layout
MIbox itemSearchCont = { .layout.dir = MI_ROW, .layout.spacing = 4, };
MIbox itemSearchIcon = { .content = miMeasureIcon() };
MIbox itemSearchInput = {};
miBoxAddChilren(&itemSearchCont, &itemSearchIcon, &itemSearchInput);

MIitem item1 = { .text = "Item 1", .detail = "Alt+Shift+Space"};
MIitem item2 = { .flags = MI_ITEM_CHECKED, .icon = ICON_EMOJI_PEOPLE, .text="People", .detail = "Alt+P" };
MIitem item3 = { .flags= MI_ITEM_SUBMENU, .icon = ICON_EMAIL, .text="Email" };

MIbox menuCont = { .layout.dir = MI_COL, .layout.spacing = 4, .layout.pack = MI_START, .layout.pad.x = 6, .layout.pad.y = 6 };
MIbox itemBox1 = { .content = miMeasureItem(item1) };
MIbox itemBox2 = { .content = miMeasureItem(item2) };
MIbox itemBox3 = { .content = miMeasureItem(item3) };
miBoxAddChildren(&menuCont, &itemSearchCont, &itemBox1, &itemBox2, &itemBox3);

miBoxLayout(&menuCont);
miBoxMoveTo(&menuCont, (MIpoint) {.x=500,.y=200});

// Menu logic
miPanelBegin(menuCont.rect);
miIcon(itemSearchIcon.rect, ICON_SEARCH);
if (miChanged(miInput(itemSearchInput.rect, (MIInput){.text=text, .maxText=sizeof(text)}))) {
    printf("Search: %d\n", text);
}
    
if (miPressed(miItem(itemBox1.rect, item1))) {
    printf("Item1 pressed\n"); 
}
miItem(itemBox2.rect, item2);
miItem(itemBox3.rect, item3);
miPanelEnd();
It's all one pass, you build the layout as you go. I have few nasty-to-implement cases. Menus are one of those, variable width content, each row having items aligned to both ends.
Another example how the rectCut works together with the flexbox.
// Search box
char const* searchText = "Search";

MIbox searchCont = { .layout.dir = MI_ROW, .layout.spacing = 4 });
MIbox button = { .content = miMeasureButton((MIbutton){.label=searchText})};
MIbox input = { .content = miMeasureInput((MIinput){}), .grow = 1});
miBoxAddChildren(&searchCont, &input, &button);
miRectCutAndLayout(&windowCont, &searchCont, windowLayout);

miButton(button.rect, (MIButton){.label=searchText, .variant=MI_FILLED)});
miInput(input.rect, (MIinput){,text=text, .maxText=sizeof(text)});

// Slider
MIbox slider = { .content = miMeasureSlider((MISlider){})};
miRectCutAndLayout(&windowCont, &slider, windowLayout);

miSlider(slider.rect, (MIslider){.value=&sliderValue, .vmin=0, .vmax=100});
When adding custom layouts to your layout, it kinda needs to do the measuring twice. The flexbox like layout is super simple, so most time is likely spent in measure text.

  
## ZZ_Accessibility_UITreeBuilding.md

      
    Raw
  

              ZZ_Accessibility_UITreeBuilding.md
            
          
    UIAutomation + Accessibility module


describing an UI as a tree
element types:

content: textual, visual
container:
. regions tagged with some purpose
. to assist with navigation


focus:

there is a concept of focus connected with
non-spatial control devices such as the keyboard


Any node may have 0..n children elements
Why a tree? Alternatives?
Linear sequences: Focus management conflict. Groups tagged with ids?
(I.e. the tree exists somewhere; ids may be non-sensical)
zzstructures
APIs for defining trees:

serial form: s-exps / xml / json / iff
host language literal equivalent
element based:

construction of fragments
children list manipulation
re-rooting


iterator based (zipper?)
infix/postfix

Some Problems:

compactness
can lead to creation of graphs

Examples:

DOM API

Metadata:

WAI-ARIA proposes some tags to inform about roles of UI elements
Microsoft's UIAutomation also has control types

General Information about Accessibility on Multiple Platforms

http://corsi.dei.polimi.it/accessibility/download/22-OS%20and%20app%20accessibility.pdf
on Linux ATK appears to be a thing.