Skip to content

Instantly share code, notes, and snippets.

@chengmu
Created October 24, 2013 09:40
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save chengmu/7134155 to your computer and use it in GitHub Desktop.
Save chengmu/7134155 to your computer and use it in GitHub Desktop.
GPU_notes

#GPU & Composting in Blink and Chrome

summaried from Chromium documents and other resources

##Background

###Stacking Contexts Positioned elements(relative, absolute, fixed) with a z-index group of layers into a isolated layer

A Stacking Contexts 'flattens' the elements' subtree, so nothing outside of the subtree can paint between elements of the subtree.

In other word: the rest of tehe Dom Tree can treat the Stacking context as an atomic conceptual layer for painting

#####Standards CSS2.1 9.9 Layer Presentation

CSS2.1 Appendix E. Elaborate description of Stacking Contexts

###The GPU Process

For the renderer process: to issue command to the GPU.

The GPU process exists primarily for security reasons.

Restricted by its sandbox, the Renderer process (where WebKit and the compositor live) cannot directly issue calls to the 3D APIs provided by the OS (we use Direct3D on Windows, OpenGL everywhere else). For that reason we use a separate process to do the rendering. The GPU process is specifically designed to provide access to the system's 3D APIs from within the Renderer sandbox or the even more restrictive Native Client "jail".

Currently Chrome uses a single GPU process per browser instance, serving requests from all the renderer processes and any plugin processes. The GPU process, while single threaded, can multiplex between multiple command buffers, each one of which is associated with its own rendering context.

implementation of hardware-accelerated compositing in Chrome

Traditionally: Browsers depend on CPU to render web page content

GPU involved in compositing contents of web page

###How Broswer Works

  • repaint : expensive, need to recaculate the pixels info, like colors;
  • redraw : cheap

##Compositing in Webkit/Blink

How WebKit Render Pages

RenderObject's are stored in a parallel tree structure, called the Render Tree. A RenderObject knows how to present (paint) the contents of the Node on a display surface. It does so by issuing the necessary draw calls to a GraphicsContext. A GraphicsContext is ultimately responsible for writing the pixels into a bitmap that gets displayed to the screen. In Chrome, the GraphicsContext wraps Skia, our 2D drawing library, and most GraphicsContext calls become calls to an SkCanvas or SkPlatformCanvas (see this document for more on how Chrome uses Skia).

RenderLayers exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc.

Notice that there isn't a one-to-one correspondence between RenderObjects and RenderLayers. A particular RenderObject is associated either with the RenderLayer that was created for it, if there is one, or with the RenderLayer of the first ancestor that has one.

RenderLayers form a tree hierarchy as well. The root node is the RenderLayer corresponding to the root element in the page and the descendants of every node are layers visually contained within the parent layer. The children of each RenderLayer are kept into two sorted lists both sorted in ascending order, the negZOrderList containing child layers with negative z-indices (and hence layers that go below the current layer) and the posZOrderList contain child layers with positive z-indices (layers that go above the current layer).

  • The DOM tree, which is our fundamental retained model
  • The RenderObject tree, which has a 1:1 mapping to the DOM tree’s visible nodes. + RenderObjects know how to paint their corresponding DOM nodes.
  • The RenderLayer tree, made up of RenderLayers that map to a RenderObject on the RenderObject tree. The mapping is many-to-one, as each RenderObject is either associated with its own RenderLayer or the RenderLayer of its first ancestor that has one. The RenderLayer tree preserves z-ordering amongst layers.

Many Trees

####Two Render Path : Software or Hardware

####Software implementation In the software path, the page is rendered by sequentially painting all the RenderLayers, from back to front. The RenderLayer hierarchy is traversed recursively starting from the root and the bulk of the work is done in RenderLayer::paintLayer() which performs the following basic steps (the list of steps is simplified here for clarity):

  1. Determines whether the layer intersects the damage rect for an early out.
  2. Recursively paints the layers below this one by calling paintLayer() for the layers in the negZOrderList.
  3. Asks RenderObjects associated with this RenderLayer to paint themselves.
  4. This is done by recursing down the RenderObject tree starting with the RenderObject which created the layer. Traversal stops whenever a RenderObject associated with a different RenderLayer is found.
  5. Recursively paints the layers above this one by calling paintLayer() for the layers in the posZOrderList.

In this mode RenderObjects paint themselves into the destination bitmap by issuing draw calls into a single shared GraphicsContext (implemented in Chrome via Skia).

###Hardware Implementation: Compositor & GPU

####Compositor

  • Some (but not all) of the RenderLayers get their own backing surface
  • layers with their own backing surfaces are called compositing layers
  • into which they paint instead of drawing directly into the common bitmap for the page

Example

Eg: the compositor is responsible for applying the necessary transformations (as specified by the layer's CSS transform properties) to each compositing layer’s bitmap before compositing it. Further, since painting of the layers is decoupled from compositing, invalidating one of these layers only results in repainting the contents of that layer alone and recompositing.

In contrast, with the software path, invalidating any layer requires repainting all layers (at least the overlapping portions of them) below and above it which unnecessarily taxes the CPU.

What is Compositing?

(in the context of rendering websites), The use of multiple backing stores to cache and group chunks of the render tree

Benefits

  • Avoide unnecessary repainting
    • components have own backing stores, nothing needs repaiting while this example animates
  • Makes some features more efficient or practical
    • scrolling, 3D CSS, opacity, filters, WebGL, hardware video decoding

Tasks of Compositing

  1. determine how to grount contents into backing stores
  2. Paint the contents of each composited layer
  3. Draw the composited layers to make a final image

New Tree! With the introduction of compositing, we add an additional conceptual tree: the GraphicsLayer tree. Each RenderLayer either has its own GraphicsLayer (if it is a compositing layer) or uses the GraphicsLayer of its first ancestor that has one. This is similar to RenderObject’s relationship with RenderLayers. Each GraphicsLayer has a GraphicsContext for the associated RenderLayers to draw into.

Code related to the compositor lives inside WebCore, behind the USE(ACCELERATED_COMPOSITING) guards.

####Here Comes the GPU!!

With the addition of the accelerated compositor, in order to eliminate costly memory transfers, the final rendering of the browser's tab area is handled directly by the GPU. ( Code for it lives behind the ACCELERATED_COMPOSITING compile-time flag.) The Compositor library is essentially using the GPU to composite rectangular areas of the page (i.e. all those compositing layers) into a single bitmap, which is the final page image.

#####Benefits of GPU

  • eliminating unnecessary (and very slow) copies of large data, especially copies from video memory to system memory.

  • In most cases, the GPU can achieve far better efficiency than the CPU (both in terms of speed and power draw) in drawing and compositing operations that involve large numbers of pixels as the hardware is designed specifically for these types of workloads.

  • Utilizing the GPU for these operations also provides parallelism between the CPU and GPU, which can operate at the same time to create an efficient graphics pipeline.

####When Will This Happen?

  • when the --forced-compositing-mode flag is turned on
    • by default in Chrome on Android and ChromeOS
  • Safari on the Mac (and most likely iOS) follows the hardware accelerated path and makes heavy use of Apple's proprietary CoreAnimation API.
  • at least one of the page’s RenderLayer’s requires hardware acceleration

####Candidates for Optimizations

In the current WebKit implementation, the following conditions are some of those that cause a RenderLayer to get its own compositing layer (see the CompositingReasons enum in RenderLayer.h for a longer list ):

  • Opacity, transforms, filters, reflections Significantly easier to apply to the composited layer when drawing

    • Layer has 3D or perspective transform CSS properties
    • Layer uses a CSS animation for its opacity or uses an animated webkit transform
    • Layer uses accelerated CSS filters
    • Layer with a composited descendant has information that needs to be in the + composited layer tree, such as a clip or reflection
  • Scrolling, fixed-position Cases where compositing a subtree of content greatly reduces the number of costly repaints

  • Content that is rendered separately Compositing on the GPU can remove the need for read-back of pixels For example, WebGL, hardware-decoded video, some plugins

    • Layer is used by<video>element using accelerated video decoding
    • Layer is used by a<canvas>element with a 3D context or accelerated 2D context
    • Layer is used for a composited plugin
  • Idealy Shouldn't but it DOES

    • Layer has a sibling with a lower z-index which has a compositing layer (in other + words the layer is rendered on top of a composited layer)
    • Composited descendant may need composited parent To correctly propagate transform, preserve-3d, or clipping information in the composited tree.

##Debug mode in Chrome

###Flags chrome://flags

  • --force-compositing-mode Pages that don't "require" compositing will still use it

  • --show-composited-layer-borders Visualize borders (and tiles) on composited layers.

  • --show-paint-rects Visualize what layers required repainting

  • --show-property-changed-rects Visualize what layers required redrawing without repainting

###Test Poster Circle Animations disable overlap testing and conservatively composite - try adding a stacking context that does not overlap anything - it still gets composited!

MapsGL HTML controls and popups easily overlayed on top of WebGL content.

Android apps page See composited layers come and go while transition animations are playing. Notice clipping elements and 3d elements usually become layers.

###Summary Now we know roughly how to draw a page using the compositor: the page is divided up into layers, layers are rasterized into textures, textures are uploaded to the GPU, and the compositor tells the GPU to put all the textures together into the final screen image

####Glossary

  • bitmap: a buffer of pixel values in memory (main memory or the GPU’s video RAM)
  • texture: a bitmap meant to be applied to a 3D model on the GPU
  • painting: in our terms, the phase of rendering where RenderObjects make calls into the GraphicsContext API to make a visual representation of themselves
  • asterization: in our terms, the phase of rendering where the bitmaps backing up RenderLayers are filled. This can occur immediately as GraphicsContext calls are by the RenderObjects, or it can occur later if we’re using SkPicture record for painting and SkPicture playback for rasterization.
  • compositing: in our terms, the phase of rendering that combines RenderLayer’s textures into a final screen image
  • drawing : in our terms, the phase of rendering that actually puts pixels onto the screen (i.e. puts the final screen image onto the screen).

Using the --show-composited-layer-borders flag will display borders around layers, and uses colors to display information about the layers, or tiles within layers:

  • Green - The border around the outside of a composited layer.
  • Dark Blue - The border around the outside of a "render surface". Surfaces are textures used as intermediate targets while drawing the frame.
  • Purple - The border around the outside of a surface's reflection.
  • Cyan - The border around a tile within a tiled composited layer. Large composited layers are broken up into tiles to avoid using large textures.
  • Red - The border around a composited layer, or a tile within one, for which the texture is not valid or present. Red can indicate a compositor bug, where the texture is lost, but typically indicates the compositor has reached its memory limits, and the red layers/tiles were unable to fit within those limits.

##More Stuff ###Optimize rendering

60FPS

manage to do all of this 60 times a second so animation, scrolling, and other page interactions are smooth.

####Damage WebKit keeps track of what parts of the screen need to be updated. The result is a damage rectangle whose coordinates indicate the part of the page that needs to be repainted traverse the RenderLayer tree and only repaint the parts of each RenderLayer that intersect with the damage rect, skipping the layer entirely if it doesn’t overlap with the damage rect. This prevents us from having to repaint the entire page every time any part of it changes, an obvious performance win.

####Tiling #####texture-size problem

  • software: bitmap, easy to change subregion
  • handware: bitmap as Textture, can't be change partially

#####Solution: only paint and upload what’s currently visible. texture never needs to be larger than the viewport,

Each layer is split up into tiles (currently of a fixed size, 256x256 pixels).

We determine which parts of the layer are needed on the GPU and only paint + upload those tiles. texture streaming: save GPU from stalling during a long upload

####Case Study: Scroll #####Software: repaint the gap #####Handware (highlight compositor's power)

  • very small scroll amount: already got texture on GPU, just drawing no painting needed

  • large scroll amout: only need to rasterize and upload these tiles, no other affected

  • prepainting: paint tiles in ahead

###Treaded Compositor

##Tool Frame Viewer

##Reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment