Skip to content

Instantly share code, notes, and snippets.

@jnewman12
Created December 8, 2016 17:57
Show Gist options
  • Save jnewman12/1a330b19d9ffa88b29430f541b7bfde4 to your computer and use it in GitHub Desktop.
Save jnewman12/1a330b19d9ffa88b29430f541b7bfde4 to your computer and use it in GitHub Desktop.
Slides for How the Browser works

How The Browser Works


SWBAT

  • make better decisions and know the justifications behind development best practices when developing for the web

Browsers we will be talking about

  • there are many browsers out there, and each behave in their own flavors. We are only going to focus on a couple browsers.
  • There are five major browsers used on desktop today: Chrome, Internet Explorer, Firefox, Safari and Opera.
  • On mobile, the main browsers are Android Browser, iPhone, Opera Mini and Opera Mobile, UC Browser, the Nokia S40/S60 browsers and Chrome–all of which, except for the Opera browsers, are based on WebKit.

DNS Lookup

first


second


third


fourth


fifth


sixth


seventh


eighth


The browsers main functionality

  • The main function of a browser is to present the web resource you choose, by requesting it from the server and displaying it in the browser window.
  • The resource is usually an HTML document, but may also be a PDF, image, or some other type of content.
  • The location of the resource is specified by the user using a URI (Uniform Resource Identifier).
  • The way the browser interprets and displays HTML files is specified in the HTML and CSS specifications.
  • These specifications are maintained by the W3C (World Wide Web Consortium) organization, which is the standards organization for the web.

The browser's high level structure

  • The user interface: this includes the address bar, back/forward button, bookmarking menu, etc.
  • The browser engine: marshals actions between the UI and the rendering engine
  • The rendering engine : responsible for displaying requested content. For example if the requested content is HTML, the rendering engine parses HTML and CSS, and displays the parsed content on the screen.
  • Networking: for network calls such as HTTP requests, using different implementations for different platform behind a platform-independent interface.

Browser structure (continued)

  • UI backend: used for drawing basic widgets like combo boxes and windows. This backend exposes a generic interface that is not platform specific. Underneath it uses operating system user interface methods.
  • JavaScript interpreter. Used to parse and execute JavaScript code.
  • Data storage. This is a persistence layer. The browser may need to save all sorts of data locally, such as cookies. Browsers also support storage mechanisms such as localStorage, IndexedDB, WebSQL and FileSystem.

browser layers

  • It is important to note that browsers such as Chrome run multiple instances of the rendering engine: one for each tab.
  • Each tab runs in a separate process.

The rendering engine

  • The responsibility of the rendering engine is well... Rendering, that is display of the requested contents on the browser screen.
  • By default the rendering engine can display HTML and XML documents and images.

Main Flow

  • The rendering engine will start getting the contents of the requested document from the networking layer.
  • This will usually be done in 8kB chunks.
  • after that, the flow looks like this:

browser flow


Main flow (continued)

  • The rendering engine will start parsing the HTML document and convert elements to DOM nodes in a tree called the "content tree".
  • The engine will parse the style data, both in external CSS files and in style elements.
  • Styling information together with visual instructions in the HTML will be used to create another tree: the render tree.

Webkit flow

webkit browser flow


Gecko flow

gecko browser flow


HTML Parser

  • The job of the HTML parser is to parse the HTML markup into a parse tree.
  • The vocabulary and syntax of HTML are defined in specifications created by the W3C organization.

The DOM

  • The output tree (the "parse tree") is a tree of DOM element and attribute nodes.
  • DOM is short for Document Object Model.
  • It is the object presentation of the HTML document and the interface of HTML elements to the outside world like JavaScript.
  • The root of the tree is the "Document" object.
  • The DOM has an almost one-to-one relation to the markup.

  • html like this:
<html>
  <body>
    <p>
      Hello World
    </p>
    <div> <img src="example.png"/></div>
  </body>
</html>

  • turns into this:

DOM Tree


The parsing algorithm

  • The parsing algorithm is described in detail by the HTML5 specification.
  • The algorithm consists of two stages: tokenization and tree construction
  • Tokenization is the lexical analysis, parsing the input into tokens.
  • Among HTML tokens are start tags, end tags, attribute names and attribute values.
  • The tokenizer recognizes the token, gives it to the tree constructor, and consumes the next character for recognizing the next token, and so on until the end of the input.

HTML parsing flow (taken from HTML5 spec)

html parsing flow


Browsers' error tolerance

  • You never get an "Invalid Syntax" error on an HTML page. Browsers fix any invalid content and go on.
  • Take this HTML for example:
<html>
  <mytag>
  </mytag>
  <div>
  <p>
  </div>
    Really lousy HTML
  </p>
</html>

Misformed HTML Examples

  • a table looking like this
<table>
    <table>
        <tr><td>inner table</td></tr>
    </table>
    <tr><td>outer table</td></tr>
</table>

Would render something like this (in webkit)

<table>
    <tr><td>outer table</td></tr>
</table>
<table>
    <tr><td>inner table</td></tr>
</table>

CSS Parsing

  • CSS file is parsed into a StyleSheet object. Each object contains CSS rules.
  • The CSS rule objects contain selector and declaration objects and other objects corresponding to CSS grammar.

css parsing


The order of processing scripts and style sheets

  • Authors expect scripts to be parsed and executed immediately when the parser reaches a <script> tag.
  • Authors can add the "defer" attribute to a script, in which case it will not halt document parsing and will execute after the document is parsed.
  • HTML5 adds an option to mark the script as asynchronous so it will be parsed and executed by a different thread.

Style sheets (continued)

  • Style sheets on the other hand have a different model.
  • Firefox blocks all scripts when there is a style sheet that is still being loaded and parsed.
  • WebKit blocks scripts only when they try to access certain style properties that may be affected by unloaded style sheets

Render tree construction

  • While the DOM tree is being constructed, the browser constructs another tree, the render tree.
  • This tree is of visual elements in the order in which they will be displayed. It is the visual representation of the document.
  • The purpose of this tree is to enable painting the contents in their correct order.

Style computation

  • Building the render tree requires calculating the visual properties of each render object.
  • This is done by calculating the style properties of each element.
  • The style includes style sheets of various origins, inline style elements and visual properties in the HTML (like the "bgcolor" property).
  • The later is translated to matching CSS style properties.

Style computation brings up a few difficulties:

  • Style data is a very large construct, holding the numerous style properties, this can cause memory problems.
  • Finding the matching rules for each element can cause performance issues if it's not optimized. Traversing the entire rule list for each element to find matches is a heavy task.
  • Applying the rules involves quite complex cascade rules that define the hierarchy of the rules.

An example

  • Suppose we have this HTML
<html>
  <body>
    <div class="err" id="div1">
      <p>
        this is a <span class="big"> big error </span>
        this is also a
        <span class="big"> very  big  error</span> error
      </p>
    </div>
    <div class="err" id="div2">another error</div>
  </body>
</html>

An example continued

  • and these styles
div {margin:5px;color:black}
.err {color:red}
.big {margin-top:3px}
div span {margin-bottom:4px}
#div1 {color:blue}
#div2 {color:green}

The result

  • The resulting rule tree will look like this (the nodes are marked with the node name: the number of the rule they point at):

above syntax


Applying these rules

  • The style object has properties corresponding to every visual attribute (all CSS attributes but more generic).
  • If the property is not defined by any of the matched rules, then some properties can be inherited by the parent element style object.
  • Other properties have default values.

StyleSheet Order

  • A declaration for a style property can appear in several style sheets, and several times inside a style sheet. This means the order of applying the rules is very important.
  • This is called the "cascade" order. According to CSS2 spec, the cascade order is (from low to high):
    • Browser declarations
    • User normal declarations
    • Author normal declarations
    • Author important declarations
    • User important declarations

The Layout Process

  • The layout usually has the following pattern:
  • Parent renderer determines its own width.
  • Parent goes over children and:
    • Place the child renderer (sets its x and y).
    • Calls child layout if needed–they are dirty or we are in a global layout, or for some other reason–which calculates the child's height.
  • Parent uses children's accumulative heights and the heights of margins and padding to set its own height–this will be used by the parent renderer's parent.

Width Calculation

  • The renderer's width is calculated using the container block's width, the renderer's style "width" property, the margins and borders.
  • something like this:
<div style="width: 30%"/></div>
  • Would be calculated by WebKit as the following(class RenderBox method calcWidth):

  • The container width is the maximum of the containers availableWidth and 0.
  • The availableWidth in this case is the contentWidth which is calculated as: clientWidth() - paddingLeft() - paddingRight()

Painting

  • In the painting stage, the render tree is traversed and the renderer's "paint()" method is called to display content on the screen.
  • Painting uses the UI infrastructure component.
  • CSS2 defines the order of the painting process. This is actually the order in which the elements are stacked in the stacking contexts.
  • This order affects painting since the stacks are painted from back to front.

Painting stack order

  • The stacking order of a block renderer is:

  • background color

  • background image

  • border

  • children

  • outline


CSS Box Model

  • The CSS box model describes the rectangular boxes that are generated for elements in the document tree and laid out according to the visual formatting model.
  • Each box has a content area (e.g. text, an image, etc.) and optional surrounding padding, border, and margin areas.

box model


  • Each node generates 0..n such boxes.
  • All elements have a "display" property that determines the type of box that will be generated. Examples:

block: generates a block box.
inline: generates one or more inline boxes.
none: no box is generated.

The End


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment