- make better decisions and know the justifications behind development best practices when developing for the web
- there are many browsers out there, and each behave in their own flavors. We are only going to focus on a couple browsers.
- There are five major browsers used on desktop today: Chrome, Internet Explorer, Firefox, Safari and Opera.
- On mobile, the main browsers are Android Browser, iPhone, Opera Mini and Opera Mobile, UC Browser, the Nokia S40/S60 browsers and Chrome–all of which, except for the Opera browsers, are based on WebKit.
- The main function of a browser is to present the web resource you choose, by requesting it from the server and displaying it in the browser window.
- The resource is usually an HTML document, but may also be a PDF, image, or some other type of content.
- The location of the resource is specified by the user using a URI (Uniform Resource Identifier).
- The way the browser interprets and displays HTML files is specified in the HTML and CSS specifications.
- These specifications are maintained by the W3C (World Wide Web Consortium) organization, which is the standards organization for the web.
- The user interface: this includes the address bar, back/forward button, bookmarking menu, etc.
- The browser engine: marshals actions between the UI and the rendering engine
- The rendering engine : responsible for displaying requested content. For example if the requested content is HTML, the rendering engine parses HTML and CSS, and displays the parsed content on the screen.
- Networking: for network calls such as HTTP requests, using different implementations for different platform behind a platform-independent interface.
- UI backend: used for drawing basic widgets like combo boxes and windows. This backend exposes a generic interface that is not platform specific. Underneath it uses operating system user interface methods.
- JavaScript interpreter. Used to parse and execute JavaScript code.
- Data storage. This is a persistence layer. The browser may need to save all sorts of data locally, such as cookies. Browsers also support storage mechanisms such as localStorage, IndexedDB, WebSQL and FileSystem.
- It is important to note that browsers such as Chrome run multiple instances of the rendering engine: one for each tab.
- Each tab runs in a separate process.
- The responsibility of the rendering engine is well... Rendering, that is display of the requested contents on the browser screen.
- By default the rendering engine can display HTML and XML documents and images.
- The rendering engine will start getting the contents of the requested document from the networking layer.
- This will usually be done in 8kB chunks.
- after that, the flow looks like this:
- The rendering engine will start parsing the HTML document and convert elements to DOM nodes in a tree called the "content tree".
- The engine will parse the style data, both in external CSS files and in style elements.
- Styling information together with visual instructions in the HTML will be used to create another tree: the render tree.
- The job of the HTML parser is to parse the HTML markup into a parse tree.
- The vocabulary and syntax of HTML are defined in specifications created by the W3C organization.
- The output tree (the "parse tree") is a tree of DOM element and attribute nodes.
- DOM is short for Document Object Model.
- It is the object presentation of the HTML document and the interface of HTML elements to the outside world like JavaScript.
- The root of the tree is the "Document" object.
- The DOM has an almost one-to-one relation to the markup.
- html like this:
<html>
<body>
<p>
Hello World
</p>
<div> <img src="example.png"/></div>
</body>
</html>
- turns into this:
- The parsing algorithm is described in detail by the HTML5 specification.
- The algorithm consists of two stages: tokenization and tree construction
- Tokenization is the lexical analysis, parsing the input into tokens.
- Among HTML tokens are start tags, end tags, attribute names and attribute values.
- The tokenizer recognizes the token, gives it to the tree constructor, and consumes the next character for recognizing the next token, and so on until the end of the input.
- You never get an "Invalid Syntax" error on an HTML page. Browsers fix any invalid content and go on.
- Take this HTML for example:
<html>
<mytag>
</mytag>
<div>
<p>
</div>
Really lousy HTML
</p>
</html>
- a table looking like this
<table>
<table>
<tr><td>inner table</td></tr>
</table>
<tr><td>outer table</td></tr>
</table>
<table>
<tr><td>outer table</td></tr>
</table>
<table>
<tr><td>inner table</td></tr>
</table>
- CSS file is parsed into a StyleSheet object. Each object contains CSS rules.
- The CSS rule objects contain selector and declaration objects and other objects corresponding to CSS grammar.
- Authors expect scripts to be parsed and executed immediately when the parser reaches a
<script>
tag. - Authors can add the "defer" attribute to a script, in which case it will not halt document parsing and will execute after the document is parsed.
- HTML5 adds an option to mark the script as asynchronous so it will be parsed and executed by a different thread.
- Style sheets on the other hand have a different model.
- Firefox blocks all scripts when there is a style sheet that is still being loaded and parsed.
- WebKit blocks scripts only when they try to access certain style properties that may be affected by unloaded style sheets
- While the DOM tree is being constructed, the browser constructs another tree, the render tree.
- This tree is of visual elements in the order in which they will be displayed. It is the visual representation of the document.
- The purpose of this tree is to enable painting the contents in their correct order.
- Building the render tree requires calculating the visual properties of each render object.
- This is done by calculating the style properties of each element.
- The style includes style sheets of various origins, inline style elements and visual properties in the HTML (like the "bgcolor" property).
- The later is translated to matching CSS style properties.
- Style data is a very large construct, holding the numerous style properties, this can cause memory problems.
- Finding the matching rules for each element can cause performance issues if it's not optimized. Traversing the entire rule list for each element to find matches is a heavy task.
- Applying the rules involves quite complex cascade rules that define the hierarchy of the rules.
- Suppose we have this HTML
<html>
<body>
<div class="err" id="div1">
<p>
this is a <span class="big"> big error </span>
this is also a
<span class="big"> very big error</span> error
</p>
</div>
<div class="err" id="div2">another error</div>
</body>
</html>
- and these styles
div {margin:5px;color:black}
.err {color:red}
.big {margin-top:3px}
div span {margin-bottom:4px}
#div1 {color:blue}
#div2 {color:green}
- The resulting rule tree will look like this (the nodes are marked with the node name: the number of the rule they point at):
- The style object has properties corresponding to every visual attribute (all CSS attributes but more generic).
- If the property is not defined by any of the matched rules, then some properties can be inherited by the parent element style object.
- Other properties have default values.
- A declaration for a style property can appear in several style sheets, and several times inside a style sheet. This means the order of applying the rules is very important.
- This is called the "cascade" order. According to CSS2 spec, the cascade order is (from low to high):
- Browser declarations
- User normal declarations
- Author normal declarations
- Author important declarations
- User important declarations
- The layout usually has the following pattern:
- Parent renderer determines its own width.
- Parent goes over children and:
- Place the child renderer (sets its x and y).
- Calls child layout if needed–they are dirty or we are in a global layout, or for some other reason–which calculates the child's height.
- Parent uses children's accumulative heights and the heights of margins and padding to set its own height–this will be used by the parent renderer's parent.
- The renderer's width is calculated using the container block's width, the renderer's style "width" property, the margins and borders.
- something like this:
<div style="width: 30%"/></div>
- Would be calculated by WebKit as the following(class RenderBox method calcWidth):
- The container width is the maximum of the containers availableWidth and 0.
- The availableWidth in this case is the contentWidth which is calculated as: clientWidth() - paddingLeft() - paddingRight()
- In the painting stage, the render tree is traversed and the renderer's "paint()" method is called to display content on the screen.
- Painting uses the UI infrastructure component.
- CSS2 defines the order of the painting process. This is actually the order in which the elements are stacked in the stacking contexts.
- This order affects painting since the stacks are painted from back to front.
-
The stacking order of a block renderer is:
-
background color
-
background image
-
border
-
children
-
outline
- The CSS box model describes the rectangular boxes that are generated for elements in the document tree and laid out according to the visual formatting model.
- Each box has a content area (e.g. text, an image, etc.) and optional surrounding padding, border, and margin areas.
- Each node generates 0..n such boxes.
- All elements have a "display" property that determines the type of box that will be generated. Examples:
block: generates a block box.
inline: generates one or more inline boxes.
none: no box is generated.