Explaining the principles of browser rendering in a simple way

Foreword

The core of the browser refers to the core program that supports the running of the browser. It is divided into two parts, one is the rendering engine, and the other is the JS engine. The rendering engine is not all the same in different browsers. At present, the common browser kernels on the market can be divided into these four types: Trident (IE), Gecko (Firefox), Blink (Chrome, Opera), and Webkit (Safari). The most familiar one here is probably the Webkit kernel, which is the real overlord in the current browser world.
In this article, we take Webkit as an example to conduct an in-depth analysis of the rendering process of modern browsers.

Page load process

Before introducing the browser rendering process, we briefly introduce the page loading process, which will help you better understand the subsequent rendering process.

The main points are as follows:

The browser obtains the IP address of the domain name from the DNS server
Send an HTTP request to the machine at this IP
The server receives, processes and returns the HTTP request
The browser gets the return content

For example, input it in the browser https://rendc.com/forum, and then after DNS resolution, rendc.com, the corresponding IP is 45.130.228.232 (the IP corresponding to different time and place may be different). The browser then sends an HTTP request to that IP.

The server receives the HTTP request, and then performs calculation (pushing different content to different users) and returns the HTTP request. The returned content is as follows:

In fact, it is a bunch of strings in HMTL format, because only HTML format browsers can parse them correctly, which is a requirement of the W3C standard. The next step is the rendering process of the browser.

Browser rendering process

The browser rendering process is roughly divided into the following three parts:

1) The browser parses three things:

One is HTML/SVG/XHTML. The HTML string describes the structure of a page, and the browser parses the HTML structure string into a DOM tree structure.
The second is CSS. Parsing CSS will generate a CSS rule tree, which is similar to the DOM structure.
The third is the JavaScript script. After the JavaScript script file is loaded, the DOM Tree and CSS Rule Tree are manipulated through the DOM API and CSSOM API.

2) After the parsing is completed, the browser engine will construct the Rendering Tree through the DOM Tree and CSS Rule Tree.

Rendering Tree, The rendering tree is not the same as the DOM tree. The rendering tree only includes the nodes that need to be displayed and the style information of these nodes.
The CSS Rule Tree is mainly to complete the matching and attach the CSS Rule to each Element (that is, each Frame) on the Rendering Tree.
Then, the position of each Frame is calculated, which is also called the layout and reflow process.

3) Finally, draw by calling the API of the native GUI of the operating system.

Next, we elaborate on the important steps that we have experienced.

Build the DOM

Browsers follow a set of steps to convert an HTML file into a DOM tree. Macroscopically, it can be divided into several steps:

The browser reads the raw bytes of HTML from disk or the network and converts them into strings according to the file's specified encoding (such as UTF-8).
The content transmitted in the network is 0 and 1 bytes of data. When the browser receives these byte data, it converts these byte data into a string, which is the code we wrote.
Convert a string to a Token, such as: <html>, <body>etc. The Token will identify whether the current Token is "start tag" or "end tag" or "text" and other information.
At this time, you must have questions, how to maintain the relationship between nodes?
In fact, this is the role of the Token to identify the "start tag" and "end tag". For example, the node between the start tag and end tag of the "title" Token must be a child node of "head".
The above figure shows the relationship between nodes, for example: "Hello" Token is located between the "title" start tag and "title" end tag, indicating that the "Hello" Token is a child node of the "title" Token. Similarly, "title" Token is a child node of "head" Token.
Generate node objects and build DOM
In fact, in the process of building the DOM, instead of waiting for all Tokens to be converted before generating node objects, it consumes Tokens while generating Tokens to generate node objects. In other words, after each Token is generated, it will immediately consume the Token to create a node object. Note: Tokens marked with end tags do not create node objects.

Next, let's take an example, assuming that there is a piece of HTML text:

<html>
<head>
    <title>Web page parsing</title>
</head>
<body>
    <div>
        <h1>Web page parsing</h1>
        <p>This is an example Web page. </p>
    </div>
</body>
</html>

The above HTML will be parsed like this:

Build CSSOM

The DOM captures the content of the page, but the browser also needs to know how the page is displayed, so CSSOM needs to be built.

The process of building CSSOM is very similar to the process of building DOM. When the browser receives a piece of CSS, the first thing the browser needs to do is to identify the Token, then build the node and generate the CSSOM. In this process, the browser will determine what the style of each node is, and this process is very resource intensive. Because the style you can set to a node by yourself can also be obtained through inheritance. During this process, the browser must recurse through the CSSOM tree and then determine what a specific element looks like.

Note: CSS matching of HTML elements is a rather complex and performance issue. Therefore, the DOM tree should be small, and CSS should use id and class as much as possible.

Build the render tree

After we generate the DOM tree and CSSOM tree, we need to combine the two trees into a rendering tree.

In this process, it is not simply a matter of merging the two. The render tree will only include the nodes that need to be displayed and the style information of these nodes. If a node display: none, then it will not be displayed in the render tree.

We may have a doubt: what should the browser do if it encounters a JS file during rendering?

During the rendering process, if it encounters <script>it, stop rendering and execute JS code. Because the browser has a GUI rendering thread and a JS engine thread, in order to prevent unpredictable rendering results, these two threads are mutually exclusive. The loading, parsing and execution of JavaScript will block the construction of the DOM. When the HTML parser encounters JavaScript during the construction of the DOM, it will suspend the construction of the DOM, transfer control to the JavaScript engine, and wait for the JavaScript engine to finish running., the browser resumes DOM construction from where it left off.

If you want to render the first screen faster, you should not load JS files at the first screen, which is why it is recommended to put the script tag at the bottom of the body tag. Of course, now, it's not that the script tag must be at the bottom, because you can add defer or async attributes to the script tag (the difference between the two will be explained below).

JS files don't just block DOM construction, it causes CSSOM to block DOM construction as well.

Originally, the construction of DOM and CSSOM did not affect each other, and the well water did not make river water. However, once JavaScript was introduced, CSSOM also began to block the construction of DOM. Only after CSSOM was constructed, DOM resumed DOM construction.

what's going on?

This is because JavaScript can not only change the DOM, it can also change the styles, which means it can change the CSSOM. Because the incomplete CSSOM cannot be used, if JavaScript wants to access the CSSOM and change it, it must be able to get the complete CSSOM when executing the JavaScript. So it leads to a phenomenon, if the browser has not completed the download and construction of CSSOM, and we want to run the script at this time, then the browser will delay script execution and DOM construction until it completes the download and construction of CSSOM. That is, in this case, the browser downloads and builds the CSSOM first, then executes the JavaScript, and finally continues to build the DOM.

Layout and drawing

After the browser generates the rendering tree, it will be laid out according to the rendering tree (also called reflow). What the browser must do at this stage is to figure out the exact location and size of each node on the page. Often this behavior is also referred to as "auto-reflow".

The output of the layout process is a "box model" that precisely captures the exact position and size of each element within the viewport, with all relative measurements converted to absolute pixels on the screen.

Immediately after the layout is complete, the browser emits "Paint Setup" and "Paint" events to convert the render tree into pixels on the screen.

Above we have detailed the important steps in the browser workflow, and then we discuss a few related issues:

A few additional notes

1. What is the role of async and defer? What's the difference?

Next, let's compare the difference between defer and async attributes:

The blue line represents JavaScript loading; the red line represents JavaScript execution; the green line represents HTML parsing.

1) Case 1

<script src="script.js"></script>

Without defer or async, the browser will load and execute the specified script immediately without waiting for the subsequent loaded document elements, it will be loaded and executed as soon as it is read.

2) Case 2

 <script async src="script.js"></script>

(asynchronous download)

The async attribute means asynchronous execution of the incoming JavaScript. The difference from defer is that if it has been loaded, it will start executing - whether it is at the HTML parsing stage or after DOMContentLoaded is triggered. It should be noted that JavaScript loaded in this way will still block the load event. In other words, async-script may be executed before or after DOMContentLoaded is fired, but it must be executed before load is fired.

3) Case 3

<script defer src="script.js"></script>

(delayed execution )

The defer attribute indicates the delayed execution of the introduced JavaScript, that is, the HTML does not stop parsing when the JavaScript is loaded, and the two processes are parallel. After the entire document is parsed and the defer-script is loaded (the order of these two things doesn't matter), all JavaScript code loaded by the defer-script is executed, and then the DOMContentLoaded event is fired.

Compared with ordinary script, defer has two differences: it does not block HTML parsing when loading JavaScript files, and the execution phase is placed after the HTML tag parsing is completed.
When loading multiple JS scripts, async is sequential loading, while defer is sequential loading.

2. Why is DOM operation slow

Think of the DOM and JavaScript as islands each connected by toll bridges. - "High Performance JavaScript"

JS is fast and modifying DOM objects in JS is also fast. In the world of JS, everything is simple and fast. But DOM manipulation is not a solo dance of JS, but a collaboration between two modules.

Because DOM is something in the rendering engine, and JS is something in the JS engine. When we use JS to manipulate the DOM, it is essentially a "cross-border communication" between the JS engine and the rendering engine. The implementation of this "cross-border communication" is not simple, it relies on the bridge interface as a "bridge" (as shown in the figure below).

There is a toll for crossing the "bridge" - the cost itself is not negligible. Every time we manipulate the DOM (whether for modification or just to access its values), we go through a "bridge". Over the "bridge" more times, it will produce more obvious performance problems. So, the suggestion to "reduce DOM manipulation" is not groundless.

3. Do you really understand reflow and redraw

The rendering process is basically like this (the four steps in yellow in the figure below): 1. Calculate CSS styles 2. Build Render Tree 3. Layout - positioning coordinates and size 4. Officially start painting

Note: There are many connecting lines in the above process, which means that JavaScript dynamically modifying DOM properties or CSS properties will lead to re-layout, but some changes will not be re-layout, that is, those arrows pointing to the sky in the above figure, such as modification The following CSS rule is not matched to the element.

It is important to mention two concepts here, one is Reflow and the other is Repaint

Redrawing: When our modification to the DOM results in a change in style but does not affect its geometric properties (such as changing the color or background color), the browser does not need to recalculate the geometric properties of the element, and directly draws new elements for the element. style (skips the reflow link shown above).
Reflow: When our modification to the DOM causes a change in the geometric size of the DOM (such as modifying the width, height of an element or hiding an element, etc.), the browser needs to recalculate the geometric properties of the element (the geometric properties and positions of other elements will also be affected accordingly. affected), and then plot the calculated results. This process is reflow (also called rearrangement)

We know that when a web page is generated, it will be rendered at least once. In the process of user access, it will continue to re-render. Re-rendering will repeat reflow + redraw or just redraw.
Reflow must occur redrawing, and redrawing does not necessarily cause reflow. Repaints and reflows occur frequently when we style nodes and can also affect performance to a large extent. The cost of reflow is much higher than that of redraw and changing the child nodes in the parent node is likely to cause a series of reflows of the parent node.

1) Common properties and methods that cause reflow

Any operation that changes the geometry of the element (the position and size of the element) will trigger a reflow,

Add or remove visible DOM elements;
Element size changes - margins, padding, borders, width and height
Content changes, such as the user entering text in an input box
Browser window resize - when the resize event occurs
Calculate the offsetWidth and offsetHeight properties
Set the value of the style attribute

2) Common properties and methods that cause repainting

3) How to reduce reflow and redraw

Use transform instead of top

Replace display: none with visibility, because the former only causes a redraw, the latter causes a reflow (changes the layout)

Don't put a node's property value in a loop as a variable in the loop.

For (let i = 0; i < 1000; i++) {
    console.log(document.querySelector('.test').style.offsetTop)
}

Do not use table layout, a small change may cause the entire table to be re-layout
The choice of the speed of animation implementation, the faster the animation speed, the more reflow times, you can also choose to use requestAnimationFrame
CSS selectors are matched and searched from right to left to avoid too many node levels
Set a node that is frequently redrawn or reflowed as a layer, a layer can prevent the rendering behavior of this node from affecting other nodes. For example, for the video tag, the browser will automatically turn the node into a layer.

performance optimization strategy

Based on the browser rendering principles described above, the order in which the DOM and CSSOM structures are constructed, initialization can optimize page rendering and improve page performance.

JS optimization: <script>tags plus defer attributes and async attributes are used to control the download and execution of scripts without blocking page document parsing.
- defer attribute: Used to start a new thread to download the script file, and make the script execute after the document parsing is completed.
- async attribute: HTML5 new attribute, used to asynchronously download script files, and immediately interpret and execute the code after downloading.
CSS optimization: <link>The attribute value in the rel attribute of the tag is set to preload, which allows you to specify in your HTML page which resources are needed immediately after the page is loaded, the optimal configuration load order, and improve rendering performance

Summarize

In summary, we draw the following conclusions:

Browser Workflow: Build DOM -> Build CSSOM -> Build Render Tree -> Layout -> Draw.
CSSOM will block rendering, and only when CSSOM is built will it enter the next stage to build the rendering tree.
Normally the DOM and CSSOM are built in parallel, but when the browser encounters a script tag without defer or async attributes, the DOM construction will pause, if it happens that the browser has not finished downloading and building the CSSOM, because JavaScript can modify CSSOM, so you need to wait for CSSOM to be built before executing JS, and finally rebuild the DOM.

Blog Directory Deep dive into browser storage

Blogs

Programming

Frameworks