MEMEPh. ideas that are worth sharing...

Explaining the principles of browser rendering in a simple way

Foreword


The core of the browser refers to the core program that supports the running of the browser. It is divided into two parts, one is the rendering engine, and the other is the JS engine. The rendering engine is not all the same in different browsers. At present, the common browser kernels on the market can be divided into these four types: Trident (IE), Gecko (Firefox), Blink (Chrome, Opera), and Webkit (Safari). The most familiar one here is probably the Webkit kernel, which is the real overlord in the current browser world.
In this article, we take Webkit as an example to conduct an in-depth analysis of the rendering process of modern browsers.

 

Page load process


Before introducing the browser rendering process, we briefly introduce the page loading process, which will help you better understand the subsequent rendering process.

The main points are as follows:

For example, input it in the browser https://rendc.com/forum, and then after DNS resolution, rendc.com, the corresponding IP is 45.130.228.232 (the IP corresponding to different time and place may be different). The browser then sends an HTTP request to that IP.

The server receives the HTTP request, and then performs calculation (pushing different content to different users) and returns the HTTP request. The returned content is as follows:

 

In fact, it is a bunch of strings in HMTL format, because only HTML format browsers can parse them correctly, which is a requirement of the W3C standard. The next step is the rendering process of the browser.

 

Browser rendering process


The browser rendering process is roughly divided into the following three parts:

1) The browser parses three things:

 

2) After the parsing is completed, the browser engine will construct the Rendering Tree through the DOM Tree and CSS Rule Tree.

3) Finally, draw by calling the API of the native GUI of the operating system.

Next, we elaborate on the important steps that we have experienced.

 

Build the DOM


Browsers follow a set of steps to convert an HTML file into a DOM tree. Macroscopically, it can be divided into several steps:

Next, let's take an example, assuming that there is a piece of HTML text:

<html>
<head>
    <title>Web page parsing</title>
</head>
<body>
    <div>
        <h1>Web page parsing</h1>
        <p>This is an example Web page. </p>
    </div>
</body>
</html>

The above HTML will be parsed like this:

 

Build CSSOM


The DOM captures the content of the page, but the browser also needs to know how the page is displayed, so CSSOM needs to be built.

The process of building CSSOM is very similar to the process of building DOM. When the browser receives a piece of CSS, the first thing the browser needs to do is to identify the Token, then build the node and generate the CSSOM. In this process, the browser will determine what the style of each node is, and this process is very resource intensive. Because the style you can set to a node by yourself can also be obtained through inheritance. During this process, the browser must recurse through the CSSOM tree and then determine what a specific element looks like.
 

Note: CSS matching of HTML elements is a rather complex and performance issue. Therefore, the DOM tree should be small, and CSS should use id and class as much as possible.

Build the render tree

After we generate the DOM tree and CSSOM tree, we need to combine the two trees into a rendering tree.

 

In this process, it is not simply a matter of merging the two. The render tree will only include the nodes that need to be displayed and the style information of these nodes. If a node display: none, then it will not be displayed in the render tree.

We may have a doubt: what should the browser do if it encounters a JS file during rendering?

During the rendering process, if it encounters <script>it, stop rendering and execute JS code. Because the browser has a GUI rendering thread and a JS engine thread, in order to prevent unpredictable rendering results, these two threads are mutually exclusive. The loading, parsing and execution of JavaScript will block the construction of the DOM. When the HTML parser encounters JavaScript during the construction of the DOM, it will suspend the construction of the DOM, transfer control to the JavaScript engine, and wait for the JavaScript engine to finish running., the browser resumes DOM construction from where it left off.

If you want to render the first screen faster, you should not load JS files at the first screen, which is why it is recommended to put the script tag at the bottom of the body tag. Of course, now, it's not that the script tag must be at the bottom, because you can add defer or async attributes to the script tag (the difference between the two will be explained below).

JS files don't just block DOM construction, it causes CSSOM to block DOM construction as well.

Originally, the construction of DOM and CSSOM did not affect each other, and the well water did not make river water. However, once JavaScript was introduced, CSSOM also began to block the construction of DOM. Only after CSSOM was constructed, DOM resumed DOM construction.

what's going on?

This is because JavaScript can not only change the DOM, it can also change the styles, which means it can change the CSSOM. Because the incomplete CSSOM cannot be used, if JavaScript wants to access the CSSOM and change it, it must be able to get the complete CSSOM when executing the JavaScript. So it leads to a phenomenon, if the browser has not completed the download and construction of CSSOM, and we want to run the script at this time, then the browser will delay script execution and DOM construction until it completes the download and construction of CSSOM. That is, in this case, the browser downloads and builds the CSSOM first, then executes the JavaScript, and finally continues to build the DOM.

 

Layout and drawing


After the browser generates the rendering tree, it will be laid out according to the rendering tree (also called reflow). What the browser must do at this stage is to figure out the exact location and size of each node on the page. Often this behavior is also referred to as "auto-reflow".

The output of the layout process is a "box model" that precisely captures the exact position and size of each element within the viewport, with all relative measurements converted to absolute pixels on the screen.

Immediately after the layout is complete, the browser emits "Paint Setup" and "Paint" events to convert the render tree into pixels on the screen.

Above we have detailed the important steps in the browser workflow, and then we discuss a few related issues:

 

A few additional notes


1. What is the role of async and defer? What's the difference?

Next, let's compare the difference between defer and async attributes:

 

The blue line represents JavaScript loading; the red line represents JavaScript execution; the green line represents HTML parsing.

1) Case 1

<script src="script.js"></script>

Without defer or async, the browser will load and execute the specified script immediately without waiting for the subsequent loaded document elements, it will be loaded and executed as soon as it is read.

2) Case 2

 <script async src="script.js"></script>

 (asynchronous download)

The async attribute means asynchronous execution of the incoming JavaScript. The difference from defer is that if it has been loaded, it will start executing - whether it is at the HTML parsing stage or after DOMContentLoaded is triggered. It should be noted that JavaScript loaded in this way will still block the load event. In other words, async-script may be executed before or after DOMContentLoaded is fired, but it must be executed before load is fired.

3) Case 3 

<script defer src="script.js"></script>

(delayed execution )

The defer attribute indicates the delayed execution of the introduced JavaScript, that is, the HTML does not stop parsing when the JavaScript is loaded, and the two processes are parallel. After the entire document is parsed and the defer-script is loaded (the order of these two things doesn't matter), all JavaScript code loaded by the defer-script is executed, and then the DOMContentLoaded event is fired.

Compared with ordinary script, defer has two differences: it does not block HTML parsing when loading JavaScript files, and the execution phase is placed after the HTML tag parsing is completed.
When loading multiple JS scripts, async is sequential loading, while defer is sequential loading.

2. Why is DOM operation slow

Think of the DOM and JavaScript as islands each connected by toll bridges. - "High Performance JavaScript"

JS is fast and modifying DOM objects in JS is also fast. In the world of JS, everything is simple and fast. But DOM manipulation is not a solo dance of JS, but a collaboration between two modules.

Because DOM is something in the rendering engine, and JS is something in the JS engine. When we use JS to manipulate the DOM, it is essentially a "cross-border communication" between the JS engine and the rendering engine. The implementation of this "cross-border communication" is not simple, it relies on the bridge interface as a "bridge" (as shown in the figure below).

 

There is a toll for crossing the "bridge" - the cost itself is not negligible. Every time we manipulate the DOM (whether for modification or just to access its values), we go through a "bridge". Over the "bridge" more times, it will produce more obvious performance problems. So, the suggestion to "reduce DOM manipulation" is not groundless.

3. Do you really understand reflow and redraw

The rendering process is basically like this (the four steps in yellow in the figure below): 1. Calculate CSS styles 2. Build Render Tree 3. Layout - positioning coordinates and size 4. Officially start painting

Note: There are many connecting lines in the above process, which means that JavaScript dynamically modifying DOM properties or CSS properties will lead to re-layout, but some changes will not be re-layout, that is, those arrows pointing to the sky in the above figure, such as modification The following CSS rule is not matched to the element.

It is important to mention two concepts here, one is Reflow and the other is Repaint

We know that when a web page is generated, it will be rendered at least once. In the process of user access, it will continue to re-render. Re-rendering will repeat reflow + redraw or just redraw.
Reflow must occur redrawing, and redrawing does not necessarily cause reflow. Repaints and reflows occur frequently when we style nodes and can also affect performance to a large extent. The cost of reflow is much higher than that of redraw and changing the child nodes in the parent node is likely to cause a series of reflows of the parent node.

1) Common properties and methods that cause reflow

Any operation that changes the geometry of the element (the position and size of the element) will trigger a reflow,

2) Common properties and methods that cause repainting

3) How to reduce reflow and redraw

Use transform instead of top

Replace display: none with visibility, because the former only causes a redraw, the latter causes a reflow (changes the layout)

Don't put a node's property value in a loop as a variable in the loop.

For (let i = 0; i < 1000; i++) {
    console.log(document.querySelector('.test').style.offsetTop)
}

 

performance optimization strategy


Based on the browser rendering principles described above, the order in which the DOM and CSSOM structures are constructed, initialization can optimize page rendering and improve page performance.

 

Summarize


In summary, we draw the following conclusions: