MEMEPh. ideas that are worth sharing...

What happens from URL input to page presentation

Foreword


Open the browser, from entering the URL to the web page being presented in front of everyone, what happened behind the scenes? What kind of process did you go through? 

In general, it is divided into the following processes:

 

1. What is a URL?


URL (Uniform Resource Locator), Uniform Resource Locator, is used to locate resources on the Internet, commonly known as URLs.
For example http://www.memeph.com.cn/html, abide by the following grammar rules:

The parts of scheme://host.domain:port/path/filename
are explained as follows:
scheme - defines the type of Internet service. Common protocols are http, https, ftp, file, the most common type is http, and https is encrypted network transmission.
host - defines the domain host (the default host for http is www)
domain - defines the Internet domain name , such as w3school.com.cn
port - defines the port number on the host (the default port number for http is 80)
path - defines the path on the server ( If omitted, the document must be in the root directory of the website).
filename - defines the name of the document/resource

 

2. Domain Name Resolution (DNS)


After the browser enters the URL, it must first go through domain name resolution, because the browser cannot directly find the corresponding server through the domain name, but through the IP address. You may have a question here - a computer can be assigned an IP address as well as a hostname and a domain name. For example, www.hackr.jp. So why not give it an IP address from the beginning? This saves you the trouble of parsing. Let's first understand what an IP address is

 

1. IP address

IP address refers to the Internet Protocol address, which is the abbreviation of IP Address. An IP address is a unified address format provided by the IP protocol. It assigns a logical address to each network and each host on the Internet, thereby shielding the difference in physical addresses. The IP address is a 32-bit binary number, for example, 127.0.0.1 is the local IP.
Domain names are the equivalent of IP addresses disguised as pretenders, wearing a mask. Its role is to facilitate the memory and communication of a group of server addresses. Users typically use hostnames or domain names to access each other's computers, rather than directly by IP address. Because compared with a set of pure numbers of an IP address, specifying a computer name with an alphanumeric representation is more in line with human memory habits. But getting computers to understand names is relatively difficult. Because computers are better at processing long strings of numbers. To solve the above problems, DNS service came into being.

 

2. What is domain name resolution

The DNS protocol provides the service of searching IP addresses through domain names, or reversely searching domain names from IP addresses. DNS is a network server; our domain name resolution is simply to record an information record on DNS.

e.g rendc.com 45.130.228.232 (Server external network IP address) 80 (server port number)

 

3. How does the browser query the IP corresponding to the URL through the domain name?

 

4. Summary

The browser sends the domain name to the DNS server, the DNS server queries the IP address corresponding to the domain name, and then returns it to the browser. The browser then prints the IP address on the protocol, and the request parameters are also carried in the protocol, and then together sent to the corresponding server. Next, we introduce the stage of sending an HTTP request to the server. The HTTP request is divided into three parts: TCP three-way handshake, HTTP request response information, and closing the TCP connection.

 

Three, TCP three-way handshake


Before the client sends data, a TCP three-way handshake is initiated to synchronize the serial number and acknowledgment number of the client and the server, and exchange TCP window size information .

1. The process of TCP three-way handshake is as follows:

 

2. Why three handshakes are needed

In Xie Xiren's "Computer Network", the purpose of the "three-way handshake" is " to prevent the failed connection request segment from being suddenly transmitted to the server, resulting in errors."

 

Fourth, send HTTP request


After the TCP three-way handshake is completed, the HTTP request message starts to be sent .
The request message consists of three parts: the request line, the request header, and the request body.

1. The request line contains the request method, URL, and protocol version

POST  /chapter17/user.html HTTP/1.1

In the above code, "POST" represents the request method, "/chapter17/user.html" represents the URL, and "HTTP/1.1" represents the protocol and protocol version. Now the more popular version is Http1.1

2. The request header contains the additional information of the request, which consists of keyword/value pairs, one pair per line, and the keyword and value are separated by an English colon ":".

The request headers inform the server about the client request. It contains a lot of useful information about the client environment and the request body. For example: Host, indicating host name, virtual host; Connection, added in HTTP/1.1, use keepalive, that is, persistent connection, one connection can send multiple requests; User-Agent, request sender, compatibility and customization requirements.

3. The request body can carry data of multiple request parameters, including carriage return, line feed and request data. Not all requests have request data.

name=tom&password=1234&realName=tomson

The above code carries three request parameters: name, password, and realName.

 

5. The server processes the request and returns an HTTP message


1. Server

A server is a high-performance computer in a network environment. It listens to service requests submitted by other computers (clients) on the network and provides corresponding services, such as web page services, file download services, mail services, and video services. The main function of the client is to browse web pages, watch videos, listen to music, etc. The two are completely different. On each server is installed the application that handles the request - the web server. Common web server products include apache, nginx, IIS or Lighttpd.
The web server plays the role of management and control . For the requests sent by different users, it will combine the configuration files to delegate different requests to the programs on the server that process the corresponding requests for processing (such as CGI scripts, JSP scripts, servlets, ASP scripts, server-side JavaScript, Or some other server-side technology, etc.), and then return the result of the background program processing as a response.
 

2. MVC background processing stage

There are many frameworks for background development, but most of them are still built according to the MVC design pattern.
MVC is a design pattern that divides an application into three core components: model--view--controller, each of which handles its own tasks and separates input, processing, and output.
 

1. View

It is the operation interface provided to the user and the shell of the program.

2. Model

**Model is mainly responsible for data interaction. **Among the three parts of MVC, the model has the most processing tasks. A model can provide data for multiple views.

3. Controller

It is responsible for selecting the data in the "model layer" according to the instructions input by the user from the "view layer", and then performing corresponding operations on it to produce the final result. Controllers are managers who receive requests from views and decide which model component to call to process the request, and then which view to use to display the data returned by the model processing.


These three layers are closely linked, but independent of each other, and changes within each layer do not affect other layers. Each layer provides an external interface (Interface) for the upper layer to call.


As for what happens at this stage? In short, the request sent by the browser first passes through the controller, the controller performs logical processing and request distribution, and then calls the model. At this stage, the model will obtain the data of redis db and MySQL, and it will be rendered after the data is obtained. The page, the response information will be returned to the client in the form of a response message, and finally the browser will render the page to the user through the rendering engine.

 

3. HTTP response message

The response message consists of three parts: the response line (request line), the response header (header), and the response body

(1) The response line contains: protocol version, status code, status code description

The status code rules are as follows:
1xx: Indication information--indicates that the request has been received, and processing continues.
2xx: Success--Indicates that the request has been successfully received, understood, and accepted.
3xx: Redirect - Further action must be taken to complete the request.
4xx: Client Error -- The request has a syntax error or the request could not be fulfilled.
5xx: Server Side Error - The server failed to fulfill a legitimate request.

(2) The response header contains additional information of the response message, consisting of name/value pairs

(3) The response body contains carriage return, line feed and response return data, not all response packets have response data

 

6. The browser parses and renders the page


After the browser gets the response text HTML, the next step is to introduce the browser rendering mechanism

The browser parsing and rendering the page is divided into the following five steps:

 

1. Parse the DOM tree according to HTML

 

2. Generate CSS rule tree according to CSS parsing

 

3. Combine DOM tree and CSS rule tree to generate rendering tree

 

4. Calculate the information (layout) of each node according to the rendering tree

 

5. Draw the page according to the calculated information

 

7. Disconnect

When the data transmission is completed, the tcp connection needs to be disconnected, and tcp is initiated four times at this time.