About the Http protocol, you must know
Introduction
The HTTP protocol is the abbreviation of Hyper Text Transfer Protocol (Hyper Text Transfer Protocol), which is a transfer protocol for transferring hypertext from a World Wide Web server to a local browser. HTTP is a communication protocol based on the TCP/IP protocol to transfer data (HTML files, image files, query results, etc.). It does not involve data packet (packet) transmission, and mainly specifies the communication format between the client and the server, and uses port 80 by default.
1. the characteristics of Http
1. Simple and fast : When a client requests a service from the server, it only needs to transmit the request method and path. Commonly used request methods are GET, HEAD, PUT, DELETE, and POST. Each method specifies a different type of contact between the client and the server. Because the HTTP protocol is simple, the program scale of the HTTP server is small, so the communication speed is fast.
2. Flexible : HTTP allows the transmission of data objects of any type.
3. Connectionless : The meaning of connectionless is to limit processing to only one request per connection. After the server processes the client's request and receives the client's response, it disconnects. In this way, transmission time can be saved.
4. Stateless : **The HTTP protocol is stateless, and the HTTP protocol itself does not save the communication state between the request and the response. There are no dependencies between any two requests. **Intuitively, each request is independent and has no direct connection with the previous request and the subsequent request. The protocol itself does not retain information about all previous request or response messages. This is to handle large numbers of transactions faster and to ensure the scalability of the protocol, and the HTTP protocol is deliberately designed to be so simple.
2. Http message
The Http message includes two parts: the request message and the response message. The request message consists of four parts: the request line, the request header, the blank line and the request body. The response message consists of four parts: status line, response header, blank line and response body. Next, we will introduce the various parts of the request message and their functions in detail.
1. The request line, which is used to describe the request type, the resource to be accessed, and the HTTP version used.
POST /chapter17/user.html HTTP/1.1
In the above code, "POST" represents the request method, "/chapter17/user.html" represents the URI, and "HTTP/1.1" represents the protocol and protocol version. Now the more popular version is Http1.1
2. The request header consists of keyword/value pairs, one pair per line, and the keyword and value are separated by an English colon ":".
The request headers inform the server about the client request. It contains a lot of useful information about the client environment and the request body. For example:
Host, indicating host name, virtual host;
Connection, added in HTTP/1.1, using keepalive, that is, persistent connection, one connection can send multiple requests;
User-Agent, request sender, compatibility and customization requirements.
3. After the last request header is a blank line, this line is very important, it means that the request header has ended, and the next is the request body.
4. The request body, which can carry data of multiple request parameters
name=tom&password=1234&realName=tomson
The above code carries three request parameters: name, password, and realName.
3. HTTP request method
- GET requests the specified page information and returns the entity body.
- HEAD is similar to a get request, except that there is no specific content in the returned response, which is used to get the header
- POST submits data to the specified resource for processing requests (such as submitting a form or uploading a file). Data is included in the request body.
- PUT replaces the contents of the specified document with data sent from the client to the server.
- DELETE requests the server to delete the specified page.
4. the difference between GET and POST
- GET is harmless when the browser falls back, while POST submits the request again
- GET requests will be actively cached by the browser, while POST will not, unless manually set
- GET request parameters will be completely preserved in the browser history, while POST parameters will not be preserved
- The parameters sent in the URL of the GET request are limited in length, while the POST has no limit
- GET parameters are passed through the URL, and POST is placed in the Request body
5. Http status code
The status code consists of three digits, the first digit defines the category of the response, which is divided into five categories:
- 1xx: Indication information--indicates that the request has been received, continue processing
- 2xx: Success--indicates that the request has been successfully received, understood, accepted
- 3xx: Redirect -- further action must be taken to complete the request
- 4xx: Client Error -- The request has a syntax error or the request cannot be fulfilled
- 5xx: server-side error -- the server failed to fulfill a legitimate request
For example, we usually see two error status codes:
403 Forbidden //Access to the requested page is forbidden
404 Not Found // The requested resource does not exist, for example: a wrong URL is entered
6. Persistent connection
1. Why do you need persistent connections
In the initial version of the HTTP protocol, a TCP connection was disconnected for every HTTP communication . In terms of the communication situation of the year, because it was all small-capacity text transmission, even this was not a big problem. However, with the popularity of HTTP, there are more and more cases of documents containing a large number of images. For example, when using a browser to browse an HTML page containing multiple images, when sending a request to access the HTML page resources, it also requests other resources contained in the HTML page. Therefore, each request will cause unnecessary establishment and disconnection of TCP connections, increasing the overhead of communication.
2. Features of persistent connection
In order to solve the above-mentioned TCP connection problem, HTTP/1.1 and a part of HTTP/1.0 came up with the method of persistent connection (HTTP Persistent Connections, also known as HTTP keep-alive or HTTP connection reuse). The characteristic of persistent connections is that the TCP connection state is maintained as long as either end does not explicitly propose to disconnect.
The advantage of persistent connection is to reduce the additional overhead caused by the repeated establishment and disconnection of TCP connections and reduce the load on the server side. In addition, reducing the overhead part of the time, so that HTTP requests and responses can be completed earlier, so that the display speed of the Web page will be increased accordingly.
In HTTP/1.1, all connections were persistent by default, but this is not standardized in HTTP/1.0. Although some servers implement persistent connections through non-standard means, the servers may not be able to support persistent connections. There is no doubt that in addition to the server side, the client side also needs to support persistent connections.
7. pipeline
Persistent connections make it possible for most requests to be pipelined. After sending a previous request, you need to wait and receive a response before sending the next request. After the emergence of pipeline technology, the next request can be sent directly without waiting for a response.
This makes it possible to send multiple requests in parallel at the same time, instead of waiting for responses one after the other. In layman's terms, the request is packaged once for transmission, and the response is packaged once and delivered back. The premise of pipelining is under persistent connections.
Suppose when requesting an HTML Web page containing 10 images , using a persistent connection allows the request to end faster than connecting one by one. Pipelining, on the other hand, is faster than persistent connections . The higher the number of requests, the more pronounced the time difference will be. The client needs to request these ten resources. The previous practice was that in the same TCP connection, the A request was sent first, then the server responded, and then the B request was sent, and so on. The pipeline mechanism allows the browser to send these ten requests at the same time. , but the server still responds to A request first, and then responds to B request after completion.
So in the case of persistent connections, the delivery of messages on a connection is similar to
request1->response1->request2->response2->request3->response3
Pipeline sending becomes something like this:
request1->request2->request3->response1->response2->response3