Mastering HTTP
History of HTTP
HTTP (Hypertext Transfer Protocol) is an application-layer protocol used to transfer hypertext (such as web pages) over the internet.
Version 0.9 (Early)
Used only for simple document browsing, with extremely limited functionality.
Version 1.0 (Official)
Introduced request headers and response headers, supporting more data types (e.g., images, CSS, etc.).
Supported different HTTP methods: GET, POST, HEAD.
Responses included status codes (e.g., 200, 404, etc.).
The TCP connection was closed after every request/response, making it inefficient. Each request required a new TCP connection, resulting in significant connection overhead (high latency).
HTTP/1.1
Persistent Connection: By default, the TCP connection remains open for multiple requests, reducing connection setup overhead.
Connection: keep-alive | closePipelining (rarely used in practice): Allows the client to send multiple requests without waiting for previous responses, but the server must still respond in order. Its actual effectiveness is limited and suffers from head-of-line blocking. Head-of-Line Blocking: Within the same TCP connection, requests must be processed sequentially. If a previous request is blocked, subsequent requests must wait. — If the first request (the head-of-line) gets stuck, even if later requests have already been processed, they still have to wait.
Added more HTTP methods (e.g., PUT, DELETE, etc.).
Introduced the
Hostheader, allowing multiple websites to be hosted on a single server.Supported chunked transfer encoding.
Due to severe head-of-line blocking in pipelining, complicated implementation, and poor compatibility, most HTTP/1.1 implementations actually did not enable pipelining. Instead, they simulated concurrency by opening multiple TCP connections:
For example, when a web page loads, it establishes multiple TCP connections (typically 6–8), and sends requests sequentially on each connection.
HTTP/2
Binary Protocol: No longer text-based, but a binary format that is more efficient, compact, and faster to parse.
Multiplexing: Allows multiple requests and responses to be sent concurrently over a single TCP connection, completely solving HTTP/1.1's head-of-line blocking problem.
Header Compression (HPACK): Compresses HTTP headers, reducing redundant data transfer.
Server Push: The server can proactively push resources to the client without the client explicitly requesting them.
Stream Prioritization: Allows setting priorities for different requests to optimize resource loading order.
Except for Server Push, all features are enabled by default. To enable Server Push in nginx:
location = /index.html {
http2_push /style.css;
http2_push /script.js;
}
However, in practice, since browsers already have efficient preloading mechanisms, the actual benefit of Server Push is limited, so it should be used with caution.
HTTP/3
HTTP/2 still runs on top of the TCP protocol, which is a byte-stream-oriented protocol emphasizing reliability and ordering. Although the application layer solved head-of-line blocking, the transport layer did not.
To solve TCP-level head-of-line blocking in HTTP/2, HTTP/3 is based on the QUIC protocol, which in turn is built on UDP instead of TCP.
Based on QUIC (Quick UDP Internet Connections): QUIC was proposed by Google and standardized by the IETF. It runs on UDP and integrates TLS encryption, multiplexing, fast connection establishment, and more.
Fully solves head-of-line blocking: Each stream is transmitted independently. If a packet is lost on one stream, it does not affect other streams.
Faster connection establishment: Because QUIC has built-in TLS, the handshake is faster (typically 1-RTT or 0-RTT).
Built-in encryption: HTTP/3 is usually always encrypted, offering higher security.
Improved mobile network performance: Better adapts to high-packet-loss, high-latency network environments (e.g., mobile networks, Wi-Fi handoffs).
HTTPS
HTTPS = HTTP + TLS/SSL (Transport Layer Security / Secure Sockets Layer) (encryption layer)
Brief description of how it works:
Client initiates a request (Client Hello).
Server responds (Server Hello) and sends its certificate (containing the public key).
Client verifies the server certificate and extracts the public key.
Client generates a random symmetric key (called the "session key" or "pre-master secret"), encrypts it with the server's public key, and sends it to the server.
Server decrypts the symmetric key using its private key.
All subsequent communication between the two parties is encrypted with this symmetric key.
TCP Sticky Packets
"Sticky Packet" is not an error or problem inherent in the TCP protocol itself. Instead, it is a phenomenon that developers may encounter when using TCP for data communication, because TCP is a byte-stream-oriented protocol.
Sender sends sequentially:
Packet 1: "Hello"
Packet 2: "World"
Receiver may receive:
Case 1: Receives "HelloWorld" all at once (the two packets are stuck together).
Case 2: Receives "Hello" and "World" separately as expected (no sticky packet).
TCP is a byte-stream protocol; it does not preserve application-layer message boundaries. TCP only guarantees reliable, in-order delivery of data, but it doesn't care whether the data you send at the application layer is a "single message" or "multiple messages."
Since TCP does not help us distinguish message boundaries, we must define the format and boundaries of messages ourselves at the application layer.Request line + Headers + blank line (
\r\n\r\n): Marks the end of the HTTP header.Content-LengthorTransfer-Encoding: Specifies the length or transfer method of the message body, so the receiver can accurately know where the message body ends.
What is \r\n? How is it different from a regular newline?
Three-Way Handshake & Four-Way Wave
See this blog post