What This Article Is About

HTTP is the protocol every web request, every API call, every "fetch" in your browser uses. The original version came out in 1991, version 1.1 in 1997, version 2 in 2015, version 3 in 2022. Each step solved a specific bottleneck that had become limiting.

You don't need to memorize wire formats to write web apps. But understanding what each HTTP version does and doesn't fix explains a lot of weird performance behavior, why your CDN setup matters, why mobile feels slow on some sites, and why the front-end optimization tricks of 2010 are anti-patterns today.

This is a complete walk through the three versions: what they do, what they fix, what they leave broken, and how to think about them as a developer.

What Stays the Same Across All Versions

Before diving into the differences, the surprising thing is how much hasn't changed:

Methods: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS. Same in all versions.
Status codes: 200, 301, 404, 500. Same.
Headers: Content-Type, Authorization, Cookie. Same names and meanings.
URLs: identical.
Request/response model: the application sees the same logical thing.

What changes is the wire format and the transport. The semantics of HTTP are stable. The plumbing under it gets better.

HTTP/1.1: The Original (1997)

HTTP/1.1 is plain text, request-response. The browser opens a TCP connection to the server, sends a request, gets a response, repeat.

GET /index.html HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0
Accept: text/html

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234

<html>...</html>

Notable features:

Persistent connections: by default the connection stays open after a response, so the next request can reuse it. Saves the TCP handshake on subsequent requests.
Chunked transfer encoding: server can stream a response without knowing the total size in advance.
Pipelining: spec said you could send multiple requests without waiting. In practice, never reliably implemented and effectively dead.

The Problem: Head-of-Line Blocking

HTTP/1.1's biggest flaw: requests on the same connection are serialized. Request 1 must finish before request 2 starts.

If request 1 is a slow database query that takes 5 seconds, requests 2, 3, 4 on the same connection wait 5 seconds, even if they would each take 50 milliseconds.

The browser's workaround: open multiple parallel TCP connections (typically 6 per origin domain). Now you can have up to 6 requests in flight at once. But each connection has its own TCP handshake and TLS handshake (expensive), and 6 is not enough for a complex page that needs 100+ resources.

The HTTP/1 Era of Front-End Hacks

To work around HTTP/1.1's connection limit, web developers spent a decade inventing workarounds:

Sprite sheets: combine 50 small images into one big image, use CSS to show the right portion. One HTTP request instead of 50.
Domain sharding: serve images from img1.example.com and img2.example.com so the browser opens 12 parallel connections instead of 6.
JS/CSS bundling: concatenate all your JavaScript into one big file. One request instead of many.
Inlining: data-URL embed images and CSS directly in HTML to avoid extra requests.
Cookie-free domains: serve static assets from a domain without cookies, so you don't waste bandwidth sending cookies on every static request.

Most of these became unnecessary or counterproductive in HTTP/2. Some optimization tools and bundlers were designed for HTTP/1.1 and now actively hurt performance. The "many small files" pattern is fine on HTTP/2 and even preferable.

HTTP/2: Multiplexing (2015)

HTTP/2 keeps the same semantics (methods, status codes, headers) but completely changes the wire format. It is binary, framed, and multiplexed.

Multiplexing: many requests and responses share a single TCP connection. Each request gets a stream ID. Frames from different streams interleave, then get reassembled at the destination.

HTTP/1.1 vs HTTP/2 Connection Behavior
HTTP/1.1: 6 parallel TCP connections
Conn 1: req → resp
Conn 2: req → resp
Conn 3: req → resp
Conn 4: req → resp
Conn 5: req → resp
Conn 6: req → resp
HTTP/2: 1 TCP connection, many streams
Stream 1
Stream 2
Stream 3
Stream 4 ...

Other improvements:

Binary framing: faster to parse than text. Less ambiguity, no whitespace edge cases.
HPACK header compression: repeated headers (cookies, user-agent, accept-language) are compressed using a shared dictionary. After the first request, the same headers cost almost nothing.
Stream priorities: client can hint that some requests matter more (CSS first, then images). Servers can use this to schedule sending.
Server push: server can preemptively send resources the client hasn't asked for yet. Sounded great, never worked well in practice, deprecated in HTTP/2 and removed from HTTP/3.
Required encryption (in browsers): the spec allows plain HTTP/2, but every browser only supports it over TLS. So in practice, HTTP/2 is always encrypted.

HTTP/2 made many HTTP/1 workarounds counterproductive. Concatenating JS bundles became less important. Domain sharding became actively harmful (it forces multiple connections, defeating multiplexing). Sprite sheets are no longer needed.

The TCP Problem That HTTP/2 Doesn't Fix

HTTP/2 still runs over TCP. TCP itself has its own head-of-line blocking. If a TCP packet is lost, the kernel waits for retransmission before delivering subsequent packets, even if those subsequent packets are for different HTTP/2 streams.

HTTP/2 says streams are independent. TCP says "I'll deliver bytes in order, end of story." When TCP enforces order on bytes that belong to multiple HTTP/2 streams, you have HTTP/2 streams blocking each other through TCP, even though the protocol says they shouldn't.

On a clean network this is invisible. On a lossy network (mobile, especially) it's a real performance hit. A single lost packet stalls all HTTP/2 streams sharing the connection.

So HTTP/2 fixed application-level head-of-line blocking but transport-level head-of-line blocking remained. The next version had to swap out TCP entirely.

HTTP/3: New Transport (2022)

HTTP/3 has the same semantics as HTTP/2. The wire-level multiplexing model is conceptually similar. The big change: transport. HTTP/3 replaces TCP+TLS with QUIC, a new transport protocol built on UDP.

QUIC includes:

Connection establishment (faster than TCP+TLS).
Reliability (retransmissions).
Ordering within a stream (but not across streams).
Congestion control.
Built-in encryption (TLS 1.3, integrated with the handshake).
Independent streams (a packet loss on stream 1 doesn't delay stream 2).
Connection migration (survives switching between WiFi and cellular).

Each HTTP/3 stream has its own ordering. A packet lost on stream 1 doesn't delay stream 2. The "real" multiplexing dream is finally delivered.

Why UDP?

QUIC needed to ship without OS upgrades. TCP is implemented in the kernel. Changing TCP behavior or shipping a new TCP-like protocol requires every router, firewall, and operating system on the internet to update. That would take 20+ years.

UDP is "send a packet, hope it arrives". Routers and firewalls already pass UDP through. So QUIC builds reliability and ordering on top of UDP, in user space, which means it ships in browsers and servers without kernel changes. New congestion control algorithms, new features, all evolve as fast as Chrome and the major server vendors can ship them.

Connection Setup Latency

This matters more than people think, especially on mobile (high latency, frequent reconnects).

Old way (HTTPS over TCP, TLS 1.2):

1 RTT: TCP handshake (SYN, SYN-ACK, ACK).
2 RTT: TLS 1.2 handshake.
Total before first byte of HTTP request: 3 RTT.

HTTP/2 over TLS 1.3:

1 RTT: TCP handshake.
1 RTT: TLS 1.3 handshake.
Total: 2 RTT before first request.

HTTP/3 (QUIC):

1 RTT: combined QUIC + TLS handshake (TLS is integrated).
Total: 1 RTT for fresh connections.

HTTP/3 with 0-RTT resumption:

0 RTT: client sends application data in the very first packet, with cryptographic state from a previous session.
Total: 0 RTT (request data is in the same packet that opens the connection).

On a 100ms RTT link, going from 3 RTT to 0 RTT means starting the response 300ms earlier. That's a meaningful difference for interactive use.

0-RTT comes with a security caveat: replay attacks. Servers must mark idempotent requests as 0-RTT-safe and reject potentially-replayable POST requests until the handshake completes.

Connection Migration

You're downloading a large file on WiFi. You walk out of range, your phone switches to cellular. With TCP, the connection breaks, the download fails, you start over.

QUIC connections survive this. The connection is identified by a connection ID embedded in QUIC packets, not by the source IP and port. When your IP changes, packets with the same connection ID continue to flow. The server keeps the connection state alive.

For mobile users, this is a big deal. For desktop users on stable networks, it doesn't matter much.

Side-by-Side Comparison

HTTP/1.1HTTP/2HTTP/3
Released199720152022
TransportTCPTCPQUIC (over UDP)
FormatTextBinary, framedBinary, framed
MultiplexingNone (1 req at a time)YesYes (independent streams)
Header compressionNoneHPACKQPACK
Connection setup3+ RTT (with TLS 1.2)2-3 RTT1 RTT (0 RTT resumed)
App-level HoL blockingYesNoNo
Transport-level HoLYesYesNo
EncryptionOptional (TLS)Effectively required (browsers)Mandatory
Network migrationNoNoYes
Server pushNoYes (deprecated)No

The Front-End Optimization Implications

If you're optimizing a website, your strategy depends on what HTTP version is in play.

For HTTP/1.1 (still common in old infrastructure): bundle aggressively. Sprite sheets help. Domain sharding might help.

For HTTP/2 (the modern default): stop bundling so aggressively. Many small files cache better and ship in parallel. Don't shard domains. Don't inline aggressively (you defeat caching). Sprite sheets are obsolete.

For HTTP/3: same advice as HTTP/2, plus you benefit on lossy networks. Mobile users especially.

The shift in philosophy: in HTTP/1, requests were expensive, so combine them. In HTTP/2 and HTTP/3, requests are cheap, so split them up for better caching and incremental delivery.

HPACK and QPACK (Header Compression)

HTTP/1 sends every header in plain text on every request. A typical request might be 800 bytes of headers (cookies, user-agent, accept-language, referer, etc.) plus a small body. Most of this is repeated identically on every request.

HPACK (HTTP/2) and QPACK (HTTP/3) maintain a shared dictionary on both sides. Common headers get represented by tiny indices. The first request still costs full headers, but subsequent requests are nearly free for headers.

Real numbers: a typical HTTP/1 request has 500 to 1500 bytes of headers. The same in HTTP/2 after warm-up: 50 to 200 bytes. On mobile with cellular bandwidth costs, this matters.

Should You Use HTTP/3?

For public-facing websites: yes if your CDN supports it. Cloudflare, Fastly, CloudFront all do. The performance gain is real on mobile and lossy networks. There's no real downside if your CDN handles fallback to HTTP/2 when HTTP/3 fails (corporate firewalls blocking UDP, etc.).

For internal services: HTTP/2 is usually enough. The QUIC benefits matter when networks are unreliable. Inside a data center, networks are reliable. The complexity of HTTP/3 isn't worth the marginal gain.

For APIs: if the clients are servers (server-to-server), HTTP/2 over TCP is mature and fine. If the clients are mobile apps, HTTP/3 is increasingly worth it.

Operational Considerations

Load balancers and proxies: not all support HTTP/2 or HTTP/3 end-to-end. Many setups have HTTPS at the edge, then plain HTTP/1 internally. This works fine but means internal services don't benefit from HTTP/2 multiplexing.

Connection coalescing: HTTP/2 lets a browser reuse one connection for multiple subdomains if their TLS certs include both. Improves performance, can confuse debugging.

Observability: tools that parse HTTP/1 plain text won't work on HTTP/2 or HTTP/3 binary. You need versions that decode the framed format. Wireshark handles all three. Some older logging proxies don't.

UDP and corporate firewalls: some networks block all UDP except DNS. HTTP/3 fails. Browsers fall back to HTTP/2 over TCP. So HTTP/3 is best-effort: it's an optimization, not a hard requirement.

QUIC CPU cost: QUIC does encryption in user space, which means more CPU work per connection than TLS-over-TCP (where modern NICs offload TLS). At scale, this matters. CDNs have invested heavily in optimizing QUIC implementations.

0-RTT replay attacks: 0-RTT data can be replayed by an attacker. The server must reject 0-RTT for non-idempotent requests (POST that modifies state). Frameworks usually handle this, but be aware.

Edge Cases and Gotchas

WebSocket on HTTP/2: HTTP/2 originally didn't support WebSocket upgrades. RFC 8441 fixed this with the "extended CONNECT" method, but support is uneven.

Connection limits: browsers cap concurrent HTTP/2 streams (typically 100 per connection). Servers can configure their max. If you hit the cap, requests queue.

Long-running streams and timeouts: HTTP/2 streams that stay open a long time (server-sent events, long-polling) can run into idle timeouts. Tune these.

Server-Sent Events on HTTP/2: works fine, and you avoid the HTTP/1 6-connection limit problem.

Mixing protocol versions: a load balancer might speak HTTP/2 to clients but HTTP/1 to backends. This is normal and fine for most apps. But streaming features (server push, server-sent events) might not work end-to-end if any hop is HTTP/1.

0-RTT and replay-safe headers: if your app accepts 0-RTT, ensure the request handler tolerates the same request being processed twice (because of replay). For idempotent GETs, no problem. For non-idempotent operations, reject early replay.

How to Tell What Version a Site Uses

In Chrome DevTools: Network tab, right-click any column header, enable "Protocol". Now each request shows h1.1, h2, or h3.

From the command line: curl --http3 -I https://example.com tells you whether HTTP/3 works.

Most large public sites speak HTTP/2 to all clients and HTTP/3 to clients that ask. Some smaller sites and most internal services still default to HTTP/1.1.

The One Thing to Remember

Each HTTP version solved a specific bottleneck. HTTP/1.1's bottleneck was head-of-line blocking at the application level. HTTP/2 fixed that with multiplexing but kept TCP's head-of-line blocking. HTTP/3 swapped TCP for QUIC to fix that too, plus added 0-RTT and connection migration. The actual semantics (methods, headers, status codes) haven't changed since 1997. As a developer, you mostly don't care which version your stack uses; just enable the latest your CDN supports, drop the old HTTP/1-era bundling/sharding hacks, and trust that the protocol does its job. The pattern is consistent: every version trades implementation complexity for less round-trip latency.