- Client sends all requests without waiting for the server responses.
- This saves at least one roundtrip since the server no longer needs to wait for the client to receive the response in order to get the next request.
- Client does less OS calls to send the requests, as it can concatenate them.
- Server can process all requests at the same time, saving total latency time for all requests and responses combined.
- Even if the server does not support pipelining, the client sending all requests combined still saves up to N half-roundtrips. Where N is the number of requests.
- Pipelining is not well supported by servers, proxies, and applications such as antivirus. So even if the server has a good implementation, the client may end up receiving a corrupted response. This is the main reason Firefox never enabled pipelining by default.
- Servers cannot process the requests in parallel by default. As per the HTTP spec, servers must process the requests in order. For example the first request may create resource A, and the second update resource A, so this cannot be done in parallel. So you don't get all the performance benefits you may see in benchmarks.
- Servers can get DDoS if an attacker finds a request takes significantly longer to get process than others. The attacker can send the requests in a way so the first one is a slow request, while the rest are fast requests. Since the server needs to send the responses in order, it must keep the results for the rest of requests in memory until the first request is processed. This is only an issue if the server processes all requests in parallel. This is known as the Head-of-line blocking (HOL blocking) at the application level problem.
- Servers can get DDoS regardless, as processing multiple requests in parallel will always use more resources than processing them sequentially.
- Server cannot send a response with unknown content-lenght. For example an endless data stream is not supported, since the server has to close the connection to end it. Cannot be done if there are other responses after it.
Note HTTP/2 solves the pipelining cons by using stream multiplexing.
HOL blocking at the TCP level is also a thing. If a packet is lost, it needs to get retransmitted, and block the rest of packets. It's not as bad as at the application level, which makes server pipelining not viable. HTTP/3 works on top of UDP, and it addresses the issue.