alyssaq/HTTP-connections.md

## HTTP-connections.md

      
    Raw
  

              HTTP-connections.md
            
          
    HTTP Fundamentals 2

HTTP Connections

How does traffic flow through the Internet?
What happens in the network layers in a HTTP transaction?
Network Layers

HTTP = application layer protocol (OSI 7) - allows applications to communicate over the network. E.g. Web browser to web server (Apache)

HTTP specifications does not mention how messages move across the network and reach the server. That's where lower layer protocols come to play.
TCP (transport control, reliable, protocol) = transport layer protocol (OSI 4). Most HTTP traffic travels over TCP. When the user types a URL in the browser, it opens a TCP socket at default port 80 and just starts writing data into the socket. TCP layer accepts data and ensures that the data gets delivered to the server. It resends info if it gets lost in transit. TCP provides flow control, which will ensure data is not sent too fast for the receiver to process that data.
UDP (user datagram, unreliable, protocol) does not guarantee delivery, needs no handshaking and is better suited for time-sensitive apps where dropping packets is preferred to delayed packets.
IP (internet protocol) = network layer (OSI 3). While TCP takes care of error detection, flow control and overall reliability, IP is responsible for breaking data into packets (datagrams) and moving them through the switches, routers, gateways, and all the network devices around the world. Destination? An IP address.
Ethernet, IEEE 802.11 (WiFi) for LANs = data link layer (OSI 2). Time for the IP packets to fly across a wire, wireless network or satellite link. This layer deals with the actual transfer of frames between devices on the same LAN. It doesn't care about the destination, more like a traffic police and specifies how to detect and recover from collisions.
Below is a traffic flow diagram and the protocol and protocol data units in each layers

TCP handshake

Before HTTP messages can start to flow, the TCP messages are required to establish a connection between the client and server - handshaking.
Below is a wireshark capture of my browser doing a GET request to google.com

The first 3 TCP messages, the 3-step TCP handshake, ensures that both the client (me) and the server (google) are in agreement about how to communicate. Once completed, my HTTP GET message is sent. My HTTP request message is also layered with TCP, IPv4 and Ethernet headers and thats the layered communications stack.

Parallel Connections

User agents, aka web browsers, can open multiple, parallel connections to a server. Despite the HTTP/1.1 Specification stating that a "single-user client SHOULD NOT maintain more than 2 connections with any server or proxy", browsers have since then exceeded this guideline. Table in section below shows the number of connections per server supported by current browsers for HTTP/1.1.
To achieve more parallel downloads, you can obtain resources from different servers with multiple domain names. So, even if a browser is limited to 4 parallel connections, you can have 4 parallel requests to images.mydomain.com and another 4 parallel requests to bigimages.mydomain.com. Ultimately, your DNS records could point all those requests to the same physical server.
Persistent Connections

With the numerous requests per page, the overhead generated by TCP handshakes and the in-memory data structures required to establish each socket connection, persistent connections are now the default connection type in HTTP 1.1. A persistent connection leaves an already open socket and stays open after the completion of one request-response transaction. It also avoids the slow-start strategy, which avoids sending more data than possible over the network, and that will make persistent connections perform better over time. Overall, they reduce memory and CPU usage, network congestion, latency and improves the response time of a page. However, servers can only serve a finite number of connections to avoid death and can be configured to prevent persistent connections. This is also for security reasons and helps prevent denial-of-service attacks, whereby someone can create a script and leave thousands of open and unused connections. Most servers are configured to close a persistent connection if it is idle for some time (e.g. 30 seconds). User agents can also close connections after a period of idle time.
Shared servers that host hundreds of websites tend be configured to refuse persistent connections. In every HTTP response, it would include a "Connection: close" in its response header to inform the client that the agent is not allowed to make a second request on the same connection.
HTTP specification also allows for pipeline connections but it is not as widely supported as parallel and persistent connections. This is where a user agent can send multiple HTTP requests on a single connection and send those off before it even waits for the first response. Pipelining allows for more efficient packing of requests and packets and can reduce latency.
 Number of parallel connections supported by browsers for HTTP/1.1

Reference: browserscope, Stever Souders


Browser
HTTP/1.1


IE 6,7
2


IE 8,9
6


IE 10
8


Firefox 2
2


Firefox 4+
6


Safari 3,4
4


Chrome 1,2
6


Chrome 3
4


Chrome 4+
6


Android 2.2
4


Android 2.3
8


Android 4
6


iPhone 2
4


iPhone 3
6


iPhone 4
4


iPhone 5
6


Opera 10.51+
8
Browser	HTTP/1.1
IE 6,7	2
IE 8,9	6
IE 10	8
Firefox 2	2
Firefox 4+	6
Safari 3,4	4
Chrome 1,2	6
Chrome 3	4
Chrome 4+	6
Android 2.2	4
Android 2.3	8
Android 4	6
iPhone 2	4
iPhone 3	6
iPhone 4	4
iPhone 5	6
Opera 10.51+	8