Skip to content

Instantly share code, notes, and snippets.

@mathie
Last active December 4, 2022 17:02
Show Gist options
  • Save mathie/402d71f0b49e59f725d0 to your computer and use it in GitHub Desktop.
Save mathie/402d71f0b49e59f725d0 to your computer and use it in GitHub Desktop.
How the Internet Works

footer: © 2015 Graeme Mathieson. CC BY-SA 4.0. slidenumbers: true

Type “google.com” into you browser and hit enter

What happens next?

^ I haven’t done any interviewing for a while but I went through a period of growth in one of the companies I worked for where we were feverishly expanding the development team, so we had to be a little more systematic in our approach to interviewing. Instead of just having an open conversation with candidates to see where it led (which is what I’d previously done in such situations), I wound up preparing a ‘standard’ set of questions. It took a few goes, but eventually I settled on a favourite question for the technical portion of the interview:

^ When I pull up my favourite Internet browser, type “google.com” into the address bar, and press return, what happens?

^ I reckon it’s a doozy of a technical question. There’s so much breadth and depth in that answer. We could talk for hours on how the browser decides whether you’ve entered something which can be turned into a valid URL, or whether it’s intended to be a search term. From there, we can look at URL construction, then deconstruction, to figure out exactly what resource we’re looking for. Then it’s on to name resolution, to figure out who we should be talking to.

^ And then it gets really interesting. We start an HTTP conversation, which is encapsulated in a TCP session which is, in turn, encapsulated in a sequence of IP packets, which are, in turn, encapsulated in packets at the data link layer (some mixture of Ethernet and/or wireless protocols), which – finally! – causes some bits to fly through the air, travel along a copper wire, or become flashes of light through fibre optic. Now our request emerges – fully formed again, if it survived the journey – at the data centre. The HTTP request is serviced (on which entire bookshelves have been written) and the response follows the same perilous journey back to the browser.

^ The story isn’t over, though. Once the browser has received the response, it still has to interpret it, create an internal representation of the document, apply visual style, and render it in a window. And then there’s the client side JavaScript code to be executed. It’s an exciting story: one of daring journeys, lost packets, unanswerable questions, and the teamwork of many disjoint routers, distributed across the Internet.


How the Internet works

Graeme Mathieson

Email me: [mathie@woss.name]

Tweet me: [@mathie]

^ I’m Graeme. I’m a software developer. I’ve spent most of the past decade building web apps with Ruby on Rails, and lately I’ve been dabbling iOS development in Swift.


google.com ⏎

^ So, we’ve launched our favourite web browser. Our computer has read some bits from the spinning rust, loaded it into main memory, and started executing those bits on the CPU. The browser window opens, ready to accept our bidding. We depress switches on the USB keyboard attached to the computer, which are interpreted as UTF-8 characters, and start to appear in the browser address bar. We hit “enter”. What happens next?


Is it a URL?

  • Yep. OK, cool, my work here is done.
  • Kinda. Well, let’s turn it into a well formed URL.
  • Nope. OK, I’m gonna assume you meant to search for something. Let’s turn it into a well formed URL for a web search.

^ First of all, we have to figure out if the string we’ve entered is a valid URL. If it is, we’re good to go. If not, we’ve got a little bit more work to do. A fully formed URL contains several components, but as human beings, we’re lazy, and we tend to omit some of them for brevity, so the computer has to figure out what we really mean. And modern browsers have combined the search input field and the address bar into a single input field. So we need to figure out whether the user intended to enter a (partial) URL, or a search term. In our case, “google.com” looks like it’s a bit of a URL, so we just need to figure out how to complete it.


HTTP Strict Transport Security

Does this site prefer HTTPS?

  • Strict-Transport-Security header from a previous request?
  • In the browser’s list of HSTS preloads?

^ Secure communication is generally for the win. With HTTP, we specify that a secure connection is required through the scheme in the URL — choosing https instead of http. However, in this case, we’ve omitted the scheme entirely, so how does the browser know whether to prefer a secure connection or not? That’s where HTTP strict transport security comes in. If the browser has previously requested a page from this particular domain, then the response may have included a hint that future requests should default to https. If we’ve never made a request to this particular URL, then the browser checks its built in list of well known domains that prefer https.


HTTP Strict Transport Security

Does this site prefer HTTPS?

  • Yep OK, set the URL scheme to https.
  • Nope Fine then. If you don’t care for security…

^ “google.com” is in the HSTS preload list, so we’ll default to using https for the scheme.


^ Now we’ve managed to construct a well formed URL and we can continue to make the request.


Browser cache

Is the URL in the browser cache?

  • Yep Let’s check it’s still valid.
  • Nope Well, we’re going to have to fetch it.

^ Web browsers keep a cache of previous requests. The http protocol has well defined semantics for deciding when a particular URL’s response can be cached, how that cache is expired, and whether the validity of cached content needs to be verified.


Browser cache

Is the cached content still valid?

Expires

Cache-Control: max-age

  • Yep Awesome. We might skip a network request!
  • Nope OK, let’s check in with the server.

^ The expires header specifies an absolute timestamp for when a page expires. If we haven’t yet reached that point in time, then the response is probably still valid. If we’ve passed that point in time, then the response might still be valid, but we’ll need to double check. The maximum age specifies a relative time (in seconds) for which a response should be considered valid. If we’ve reached that time, then the response definitely needs to be double checked for validity.


Browser cache

Should the cached content be revalidated?

Cache-Control: must-revalidate

  • Yep OK, let’s check in with the server.
  • Nope Awesome. Skip to rendering!

^ If the server specified that we must revalidate the response, then we’ll do an HTTP request to check the content, see if it’s still valid. We can take a bit of a shortcut here, and just ask the server if the content has changed. We can figure this out based on the timestamp of the version of the page we have cached and/or based on a tag associated with the page. (This tag is an opaque type — so we can’t take any meaning from it — but it is typically a hash of the dynamic content of the page.)


Parse the URL

  • Scheme: “https”
  • Authority: “google.com”
  • Path: “/“

^ At this point, we need to know the scheme (protocol) that we’re attempting to use, and we need to know the authority. Well, really, we just need to know the hostname for now. The authority consists of the user information — a username and, potentially, a password — the hostname, and a port number. In this case, the user information and the port number have been omitted.


DNS Lookup: Browser cache

Is the hostname in the browser’s cache?

  • Yep Awesome, let’s use that IP address.
  • Nope OK, we’re going to have to do this the hard way.

^ Modern browsers often maintain their own cache of mappings from hostname to IP address. If they can serve the request from their own cache, it saves the time of going through the system resolver (i.e. a sys call or two).


DNS Lookup: OS resolver

Is the hostname in the operating system’s cache?

  • Yep Job done. We’ll use that IP address.
  • Nope OK, we’re really going to have to look it up.

^ The operating system maintains a global cache of mappings from hostname to IP address. This is shared across all processes for all users on the local computer. If the request can be served from the cache, it will be, and nothing else needs to happen. If not, we’re going to need to dig in a bit further.


Name Service Switch

  • Check /etc/hosts
  • Try multicast DNS
  • Perform a DNS lookup

^ The name service switch is a level of indirection inside the operating system’s name service resolution that allows the system administrator to specify a set of resolution mechanisms. In the olden days, this allowed us to insert additional name resolution systems (i.e. NIS/YP) but these days, it’s pretty much just the static /etc/hosts file, then falling back on DNS.


DNS Lookup

Get the IP address of a name server

  • From DHCP
  • Statically configured

^ The IP address of one or more name servers is already known by the operating system. Since you need to have a name server in order to convert hostnames to IP addresses, there needs to be a way to “bootstrap” the process. Typically the IP address of the name servers is supplied by DHCP, or through some other static configuration. Most often, the name server will be somewhere on the local network — at home, it’s probably your DSL router. Often, several name servers are specified.


DNS Record Types

  • A and AAAA are address records: mappings from name to IP address.
  • PTR is a reverse mapping from IP address to name
  • NS is a pointer to a name server.
  • Other record types: SOA, CNAME, MX, TXT.

^ DNS has several different types of records, each of which performs a different function. A records (and their corresponding IPv6 counterpart AAAA records) map from a host name to an IP address. That’s what we’re looking for here, though we’ll bump into some other record types along the way. CNAME records are a mapping from an alias to a canonical name. The SOA provides some information about a domain, including which name server is its primary authority, and some information to control how records in that domain can be cached.


Send the DNS request

New is Apple iOS 9 & El Capitan

  • Send out an AAAA request; and
  • Send out an A request, in parallel.

^ A shiny new feature in Apple’s iOS 9 and Mac OS X El Capitan that I spotted a couple of weeks ago. It will send out DNS requests for an IPv6 AAAA record and an IPv4 A record, both at the same time. The process is the same, and whichever responds first (more or less) wins.


Recursive DNS request

Is the record in the name server’s cache?

  • Yep Is it still valid? (TTL) If so, return the record. Job done.
  • Nope OK, we’ll need to look it up.

^ Most recursive name servers keep a cache of records they’ve recently looked up. DNS is designed so that records are cached with well defined semantics. Each resource record has a Time To Live (in seconds) associated with it. If the record’s time to live is greater than 0, it’s considered to still be valid and is returned as an answer. Depending on the situation — essentially, the stability of the record in question — a TTL can be anything from a few minutes to a few weeks.


Upstream DNS server

Is our local DNS server configured to have one or more upstream servers?

  • Yep OK, let’s pass the request off to an upstream and let it figure out the answer.
  • Nope Damn. We’re going to have to do the hard work ourselves.

^ DNS servers are most often configured in a hierarchy. Chances are you’ll have a DNS server somewhere on your local network. At home, your DSL router will probably act as a recursive DNS server. It will be configured to send DNS requests that it can’t answer up to your ISP’s customer-facing DNS resolver. Your ISP’s DNS servers may well themselves be configured with upstream servers in a hierarchy of their own. Eventually, we get to the top of the hierarchy, with the root DNS servers.


Root DNS Servers

  • 13 well-known IP addresses of root servers.
  • Really, they’re hundreds of machines distributed globally.
  • Authoritative for the root zone.

^ The IP addresses of the root domain name servers are well-known, and are distributed by default with most domain name server implementations. We need to know their IP addresses in order to bootstrap the domain name system — it’s not possible to resolve a name without knowing the IP address of a DNS server that can answer the question. While they consist of only 13 IP addresses, really there are hundreds (500+ last time I looked) of machines around the globe. A mechanism called Global Server Load Balancing means that disparate systems can share the same IP address, and when you attempt to make a connection to that IP address, you’ll be routed to a machine that’s geographically (more or less) close to you.


DNS Authority

Root servers are authoritative for the root zone.

Know the canonical answer for who serves each TLD: “.com”, “.net”, “.uk”, etc.

^ The root servers are authoritative for the root zone. This means that they know the canonical answer for the list of name servers (and, through glue records, the IP addresses of those name servers) who are authoritative each top level domain, including country-specific top level domains.


What’s the A record for “google.com”?

^ So we’ve got to the stage where the browser has determined it doesn’t have the IP address for google.com in its own cache. The operating system resolver doesn’t have the IP address either, so we’ve moved on to doing a DNS lookup. Our local caching name server (our DSL router) doesn’t have the answer, so it has asked the ISP’s DNS server. It doesn’t know the answer, and doesn’t have another upstream to punt the request to, so it’s going to figure out the answer for itself (and cache the result).


Root servers

What’s the A record for “google.com”?

  • No idea, but here’s the list of name servers for “.com”.
  • Oh, and have the IP addresses of those name servers, too.

^ The root name servers cannot hold DNS information for every host on the Internet. That would be crazy. So when you ask the root servers for a specific record, they have no idea what the answer is, but they can help point you in the right direction. So instead of a direct answer to the question, they’ll say, “I dunno” but they’ll help out by giving you a list of hostnames for name servers that serve the “.com” domain — because they do authoritatively know the answer to that. If there’s enough space in the response, they’ll also hand out IP addresses of those name servers, if they happen to know them (and they typically do).


Authoritative servers for “.com”

What’s the A record for “google.com”?

  • No idea, but here’s a list of name servers for “google.com”.
  • Oh, and have the IP addresses of those name servers, too.

^ The authoritative DNS servers for the “.com” zone don’t even hold all the DNS records for .com. With the sheer number of domains in each TLD zone, and the rate of change of the records in the zone, this would be infeasible. However, thanks to the database of domain names maintained by each TLD registry — extracted from the whois database — the zone does have details of the authoritative name servers for each domain name within the zone. So this time, instead of a direct answer to the question, we get back a list of name servers — and, if possible, their IP addresses — for the “google.com” domain.


Authoritative servers for “google.com”

What’s the A record for “google.com”?

  • Hey, I know this! Here’s a list of IP addresses!

^ At last, we’ve got an answer! The authoritative servers for google.com know the list of IP addresses that we should talk to. It’ll return one or more addresses. If there’s more than one, it’ll return the list in a random order — this provides a form of load balancing, in that hopefully each client will split their requests fairly evenly across all the advertised IP addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment