stroxler/how_dns_works.org

## how_dns_works.org

      
    Raw
  

              how_dns_works.org
            
          
    Web: DNS

A quick introduction to DNS

DNS at a glance

In one sentence, I’d say that DNS is a chain-of-trust system  for configuring
  domain resolution in a trustworthy way.
The vocab isn’t completely precise but I’ll use this terminology:

  a top-level domain (TLD) is something like “com” or “org”, a 1-part domain
  a second-level domain (SLD) is something like “pets.com” or “asu.edu”
  a subdomain is something like “cats.pets.com” or “stats.math.asu.edu”,
    they can nest arbitrarily deep.
  a domain is any domain, but nearly always a SLD or subdomain (it is technically
    possible to use a TLD as a domain - see “ai.” which you can navigate to); this
    just refers to a full domain used in any given query.

Let’s unpack that a bit

By a chain-of-trust system I mean that DNS starts with a tiny set
  of root nodes (13 of them) that are hardcoded as reliable, and grows
  outward by a combination of human deals based on

  centralization (e.g. the top-level-domains are globally known and the root servers
    just stay up to date)
  or deals between trusted parties (the relationship between
    registrars like Route53 who manage domain purchases and top-level
    authoritative name nodes versus the registries who manage the TLD
    servers themselves) therefore

By “configuring domain resolution” I want to be clear it’s not just a
  simple mapping of domain to IP.

  We often just think of DNS as mapping domains to IP addresses naively;
    the A/AAAA, CNAME, and ALIAS records do mostly do this.
  But DNS can be used for other things, and has hooks to customize behavior:
    
      NS records let you set up your own DNS servers for subdomains; this
        can be especially handy if you need to give control of a subdomain to
        some party separate from who manages the SLD (e.g. *.aa.stitchfix.com)
      TXT records can store arbitrary data
      MX records allow email specifically to be routed differently from
        general traffic, by specifying a CNAME-like alias for just smtp
      SPF records help smtp servers filter out bogus traffic
      There are tens of other kinds of records. Also, some record types are
        abused for other purposes, like using dummy CNAMEs to prove ownership.
    
  
By “a trustworthy way” I mean that it’s intended for reasonable trust:

  The root nodes and top-level nodes are globally known and trusted
  The registrars are trusted to declare authoritative name nodes for
    SLDs by the registries; the registrars and registries coordinate so that
    duplicates are not possible (presumably this means there’s a brief lag
    when buying a domain name to confirm).
  Whoever buys the domain and the registrar then have some kind of authentication
    rule, and the domain owner can extend trust to third parties on their own
    via things like CNAME pointers or even letting other entities fully control
    subtrees via NS records on subdomains.

Some illuminating DNS Trivia

The individual DNS servers are TCP services that run on port 53; I think it’s
  basically a text protocol with a slightly arcane format.
Domains form a tree, and there is technically a root domain of “.”. All domains
  technically end in “.” (so the full domain is “google.com.”).

  Most of our tools hide or otherwise automatically this for us so urls almost
    never use the dot.
  But it is there, for example the domain “ai.” is only accessible if you include
    the dot because most tools don’t expect a bare TLD
  Also DNS tools themselves like Route53 often require the trialing “.”, which I
    think is related to there being a special hook in the config file format
    where you can specify a default suffix and any domain lacking the trailing
    “.” gets that suffix automatically.

All DNS entries come with TTL values that prevent DNS from doing too much work
  while ensuring that changes do eventually propagate. Caching can potentially happen
  in many places but generally is on the client computer and the recursive DNS node.
Technically there are two types of DNS queries - an iterative query causes
  the recursive node (typically your ISP) to make separate one-off queries as it
  traverses the DNS tree and any CNAMEs, whereas in a recursive query an authoritative
  name node might resolve some steps behind the scenes (and maybe cache the result).
  My impression is you don’t really need to know about this as a user or domain owner.
The DNS nodes

On the client

The OS can have some hostnames preconfigured / hardcoded.
It will also generally cache anything it has previously resolved with a suitable TTL.
The recursive DNS server

Normally outside of a custom networking environment, the recursive server will be in your ISP.
It’s possible to set up your own recursive DNS, and in fact this is commonly done in some contexts (e.g. corporate networks with local names, and sometimes inside a VPN + VPC there will be local DNS).
The recursive server doesn’t do anything except follow a rules table for asking other servers, and store a cache based on TTL directives.
It is sort of trusted by the clients using it:

  it is not trusted by the rest of the network at all
  the clients do trust it if they are using unencrypted traffic
  but with SSL, a malicious DNS server that redirects traffic still would not
    be able to authenticate that traffic.
    
      if a CA itself were attacked in this way, it could cause problems down the road
    
  
Root server

There are 13 IP addresses hardcoded as root DNS servers since the dawn of the internet. These servers have to be updated with information about downstream resolvers.
Originally there were only 13 servers, but these days the IPv4 networking is set up such that there are actually about 1300, with only 13 visible from any one place in the internet; the goal is to always route traffic to a physically nearby server.
You can see the root servers at http://root-servers.org
They are trusted, although again if one were compromised and tried to redirect traffic SSL would detect it
  as long as the CAs didn’t start issuing certificates based on the malicious data.
TLD server

There are many of these but it’s still a bounded, regulated set and the root servers have to know about these.
For example, there’s a `com.` TLD server that knows about all two-part `.com` addresses such as `pets.com` and `facebook.com`.
  client (you - often a browser).
At http://iana.org/domains/root/db you can see the TLD servers - who owns them and each host name + IP address.
I believe that name servers only store authoritative name nodes
  for second-level-domains (SLDs - e.g. pets.com, asu.edu). Everything at the
  subdomain level is handled either by those authoritative name nodes or by other
  DNS servers specified via NS records.
They are trusted, although again if one were compromised and tried to redirect traffic SSL would detect it
  as long as the CAs didn’t start issuing certificates based on the malicious data.
The authoritative name server(s)

An authoritative name server, or just name server, is for some two-part domain like asu.edu.
The top-level authortative name servers (for two-part SLDs) are managed by the registrars,
  e.g. Route53 or GoDaddy, and the record of who to talk to is stored in the TLD servers.
But it’s possible to set up arbitrary name servers underneath that level, which can be handy

  if you need to run your own for some reason (e.g. if you generate subdomains on the fly
    as we did at terminal.com)
  or, if you need to give control of a subdomain to a separate authoritative name node
    (e.g. if you want to proxy control of a subdomain to some other AWS account’s route53; sometimes
    this may be needed for trusted third parties like marketing agencies, or if your organization
    has multiple cloud accounts and different teams own different subdomains).

There are many kinds of records you can store under a domain, as I discussed above.
SSL

How does SSL work?

SSL itself involves asymmetric keys from both authorities and service
  certs, and then symmetric keys for traffic encryption.
Here’s how the process works:

  Any certificat authority (CA) has a public-private key pair
  A client operating system is configured to know about a specific, smallish
    number of CAs, and has the public keys hardcoded.
  Any service allowing SSL (for example an https server) will have on
    it a private key and a cert.
  When a client connects, the service will send the cert:
    
      the cert should contain a signature (including an expiration date) from
        the CA, made with their private key.
      The client OS will verify the signature with the public key and check the date;
        it will reject if there are issues.
      If all is well, the client OS will generate a one-time symmetric key and
        encrypt it using the public key from the cert. This completes the “handshake”.
      At this point both the client and server have the symmetric key, which
        they will use to communicate.
    
  
This general approach of a handshake with asymmetric keys and then symmetric
  key encryption is common; for example SSH does not use SSL but does still use
  that general approach.
How does a computer decide about trusted CAs

Computers come with a standard set of CAs preconfigured, and probably get
  new ones sometimes in OS updates. These are the “truted” CAs that really matter.
It’s also possible to configure extra CAs manually or via IT tools; this is handy
  if, for example, you want to encrypt traffic for domains internal to a private
  network. On MacOS, the Keychain app would be used for manual configuration.
How do CAs decide to issue a certificate?

There can be an involved process for “extra verified” certs, which institutions
  like banks will go through (and then you get the extra-green bar in browsers).
But normally, the trust is bootstrapped from DNS.
CNAME validation

The standard approach uses dummy CNAME records:

  you ask a CA for a certificate, and enter set of domains for which it should be valid
  for each domain, the CA generates a subdomain (one more layer nested beyond whatever
    domain for which the cert should be valid) with random strings in the name and value.
  you update your registrar (e.g. Route53, GoDaddy) to include that CNAME
  once the CA sees the update, they know you do in fact control the domain and they
    generate the cert (which will be valid for some time window)

In many cases they will automatically renew the cert as long as your CNAME records
  remain in place; for example Amazon’s ACM will do this for certs issued by Amazon;
  for that to work the cert has to be hosted by the same provider who issued it
  (as is the case with ACM-provided certs hosted via ALBs or cloudfront domains).
certbot and letsencrypt.org

There is a second approach developed by letsencrypt.org based on using unencrypted
  http traffic but adding random routes to prove you control the http endpoint, which
  is mostly automated via a tool called certbot. This might be easier to use if you
  were hosting a site on raw hardware because it can auto-renew via a cron job without
  you needing to talk to your DNS provider.
There are instructions on how this works, certbot will basically futz with your
  reverse proxy config (e.g. nginx) to accomplish what it needs for validating the
  domain.
In the case of a CNAME, what domain does the certificate need to work for?

Specifically, let’s say I have an ALB and point a subdomain at it using
  some registrar.
The cert will need to be valid for the subdomain used to get there, it’s not
  enough for the cert to be valid just on the “target” domain.
Note that this means you’d need to give the cert to any third-party service
  hosting a subdomain if they don’t already control engouh (via a NS entry)
  to generate their own certs; either way you are basically telling the world -
  either via dns proxying or by sharing your private cert key - that you
  fully trust this third party from the point of view of things like same-origin
  security.
Note that you might well want the cert to be valid for both in some cases,
  e.g. if I were hosting a website I wanted exposed at stroxler.net but I
  also wanted to point steventroxler.net at it via a CNAME. This is possible,
  the CA would just validate that you control both domains.