namishelex01/DNS

## DNS
Introduction

> Distributed DB used by TCP/IP applications to map bt Hostnames & IP addrs and provide email routing information

> Why DNS? An app must convert hostname->IP addr before it can as TCP/UDP to, open connection/send datagram

> Access to DNS is through a "Resolver", its a part of application not OS

> UNIX hosts has two library functions
	- gethostname(3)	: Hostname -> IP addr
	- gethostbyaddr(3)	: IP addr -> Hostname

> Most commonly used implementation of DNS - BIND {Berkeley Internet Name Domain Server}

DNS Basics

> DNS name space is heirarchical

> Every node has a label (upto 63 char)

> Root of tree is a special node with a null label

> Domain Name of any node is list of labels, starting at node, working upto root, using period/dot to separate labels

> Fully Qualified Domain Name (FQDN) : name ends with a period

> Top-level domains divided into 3 areas :
	- arpa : special domain used for address to name mappings
	- generic domains : 3-char domain also called organizational domains
	- country/geographical domains : 2-character domains based on country codes

> The only generic domains restricted to US are .gov & .mil

> No single entity manages every label in the tree. Instead, one entity maintains portion of the tree and delegates responsibility to others for specific zones

> Zone is a sub-tree that is administered separately.

> A common zone is a second-level domain. Many second level then divide their zone into smaller zones.

> Once the authority for a zone is delegated it is upto the person responsible for the zone to provide multiple name servers for that zone

> Whenever a new system is installed in a zone, the DNS admin for the zone allocates a name and IP addr for the new system and enters these into system's DB

> Admin must provide a primary name server for that zone and one or more secondary name servers.

> Both servers must be independent and redundant servers so that availability of name service for the zone isn't affected by a single point of failure.

> Primary loads all the information for the zone from disk files, while secondaries obtainall the information from the primary. When a secondary obtains the info from its primary, its called a ZONE TRANSFER

> When a new host is added to a zone, the admin adds the info(name & IP) to a disk file on the system running primary.

> The primary is then notified to re-read its config. The secondaries queries the primary on regular basis(3 hr) and if the primary contains newer data, the secondary obtains the new data using a zone transfer

> Every name server must know IP address of each root server. The root server must know the name and IP of each authoritative name server for all second level domains.

> When a name server receives information about a mapping, it caches that information so that a later query for the same mapping can use the cached result and not result of additional queries to other servers.

DNS Message Format

The message has a fixed 12-byte header followed by four variable-length fields.

The identification is set by the client and returned by the server. It lets the client match responses to requests.

The 16-bit flags field is divided into numerous pieces,

We'll start at the leftmost bit and describe each field.
l QR is a 1-bit field: 0 means the message is a query, 1 means it's a response.
l opcode is a 4-bit field. The normal value is 0 (a standard query). Other values are 1 (an inverse query) and 2
(server status request).
l AA is a 1-bit flag that means "authoritative answer." The name server is authoritative for the domain in the
question section.
l TC is a 1-bit field that means "truncated." With UDP this means the total size of the reply exceeded 512 bytes,
and only the first 512 bytes of the reply was returned.
l RD is a 1-bit field that means "recursion desired." This bit can be set in a query and is then returned in the
response. This flag tells the name server to handle the query itself, called a recursive query. If the bit is not set,
and the requested name server doesn't have an authoritative answer, the requested name server returns a list of
other name servers to contact for the answer. This is called an iterative query. We'll see examples of both types
of queries in later examples.
l RA is a 1-bit field that means "recursion available." This bit is set to 1 in the response if the server supports
recursion. We'll see in our examples that most name servers provide recursion, except for some root servers.
l There is a 3-bit field that must be 0.
l rcode is a 4-bit field with the return code. The common values are 0 (no error) and 3 (name error). A name error
is returned only from an authoritative name server and means the domain name specified in the query does not
exist.
The next four 16-bit fields specify the number of entries in the four variable-length fields that complete the record. For
a query, the number of questions is normally 1 and the other three counts are 0. Similarly, for a reply the number of
answers is at least 1, and the remaining two counts can be 0 or nonzero.

Question Portion of DNS Query Message
The format of each question in the question section is shown in Figure 14.5. There is normally just one question.
The query name is the name being looked up. It is a sequence of one or more labels. Each label begins with a 1-byte
count that specifies the number of bytes that follow. The name is terminated with a byte of 0, which is a label with a
length of 0, which is the label of the root. Each count byte must be in the range of 0 to 63, since labels are limited to 63 bytes. (We'll see later in this section that a count byte with the two high-order bits turned on, values 192 to 255, is
used with a compression scheme.) Unlike many other message formats that we've encountered, this field is allowed to
end on a boundary other than a 32-bit boundary. No padding is used. Figure 14.6 shows how the domain name
gemini.tuc.noao.edu is stored.

Each question has a query type and each response (called a resource record, which we talk about below) has a type.
There are about 20 different values, some of which are now obsolete. Figure 14.7 shows some of these values. The
query type is a superset of the type: two of the values we show can be used only in questions.


The most common query type is an A type, which means an IP address is desired for the query name. A PTR query
requests the names corresponding to an IP address. This is a pointer query that we describe in Section 14.5. We
describe the other query types in Section 14.6.
The query class is normally 1, meaning Internet address. (Some other non-IP values are also supported at some
locations.)
Resource Record Portion of DNS Response Message
The final three fields in the DNS message, the answers, authority, and additional information fields, share a common
format called a resource record or RR. Figure 14.8 shows the format of a resource record.


The domain name is the name to which the following resource data corresponds. It is in the same format as we
described earlier for the query name field (Figure 14.6).
The type specifies one of the RR type codes. These are the same as the query type values that we described earlier. The
class is normally 1 for Internet data.
The time-to-live field is the number of seconds that the RR can be cached by the client. RRs often have a TTL of 2
days.
The resource data length specifies the amount of resource data. The format of this data depends on the type. For a type
of 1 (an A record) the resource data is a 4-byte IP address.
Now that we've described the basic format of the DNS queries and responses, we'll see what is passed in the packets by
watching some exchanges using tcpdump.


Pointer Quries

First return to Figure 14.1 and examine the arpa top-level domain, and the in-addr domain beneath it. When an
organization joins the Internet and obtains authority for a portion of the DNS name space, such as noao.edu, they
also obtain authority for a portion of the in-addr.arpa name space corresponding to their IP address on the
Internet. In the case of noao.edu it is the class B network ID 140.252. The level of the DNS tree beneath inaddr.
arpa must be the first byte of the IP address (140 in this example), the next level is the next byte of the IP
address (252), and so on. But remember that names are written starting at the bottom of the DNS tree, working upward.
This means the DNS name for the host sun, with an IP address of 140.252.13.33, is 33.13.252.140. in-addr.arpa.
We have to write the 4 bytes of the IP address backward because authority is delegated based on network IDs: the first
byte of a class A address, the first and second bytes of a class B address, and the first, second, and third bytes of a class
C address. The first byte of the IP address must be immediately below the in-addr label, but FQDNs are written
from the bottom of the tree up. If FQDNs were written from the top down, then the DNS name for the IP address would
be arpa.in-addr.140.252.13.33, but the FQDN for the host would be edu.noao.tuc.sun.
If there was not a separate branch of the DNS tree for handling this address-to-name translation, there would be no way
to do the reverse translation other than starting at the root of the tree and trying every top-level domain. This could
literally take days or weeks, given the current size of the Internet. The in-addr.arpa solution is a clever one,
although the reversed bytes of the IP address and the special domain are confusing.
Having to worry about the in-addr.arpa domain and reversing the bytes of the IP address affects us only if we're
dealing directly with the DNS, using a program such as host, or watching the packets with tcpdump. From an
application's point of view, the normal resolver function (gethostbyaddr) takes an IP address and returns
information about the host. The reversal of the bytes and appending the domain in-addr.arpa are done
automatically by this resolver function.

Hostname Spoofing Check
When an IP datagram arrives at a host for a server, be it a UDP datagram or a TCP connection request segment, all
that's available to the server process is the client's IP address and port number (UDP or TCP). Some servers require the
client's IP address to have a pointer record in the DNS. We'll see an example of this, using anonymous FTP from an
unknown IP address, in Section 27.3.
Other servers, such as the Rlogin server (Chapter 26), not only require that the client's IP address have a pointer record,
but then ask the DNS for the IP addresses corresponding to the name returned in the PTR response, and require that one
file:///D|/Documents%20and%20Settings/bigini/Docu.../homenet2run/tcpip/tcp-ip-illustrated/dns_the.htm (11 of 18) [12/09/2001 14.47.06]
Chapter 14. DNS: The Domain Name System
of the returned addresses match the source IP address in the received datagram. This check is because entries in the
.rhosts file (Section 26.2) contain the hostname, not an IP address, so the server wants to verify that the hostname
really corresponds to the incoming IP address.
Some vendors automatically put this check into their resolver routines, specifically the function gethostbyaddr.
This makes the check available to any program using the resolver, instead of manually placing the check in each
application.
We can see an example of this using the SunOS 4.1.3 resolver library. We have written a simple program that performs
a pointer query by calling the function gethostbyaddr. We have also set our /etc/resolv.conf file to use the
name server on the host noao.edu, which is across the SLIP link from the host sun. Figure 14.13 shows the
tcpdump output collected on the SLIP link when the function gethostbyaddr is called to fetch the name
corresponding to the IP address 140.252.1.29 (our host sun).


Resource Records

We've seen a few different types of resource records (RRs) so far: an IP address has a type of A, and PTR means a
pointer query. We've also seen that RRs are what a name server returns: answer RRs, authority RRs, and additional
information RRs. There are about 20 different types of resource records, some of which we'll now describe. Also, more
RR types are being added over time.
A An A record defines an IP address. It is stored as a 32-bit binary value.
PTR
This is the pointer record used for pointer queries. The IP address is represented as a domain name (a
sequence of labels) in the in-addr.arpa domain.
CNAME
This stands for "canonical name." It is represented as a domain name (a sequence of labels). The domain
name that has a canonical name is often called an alias. These are used by some FTP sites to provide an
easy to remember alias for some other system.
For example, the gated server (mentioned in Section 10.3) is available through anonymous FTP from the
server gated.cornell.edu. But there is no system named gated, this is an alias for some other
system. That other system is the canonical name for gated.cornell.edu:
sun % host -t cname gated.cornell.edu
gated.cornell.edu CNAME COMET.CIT.CORNELL.EDO
Here we use the -t option to specify one particular query type.
HINFO
Host information: two arbitrary character strings specifying the CPU and operating system. Not all sites
provide HINFO records for all their systems, and the information provided may not be up to date.
sun % host -t hinfo sun
sun.tuc.noao.edu HINFO Sun-4/25 Sun4.1.3
MX
Mail exchange records, which are used in the following scenarios: (1) A site that is not connected to the
Internet can get an Internet-connected site to be its mail exchanger. The two sites then work out an
alternati ve way to exchange any mail that arrives, often using the UUCP protocol. (2) MX records
provide a way to deliver mail to an alternative host when the destination host is not available. (3) MX
records allow organizations to provide virtual hosts that one can send mail to, such as
cs.university.edu, even if a host with that name doesn't exist. (4) Organizations with firewall
gateways can use MX records to limit connectivity to internal systems.
Many sites that are not connected to the Internet have a UUCP link with an Internet connected site such as
UUNET. MX records are then provided so that electronic mail can be sent to the site using the standard
user@host notation. For example, a fictitious domain foo.com might have the following MX records:
sun % host -t mx foo.com
foo.com MX relayl.UU.NET
foo.com MX relay2.UH.NET
MX records are used by mailers on hosts connected to the Internet. In this example the other mailers are
told "if you have mail to send to user@foo.com, send the mail to relay1.uu.net or
relay2.uu.net."
MX records have 16-bit integers assigned to them, called preference values. If multiple MX records exist
for a destination, they're used in order, starting with the smallest preference value.
Another example of MX records handles the case when a host is down or unavailable. In that case the
mailer uses the MX records only if it can't connect to the destination using TCP. In the case of the author's
primary system, which is connected to the In ternet by a SLIP connection, which is down most of the time,
we have:
sun % host -tv mx sun
Query about sun for record types MX
Trying sun within tuc.noao.edu ...
Query done, 2 answers, authoritative status: no error
sun.tuc.noao.edu 86400 IN MX 0 sun.tuc.noao.edu
sun.tuc.noao.edu 86400 IN MX 10 noao.edu
We also specified the -v option, to see the preference values. (This option also causes other fields to be
output.) The second field, 86400, is the time-to-live value in seconds. This TTL is 24 hours (24 x 60 x 60).
The third column, IN, is the class (Internet). We see that direct delivery to the host itself, the first MX
record, has the lowest preference value of 0. If that doesn't work (i.e., the SLIP link is down), the next
higher preference is used (10) and delivery is attempted to the host noao.edu. If that doesn't work, the
sender will time out and retry at a later time.
In Section 28.3 we show examples of SMTP mail delivery using MX records.
NS
Name server record. These specify the authoritative name server for a domain. They are represented as
domain names (a sequence of labels). We'll see examples of these records in the next section.


Caching


To reduce the DNS traffic on the Internet, all name servers employ a cache. With the standard Unix implementation,
the cache is maintained in the server, not the resolver. Since the resolver is part of each application, and applications
come and go, putting the cache into the program that lives the entire time the system is up (the name server) makes
sense. This makes the cache available to any applications that use the server. Any other hosts at the site that use this
name server also share the server's cache.
In the scenario that we've used for our examples so far (Figure 14.9), we've run the clients on the host sun accessing the
name server across the SLIP link on the host noao.edu. We'll change that now and run the name server on the host
sun. In this way if we monitor the DNS traffic on the SLIP link using tcpdump, we'll only see queries that can't be
handled by the server out of its cache.
By default, the resolver looks for a name server on the local host (UDP port 53 or TCP port 53). We delete the
nameserver directive from our resolver file, leaving only the domain directive:
sun % cat /etc/resolv.conf
domain tuc.noao.edu
The absence of a nameserver directive in this file causes the resolver to use the name server on the local host.


UDP/TCP

When the resolver issues a query and the response comes back with the TC bit set ("truncated") it means the size of the
response exceeded 512 bytes, so only the first 512 bytes were returned by the server. The resolver normally issues the
request again, using TCP. This allows more than 512 bytes to be returned. (Recall our discussion of the maximum UDP
datagram size in Section 11.10.) Since TCP breaks up a stream of user data into what it calls segments, it can transfer
any amount of user data, using multiple segments.
Also, when a secondary name server for a domain starts up it performs a zone transfer from the primary name server
for the domain. We also said that the secondary queries the primary on a regular basis (often every 3 hours) to see if the primary has had its tables updated, and if so, a zone transfer is performed. Zone transfers are done using TCP, since
there is much more data to transfer than a single query or response.
Since the DNS primarily uses UDP, both the resolver and the name server must perform their own timeout and
retransmission. Also, unlike many other Internet applications that used UDP (TFTP, BOOTP, and SNMP), which
operate mostly on local area networks, DNS queries and responses often traverse wide area networks. The packet loss
rate and variability in round-trip times are normally higher on a WAN than a LAN, increasing the importance of a good
retransmission and timeout algorithm for DNS clients.


Workflow

The following 11 steps take place, assuming none of the information is already cached by the client or server:
1. The client starts and calls its resolver function to convert the hostname that we typed into an IP address. A query
of type A is sent to a root server.
2. The root server's response contains the name servers for the server's domain.
3. The client's resolver reissues the query of type A to the server's name server. This query normally has the
recursion-desired flag set.
4. The response comes back with the IP address of the server host.
5. The Rlogin client establishes a TCP connection with the Rlogin server. (Chapter 18 provides all the details of
this step.) Three packets are exchanged between the client and server TCP modules.
6. The Rlogin server receives the connection from the client and calls its resolver to obtain the name of the client
host, given the IP address that the server receives from its TCP. This is a PTR query issued to a root name
server. This root server can be different from the root server used by the client in step 1.
7. The root server's response contains the name servers for the client's in-addr.arpa domain.
8. The server's resolver reissues the PTR query to the client's name server.
9. The PTR response contains the FQDN of the client host.
10. The server's resolver issues a query of type A to the client's name server, asking for the IP addresses
corresponding to the name returned in the previous step. This may be done automatically by the server's
gethostbyaddr function, as we described in Section 14.5, otherwise the Rlogin server does this step
explicitly. Also, the client's name server is often the same as the client's in-addr.arpa name server, but this
isn't required.
11. The response from the client's name server contains the A records for the client host. The Rlogin server compares the A records with the IP address from. the client's TCP connection request.
Caching can reduce the number of packets exchanged in this figure.