The following paper was originally published in the
Proceedings of the Fifth USENIX UNIX Security Symposium
              Salt Lake City, Utah, June 1995.
      DNS and BIND Security Issues
                      Paul Vixie
            Internet Software Consortium
 For more information about USENIX Association contact:
            1. Phone:       510 528-8649
            2. FAX:         510 548-5738
            3. Email:       office@usenix.org
            4. WWW URL: http://www.usenix.org
                     Originally published in the proceedings of the 5th Usenix Security Symposium
                                         DNS    and BIND Security Issues
                                                       Paul Vixie
                                                   <paul@vix.com>
                                             Internet Software Consortium
                                                          2 May, 1995
                                                            Abstract
          Efforts are underway to add security to the DNS protocol. We have observed that if BIND would just
          do what the DNS specifications say it should do, stop crashing, and start checking its inputs, then most
          of the existing security holes in DNS as practiced would go away. To be sure, attackers would still
          have a pretty easy time co-opting DNS in their break-in attempts. Our aim has been to get BIND to
          the point where its only vulnerabilities are due to the DNS protocol, and not to the implementation.
          This paper describes our progress to date.
1. Introduction                                                   inputs, then most of the existing security holes in DNS as
                                                                  practiced would go away. To be sure, attackers would
Many were the reasons for starting work on BIND again a           still have a pretty easy time co-opting DNS in their
few years back. The BIND server and resolver are critical         break-in attempts. Our aim has been to get BIND to the
to the daily activities of millions of Internet users, yet they   point where its only vulnerabilities are due to the DNS
have each been infested with bugs from their first day of         protocol, and not to the implementation.
use. We have made some good progress on plugging the
memory leaks and core dumps that BIND is famous for,
and along the way we have found a lot of ways to make             2. Why Is DNS Security Important?
BIND more secure.
      Many of the classic security breaches in the history        Let’s say that a security conscious user always uses a
                                                                  DES challenge/response device when connecting to hosts
of computers and computer networking have had to do
not with fundamental algorythm or protocol flaws, but             outside the local network, but when connecting locally,
with implementation errors. Sometimes those errors take           she figures that it is safe to send her password in clear
the form of ignorant or “security unaware” programming,           text since she knows1 that outsiders cannot sniff on her
such as collecting potentially unbounded streams of data          private network. Further assume that hers is one of the
from the network using functions which do not know                many installations which does not restrict outbound TCP
the length of their destination buffers, or the use of            connections, on the assumption that firewalls are only
predictable magic cookies since the programmer’s goal             necessary to keep people out2. If her name server is able
is to prevent accidental data errors rather than intentional      to receive UDP packets on port 53 from outside her local
ones. Other times, a code branch rarely or never taken in         network, then this security conscious user is in for a
normal use is found to have “security fatal” bugs or even         potentially rough ride.
deliberate back doors or loopholes.                                    Before we begin, we’d like to emphasize that the
     While we do not intend to demean the efforts of              examples are not drawn from theoretical studies, but
those involved in upgrading the Internet protocols to             rather the tcpdump command running on real networks.
make security a more realistic goal, we have observed
that if BIND would just do what the DNS specifications
                                                                  1
say it should do, stop crashing, and start checking its            We’ll assume that she is correct.
                                                                  2
                                                                   An assumption with which we do not agree.
Folks over on the Dark Side have tools to exploit these              This would not have worked in our example, since
weaknesses, and they are real, right here, right now. We        we’re assuming a one-way firewall. Her resolver isn’t
learned of these weaknesses by studying some successful         reachable by packets from outside her net – but her
attacks, not just by a careful examination of the protocol      name server is. If that name server can be corrupted,
and the BIND source code.                                       even for an instant, then an attacker can redirect
                                                                telnet sessions (containing passwords), electronic mail
2.1. Misdirected Destination                                    (containing proprietary information), or even other
                                                                DNS queries (thus using one name server to help corrupt
A user asks her telnet client to connect to host1. Her          others.) Every one of those things has been seen in action
client asks the name server for the address of host1,           – we’re not just being paranoid.
receives a corrupt answer, and then initiates a TCP
connection to the telnet server at that address. This
address does not correspond to her intended host, but it        3.2. Misdirected Source
displays the usual greeting, and she types her usual login      On late model BSD-derived systems, name based authen-
and password. The connection drops, she tries it again,         tication usually takes the form of files containing lists
all is well, she chalks it up to a gremlin in the network and   of host names or addresses, possibly including a user
forgets all about it. But there is a gremlin in her network,    name to be matched against the remote (“incoming”) user
and that gremlin just harvested her password.                   name1. A convention is upheld whereby certain TCP port
                                                                numbers2 are able to be bound only by processes execut-
2.2. Misdirected Source                                         ing with so-called “super user” priviledges3. This rather
                                                                brittle chain of causality permits the BSD ruserok() li-
If that same user depends on name based authentication          brary call to assume that the remote user name given in
when inside what she considers to be the safe confines          the data stream is “authentic” from the point of view of
of her internal network, she’s in for another hellride.         the remote host and its administrators. Users are not al-
Anyone on any interior host can almost trivially bypass         lowed to claim, when they use the rsh or rdist or rlogin
name based authentication, causing this user’s hosts to         commands, that they are somebody they’re not – at least
believe that “they” are “her” and therefore allowing them       on well run, trustworthy multiuser hosts.
to log in with her access rights and priviledges. Any host
                                                                     BSD’s security took a giant step forward back
which is allowed to accept incoming connections from
outside the local network could be fooled in this same          in 1989 or so, when the callers of ruserok() were
way, but by an outside host.                                    encouraged to do more than blindly assume that the
                                                                result of gethostbyaddr(getpeername(remote)) was
                                                                accurate. It used to be that whatever DNS gave as the name
3. How Did That Happen?                                         corresponding to the source address of a connection,
                                                                was used directly as the search key when scanning
Clearly, the above activities were not design goals of          ~/.rhosts and its bretheren. After someone noticed
the DNS protocol or of the BIND implementation of that          that the name server being asked for this information
protocol. Let’s look at how they could occur.                   was the one belonging to the connection’s initiator, the
                                                                convention changed: Now, after calling gethostbyaddr(),
3.1. Misdirected Destination                                    the result is passed back through gethostbyname() to
                                                                see if the addresses and names all match. The name
It could be as simple as a forged response sent directly        server for gethostbyname() will be, barring corruption,
to her resolver. Even after 25 years of experience, the         authoritative for any given host name in ~/.rhosts (et al.)
Internet still has no production routers which disallow         Someone who can make their address appear to map to
packets with impossible source addresses. So if you can         one of your hosts will have to take some extra steps to
route packets to someone, you can make those packets            also make your host appear to have one of his addresses.
look as though they came from a close and trusted host
                                                                     (SunOS put this check into gethostbyaddr() – an
– even if they originated outside that host’s network. If
                                                                error that will live in infamy, since not every caller of that
an attacker can predict the time that a query will be sent,
                                                                function wants to get an “error” return status when the
he need only flood the resolver with bogus replies and
hope that his bogons arrive earlier than the real answer.
Predicting the UDP port used by the resolver for any
given query might require that a novice attacker spend          1
                                                                 E.g., hosts.equiv, hosts.lpd, ~/.rhosts
several minutes thinking about it, but many attackers will      2
                                                                 Those from 512 to 1023.
consider that time well spent.                                  3
                                                                 This convention is of course meaningless on single-user hosts.
forward and reverse lookups yield asymmetric results.          4.1.   DNS   Datagram Formats
The proper place for this mapping logic is in those
applications and library calls who intend to use the data      DNS queries and responses use a common format, though
for some kind of authentication – it is not a naming issue     not all protocol elements are used all the time. The
per se, and does not belong in the resolver.)                  simplest case, described here, uses IP/UDP where each
                                                               datagram contains one DNS query or response. DNS’s use
     As effective as that extra gethostbyname() call has
                                                               of IP/TCP is beyond the scope of this report other than as
been, its goal was to keep attackers from just editing their
                                                               it affects zone transfers, which we will discuss shortly.
IN-ADDR.ARPA zones and zooming on in. No thought was
given to whether the name servers could be corrupted.          Header Section: Describes the other sections, has
So while an attacker has a little more work to do now              flags including RD (recursion desired) and AA
than in the Old Days, it is still trivially easy to pollute        (authoritative answer), and most important for our
the caches of the set of servers who will be asked for the         discussion, has a 16 bit “query ID.”
gethostbyaddr() and gethostbyname() answers, or to             Query Section: Contains the name, class, and type of the
flood the resolvers with bogus responses at the time that          resource record set (“RRset”) being queried for. DNS
they are predicted to be waiting for the answers.                  permits multiple queries in this section but this has
      If an attacker can reach the victim’s host, they can         never been tried and is not well specified.
probably make their host name seem to be almost any            Answer Section: Always empty in queries. Contains
arbitrary string when viewed by the victim’s rlogind.             the RRset matching the query, or is empty if name
And, if they can also break “super user” on the source host       doesn’t exist, if no data matched the query, or if a
(or if that host is their own office workstation), they can       nonrecursive query results in a referral.
make the victim see any arbitrary remote user name. If         Authority Section: Always empty in queries. Can be
this attacker knows any of the contents of your ~/.rhosts          empty in responses. If nonempty, it contains the
files or your ~Bhosts.equiv file – and these are eminently         NS and SOA RRs for the enclosing zone. This is
guessable – then they are in.                                      sometimes called “referral data.”
                                                               Additional Data Section: Always empty in queries. Can
4. Protocol View of Weaknesses                                     be empty in responses. If the answer or authority
                                                                   section contains any RRs whose data fields contain
One way of looking at these weaknesses is from an                  RRnames, the RRsets for those RRnames appear
operational point of view, which given the current                 here.
state of the art, tells us: name based authentication is
inherently insecure. Sessions (whether TELNET, NFS, or         4.2. Servers and Resolvers
whatever) should require something stronger than trying
to determine a host’s name and and then looking for that       The client in DNS is called a “resolver.” The server is
name in some statically configured list. ([RFC1510] and        called, appropriately enough, a “name server.” Resolvers
[RFC1760] are each cause for optimism.)                        have some static configuration information, consisting of
                                                               a domain “search list” and a list of name server addresses.
     From the bottom, though, these weaknesses all come
                                                               Theoretically, a resolver can also be configured with a
with particular sets of details and can be described in
                                                               static map of domains to name server addresses, allowing
terms of DNS protocol elements. As implementors we
                                                               queries to be forwarded directly to appropriate name
are more interested in this view than in the more political
                                                               servers for some set of locally known domains. BIND
questions of Global Internet Authentication. So let’s have
                                                               does not implement this last part yet. The resolver’s list
a look at the packets, shall we? After that we’ll take a
                                                               of name server addresses had better include at least one
look at the ways they can be perverted.
                                                               recursive name server, or the DNS name space is going to
      We do not intend to present an exhaustive descrip-       look pretty small.
tion of DNS – [RFC1034] and [RFC1035] already fill that
need. Our goal in this section is to present enough infor-
                                                               4.3. Recursion
mation about DNS that someone unfamiliar with its details
can still understand the security ramifications of some        To “recurse” on a query means that when a query comes
of DNS’s design choices. If this report disagrees with         in for an RRset not known to the server receiving it, that
[RFC1034] or [RFC1035] in any detail, it is most likely        server will forward it to some name server more likely to
that the report is wrong.                                      know the answer. In some cases, the forwarding server
                                                               will know the name server list for the exact domain or
                                                               parent domain of the query. More often, a grandparent
domain’s servers are known, or no servers are known             which is what we call the portion of the DNS name space
and the query is sent all the way to the root name servers      that is outside all of a server’s zones of authority. If a
(which are co-operated by the InterNIC and a worldwide          server has no zones of authority, then all of its answers
cadre of volunteers.) There is a flag in the query called RD    will be nonauthoritative since all it has is a cache. This
which, if set, specifies that recursion is desired; if clear,   kind of server is sometimes called a “caching only” or
a name server will answer queries for unknown RRsets            “forwarding” server.
with an appropriate error (“name unknown” or “no data,”
depending.)                                                     4.6. Forwarding -vs- Recursion
     Sending nonrecursive queries is a fine way to find
out what a name server already knows, since, otherwise,         When a name server receives a query for data it doesn’t
you will get an answer even if the name server had to go        have, it can either send back an error response (if it is
searching for it at the time of your query.                     authoritative for the name’s zone, it knows that either
                                                                the name or data doesn’t exist), send back a referral
                                                                (if running in “nonrecursive mode” as the root servers
4.4. Referrals
                                                                all do, or if the RD flag is clear in the query), or it can
If a name server receives a query for a <name,class,type>       forward the query. This last possibility is of interest to us
tuple that it knows it has delegated, it answers with           in our security study, because of what will happen when
what’s called a “referral.” A referral response has an          some response finally comes back. Forwarding is not a
empty answer section but a nonempty authority section;          three-party transaction – a forwarded query results in a
the intent of this message is to tell another server “the       response to the forwarder who must then complete the
name you asked for exists, but I don’t have the answer,         original transaction by forwarding the response back to
go try these other servers.” Bogus referrals are a fine         the originator.
way to pollute a cache indirectly – if you can snoop on              BIND takes its forwarding duties one step further, as
a forwarded query and then inject a referral response,          an optimization attempt: It caches all the RRsets in the
you can make the forwarding server effectively believe          forwarded response. This promiscuity is the source of
that you are the delegated server for an entire subtree of      most of BIND’s bad reputation in both the operations and
the DNS name space. This is actually the easiest way to         the security fields. Other servers are free to put almost
pollute a cache since there’s no guessing involved: You         anything into the response, even if it has nothing to do
know the source address, source UDP port, and query ID          with the query. As shown in [Bel95a], this has disasterous
by inspection. You even know the query name. The only           effects on security.
trick is in breaking into a host on a network backbone so
                                                                      It is worth noting that the first query handled by a
that you can actually see the queries being forwarded to
                                                                forwarding or recursive name server for a given RRset
the root servers. This has been done1, but not often.
                                                                is likely to result, ultimately, in it forwarding back an
                                                                answer obtained from an authoritative name server –
4.5. Authority: Masters and Slaves                              thus the AA flag will be set in the response, even though
                                                                the forwarder is not itself authoritative for the name.
To be “authoritative” means that a name server has an
                                                                Subsequent queries to the same name server for the same
entire “zone” loaded, either via a “master file” that was
                                                                RRset will probably be satisfied from the cache, and in
created by the name server administrator, or via a “zone
                                                                that case the AA flag will not be set in the response. You
transfer,” which is a TCP session with another name server.
                                                                can see this in action using the ISI dig tool from the BIND
The former kind of server is called the “master” and the
                                                                kit.
latter is a “slave.” Slaves generally do their zone transfers
from the master, but sometimes firewalls are interposed
and it becomes necessary to have slaves pull their data         4.7. Forwarding -vs- Timeouts
from other slaves, which are themselves stationed at the        When BIND’s resolver needs to forward a query, it
border, perhaps even on the firewall itself.                    chooses the next name server address from its statically
     Masters and slaves will set the AA flag on any             configured list, sends the query, waits a short time for
response whose answer section contains only RRsets              an answer, chooses the next name server address, sends
from authoritive zones. The AA flag will be clear if any        and waits, and so on. BIND’s timeouts are fairly short; It
RRset in the answer section came from the the “cache,”          will often send a query to name server #1, then to name
                                                                server #2, then the response will come in from name
                                                                server #1, and the resolver will close its socket such that
                                                                when name server #2’s response comes in a second or
1
 No, we’re not going to name names.                             so later the kernel sends back an ICMP Port Unreachable
message. We wish there were a way to ask the kernel not           reasonable to declare failure at this point, though perhaps
to send these, other than keeping the socket open longer          a bit severe.
(which would lead to resource starvation among kernel                   BINDs from version 4.9 have syslog’ed the condition
protocol control blocks.) Lengthening the timeout would           and gone on to try the other delegated servers. The syslog
lead to longer application-visible delays when a statically       volume generated by this condition is the cause of more
configured name server goes off the air, but life is full of      than half the questions we see about BIND from new name
hard choices.                                                     server administrators. The only way to fix the condition
                                                                  is to get someone to edit the delegation to remove the
4.8. Query IDs and UDP Ports                                      nonauthoritative name server, or to get someone to
                                                                  make the name server authoritative. Either way it’s
Each query sent out by a resolver will come from some
                                                                  not something the detecting server’s administrator can
UDP port on some address of the resolver’s host, and its
                                                                  do anything about directly; we hope that the continued
header will contain a unique (in the context of the source
                                                                  syslog volume will lead to more hate mail being sent
address and port number) query ID. UDP port numbers and
                                                                  to the administrators of broken zones, thus ultimately
DNS query IDs are both unsigned 16 bit quantities, giving
                                                                  leading to a decline in the number of broken zones. We
a range from 0 to 65535 for each. Port numbers could be
                                                                  have been accused of optimism in this matter.
conserved and reused by the resolver, but BIND currently
opens a new socket for each query, and kernels tend to
use an LRU mechanism when assigning port numbers to               4.11. Glue
new sockets. The tuple <address,port,queryID> forms               When transmitting a zone via a TCP “zone transfer,” the
a unique identifier that servers can use to keep track of         general rule is to send only the RRsets whose names lie
queries in progress. Resolvers should verify that the             within the zone being transferred, which is to say starting
query ID of the response matches that of their query.             from the initial zone cut, and proceeding downward
                                                                  (away from the root) to include all names which are not
4.9. Delegations, Zones, Domains, and Subdomains                  further delegated. There is an exception to this, called
                                                                  “glue.” Any address records (A RRs) which are referred
Strictly speaking, every DNS name is a domain. All                to by an NS RR inside the zone (at the initial cut or any
domains except the root are also “subdomains.” Any                downward cuts) must be included, even if they lie beneath
time a subdomain is delegated to some other master name           one of the downward zone cuts.
server, a “zone cut” is said to exist. A zone consists of
all names from a zone cut downward to either terminal                   If this information is not included in the zone
names (sometimes called “leaf domains”) or other, deeper          transfer, then referral responses won’t be able to include
zone cuts.                                                        those addresses in their additional data sections. In the
                                                                  absence of that additional data, the name servers will not
    The most common case of a zone begins at a                    be reachable except by servers who have the zone – and
subdomain and has no zone cuts beneath it. The most               that’s not very useful. It is important that a server only
famous zone is the root (“.”) which has no terminal               send (or accept) relevant glue during zone transfers, since
names, just delegations.                                          otherwise this becomes an easy way for your cache to
     There are two views of a delegation: The parent              become polluted.
zone, which has some NS RRs at the cut, and the child
zone, which has a superset of those NS RRs and also an
SOA RR. When we say “superset” we mean that a child               5. What We Have Fixed
will have at least the NS RRs known by its parent, and
                                                                  BINDs from version 4.9 have plugged a lot of holes with
perhaps some additional NS RRs that the parent does not
                                                                  respect to earlier versions. An incomplete list follows:
know about.
4.10. Lame Delegations                                            5.1. Cache Tagging
If a delegation NS RR names a host which is not                   BIND  now maintains for each cached RR a “credibility”
authoritative for the zone, then that host when queried           level showing whether the data came from a zone, an
nonrecursively for names in that zone will answer with a          authoritative answer, an authority section, or additional
delegation to a higher (that is, closer to the root) authority.   data section. When a more credible RRset comes in, the
This is an error condition as perceived by the server that        old one is completely wiped out. Older BINDs blindly
forwarded a nonrecursive query – if a name server is              aggregated data from all sources, paying no attention to
listed in an NS RR, it is supposed to have the zone. It is        the maxim that some sources are better than others.
      Each RR also has the address of the name server who      particular address if they have more than one interface –
sent it to us. This can be seen in cache dump when you’re      so if you’re on the wrong side of a multihomed SunOS
looking at some bad data and wondering how it got to           name server, all of its responses will appear to be
you.                                                           “unsolicited.”
5.2. Additional Data Promiscuity                               5.6. Glue
                                                               BINDs from version 4.9 restrict glue to just the A RRs
We accelerate the TTL decline for data which arrived as
additional data. We are considering not caching it at all      under the delegation point, whereas previous versions
other than as necessary for forwarding the response – see      included all the A RRs referred to by a zone’s NS RRs –
below.                                                         even those above the zone. By “restrict” we mean that
                                                               BIND will be conservative both in what it generates and
                                                               what it accepts. This may fly in the face of the Robustness
5.3. Irrelevant Answers
                                                               Principle1 of [RFC1123], but the old behaviour was just
We check the response to ensure that all RRsets in each        simply wrong.
section have names and types that make sense in the
context of the query and answer sections. Including
spurious additional data won’t automatically pollute a         6. What We Cannot Fix
cache any more; As of BIND 4.9.3 it is necessary that
                                                               We are counting on the IETF DNSSEC effort to bring
the answer section contain a CNAME RR to introduce an
                                                               us a DNS protocol revision that authoritatively signs
arbitrary name, after which it’s business as usual for cache
                                                               responses. With that in place we will all stop worrying
polluters. This is the best we can do without a protocol
                                                               about attackers who spoof their source addresses, predict
change.
                                                               our UDP port numbers and query ID numbers, and so on.
                                                               Response data will be objectively verifiable, independent
5.4. Nonmatching Answers                                       of whether it is even a response to some query we have
                                                               sent. Until DNSSEC is finished and in wide use, there are
Believe it or not, older BINDs did not check that the
                                                               some things we’re just going to have to live with.
answer name matched the query name. Now, within
the limits of CNAMEs and wildcard answers, BIND will
insist that a response answers the right question. This        6.1. Query ID Prediction
error was particularly pernicious with respect to some
                                                               With only 16 bits worth of query ID and 16 bits worth
of the name ↔ address symmetry checking, since the
                                                               of UDP port number, it’s hard not to be predictable. A
answer’s RRname sets the name in the resolver’s response
                                                               determined attacker can try all the numbers in a very short
structure, which meant that callers of gethostbyname()
                                                               time and can use patterns derived from examination of the
could end up comparing a foreign name to another
                                                               freely available BIND source code. Even if we had a white
foreign name.
                                                               noise generator to help randomize our numbers, it’s just
                                                               too easy to try them all.
5.5. Logging
Many of the detectable conditions indicating a probable        6.1.   CNAME Indirection
break-in attempt were in the past either not detected, or      As mentioned previously, a CNAME response allows a
treated as protocol errors (which is to say, silently worked   remote name server to introduce a new name for an RRset
around). BIND now fairly shrieks whenever it has even the      of arbitrary type. Forwarders receiving such a response
slightest cause for alarm, which is a mixed blessing since     should not cache those RRsets (as BIND currently does),
the volume of its complaints is so high that most name         but even with that precaution it will be possible to use a
server administrators pay no attention.                        CNAME response to bypass the name/address symmetry
     The syslog data is of greatest interest during the        checking.
post mortem analysis of a break-in attempt. The log of
unsolicited responses, for example, can show attempts
at cache pollution during the early stages – before the
attackers switched to whatever technology actually got
them in, or set off your alarms, or whatever. Be aware
while examining these logs that some systems (most
notably SunOS) cannot cause packets to come from a             1
                                                                “Be liberal in what you accept, and conservative in what you send.”
7. What We Would Like To Fix                                          7.2. Hierarchical Cache
Every change to BIND has the potential to push the                    We would like to segment the cache such that additional
Internet into the final abyss. We are therefore quite                 data can be cached for the duration of a query’s restarts,
conservative about anything that looks like it could have             but not used to satisfy other queries (either as answer data,
far reaching consequences, which is to say, just about                authority data, or additional data). Ideally, the only things
anything1.                                                            we would ever cache would be the answer and authority
                                                                      sections, and only those from authoritative answers (AA
7.1. Query Restarts                                                   flag set). BIND’s current cache design is not ready for
                                                                      this kind of overloading – we’ve pushed it about as far
Some of the information needed to properly validate a                 as it will go just by adding the credibility tags described
DNS response is expensive (in terms of bandwidth and                  earlier. What’s needed is a multilevel translucent cache
delay) to obtain, and for that reason it is inappropriate             such that each lookup can specify a stack of caches
for every resolver to exhaustively validate every response            to be searched, and each cache can be managed by an
it receives. Recursive or forwarding name servers, on                 appropriate purge policy.
the other hand, have (or should be able to obtain) all
the information the DNS has to offer, and it would be a               7.3. Empty Nonterminal Names
good thing if the name server validated responses before
forwarding them to the client. BIND does not currently do             One of the gaping holes in BIND’s new nonpromiscuous
this, since it is not possible to edit responses in situ and we       policy towards cache data is that the credibility and zone
are uncomfortable with the idea of BIND autonomously                  tags are held in the RR, not in the name. It is possible to
deciding that certain responses should not be forwarded               determine, knowing only a name, whether that name lies
at all.                                                               within any of a server’s zones of authority. BIND doesn’t
      Our current plan for circumventing this problem is to           do that right now, it currently checks the RRs looking for
restart all queries. To “restart” means that upon receiving           any that have a zone tag, and if none are found it assumes
an answer from a forwarded query, a name server will                  that it is in the cache. This is bad news in the case of
validate the response and insert “known good” data into               empty nonterminal names – those names which have no
                                                                      RRs and are only present to keep two dots from smashing
its cache, and then pretend that the original query had
“just now” been received. All the original RRsets would               into each other.
be looked up again, and if any are still missing (either                    The ARPA domain was once empty other than for
because no response has yet included them, or because                 its IN-ADDR.ARPA subdomain, and eventually someone
the responses that included them were invalid in some                 accidentally fed a root server some NS RRs at that name.
way), new queries would be generated to bring in the                  That root server told the other root servers, and those
missing data. Query restarts are the only way to solve                root servers told every name server on the Internet, and
certain other problems currently being encountered by                 pretty soon nobody anywhere could do address → name
     2                                                                translations. We quickly added some NS RRs at the ARPA
BIND – the security benefits will be a happy side effect.
      One interesting question we’re pondering about                  domain and cold started the universe.
query restarts is whether to preserve the AA flag, which                    It would be better if BIND did not need data to be
as discussed earlier will tend to be set on forwarded                 present at a name in order to know that that name was
responses if those responses come from an authoritative               inside a local zone of authority. Astute readers will note
server, but will tend to be clear on responses satisfied from         that it’s really quite easy to add new names to someone
the forwarder’s cache. We could maintain the current                  else’s authority zones – just keep in mind during your
semantics with the hierarchical cache described below,                experiments that these new names won’t appear in zone
but it’s not clear that the AA flag on forwarded responses            transfers, so you will have to infect each authoritative
really matters that much. DNSv2 will probably have a AD               name server manually.
flag – authority desired – to force forwarding in spite of
any cache. The proposed AD flag will probably have to                 7.4. Unified Zone Cut View
bypass the query restart logic described here.
                                                                      Right now the answer you’ll get for an NS query for a
                                                                      domain will depend on who you ask. If you ask a server
                                                                      of the parent zone, you will get the delegation information
                                                                      from “above” the zone cut. If you ask the a server of
1
 A Usenet article once opined, “BIND is like a train wreck inside.”   the zone itself, you will get the actual authority data (an
2
 Out of zone CNAMEs, for example.                                     NS RRset and an SOA.) We believe it would be better
in most cases to have the server for the parent zone use         9. Which BIND Version Plugs Which Hole?
its delegation data only as hints, and that it should go
out and ask the servers named therein for their view             Always assume that you need the latest BIND you can lay
of the real delegation data. This would prevent most             your hands on. Our RCS libraries have the whole sordid
of the current instances of lame delegation, since the           story, and from them we could derive a table of Versions
lameness would be detected by the server for the parent          -vs- Vulnerabilities. You can bet that the upper class of
zone where it can most likely be fixed by the local name         attackers can do this as well. Deriving that table would
server administrator. The lame data can be elided from           be a lot of work and publishing it might do more harm
delegation responses, thus preventing other servers              (giving folks the false idea that they don’t need to upgrade
from following it and having each other server syslog            their BIND) than good (letting folks see how bad things
the lameness information to their local, helpless, name          really are.) When we took over BIND, the latest version
server administrator. Naturally we would extend the logic        was UCB 4.8.3. Our first release was DECWRL 4.9, which
so that the zone servers validate their own delegation           contained quite a few security related changes. Our
information and likewise elide lame information from             current release as of this writing is ISC 4.9.31, and it also
their responses.                                                 contains quite a few security related changes.
     This unification would put a stop to the unpleasant
question, “how can both the parent and child zones               References
answer authoritatively if they are allowed to answer
differently?” We may implement a stopgap whereby                 [Bel95a]        Steven M. Bellovin. Using the Domain
parents stop setting the AA flag on referral responses –                         Name System for Syetem Break-ins. In
since the child is really the authority. Unfortunately, last                     Proceedings of the Fifth Usenix UNIX
time we changed the way we handed out referrals, some                            Security Syposium, Salt Lake City, UT.
major clients could not handle it and we had to back                             AT&T Bell Laboratories, 1995.
out to older, broken behaviour. Keeping track of client
sensitivities has become a first order task for us.              [RFC1034] Paul V. Mockapetris (ISI). RFC 1034
                                                                           – Domain Concepts and Facilities, IETF,
     What we’re wrestling with on the unification theory                   1987.
is whether the root servers should try to verify their
delegation data. With millions of zones delegated, it            [RFC1035] Paul V. Mockapetris (ISI). RFC 1035 –
could take quite a while for each root server to get this                  Domain Implementation and Specification,
done at startup time, so if we do it, it’ll have to come after             IETF, 1987.
we make the cache persistent.
                                                                 [RFC1123] R. Braden, Editor. RFC 1123 – Require-
                                                                           ments for Internet Hosts – Application and
8.   DNSSEC –    The IETF DNS Security WG                                  Support, IETF, 1989.
As we’ve mentioned several times in this paper, there            [RFC1510] John T. Kohl, et al. RFC 1510 – The
is presently work underway to add security to DNS. The                     Kerberos Network Authentication Service
current model is something like a “web of trust,” using                    (V5), IETF, 1993.
public key technology. A new KEY RR holds the public             [RFC1760] N. Haller. RFC 1760 – The S/KEY
key and is added to the delegation data. This key is                       One-Time Password System, IETF, 1995.
sufficient to validate signed answers but not to actually
sign them. Signing is done by the authoritative servers,
and the SIG RR is used to carry the signature of any given
RRset.
      Once DNSSEC is widely implemented, it will be
possible to determine from examination of a DNS
response whether its contents are authentic. This sounds
simple but it has deep reaching consequences in both the
protocol and the implementation – which is why it’s taken
more than a year to choose a security model and design a
solution. We expect it to be another year before DNSSEC
is in wide use on the leading edge, and at least a year after
that before its use is commonplace on the Internet.
                                                                 1
                                                                  see http://www.isc.org/isc/.