While the Domain Name System (DNS) was intended to be a scalable, distributed means of performing name-to-IP mappings, its flexible design has allowed it to grow far beyond its original goals. While most people would be familiar with it for Web browsing, many systems depend on fast and consistent DNS performance. Mail servers, Web proxy servers, and content distribution networks (CDNs) must all resolve hundreds or even thousands of DNS names in short periods of time, and a failure in DNS may cause a service failure, rather just delays.
The server-side infrastructure of DNS consists of hierarchically-organized name servers, with central authorities providing ``root'' servers and others delegated organizations handling ``top-level'' servers, such as ``.com'' and ``.edu''. Domain name owners are responsible for providing servers that handle queries for their names. While DNS users can manually query each level of the hierarchy in turn until the complete name has been resolved, most systems delegate this task to local nameserver machines. This approach has performance advantages (e.g., caching replies, consolidating requests) as well as management benefits (e.g., fewer machines to update with new software or root server lists).
With local nameserver cache hit rates approaching
90% [9,24],
their performance impact can
eclipse that of the server-side DNS infrastructure. However, local
nameserver performance and reliability has not been well studied, and
since it handles all DNS lookups for clients, its failure can disable
other systems. Our experiences with building the CoDeeN content
distribution network, running on over 100 PlanetLab
nodes [23], motivated us to investigate this issue, since all
CoDeeN nodes use the local nameservers at their hosting sites.