Check out the new USENIX Web site. next up previous
Next: Background & Analysis Up: CoDNS: Improving DNS Performance Previous: CoDNS: Improving DNS Performance

Introduction


The Domain Name System (DNS) [15] has become a ubiquitous part of everyday computing due to its effectiveness, human-friendliness, and scalability. It provides a distributed lookup service primarily used to convert from human-readable machine names to Internet Protocol (IP) addresses. Its existence has permeated much of computing via the World Wide Web's near-complete dependence on it. Thanks in part to its redundant design, aggressive caching, and flexibility, it has become a ubiquitous part of everyday computing that most people take for granted, including researchers.

Most DNS research focuses on ``server-side'' problems, which arise on the systems that translate names belonging to the group that runs them. Such problems include understanding name hierarchy misconfiguration [5,9] and devising more scalable distribution infrastructure [4,10,18]. However, due to increasing memory sizes and DNS's high cachability, ``client-side'' DNS hit rates are approaching 90% [9,24], so fewer requests are dependent on server-side performance. The client-side components are responsible for contacting the appropriate servers, if necessary, to resolve any name presented by the user. This infrastructure, which has received less attention, is our focus - understanding client-side behavior in order to improve overall DNS performance and reliability.

Using PlanetLab [16], a wide-area distributed testbed, we locally monitor the client-side DNS infrastructure of 150 sites around the world, generating a large-scale examination of client-side DNS performance. We find that client-side failures are widespread and frequent, and that their effects degrade DNS performance and reliability. The most common problems we observe are intermittent failures to receive any response from the local nameservers, but these are generally hidden by the internal redundancy in DNS deployments. However, the cost of such redundancy is additional delay, and we find that the delays induced through such failures often dominate the time spent waiting on DNS lookups.

To address these client-side problems, we have developed CoDNS, a lightweight, cooperative DNS lookup service that can be independently and incrementally deployed to augment existing nameservers. CoDNS uses an insurance-like model of operation - groups of mutually trusting nodes agree to resolve each other's queries when their local infrastructure is failing. We find that the group size does not need to be large to provide substantial benefits - groups of size 2 provide roughly half the maximum possible benefit, and groups of size 10 achieve almost all of the possible benefit. Using locality-enhancement techniques and proximity optimizations, CoDNS achieves low-latency, low-overhead name resolution, even in the presence of local DNS delays/failures.

CoDNS has been serving live traffic on PlanetLab since October 2003, providing many benefits over standard DNS. CoDNS reduces average lookup latency by 27-82%, greatly reduces slow lookups, and improves DNS availability by an extra '9', from 99% to over 99.9%. Its service is more reliable and consistent than any individual node's. Additionally, CoDNS has salvaged ``unusable'' nodes, which had such poor local DNS infrastructure that they were unfit for normal use. Applications using CoDNS often have faster and more predictable start times, improving availability.



next up previous
Next: Background & Analysis Up: CoDNS: Improving DNS Performance Previous: CoDNS: Improving DNS Performance
KyoungSoo Park 2004-10-02