We have built a prototype of CoDNS and have been running it on all nodes on PlanetLab for roughly eight months. During that time, we have been directing the CoDeeN CDN [23] to use CoDNS for the name lookup.
CoDNS consists of a stand-alone daemon running on each node, accessible via UDP for remote queries, and via loopback TCP for locally-originated name lookups. The daemon is event-driven, and is implemented as a non-blocking master process and many (blocking) slave processes. The master process receives name lookup requests from local clients and remote peers, and passes them to one of its idle slaves. A slave process resolves those names by calling gethostbyname() and sends the result back to the master. Then, the master returns the final result to either a local client or a remote peer depending on where it originated. Queries resolving the same hostname are coalesced into one query and answered together when resolved. Preference for idle slaves is given to locally-originated requests over remote queries to ensure good performance for local users.
The master process records each request's arrival time from local clients and sends a UDP name lookup query to a peer node when the response from the slave has not returned within a certain period. This delay is used as a boundary for deciding if the local nameserver is slow. In the event that neither the local nameserver nor the remote node has responded, CoDNS doubles the delay value before sending the next remote query to another peer. In the process, whichever result that comes first will be delivered as the response for the name lookup to the client. Peers may silently drop remote queries if they are overloaded, and remote queries that fail to resolve are also discarded. Slaves may add delay if they receive a locally-generated request that fails to resolve, with the hope that remote nodes may be able to resolve such names.