In this paper, we proposed a new approach called informated
replication for designing distributed systems to survive Internet
epidemics that cause catastrophic damage. Informed replication uses a
model of correlated failures to exploit
software diversity, providing high reliability with low replication
overhead. Using host diversity characteristics derived from a
measurement study of hosts on the UCSD campus, we developed and
evaluated heuristics for determining the number and placement of
replicas that have a number of attractive features. Our heuristics
provide excellent reliability guarantees (over probability that user data survives attacks of single- and double-exploit pathogens), result in low degree of replication (less than 3 copies for single-exploit pathogens; less than 5 copies for double-exploit pathogens), limit the storage burden on each host in the system, and lend themselves to a fully distributed implementation. We then used this approach in the design and implementation of a cooperative backup system called the Phoenix Recovery Service. Based upon our evaluation results, we conclude that our approach is a viable and attractive method for surviving Internet catastrophes.