Check out the new USENIX Web site. next up previous
Next: Acknowledgements Up: Surviving Internet Catastrophes Previous: Recovering from a catastrophe


Conclusions

In this paper, we proposed a new approach called informated replication for designing distributed systems to survive Internet epidemics that cause catastrophic damage. Informed replication uses a model of correlated failures to exploit software diversity, providing high reliability with low replication overhead. Using host diversity characteristics derived from a measurement study of hosts on the UCSD campus, we developed and evaluated heuristics for determining the number and placement of replicas that have a number of attractive features. Our heuristics provide excellent reliability guarantees (over $0.99$ probability that user data survives attacks of single- and double-exploit pathogens), result in low degree of replication (less than 3 copies for single-exploit pathogens; less than 5 copies for double-exploit pathogens), limit the storage burden on each host in the system, and lend themselves to a fully distributed implementation. We then used this approach in the design and implementation of a cooperative backup system called the Phoenix Recovery Service. Based upon our evaluation results, we conclude that our approach is a viable and attractive method for surviving Internet catastrophes.



next up previous
Next: Acknowledgements Up: Surviving Internet Catastrophes Previous: Recovering from a catastrophe
Flavio Junqueira 2005-02-17