4th USENIX Conference on File and Storage TechnologiesAbstract
Pp. 281294 of the Proceedings
TAPER: Tiered Approach for Eliminating Redundancy in Replica
Synchronization
Navendu Jain and Mike Dahlin, University of Texas at Austin; Renu Tewari, IBM Almaden Research Center
Abstract
We present TAPER, a scalable data replication protocol
that synchronizes a large collection of data across multiple
geographically distributed replica locations. TAPER
can be applied to a broad range of systems, such as software
distribution mirrors, content distribution networks,
backup and recovery, and federated file systems. TAPER
is designed to be bandwidth efficient, scalable and
content-based, and it does not require prior knowledge
of the replica state. To achieve these properties, TAPER
provides: i) four pluggable redundancy elimination
phases that balance the trade-off between bandwidth savings
and computation overheads, ii) a hierarchical hash
tree based directory pruning phase that quickly matches
identical data from the granularity of directory trees to
individual files, iii) a content-based similarity detection
technique using Bloom filters to identify similar files,
and iv) a combination of coarse-grained chunk matching
with finer-grained block matches to achieve bandwidth
efficiency. Through extensive experiments on various
datasets, we observe that in comparison with rsync, a
widely-used directory synchronization tool, TAPER reduces
bandwidth by 15% to 71%, performs faster matching,
and scales to a larger number of replicas.
- View the full text of this paper in HTML and PDF.
Until December 2006, you will need your USENIX membership identification in order to access the full papers. The Proceedings are published as a collective work, © 2005 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
- If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
|