NFS over RDMA 
Brent Callaghan, Sun Microsystems
Network bandwidth is growing by orders of 
magnitude. Yet conventional processing of NFS 
traffic over gigabit networks gobbles CPU. Using 
RDMA protocols, we expect NFS to make full and 
efficient use of gigabit networks.
 
The Case for 
Massive Arrays of Idle Disks (MAID) 
Dennis Colarelli, Dirk Grunwald, and Michael Neufeld, 
University of Colorado, Boulder 
The declining costs of commodity disk drives is rapidly changing the  economics of deploying large 
amounts of on-line storage. Conventional mass storage systems  typically use high performance RAID 
clusters as a disk cache, often with a file system interface. The  disk cache is backed by tape libraries 
which serve as the final repository for data. In mass storage  systems where performance is an issue 
tape may serve only as a deep archive for disaster recovery  purposes. In this case all data is stored 
on the disk farm. If a high availability system is required, the  data is often duplicated on a separate 
system, with a fail-over mechanism controlling access.  
This work explores an alternative design using massive arrays of  idle disks, or MAID. We argue 
that this storage organization provides storage densities matching  or exceeding those of tape libraries 
with performance similar to disk arrays. Moreover, we show that  through a combination of effective 
power management of individual drives and the use of cache or  migration, this performance can be 
achieved using a very small power envelope.  
We examine the issues critical to the performance, energy  consumption and practicality of sev-eral 
classes of MAID systems. The potential of MAID to save energy costs  with a relatively small 
performance penalty is demonstrated in a comparison with a  conventional RAID 0 storage array. 
 
Cooperative Backup System 
Sameh Elnikety, Rice University; Mark Lillibridge, Compaq SRC; Mike Burrows, Microsoft Research; and  Willy Zwaenepoel,  Rice University 
This paper presents the design of a novel backup system built on top  of a peer-to-peer architecture with 
minimal supporting infrastructure. The system can be deployed for  both large-scale and small-scale peer-to-peer 
overlay networks. It allows computers connected to the Internet to  back up their data cooperatively. Each 
computer has a set of partner computers and stores its backup data  distributively among those partners. In return, 
such a way as to achieve both fault-tolerance and high reliability.  This form of cooperation poses several 
interesting technical challenges because these computers have  independent failure modes, do not trust each 
other, and are subject to third party attacks.
 
Federated File Systems for Clusters with Remote Memory 
Communication 
Suresh Gopalakrishnan, Ashok Arumugam, and Liviu Iftode, 
Rutgers University 
We present the design, prototype implementation and 
initial evaluation of FedFS - a novel cluster file system 
architecture that provides a global file space by aggregating the local file systems of the cluster nodes into 
a loose federation. The federated file system (FedFS) 
is created ad-hoc for a distributed application that 
runs on the cluster, and its lifetime is limited by the 
lifetime of the distributed application. FedFS provides location-independent global file naming, load 
balancing, and file migration and replication. It relies on the local file systems to perform the file I/O 
operations.  
The local file systems retain their autonomy, in the 
sense that their structure and content do not change 
to support the federated file system. Other applications may run on the local file systems without realizing that the same file system is part of one or multiple 
FedFS. If the distributed application permits, nodes 
can dynamically join or leave the federation anytime, 
with no modifications required to the local file system 
organization. 
FedFS is implemented as an I/O library over VIA, 
which supports remote memory operations. The 
applicability and performance of the federated file 
system architecture is evaluated by building a distributed NFS file server. 
An Iterative Technique for Distilling a Workload's Important Performance Information 
Zachary Kurmas, Georgia Tech; Kimberly Keeton, HP Labs
 
Larger Disk Blocks or Not? 
Steve McCarthy, Mike Leis, and Steve Byan, Maxtor Corporation 
The recent annual compound growth rate of disk drive areal density  has been 100% - a doubling of capacity every year. This growth rate  is faster than MooreÕs Law - advances in disk technology have been  outpacing advances in semiconductor technology. Part of the reason  for this spectacular growth rate is that areal density is a  two-dimensional problem. Succeeding product generations increase  both the number of tracks per inch (TPI) radially and the number of  linear bits per inch (BPI) circumferentially. However, both  parameters are facing technical challenges that may slow the rate of  capacity growth. In this paper, we will briefly examine some of the  obstacles to increased BPI and propose an increase in sector size as  an aid to surmounting them. 
Lazy Parity Update: A Technique to Improve Write I/O Performance of Disk Array Tolerating Double Disk Failures 
Young Jin Nam, Dae-Woong Kim, Tae-Young Choe, and Chanik Park, Pohang University of Science and Engineering, Kyungbuk, Republic of Korea 
The Armada Framework for Parallel I/O on Computational Grids 
Ron Oldfield and David Kotz, Dartmouth College 
IBM Storage Tank:
A Distributed Storage System 
D. A. Pease, R. M. Rees, W. C. Hineman, D. L. Plantenberg,
R. A. Becker-Szendy, R. Ananthanarayanan, M. Sivan-Zimet,
C. J. Sullivan, IBM Almaden Research Center; R. C. Burns, Johns
Hopkins University; D. D. E. Long, University of California, Santa
Cruz 
IBM Storage Tank is a SAN-based distributed object storage system  for use in heterogeneous 
environments. It provides performance comparable to that of file  systems built on bus-attached, 
high-performance storage, as well as advanced storage and data  management functions. It is 
designed to be highly available and scalable. The Storage Tank  project has been underway at 
IBM's Almaden Research Center for several years.  
Storage Tank is designed to work with any Storage Area Network  architecture, as well as with any 
SAN storage hardware. (It currently runs on both Fibre Channel and  iSCSI SANs.) It is also 
designed to be portable to essentially any host system architecture.  
This paper provides a high-level overview of Storage Tank's design  and features. 
Data Placement Based on the Seek Time Analysis of a MEMS-based Storage Device 
Zachary N. J. Peterson, Scott A. Brandt, Darrell D. E. Long, University of California, Santa Cruz 
Reducing access times to secondary I/O devices has 
long been the focus of many systems researchers. 
With traditional disk drives, access time is the composition 
of transfer time, seek time and rotational latency, 
so many techniques as to minimize these factors, 
such as ordering I/O requests or intelligently 
placing data, have been developed. MEMS-based 
storage devices are seen by many as a replacement 
or an augmentation for modern disk drives, but algorithms 
for reducing access time for MEMS-based 
storage are still poorly understood. These devices, 
based on MicroElectroMechanical systems (MEMS), 
use thousands of active read/write heads working in 
parallel on a non-rotating magnetic substrate, eliminating 
rotational latency from the access time equation. 
This leaves seek time as the dominant variable. 
Therefore, new data layout techniques based 
on minimizing the unique seek time characteristics 
of a MEMS-based storage device can be developed. 
This paper begins to examine the access qualities of 
a MEMS-based storage device, and based on experimental 
simulation, develops an understanding of the 
seek time characteristics of such a device. These 
characteristics then allow us to identify equivalent 
regions in which to place data for improved access. 
Logistical Networking Research and the Network Storage Stack  
James S. Plank, Micah Beck, and Terry Moore, 
University of Tennessee 
Enhancing NFS Cross-Administrative Domain Access 
Joseph Spadavecchia and Erez Zadok, Stony Brook University 
The access model of exporting NFS volumes to clients
suffers from two problems. First, the server depends on
the client to specify the user credentials to use and has
no flexible mechanism to map or restrict the credentials
given by the client. Second, there is no mechanism to
hide data from users who do not have privileges to access
it. Although NFSv4 promises to fix the first problem us-ing
universal identifiers, it does not provide a mechanism
for hiding data and is not expected to be in wide use for
a long time. 
We address these problems by a combination of two
solutions. First, range-mapping is a mechanism that allows
the NFS server to restrict and flexibly map the credentials
set by the client. Second, file-cloaking allows the
server to control the data a client is able to view or access
beyond normal Unix semantics. Our design is compatible
with all versions of NFS, including NFSv4. We have
implemented this work in Linux and made changes only
to the NFS server code; client-side NFS and the NFS protocol
remain unchanged. Our evaluation shows a minimal
average performance overhead and, in some cases,
an end-to-end performance improvement. 
StorageAgent: An Agent-based Approach for Dynamic Resource Sharing in a Storage Service Provider (SSP) Infrastructure 
Sandeep Uttamchandani, IBM Almaden Research Center 
In a SSP Infrastructure, the resources of the Storage Server namely  cache, memory and CPU are shared in an ad-hoc 
manner among the clients. These resources play an important role in  determining the overall Throughput and 
Latency of data-access. In this paper, we propose StorageAgent: A  systematic, secure and efficient approach for 
distributing resources. Built on agent-based semantics for dynamic  resource sharing, StorageAgent achieves the 
following goals. First, there is an efficient utilization of  available resources as there are well-defined semantics for 
lending and reclaiming resources. Second, security of data is  ensured as access to borrowed resources is controlled 
solely by trusted-agents. Third, fine-grain control and metering of  resources used by individual clients is possible. 
Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System 
An-I A. Wang, Peter Reiher, and Gerald J. Popek, University of California, Los Angeles; Geoffrey H. Kuenning, Harvey Mudd College 
Conquest is a disk/persistent-RAM hybrid file system that 
is incrementally deployable and realizes most of the benefits 
of cheaply abundant persistent RAM. Conquest consists 
of two specialized and simplified data paths for in-core 
and on-disk storage and outperforms popular disk-based 
file systems by 43% to 97%. 
  
  
  
    
  |