T2 Solid-State Storage: Technology, Design, and Application UPDATED!
Richard Freitas and Larry Chiu, IBM Almaden Research Center
Most system designers dream of replacing slow, mechanical storage
(disk drives) with fast, non-volatile memory. The advent of inexpensive
solid-state disks (SSDs) based on flash memory technology and,
eventually, on storage class memory technology is bringing
this dream closer to reality.
This tutorial will briefly examine the leading solid-state memory
technologies and then focus on the impact the introduction of
such technologies will have on storage systems. It will include
a discussion of SSD design, storage system architecture, applications,
and performance assessment.
Richard Freitas is a Research Staff Member at the IBM Almaden Research
Dr. Freitas received his PhD in EECS from the University of California
at Berkeley in 1976. He then joined IBM at the IBM T.J. Watson
Research Lab. He has held various management and research positions
in architecture and design for storage systems, servers, workstations,
and speech recognition hardware at the IBM Almaden Research Center
and the IBM T.J. Watson Research Center. His current interest lies
in exploring the use of emerging nonvolatile solid state memory
technology in storage systems for commercial and scientific computing.
Larry Chiu is Storage Research Manager and a Senior Technical Staff
Member at the IBM Almaden Research Center. He co-founded the SAN Volume Controller
product, a leading storage virtualization engine which has held
the fastest SPC-1 benchmark record for several years. In 2008,
he led a research team in the US and in the UK to demonstrate one million
IOPS storage system using solid state disks. He is currently working
on expanding solid state disk use cases in enterprise system and
software. He has an MS in computer engineering from the
University of Southern California and another MS in
technology commercialization from the University of Texas at Austin.
T3 Storage and Network Deduplication Technologies NEW!
Michael Condict, NetApp
Economic and environmental concerns are currently motivating a push
across the computing industry to do more with less: less energy and
less money. Deduplication of data is one of the most effective tools
to accomplish this. Removing redundant copies of stored data reduces
hardware requirements, lowering capital expenses and using less power.
Avoiding sending the same data repeatedly across a network increases
the effective bandwidth of the link, reducing networking expenses.
This tutorial will provide a detailed look at the multitude of ways deduplication can be used to improve the efficiency of storage
and networking devices. It will consist of two parts.
The first part will introduce the basic concepts of deduplication and
compare it to the related technique of file compression. A taxonomy of
basic deduplication techniques will be covered, including the unit of
deduplication (file, block, or variable-length segment), the deduplication
scope (file system, storage system, or cluster), in-line vs. background
deduplication, trusted fingerprints, and several other design choices.
The relative merits of each will be analyzed.
The second part will discuss advanced techniques, such as the use of
fingerprints other than a content hash to uniquely identify data,
techniques for deduplicating across a storage cluster, and the use of
deduplication within a client-side cache.
Michael Condict received his BS in mathematics at Lehigh University
in 1976 and an MS in computer science from Cornell University in 1981. He worked on the first Ada compiler while a research scientist at New
York University, investigated circuit-design languages at Bell Labs,
Murray Hill, and contributed to the Amoeba OS project under Andrew
Tannenbaum at the Free University, Amsterdam. Returning to industry, he spent seven years in the Open Software Foundation Research Institute,
helping to design and build the Mach
micro-kernel-based version of OSF/1 and also OSF/AD, the version that
ran on several commercial massively parallel computing systems.
Following this he joined several startups, including InfoLibria (Web
caching), Oryxa (component-based storage programming), and BladeLogic
(data-center automation), the last of which went public in the summer
Currently he is a member of the Advanced Technologies Group at NetApp,
where his research interests include deduplication and the
innovative use of flash technology.
T4 Clustered and Parallel Storage System Technologies UPDATED!
Marc Unangst, Panasas
Cluster-based parallel storage technologies are now capable of delivering
performance scaling from 10s to 100s of GB/sec. This tutorial will examine
state-of-the-art high-performance file systems and the underlying
technologies employed to deliver scalable performance across a range of
scientific and industrial applications.
The tutorial has two main sections. In the first section, we will describe the
architecture of clustered, parallel storage systems, including the Parallel
NFS (pNFS) and Object Storage Device (OSD) standards. We will compare several
open-source and commercial parallel file systems, including Panasas, Lustre,
GPFS, and PVFS2. We will also discuss the impact of solid state disk technology
on large-scale storage systems. The second half of the tutorial will cover
performance, including what benchmarking tools are available, how to use
them to evaluate a storage system, and how to optimize application
I/O patterns to exploit the strengths and weaknesses of clustered, parallel
Marc Unangst is a Software Architect at Panasas, where he has been a
leading contributor to the design and implementation of the PanFS
distributed file system. He represents Panasas on the SPEC SFS
benchmark committee and he authored draft specification documents for
the POSIX High End Computing Extensions Working Group (HECEWG).
Previously, Marc was a staff programmer in the Parallel Data Lab at
Carnegie Mellon, where he worked on the Network-Attached Storage
Device (NASD) project. He holds a BS in electrical
and computer engineering from Carnegie Mellon.