T1 Clustered and Parallel Storage System Technologies UPDATED!
Brent Welch and Marc Unangst, Panasas
Cluster-based parallel storage technologies are now capable of delivering
performance scaling from 10s to 100s of GB/sec. This tutorial will examine
current state-of-the-art high-performance file systems and the underlying
technologies employed to deliver scalable performance across a range of
scientific and industrial applications.
The tutorial has two main sections. The first section describes the
architecture of clustered, parallel storage systems and then compares
several open-source and commercial systems based on this framework,
including Panasas, Lustre, GPFS, and PVFS2. In addition, we describe the
Object Storage Device (OSD) and Parallel NFS (pNFS) standards. The second
half of the tutorial is about performance, including what benchmarking tools
are available, how to use them to evaluate a storage system correctly, and
how to optimize application I/O patterns to exploit the strengths and
weaknesses of clustered, parallel storage systems.
Brent Welch is Director of Software Architecture at Panasas. Panasas
has developed a scalable, high-performance, object-based distributed
file system that is used in a variety of HPC environments, including
many of the Top500 super computers. He has previously worked at
Xerox-PARC and Sun Microsystems Laboratories. Brent has experience
building software systems from the device driver level up through
network servers, user applications, and graphical user interfaces.
While getting his PhD at UC Berkeley, he designed and built the
Sprite distributed file system. Brent participates in the IETF NFSv4
working group and is co-author of the pNFS Internet drafts that
specify parallel I/O extensions for NFSv4.1.
Marc Unangst is a Software Architect at Panasas, where he has been a
leading contributor to the design and implementation of the PanFS
distributed file system. He represents Panasas on the SPEC SFS
benchmark committee and authored draft specification documents for
the POSIX High End Computing Extensions Working Group (HECEWG).
Previously, Marc was a staff programmer in the Parallel Data Lab at
Carnegie Mellon, where he worked on the Network-Attached Storage
Device (NASD) project. He holds a Bachelors of Science in Electrical
& Computer Engineering from Carnegie Mellon.
T2 Security and Usability: What Do We Know? NEW!
Simson Garfinkel, Naval Postgraduate School
For years we've heard that security and usability are antagonistic:
secure systems aren't usable and usable systems aren't secure. New
research in the field of HCI-SEC reveals this myth for what it is. In
this tutorial we will review the past few years of research in
security and usability and see how to create systems that are both
usable and secure. We'll also discuss how to evaluate the usability of
a system in the lab, in the field, and with the necessary legal
approvals.
Simson L. Garfinkel is an Associate Professor at the Naval Postgraduate School in Monterey, CA, and a fellow at the Center for Research on Computation and Society at Harvard University. He is also the founder of Sandstorm Enterprises, a computer security firm that develops advanced computer forensic tools used by businesses and governments to audit their systems. Garfinkel has research interests in computer forensics, the emerging field of usability and security, information policy, and terrorism. He has actively researched and published in these areas for more than two decades. He writes a monthly column for CSO Magazine, for which he has been awarded four national journalism awards, and is the author or co-author of fourteen books on computing. He is perhaps best known for Database Nation: The Death of Privacy in the 21st Century and for Practical UNIX and Internet Security.
|
T3 Storage Class Memory, Technology, and Uses UPDATED!
Richard Freitas, Winfried Wilcke, Bülent Kurdi, and Geoffrey Burr, IBM Almaden Research Center
The dream of replacing the disk drive with solid-state, non-volatile
random access memory is finally becoming a reality. There are several
technologies under active research and development, such as advanced forms
of FLASH, Phase Change Memory, and Magnetic RAM. They are
collectively called Storage Class Memory (SCM). The advent of this
technology likely will have a significant impact on the design of both
future storage and memory systems.
This tutorial will give a rather detailed overview of the SCM device
technologies being developed and how they will impact the design of
storage controllers and storage systems. The device overview will
emphasize technology paths to very high bit densities, which will enable
low cost storage devices, ultimately becoming cost competitive with
enterprise disks. The system discussion will include examples of very high
I/O rate systems built with solid state storage devices.
But there is more to SCM than just its use in storage systems. SCM, by
definition, is fast enough to be used as (non-volatile) main memory,
complementing DRAM; we will lightly touch on how this will affect the
overall system architecture and software.
In conclusion, we believe that SCM will have a major impact on the overall
memory/storage stack of future systems and will, eventually, affect
software as well.
Dr. Freitas is an IBM Research Staff Member at the IBM Almaden Research Center. He received his Ph.D. degree in EECS from the University of California at Berkeley in 1976. He then joined the IBM RISC computing group at the IBM Thomas J. Watson Research Center, where he worked on the IBM 801 project. He has held various management and research positions in architecture and design for storage systems, servers, workstations, and speech recognition hardware at the IBM Almaden Research Center and the IBM T.J. Watson Research Center. His current interests include exploring the use of emerging nonvolatile solid-state memory technology in storage systems for commercial and scientific computing.
Dr. Wilcke is Program Director at the IBM Almaden Research Center. He received a Ph.D. degree in nuclear physics in 1976 from the Johann Wolfgang Goethe Universität, Frankfurt, Germany, and worked at the University of Rochester, Lawrence Berkeley Laboratory, and Los Alamos on heavy-ion and muon-induced reactions. In 1983, he joined the IBM T.J. Watson Research Center in New York, where he managed Victor and Vulcan, the first two MIMD message-passing supercomputer projects of IBM Research, which were the precursors of the very successful IBM SP* supercomputers. In 1991 he joined HaL Computer Systems, initially as Director of Architecture and later as CTO. With Sun Microsystems, his team created the 64-bit SPARC** architecture. Later, he rejoined IBM Research in San Jose, California, where he launched the IBM IceCube project, which became the first funded spinoff venture of IBM Research. Recently, Dr. Wilcke became engaged in research on storage-class memories and future systems based on such memories. In addition to his industrial work, he has published more than 100 papers, has coauthored numerous patents, and is active in aviation.
Dr. Kurdi completed his Ph.D. studies at the Institute of Optics at the University of Rochester, where he investigated silicon-based integrated optics. He holds B.S. degrees in electrical engineering and mathematics with minors in physics and philosophy from the University of Dayton. In 1989 he joined the IBM Almaden Research Center, where he has worked on integrated optical devices for magneto-optical data storage, top surface imaging techniques for the fabrication of advanced magnetic write heads, and planarization processes for magnetic head slider fabrication. He is currently the manager of the nanoscale device integration group and has been coordinating several multifaceted efforts in the area of ultra-high-density NVM devices.
Geoffrey W. Burr received his B.S. in Electrical Engineering (EE)
and B.A. in Greek Classics from the State University of New York
at Buffalo. He received his M.S. and
Ph.D. in Electrical Engineering from the California Institute of
Technology. Since that time, Dr. Burr has worked
at the IBM Almaden Research Center, where
he is currently a Research Staff Member. After many years as an experimentalist in volume holographic data
storage and optical information processing, Dr. Burr's current
research interests include nanophotonics, computational lithography,
numerical modeling for design optimization, phase change memory,
and other non-volatile memory. He is currently a Topical Editor for
Optics Letters.
T4 Web-Scale Data Management NEW!
Christopher Olston and Benjamin Reed, Yahoo! Research
A new breed of software systems is being developed to manage and process
Web-scale data sets on large clusters of commodity computers. A typical
software stack includes a distributed file system (e.g., GFS, HDFS), a
scalable data-parallel workflow system (e.g., Map-reduce, Dryad), and a
declarative scripting language (e.g., Pig Latin, Hive). These technologies
are driven primarily by the needs of large Internet companies like Google,
Microsoft, and Yahoo!, but are also finding applications in the sciences,
journalism, and other domains.
In this tutorial we survey Web-scale data management technologies, with
special focus on open-source instances. We give concrete code examples
modeled after real-world use cases at companies like Yahoo!. These
technologies have not yet reached maturity; at the end of the tutorial,
we discuss some "in-the-works" and "wish-list" features in this space.
Christopher Olston is a senior research scientist at Yahoo! Research,
working in the areas of data management and Web search. Olston is
occasionally seen behaving as a professor, having taught undergrad and grad
courses at Berkeley, Carnegie Mellon, and Stanford. He received his Ph.D. in
2003 from Stanford under fellowships from the university and the National
Science Foundation.
Benjamin Reed works on distributed computing platforms at Yahoo!
Research, where he is a research scientist. His projects include Pig and
ZooKeeper, which are both Apache sub-projects of Hadoop. In the past he
has contributed to the Linux kernel and was made an OSGI Fellow for
his work on the OSGI Java framework. He received his PhD in 2000 from
the University of California, Santa Cruz.
|