OSDI '06 Abstract
Pp. 117–130 of the Proceedings
Flight Data Recorder: Monitoring Persistent-State Interactions to
Improve Systems Management
Chad Verbowski, Emre Kıcıman, Arunvijay Kumar, and Brad Daniels,
Microsoft Research; Shan Lu, University of Illinois at Urbana-Champaign;
Juhan Lee, Microsoft MSN; Yi-Min Wang, Microsoft Research; Roussi
Roussev, Florida Institute of Technology
Abstract
Mismanagement of the persistent
state of a system—all the executable files, configuration settings and other
data that govern how a system functions—causes reliability problems, security
vulnerabilities, and drives up operation costs. Recent research traces
persistent state interactions—how state is read, modified, etc.—to help
troubleshooting, change management and malware mitigation, but has been limited
by the difficulty of collecting, storing, and analyzing the 10s to 100s of
millions of daily events that occur on a single machine, much less the 1000s or
more machines in many computing environments.
We present the Flight Data
Recorder (FDR) that enables always-on tracing, storage and analysis of
persistent state interactions. FDR uses a domain-specific log format, tailored
to observed file system workloads and common systems management queries. Our lossless
log format compresses logs to only 0.5–0.9 bytes per interaction. In this log
format, 1000 machine-days of logs—over 25 billion events—can be analyzed in less
than 30 minutes. We report on our deployment of FDR to 207 production machines
at MSN, and show that a single centralized collection machine can potentially scale
to collecting and analyzing the complete records of persistent state
interactions from 4000+ machines. Furthermore, our tracing technology is
shipping as part of the Windows Vista OS.
- View the full text of this paper in HTML and PDF. Listen to the presentation in MP3 format.
Until November 2007, you will need your USENIX membership identification in order to access the full papers.
The Proceedings are published as a collective work, © 2006 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
|