HotOS IX — Paper

HotOS IX Paper [HotOS IX Program Index]

Crash-Only Software

George Candea and Armando Fox

Appears in Proceedings of the 9th Workshop on Hot Topics in Operating Systems (HotOS-IX), Lihue, Hawaii, May 2003

Abstract

Crash-only programs crash safely and recover quickly. There is only one way to stop such software -- by crashing it -- and only one way to bring it up -- by initiating recovery. Crash-only systems are built from crash-only components, and the use of transparent component-level retries hides intra-system component crashes from end users. In this paper we advocate a crash-only design for Internet systems, showing that it can lead to more reliable, predictable code and faster, more effective recovery. We present ideas on how to build such crash-only Internet services, taking successful techniques to their logical extreme.

1. Occam's Razor and the Restart Potpourri

There are many reasons to restart software, and many ways to do it. Studies have shown that a main source of downtime in large scale software systems is caused by intermittent or transient bugs [12,20,19,1]. Most non-embedded systems have a variety of ways to stop; for example, an operating system can shut down cleanly, panic, hang, crash, lose power, etc.

When shutting down programs cleanly, unavailability consists of the time to shut down and the time to come back up; when crash-rebooting, unavailability consists only of the time to recover. Ironically, shutting down and reinitializing can sometimes take longer than recovering from a crash. Table 1 illustrates a casual comparison of reboot times; no important data was lost in either of the experiments.

System	Clean reboot	Crash reboot
RedHat 8 (with ext3fs)	104 sec	75 sec
JBoss 3.0 application server	47 sec	39 sec
Windows XP	61 sec	48 sec

Table 1. Duration of clean vs. crash reboots.

It is impractical to build a system that is guaranteed to never crash, even in the case of carrier class phone switches or high end mainframe systems. Since crashes are unavoidable, software must be at least as well prepared for a crash as it is for a clean shutdown. But then -- in the spirit of Occam's Razor -- if software is crash-safe, why support additional, non-crash mechanisms for shutting down? A frequent reason is the desire for higher performance.

For example, to avoid slow synchronous disk writes, many UNIX file systems cache metadata updates in memory. As a result, when a UNIX workstation crashes, the file system reaches an inconsistent state that takes a lengthy fsck to repair, an inconvenience that could have been avoided by shutting down cleanly. This captures the design tradeoff that improves steady state performance at the expense of shutdown and recovery performance. In the face of inevitable crashes, such a file system turns out to be brittle: a crash can lose data and, in some cases, the post-crash inconsistency cannot even be repaired. Not only do such performance tradeoffs impact robustness, but they also lead to complexity by introducing multiple ways to manipulate state, more code, and more APIs. The code becomes harder to maintain and offers the potential for more bugs---a fine tradeoff, if the goal is to build fast systems, but a bad idea if the goal is to build highly available systems. If the cost of such performance enhancements is dependability, perhaps it's time to reevaluate our design strategy.

In earlier work, we used recursive micro-reboots to improve the availability of a soft-state system that was trivially crash-safe [3]. In this paper we advocate a crash-only design (i.e., crash safety + fast recovery) for Internet systems, a class distinguished by the following properties: large scale, stringent high availability requirements, built from many heterogenous components, accessed over standard request-reply protocols such as HTTP, serving workloads that consist of large numbers of relatively short tasks that frame state updates, and subjected to rapid and perpetual evolution. We restrict our attention to single installations that reside inside one data center and do not span administrative domains.

In high level terms, a crash-only system is defined by the equations stop=crash and start=recover. In the rest of the paper, we describe the benefits of the crash-only design approach by analogy to physics, describe the internal properties of components in a crash-only system, the architectural properties governing the interaction of components, and a restart/retry architecture that exploits crash-only design, including our work to date on a prototype using J2EE.

2. Why Crash-Only Design ?

Mature engineering disciplines rely on macroscopic descriptive physical laws to build and understand the behavior of physical systems. These sets of laws, such as Newtonian mechanics, capture in simple form an observed physical invariant. Software, however, is an abstraction with no physical embodiment, so it obeys no physical laws. Computer scientists have tried to use prescriptive rules, such as formal models and invariant proofs, to reason about software. These rules, however, are often formulated relative to an abstract model of the software that does not completely describe the behavior of the running system (which includes hardware, an operating system, runtime libraries, etc.). As a result, the prescriptive models do not provide a complete description of how the implementation behaves in practice, because many physically possible states of the complete system do not correspond to any state in the abstract model.

With the crash-only property, we are trying to impose, from outside the software system, macroscopic behavior that coerces the system into a simpler, more predictable universe with fewer states and simpler invariants. Each crash-only component has a single idempotent ``power-off switch'' and a single idempotent ``power-on switch''; the switches for larger systems are built by wiring together their subsystems' switches in ways described by section 3. A component's power-off switch implementation is entirely external to the component, thus not invoking any of the component's code and not relying on correct internal behavior of the component. Examples of such switches include kill -9 sent to a UNIX process, or turning off the physical, or virtual, machine that is running some software inside it.

Keeping the power-off switch mechanism external to components makes it a high confidence ``component crasher.'' Consequently, every component in the system must be prepared to suddenly be deactivated. Power-off and power-on switches provide a very small repertoire of high-confidence, simple behaviors, leading to a small state space. Of course, the ``virtual shutdown'' of a virtual machine, even if invoked with kill -9, has a much larger state space than the physical power switch on the workstation, but it is still vastly simpler than the state space of a typical program hosted in the VM, and it does not vary for different hosted programs. Indeed, the fact that virtual machines are relatively small and simple compared to the programs they host has been successfully invoked as an argument for using VMs for inter-application isolation [26].

Recovery code deals with exceptional situations, and must run flawlessly. Unfortunately, exceptional situations are difficult to handle, occur seldom, and are not trivial to simulate during development; this often leads to unreliable recovery code. In crash-only systems, however, recovery code is exercised every time the system starts up, which should ultimately improve its reliability. This is particularly relevant given that the rate at which we reduce the number of bugs per Klines of code lags behind the rate at which the number of Klines per system increases, with the net result being that the number of bugs in an evolving system increases with time [7]. More bugs mean more failures, and systems that fail more often will need to recover more often.

Many of the benefits resulting from a crash-only design have been previously obtained in the data storage/retrieval world with the introduction of transactions. Our approach aims for a similar effect on the failure properties of Internet systems---crash-only design is in many ways a generalization of the transaction model. It is important to note that Internet applications do not have to use transactions in order to be crash-only; in fact, ACID semantics can sometimes lead to overkill. For example, session data accumulates information at the server over a series of user service requests, for use in subsequent operations. It is mostly single-reader/single-writer, thus not requiring ordering and concurrency control. The richness of a query language like SQL is unnecessary, and session state usually does not persist beyond a few minutes. These observations are leveraged by SSM [18], a crash-only hashtable-like session state store.

A crash-only system makes it affordable to transform every detected failure into component-level crashes; this leads to a simple fault model, and components only need to know how to recover from one type of failure. For example, [21] forced all unknown faults into node crashes, allowing the authors to improve the availability of a clusterized web server. Existing literature often assumes unrealistic fault models (e.g., that failures occur according to well-behaved tractable distributions); a crash-only design enables aggressive enforcement of such desirable fault models, thus increasing the impact of prior work. If we state invariants about the system's failure behavior and make such behavior predictable, we are effectively coercing reality into a small universe governed by well-understood laws.

Moreover, a system in which crash-recovery is cheap allows us to micro-reboot suspect components before they fail. By aggressively employing software rejuvenation [16] -- rebooting in order to stave off failure induced by resource exhaustion -- we can prevent failures altogether. The system can immediately trigger component-level rejuvenation whenever it notices fail-stutter behavior [2], a trough in offered workload, or based on predictive mathematical models of software aging [10].

Finally, if we admit that most failures can be recovered by micro-rebooting, crashing every suspicious component could shorten the fault detection and diagnosis time---a period that sometimes lasts longer than repair itself.

3. Properties of Crash-Only Software

In this section we describe a set of properties that we deem sufficient for a system to be crash-only. In some systems, some of these properties may not be necessary for crash-only behavior.

To make components crash-only, we require that all important non-volatile state be kept in dedicated state stores, that state stores provide applications with the right abstractions, and that state stores be crash-only. To make a system of interconnected components crash-only, it must be designed so that components can tolerate the crashes and temporary unavailability of their peers. This means we require strong modularity with relatively impermeable component boundaries, timeout-based communication and lease-based resource allocation, and self-describing requests that carry a time-to-live and information on whether they are idempotent. Many Internet systems today have some subset of these properties, but we do not know of any that combines all properties into a true crash-only system.

In section 4 we will show how crash-only components can be glued together into a robust Internet system based on a restart/retry architecture; in the rest of this section we describe in more detail the properties of crash-only systems. Some relate to intra-component state management, while others relate to inter-component interactions. We recognize that some of these sacrifice performance, but we strongly believe the time has come for robustness to reclaim its status as a first-class citizen.

3.1 Intra-Component Properties

In today's Internet applications there are a small number of types of state: transactional persistent state, single-reader/single-writer persistent state (e.g., user profiles, that almost never see concurrent access), expendable persistent state (server-side information that could be sacrificed for the sake of correctness or performance, such as clickstream data and access logs), session state (e.g., the result set of a previous search, subject to refinement), soft state (state that can be reconstructed at any time based on other data sources), and volatile state. While differentiated mostly by guaranteed lifetime, the requirements for these categories of state lead to qualitatively different implementations.

All important non-volatile state is managed by dedicated state stores, leaving applications with just program logic. Specialized state stores (e.g., relational and object-oriented databases, file system appliances, distributed data structures [14], non-transactional hashtables [15], session state stores [18], etc.) are much better suited to manage state than code written by developers with minimal training in systems programming. Applications become stateless clients of the state stores, which allows them to have simpler and faster recovery routines. A popular example of such separation can be found in three-tier Internet architectures, where the middle tier is largely stateless and relies on backend databases to store data.

These state stores must also be crash-only, otherwise the problem has just moved down one level. Many commercial off-the-shelf state stores available today are crash-safe (i.e., they can be crashed without loss of data), such as databases and the various network-attached storage devices, but most are not crash-only, because they recover slowly. Many products, however, offer tuning knobs that permit the administrator to trade performance for improved recovery time, such as making checkpoints more often in the Oracle DBMS [17]. An example of a pure crash-only state store is the Postgres database system [25], which uses non-overwriting storage and maintains all data in a single append-only log. Although it trades away some performance, Postgres achieves practically instantaneous recovery, because it only needs to mark the uncommitted transactions as aborted.

The abstractions and guarantees provided by state stores must be congruent with application requirements. This means that the state abstraction exported by the state store is not too powerful (e.g., offering a SQL interface with ACID semantics for storing and retrieving simple key-value tuples) and not too weak. A state abstraction that is too weak will require client components to do some amount of state management, such as implementing a customer record abstraction over an offered file system interface. Good state abstractions allow applications to operate at their ``natural'' semantic level. Offering the weakest state guarantees that satisfy the application allows us to exploit application semantics and build simpler, faster, more reliable state stores.

For example, Berkeley DB [22] is a storage system supporting B+tree, hash, and record abstractions. It can be accessed through four different interfaces, ranging from no concurrency control/no transactions/no disaster recovery to a multi-user, transactional API with logging, fine-grained locking, and support for data replication. Applications can use the abstraction that is right for their purposes and the underlying state store optimizes its operation to fit those requirements. Workload characteristics can also be leveraged by state stores; e.g., expecting a read-mostly workload allows a state store to utilize write-through caching, which can significantly improve recovery time and performance.

We do not advocate that every application have its own set of state stores. Instead, we believe Internet systems will standardize on a small number of state store types: ACID stores (e.g., databases for customer and transaction data), non-transactional persistent stores (e.g., DeStor [15], a crash-only system specialized in handling non-transactional persistent data, like user profiles), session state stores (e.g., SSM [18] for shopping carts), simple read-only stores (e.g., file system appliances for static HTML and images), and soft state stores (e.g., web caches). If we think carefully about the state abstractions required by each application component and use suitable state stores, we can make these components crash-only.

3.2 Inter-Component Properties

Subsystems that crash on their own, or that are explicitly crash-rebooted for recovery, will temporarily become unavailable to serve requests. For a crash-only system to gracefully tolerate such behavior, we need to decouple components from each other, from the resources they use, and from the requests they process.

Components have externally enforced boundaries that provide strong fault containment. The desired isolation can be achieved with virtual machines, isolation kernels [26], task-based intra-JVM isolation [24,8], OS processes, etc. Indeed, the Denali isolation kernel is designed for such encapsulation. Web hosting service providers often use multiple virtual machines on one physical machine to offer their clients individual web servers they can administer at will, without affecting other customers. The boundaries between components delineate distinct, individually recoverable stages in the processing of requests.

All interactions between components have a timeout. This includes explicit communication as well as RPC: if no response is received to a call within the allotted timeframe, the caller assumes the callee has failed and reports it to a recovery agent [5], which can crash-restart the callee if appropriate. Crash-restarting helps ensure the called component is in a known state; this is safe because the component is crash-safe and crash-restart is idempotent. Timeouts provide an orthogonal mechanism for turning all non-Byzantine failures, both at the component level and at the network level, into fail-stop events (i.e., the failed entity either provides results or is stopped), even though the components are not necessarily fail-stop. Such behavior is easier to accomodate, and containment of faults is improved.

Crash-only components recover quickly, thus recovery is very cheap. Under such circumstances, it becomes acceptable for the recovery manager to crash-recover suspect components even when it lacks the certainty that those components have indeed failed; the downtime risk of letting the components run may be higher than crash-rebooting healthy components.

All resources are leased, rather than permanently allocated, to ensure that resources are not coupled to the components using them. Resources include many types of persistent state, such as account profiles for a free e-mail provider: every time the user logs in, a 6-month lease is renewed; when the lease expires, all associated data can be purged from the system. It also includes CPU resources: if a computation is unable to renew its execution lease, it is terminated by a high confidence watchdog [9]. For example, in PHP, a server-side scripting language used for writing dynamic web pages, runaway scripts are killed and an error is returned to the web browser. Leases [11] give us the ability to reason about the conditions that hold true of the system's resources after a lease expires. Infinite timeouts or leases are not acceptable; the maximum-allowed timeout and lease are specified in an application-global policy. This way it is less likely that the system will hang or become blocked.

Requests are entirely self-describing, by making the state and context needed for their processing explicit. This allows a fresh instance of a rebooted component to pick up a request and continue from where the previous instance left off. Requests also carry information on whether they are idempotent, along with a time-to-live; both idempotency and TTL information can initially be set at the system boundary, such as in the web tier. For example, the TTL may be determined by load or service level agreements, and idempotency flags can be based on application-specific information (which can be derived, for instance, from URL substrings that determine the type of request). Many interesting operations in an Internet service are idempotent, or can easily be made idempotent by keeping track of sequence numbers or by wrapping requests in transactions; some large Internet services have already found it practical to do so [23]. Over the course of its lifetime, a request will split into multiple sub-operations, which may rejoin, in much the same way nested transactions do. Recovering from a failed idempotent sub-operation entails simply reissuing it; for non-idempotent operations, the system can either roll them back, apply compensating operations, or tolerate the inconsistency resulting from a retry. Such transparent recovery of the request stream can hide intra-system component failures from the end user.

4. A Restart/Retry Architecture

A component infers failure of a peer component either based on a raised exception or a timeout. When a component is reported failed, a recovery agent may crash-reboot it; the idempotency of crash-shutdown makes this an inexpensive way to ensure the component is indeed turned off before attempting recovery. Components waiting for an answer from the restarted component receive a RetryAfter(n) exception, indicating that the in-flight requests can be re-submitted after $n$ msec (the estimated time to recover); this exception is purely an optimization because, in its absence, components would have timeouts as a fallback mechanism. If the request is idempotent and its time-to-live allows it to be resubmitted, then the requesting component does so. Otherwise, a failure exception is propagated up the request chain until either a previous component decides to resubmit, or the client needs to be notified of the failure. The web front-end issues an HTTP/1.1 Retry-After directive to the client with an estimate of the time to recover, and retry-capable clients can resubmit the original HTTP request.

Figure 1. A simple restart/retry architecture.

In Figure 1 we show a simple restart/retry example, in which a request to view a shopping cart splits inside the application server into one subrequest to a stateful session EJB (Enterprise JavaBean) that communicates with a session state store and another subrequest to a stateless session EJB. Should the state store become unavailable, the application server either receives a RetryAfter exception or times out, at which time it can decide whether to resubmit the request to SSM or not. Within each of the subsystems shown in Figure 1, we can imagine each subrequest further splitting into finer grain subrequests submitted to the respective subsystems' components. We have implemented crash-restarting of EJBs in a J2EE application server; an EJB-level micro-reboot takes less than a second [5].

Timeout-based failure detection is supplemented with traditional heartbeats and progress counters. The counters---compact representations of a component's processing progress---are usually placed at state stores and in messaging facilities, where they can map state access and messaging activity into per-component progress. Many existing performance monitors can be transformed into progress monitors by augmenting them with request origin information. Components themselves can also implement progress counters that more accurately reflect application semantics, but they are less trustworthy, because they are inside the components.

The dynamics of loosely coupled systems can sometimes be surprising. For example, resubmitting requests to a component that is recovering can overload it and make it fail again; for this reason, the RetryAfter exceptions provide an estimated time-to-recover. This estimated value can be used to spread out request resubmissions, by varying the reported time-to-recover estimate across different requestors. A maximum limit on the number of retries is specified in the application-global policy, along with the lease durations and communication timeouts. These numbers can be dynamically estimated based on historical information collected by a recovery manager [5], or simply captured in a static description of each component, similar to deployment descriptors for EJBs. In the absence of such hints, a simple load balancing algorithm or exponential backoff can be used.

To prevent reboot cycles and other unstable conditions during recovery, it is possible to quiesce the system when a set of components is being crash-rebooted. This can be done at the communication/RPC layer, or for the system as a whole. In our prototype, we use a stall proxy [5] in front of the web tier to keep new requests from entering the system during the recovery process. Since Internet workloads are typically made of short running requests, the stall proxy transforms brief system unavailability into temporarily higher latency for clients. We are exploring modifications to the Java RMI layer that would allow finer grain request stalling.

5. Discussion

Building crash-only systems is not easy; the key to widespread adoption of our approach will require employing the right architectural models and having the right tools. With the recent success of component-based architectures (e.g., J2EE and .Net), and the emergence of the application server as an operating system for Internet applications, it is possible to provide many of the crash-only properties in the platform itself. This would allow all applications running on that platform to take advantage of the effort and become crash-only.

We are applying the principles described here to an open-source Java 2 Enterprise Edition (J2EE) application server. We are separating the individual J2EE services (naming, directory lookup, messaging, etc.) into well-isolated components, implementing requests as self-describing continuations, modifying the RMI layer to allow for timeout-based operation, modifying the EJB containers to implement lease-based resource allocation, and integrating non-transactional state stores like DeStor and SSM. A first step in this direction is described in [5].

We are focusing initially on applications whose workloads can be characterized as relatively short-running tasks that frame state updates. Substantially all Internet services fit this description, in part because the nature of HTTP has forced designers into this mold. As enterprise services and applications (e.g., workflow, customer management) become web-enabled, they adopt similar architectures. We expect there are many applications outside this domain that could not easily be cast this way, and for which deriving a crash-only design would be impractical or infeasible.

In order for the restart/retry architecture to be highly available and correct, most requests it serves must be idempotent. This requirement might be inappropriate for some applications. Our proposal does not handle Byzantine failures or data errors, but such behavior can be turned into fail-stop behavior using well-known orthogonal mechanisms, such as triple modular redundancy [13] or clever state replication [6].

In today's Internet systems, fast recovery is obtained by overprovisioning and counting on rapid failure detection to trigger failover. Such failover can sometimes successfully mask hours-long recovery times, but often detecting failures end-to-end takes longer than expected. Crash-only software is complementary to this approach and can help alleviate some of the complex and expensive management requirements for highly redundant hardware, because faster recovering software means less redundancy is required. In addition, a crash-only system can reintegrate recovered components faster, as well as better accommodate removed, added, or upgraded components.

We expect throughput to suffer in crash-only systems, but this concern is secondary to the high availability and predictability we expect in exchange. The first program written in a high-level language was certainly slower than its hand-coded assembly counterpart, yet it set the stage for software of a scale, functionality and robustness that had previously been unthinkable. These benefits drove compiler writers to significantly optimize the performance of programs written in high-level languages, making it hard to imagine today how we could program otherwise. We expect the benefits of crash-only software to similarly drive efforts that will erase, over time, the potential performance loss of such designs.

6. Conclusion

By using a crash-only approach to building software, we expect to obtain better reliability and higher availability in Internet systems. Application fault models can be simplified through the application of externally-enforced ``crash-only laws,'' thus encouraging simpler recovery routines which have higher chances of being correct. Writing crash-only components may be harder, but their simple failure behavior can make the assembly of such components into large systems easier.

The promise of a simple fault model makes stating invariants on failure behavior possible. A system whose component-level and system-level invariants can be enforced through crash-rebooting is more predictable, making recovery management more robust. It is our belief that applications and services with high availability requirements can significantly benefit from these properties.

Once we surround a crash-only system with a suitable recovery infrastructure, we obtain a recursively restartable system [4]. Transparent recovery based on component-level micro-reboots enables restart/retry architectures to hide intra-system failure from the end users, thus improving the perceived reliability of the service. We find it encouraging that our initial prototype [5] was able to complete 78% more client requests under faultload than a non-crash-only version of the system that did not employ micro-reboots for recovery.

References

[1]	T. Adams, R. Igou, R. Silliman, A. M. Neela, and E. Rocco. Sustainable infrastructures: How IT services can address the realities of unplanned downtime. Research Brief 97843a, Gartner Research, May 2001. Strategy, Trends & Tactics Series.
[2]	R. H. Arpaci-Dusseau and A. C. Arpaci-Dusseau. Fail-stutter fault tolerance. In Proc. 8th Workshop on Hot Topics in Operating Systems, Elmau/Oberbayern, Germany, 2001.
[3]	G. Candea, J. Cutler, and A. Fox. Improving availability with recursive micro-reboots: A soft-state system case study. Performance Evaluation Journal, Summer 2003. To appear.
[4]	G. Candea and A. Fox. Recursive restartability: Turning the reboot sledgehammer into a scalpel. In Proc. 8th Workshop on Hot Topics in Operating Systems, Elmau/Oberbayern, Germany, 2001.
[5]	G. Candea, P. Keyani, E. Kiciman, S. Zhang, and A. Fox. JAGR: An autonomous self-recovering application server. In Proc. 5th International Workshop on Active Middleware Services, Seattle, WA, June 2003.
[6]	M. Castro and B. Liskov. Practical Byzantine fault tolerance. In Proc. 3rd USENIX Symposium on Operating Systems Design and Implementation, New Orleans, LA, 1999.
[7]	A. Chou, J.-F. Yang, B. Chelf, S. Hallem, and D. Engler. An empirical study of operating systems errors. In Proc. 18th ACM Symposium on Operating Systems Principles, Lake Louise, Canada, 2001.
[8]	G. Czajkowski and L. Dayn'es. Multitasking without compromise: A virtual machine evolution. In Proc. Conference on Object Oriented Programming Systems Languages and Applications, Tampa Bay, FL, 2001.
[9]	C. Fetzer. Perfect failure detection in timed asynchronous systems. IEEE Transactions on Computers, 52(2):99-112, Feb. 2003.
[10]	S. Garg, A. V. Moorsel, K. Vaidyanathan, and K. S. Trivedi. A methodology for detection and estimation of software aging. In Proc. 9th International Symposium on Software Reliability Engineering, Paderborn, Germany, 1998.
[11]	C. G. Gray and D. R. Cheriton. Leases: An efficient fault-tolerant mechanism for distributed file cache consistency. In Proc. 12th ACM Symposium on Operating Systems Principles, Litchfield Park, AZ, 1989.
[12]	J. Gray. Why do computers stop and what can be done about it? In Proc. 5th Symposium on Reliability in Distributed Software and Database Systems, Los Angeles, CA, 1986.
[13]	J. Gray and A. Reuter. Transaction processing: concepts and techniques. Morgan Kaufmann, San Francisco, CA, 1993.
[14]	S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler. Scalable, distributed data structures for Internet service construction. In Proc. 4th USENIX Symposium on Operating Systems Design and Implementation, San Diego, CA, Oct. 2000.
[15]	A. C. Huang and A. Fox. Decoupled storage: State with stateless-like properties. Submitted to the 22nd Symposium on Reliable Distributed Systems, 2003.
[16]	Y. Huang, C. M. R. Kintala, N. Kolettis, and N. D. Fulton. Software rejuvenation: Analysis, module and applications. In Proc. 25th International Symposium on Fault-Tolerant Computing, Pasadena, CA, 1995.
[17]	T. Lahiri, A. Ganesh, R. Weiss, and A. Joshi. Fast-Start: Quick fault recovery in Oracle. In Proc. ACM International Conference on Management of Data, Santa Barbara, CA, 2001.
[18]	B. Ling and A. Fox. A self-tuning, self-protecting, self-healing session state management layer. In Proc. 5th International Workshop on Active Middleware Services, Seattle, WA, 2003.
[19]	B. Murphy and N. Davies. System reliability and availability drivers of Tru64 UNIX. In Proc. 29th International Symposium on Fault-Tolerant Computing, Madison, WI, 1999. Tutorial.
[20]	B. Murphy and T. Gent. Measuring system and software reliability using an automated data collection process. Quality and Reliability Engineering International, 11:341-353, 1995.
[21]	K. Nagaraja, R. Bianchini, R. P. Martin, and T. D. Nguyen. Using fault model enforcement to improve availability. In Proc. 2nd Workshop on Evaluating and Architecting System Dependability, San Jose, CA, 2002.
[22]	M. Olson, K. Bostic, and M. Seltzer. Berkeley DB. In Proceedings of the 1999 Summer USENIX Technical Conference, Monterey, CA, June 1999.
[23]	A. Pal. Personal communication. Yahoo!, Inc., 2002.
[24]	P. Soper, P. Donald, D. Lea, and M. Sabin. Application isolation API specification. Java Specification Request No. 121, https://jcp.org/en/jsr/detail?id=121, 2002.
[25]	M. Stonebraker. The design of the Postgres storage system. In Proc. 13th Conference on Very Large Databases, Brighton, England, 1987.
[26]	A. Whitaker, M. Shaw, and S. Gribble. Scale and performance in the Denali isolation kernel. In Proc. 5th USENIX Symposium on Operating Systems Design and Implementation, Boston, MA, 2002.

This paper was originally published in the Proceedings of HotOS IX: The 9th Workshop on Hot Topics in Operating Systems, May 18–21, 2003, Lihue, Hawaii, USA
Last changed: 26 Aug. 2003 aw

HotOS IX Program Index

HotOS IX Home

USENIX home