Check out the new USENIX Web site. [top | prev | next]

Isolation with Flexibility:
A Resource Management Framework for Central Servers

David G. Sullivan, Margo I. Seltzer
Division of Engineering and Applied Sciences
Harvard University, Cambridge, MA 02138

{sullivan,margo}@eecs.harvard.edu



1. Introduction

In managing computational resources, an operating system must balance a variety of goals, including maximizing resource utilization, minimizing latency, and providing fairness. The relative importance of these goals for a particular system depends on the nature of the system and the ways in which it is used. For supercomputers running compute-intensive applications, the primary goal may be to maximize throughput, while for personal computers used to enhance a single user's productivity, the chief goal may be to maximize responsiveness.

In today's computing environments, users increasingly compete for the resources of server systems, whether to access central databases or to view content on virtually-hosted Web sites. On such systems, fairness becomes a critical resource-management goal. Proportional-share mechanisms allow this goal to be met by providing resource principals (users, applications, threads, etc.) with guaranteed resource rights. For example, customers who pay Internet service providers to virtually host their Web sites can be given rights to shares of the hosting machine that are commensurate with the prices they pay. Service providers who can make such guarantees can offer larger resource shares to principals willing to pay a premium for better quality of service.

Although its full promise is yet to be realized, thin-client computing is another domain in which proportional-share resource management is desirable. Administrators of such systems are often forced to host one application per server to provide predictable levels of service [Sun98]. Proportional-share techniques enable the consolidation of multiple applications onto a single server by giving each application a dedicated share of the machine.

A system that supports proportional-share resource management must isolate resource principals from each other, so that a given principal's resource rights are protected from the activities of other principals. To provide such isolation, a system must necessarily impose limits on the flexibility with which resource allocations can be modified. Such limits work well when the resource needs of applications are well-known and unchanging, because a system administrator can assign the appropriate resource shares and leave the system to run. Unfortunately, these conditions frequently do not hold. Even if the applications' current resource needs are adequately understood, they will typically change over time. For example, as a Web site's working set of frequently accessed documents expands, the site may require an increasing share of the server's disk bandwidth in order to offer reasonable responsiveness. Moreover, it would be preferable if system administrators could be freed from the need to make detailed characterizations of applications' resource needs. Ideally, the applications themselves should be able to modify their own resource rights in response to their needs and the current state of the system.

In this paper, we present extensions to the lottery-scheduling resource management framework [Wal94, Wal95, Wal96] that allow resource principals to safely overcome the limits on flexible allocation that proportional-share frameworks impose for the sake of secure isolation. Our extended framework supports both absolute resource reservations (hard shares) and proportional allocations that change in size as resource principals enter and leave the competition for a resource (soft shares). It also introduces a system of access controls to protect the isolation properties that lottery scheduling provides. And our framework offers the means for applications to modify their own resource rights without compromising the rights of other resource principals. One of these mechanisms, called ticket exchanges, allows applications to coordinate their use of the system's resources by bartering over resource rights with each other. Our extended framework thereby provides isolation with increased flexibility: the flexibility to safely overcome the limits on resource allocation that standard proportional-share frameworks enforce.

We have developed a prototype implementation of our framework in the VINO operating system [Sel96] and have used it, in conjunction with several proportional-share mechanisms, to manage CPU time, physical memory, and disk bandwidth. Our experiments demonstrate that the extended lottery-scheduling framework enables server applications to achieve improved performance under realistic usage scenarios.

This work makes several contributions. First, we extend the lottery-scheduling framework to securely manage multiple resources, providing both soft and hard resource shares. To our knowledge, our prototype is the first implementation of a proportional-share framework to support both types of shares for multiple resources. Second, we point out an important tension between the conflicting goals of secure isolation and flexible resource allocation, and we present mechanisms that allow for more flexible allocation while preserving secure isolation. Third, we illustrate the value of a system that can support dynamic adjustments to the resource allocations that applications receive.

In the next section, we review the original lottery-scheduling framework and describe how we extend it to securely support proportional sharing of multiple resources. In Section 3, we illustrate how lottery scheduling (like all proportional-share schemes) imposes both upper and lower limits on the resource allocations that clients can obtain, and we describe the mechanisms that we use to overcome both sets of limits while maintaining secure isolation. In Section 4, we describe our prototype implementation of the extended framework, including the scheduling mechanisms that we have chosen to employ. Section 5 presents experiments designed to evaluate the prototype and to test one of our mechanisms for flexibly adjusting resource rights. Finally, we discuss related work and summarize our conclusions.

[top | prev | next]