This paper has described a general, user-level strategy for building sandbox environments that has been tested on both Windows NT and Linux. It is interesting to observe that modern OSes provide sufficient support to permit implementation of quantitative constraints at the user level. The shared library support enables interception of system APIs; the monitoring infrastructure makes possible acquiring of almost all necessary information; the priority-based scheduling, debugger processes, and signal handling mechanisms allow the adjustment of an application's CPU usage; the memory protection and memory-mapped file mechanisms permits control of the amount of physical memory available to an application. Finally, the socket interface gives direct control of network activities. Most resources in an operating system can benefit from some combination of these techniques.
In fact, user-level approaches provide more flexibility in deciding the granularity, the policies, and monitoring/controlling mechanisms available for enforcing sandbox constraints. We demonstrate this extensibility by customizing our process-level sandbox implementation on Windows NT to limiting resource usage at the level of thread and socket groups, instead of the default process granularity assumed in Section 3. The required modifications were simple, just involving changes in the progress expressions used in the monitoring code and some specialization of the controlling code.
Controlling CPU usage of thread groups
Figure 7 shows a snapshot of the system
CPU usage (as measured by the NT Performance Monitor) for an application with
two groups of threads, each of which is constrained to a total CPU share of 40%.
The application itself consists of three threads which start at different times.
Initially, the first thread starts as a member of thread group one and takes up
a 40% CPU share. The second thread starts after ten seconds and joins thread
group two. It also gets 40% of the CPU share, the total capacity of this thread
group. After another ten seconds, the third thread joins thread group two. The
allocation for the second thread group adjusts: the third thread gets a 30% CPU
share and the second thread receives a 10% CPU share, keeping the total CPU
share of this thread group at 40%. Note that the CPU share of the first thread
group is not affected, and that we are able to control CPU usage of thread
groups as accurately as at the process level. Currently, the resource allocation
to threads in the same group is arbitrary and not controlled. However, one could
set up a general hierarchical resource sharing structure, attesting to the
extensibility of the user-level solution.
Controlling bandwidth of socket groups Figure 8 shows the effect of restricting network bandwidth at the level of socket groups, where the total bandwidth of a socket group is constrained. The application used in the experiment consists of one server instance and three client instances. The server spawns a new thread for each client, using a new socket (connection). The communication pattern is a simple ping-pong pattern between the clients and the server.
Figure 8(a) shows the
performance of server threads when the bandwidth constraint is enforced at the
process level. The total network bandwidth is restricted to 6MBps. The clients
and server exchange 100,000 4KB-sized messages. The figure shows the number of
messages sent by the server to each client as time progresses. The first client
starts about the same time as the server and gets the total bandwidth of 6MBps
(as indicated by the slope). The second client starts after one minute, sharing
the same network constraint. Therefore, the bandwidth becomes 3MBps each. The
communication is kept at this rate for another minute until the third client
joins. This makes all three of them transmit at a lower rate (2MBps). As a
result, the first client takes more than 400 seconds to complete its
transmission, due to the interference from the other two clients.
Figure 8(b) shows the case where the first client needs to receive a better and guaranteed level of service. Two socket groups are used, with the network bandwidth of the first constrained to 4MBps and that of the second group to 2MBps. Clients start at the same times as before. However, the performance of the first client is not influenced by the arrival of the other two clients. Only the two later clients share the same bandwidth constraint. In consequence, the first client takes only 200 seconds to finish its interactions.
These experiments demonstrate that user-level sandboxing techniques can be used to create flexible, application-specific predictable execution environments for application components of various granularity. As a large-scale application of such mechanisms, we have exploited these advantages in other work [CK00] to create a cluster-based testbed that can be used to model execution behavior of distributed applications under various scenarios of dynamic and heterogeneous resource availability.
Next: Conclusion and Future Work
Up: User-level Resource-constrained Sandboxing
Previous: Differences in Linux implementation