Check out the new USENIX Web site. next up previous
Next: 4 Performance Study Up: Measuring CPU Overhead for Previous: 2 Xen


3 Monitoring Framework


To implement a monitoring system that accounts for CPU usage by different guest VMs, we instrumented activity in the hypervisor CPU scheduler.

Let $Dom_0, Dom_1, ..., Dom_k$ be virtual machines that share the host node, where $Dom_0$ is a privileged management domain (Domain0) that hosts the device drivers. Let $Dom_{idle}$ denote a special idle domain that ``executes'' on the CPU when there are no other runnable domains (i.e. there is no virtual machine that is not-blocked and not-idle). $Dom_{idle}$ is the analog to the ``idle-loop'' executed by an OS when there are no other runnable processes.

At any point of time, guest domain $Dom_i$ can be in one of the following three states:

For each domain $Dom_i$, we collect a sequence of data describing the timing of domain state changes. Using this data, it is relatively straightforward to compute the share of CPU which was allocated to $Dom_i$ over time.

As was mentioned in Section 2, in order to avoid the overhead of copying I/O data to/from the guest virtual machine Xen implements the ``page-flipping'' technique, where the memory page containing the I/O data is exchanged with an unused page provided by the guest OS. Thus, in order to account for different I/O related activities in $Dom_0$ (that ``hosts'' the unmodified device drivers), we observe the memory page exchanges between $Dom_0$ and $Dom_i$. We measure the number $N^{mp}_i$ of memory page exchanges performed over time interval $T_i$ when $Dom_0$ is in execution state. We derive the CPU cost (CPU time processing) of these memory page exchanges as $Cost_i^{mp}=T_i/N^{mp}_i$. After that, if there are $N_i^{Dom_i}$ memory page exchanges between $Dom_0$ and virtual machine $Dom_i$ then $Dom_i$ is ``charged'' for $N_{i}^{Dom_i} \times
Cost_i^{mp}$ of CPU time processing of Domain0. In this way, we can partition the CPU time used by Domain0 for processing the I/O related activities of different VMs sharing the same device driver, and ``charge'' the corresponding virtual machine that caused these I/O activities. Within the monitoring system, we use a time interval of 100 ms to aggregate overall CPU usage across different virtual machines.



next up previous
Next: 4 Performance Study Up: Measuring CPU Overhead for Previous: 2 Xen