Table 3:
Percentage of total time spent idle for various configurations
transmitting data on PC-733.
Idle Time While Running nettest
PC-733
86%
Optimized VM/PC-733
21.7%
Optimized VM/PC-733 without IRQ notification
17.9%
Optimized VM/PC-733 without send combining and IRQ notification
2.0%
Version 2.0 VM/PC-733
0%
Figure 6 shows that VM/PC-733 is able to
saturate a 100 Mbit link without becoming CPU bound, but VM/PC-350
is CPU bound, even with optimizations. Natively, PC-733 and
PC-350 easily saturate a 100 Mbit link. The final experiments set
out to gather information about how utilized the CPU is in the
different configurations.
We instrumented the system to obtain a precise measurement of idle
time. Normally, when a guest issues a halt (HLT) instruction, VMware
Workstation switches back to the VMApp which then blocks on a
select()on all devices. Instead, we enabled an option whereby a
guest HLT instruction spins and halts the CPU in the VMM rather than
yielding control back to the host OS. Using the TSC register, we
measure idle time starting from when the guest issues a HLT
instruction to when the next hardware interrupt occurs. This idle
time represents CPU cycles that is available to the guest OS for
running other computation. Note that not all of this idle time would
be available for other host OS computation, as there are a couple of
world switches and some system call overhead (e.g., the select()
system call) if we switched back to the VMApp on a guest HLT
instruction.
For the native idle times, the standard profiler built into Linux
kernels was augmented to account for time spent executing user code
and in the kernel idle loop, and then the percentage of total ticks
spent in the idle loop was taken.
The idle times in Table 3 show that in VM/PC-733,
with a transmit size of 4KB, the guest has transitioned from being CPU
bound at 64 Mb/s to being I/O bound with 21.7% idle time. In
comparison, PC-733 has 86% idle time. At this point, nearly all
of the remaining overheads are either part of CPU virtualization or
part of the nature of the hosted architecture. The next section
discusses further optimizations both within and outside the scope of a
hosted architecture.
Next:Performance Enhancements Up:Virtual Machine Networking Performance Previous:Throughput vs. Data Size:
Beng-Hong Lim
2001-05-01