Yaoping Ruan and Vivek Pai
Department of Computer Science
Princeton University
{yruan,vivek}@cs.princeton.edu
For operating system intensive applications, the ability of designers
to understand system call performance behavior is essential to
achieving high performance. Conventional performance tools, such as
monitoring tools and profilers, collect and present their information
off-line or via out-of-band channels. We believe that making this
information first-class and exposing it to applications via
in-band channels on a per-call basis presents opportunities for
performance analysis and tuning not available via other mechanisms.
Furthermore, our approach provides direct feedback to applications on
time spent in the kernel, resource contention, and time spent blocked,
allowing them to immediately observe how their actions affect kernel
behavior. Not only does this approach provide greater
transparency into the workings of the kernel, but it also allows
applications to control how performance information is collected,
filtered, and correlated with application-level events.
To demonstrate the power of this approach, we show that our implementation, DeBox, obtains precise information about OS behavior at low cost, and that it can be used in debugging and tuning application performance on complex workloads. In particular, we focus on the industry-standard SpecWeb99 benchmark running on the Flash Web Server. Using DeBox, we are able to diagnose a series of problematic interactions between the server and the OS. Addressing these issues as well as other optimization opportunities generates an overall factor of four improvement in our SpecWeb99 score, throughput gains on other benchmarks, and latency reductions ranging from a factor of 4 to 47.