Visualization Tool |
Overview of the Performance Explorer.
This section describes our visualization tool, called the Performance Explorer, which supports performance analysis through the interactive visualization of trace records that are generated by the VM extensions described in Section *. Figure * presents an overview of the Performance Explorer when run with a variant of SPECjbb2000 using one warehouse on one virtual processor. The figure comprises three parts (a, b, and c), which are further described below.
The Performance Explorer is based on two key concepts: trace record sets and metrics.
A trace record set is a set of trace records.
Given the trace files of an application's execution,
the Performance Explorer provides an initial trace record set containing
all trace records of all threads on all virtual processors.
It also provides filters to create subsets of a trace record set
(e.g., all trace records of the MainThread,
all trace records longer than 5 ms,
all trace records with more than 1000 L1 D-cache misses,
or all trace records ending in a Java method matching the regular expression ".*lock.*
").
Part a of Figure *
illustrates the user interface for configuring these filters.
Furthermore, the Performance Explorer provides set operations
(union, intersection, difference) on trace record sets.
A metric extracts or computes a value from a trace record. The Performance Explorer provides a metric for each hardware counter gathered in the trace files, and a metric for the trace record duration. The user can define new metrics using arithmetic operations on existing metrics. For example, instructions per cycle (IPC) can be computed using a computed metric that divides the instructions completed (INST_CMPL) event value by the cycles (CYC) event value. Part b in Figure * shows a graph for just one metric. The horizontal axis is wall-clock time, and the vertical axis is defined by the metric, which in this case is IPC. The vertical line that is a quarter of the way in from the left side of the graph represents a marker trace record generated by manual instrumentation of the program. This specific marker shows when the warehouse application thread starts executing. To the right of the marker, the applications enters a steady state where the warehouse thread is created and executed. To the left of the marker represents the program's start up where its main thread dominates execution. Each line segment in the graph (in this zoomed-out view of the graph most line segments appear as points) represents a trace record. The length of the line segment represents its wall clock duration. The color of the line segment (different shades of gray in this paper) indicates the corresponding Java thread. The user can zoom in and out and scroll the graph.
Part c of Figure * displays a table of the trace records visualized in the graph, one trace record per line. This table presents all the attributes of a trace record, including the values of all metrics plotted in the graph. Selecting a range of trace records in the graph selects the same records in the table, and vice versa. This allows the user to select anomalous patterns in the graph, for example the drop in IPC before each garbage collection, and immediately see all the attributes (like method names) of the corresponding trace records in the trace record table. The user can get simple descriptive statistics (sum, minimum, maximum, average, standard deviation, and mean delta) over the selected trace records for all metrics. Finally, a selection of trace records can be named and saved as a new trace record set.
In addition to providing time graphs and trace record tables, the Performance Explorer also provides several other ways to visualize the trace data. The Performance Explorer provides thread timelines, which are tables where each column represents a thread, each row represents an interval in time, and the cells are colored based on the processor that is executing the thread. The Performance Explorer provides processor timelines, where each column represents a processor, and the cells are colored based on the thread that executes on the processor. Cells in these timelines visualize a given metric (like IPC), and they are adorned with glyphs to show preemption or yielding of a thread at the end of a time slice. These timelines are helpful in analyzing scheduling effects. The Performance Explorer provides scatter plots, where the X and Y axes can be any given metric, and all trace records of a trace record set are represented as points. Lastly, the Performance Explorer calculates the correlations between any two metrics over a trace record set.
Due to the extensive number of HPM events that can be counted on POWER4 processors, and the limitation of a given event being available only in a limited number of hardware counters, the Performance Explorer provides functionality for exploring the available events and event groups.
Because of space considerations, subsequent figures only contain information from the Performance Explorer that is pertinent to the discussion at hand.
Visualization Tool |