Figure gives the energy distribution for the
software components in both the interpreter and JIT modes.
For example, jack executing the interpreter mode, the instruction accesses consume 60J and data accesses consume 232J. The corresponding energy numbers for the JIT mode are much lower at 10J and 20J respectively.
These results are in consonance with the better locality of
instruction accesses in the interpreter mode as discussed earlier.
In the interpreter mode almost all
the energy is spent in the interpretation and GC and class loading were found to be less than 2%
of the overall energy consumption.
Although execution takes the largest amount of energy in
the JIT mode, the dynamic compilation also consumes a significant amount of
energy. This is due to two main reasons [23].
First, there are abrupt changes in the working
set during dynamic compilation as the code and data structures used by the compiler are different
from that for the rest of the JVM. Thus, when we move to the code generation phase, we experience
poor locality in the cache (data and instruction) accesses, and this in turn causes
more references to the memory (both Imemory and Dmemory). Second, when code is installed after
dynamic compilation, it causes references to main memory.
We observe that (in the JIT mode), on an average, the
dynamic compilation consumes 24% of the overall energy across the benchmarks.
Figure
(d) breaks down the energy consumption
by the hardware components during dynamic compilation and shows
that Imemory and Dmemory are responsible for the bulk of the overhead.
|
In the rest of this discussion, we focus only on the JIT mode since
the energy consumption for the interpreter is dominated entirely by the
interpretation. Figure gives the energy
breakdown of javac into different software components with
different cache configurations.
We observe that (as opposed to class loading and garbage collection)
the dynamic compilation and execution can take advantage of larger cache sizes.
Data from other experiments [22] show
that the energy consumption during loading is mainly dominated
by compulsory misses. Hence, the number of total misses during loading
is fairly constant across different cache configurations.
However, there are small variations in energy consumption
with changes in cache configuration as the per-access energy cost is affected
not only by number of accesses but also by the energy cost of the tag-matching hardware and
the capacitive load of bit lines.
As can be observed in class loading profile in Figure
(a),
most of the energy is consumed by the data memory.
It should be noted that some Java environments may be running multiple
applications concurrently, in which some of the class loading costs can
be amortized over the different applications [10].
|
|
We see from Figure that the garbage collector
consumes a very small fraction of the energy.
Its energy consumption due to data accesses is higher than that due to
instruction accesses as the garbage collector
code itself is very small (i.e., good Icache locality)
but the data accessed by the GC has a relatively poor locality.
In fact, our detailed analysis shows that
most of the energy expended in data memory is a result of the
the cache misses. More innovation in
improving the data locality of garbage collection will be valuable from energy perspective.
While the absolute energy consumed by the garbage collector is small compared to
overall execution in these experiments,
we believe that the need for more aggressive
garbage collection for limited memory embedded systems will make this component more important.
It must be noted that the energy consumed in the garbage collection
portion is also influenced by the choice of the algorithm and the size
of the heap. The size of the heap can influence the number of times
the garbage collector is invoked.
For example, when we varied the heap size from 24M to 8M,
the energy consumed by garbage collection increases eight fold when
executing mtrt (s100 dataset and JIT compilation mode).
The dataset of the application can also influence the energy consumed by
the garbage collector.
As an example, we found that the GC is responsible for
nearly 14% of total data misses for s100 data set (compared to 7% with s10) in
the JIT mode, for javac, contributing to the overall
energy more significantly. More detailed analysis of these tradeoffs in
garbage collection energy consumption is
beyond the scope of this work and is an interesting area of research in itself.
The execution of compiled code consumes the major chunk of the energy
and Figure
shows the energy distribution for the different hardware and software
components of the JIT mode.
Overall, observing the trends shown in Figure
, it is
interesting to note that different applications in SPEC JVM98 exhibit different
energy behaviors.
For instance,
while mtrt consumes the maximum energy during the execution phase,
its energy consumption is smaller than that of compress during loading, garbage collection,
and dynamic compilation.
The energy consumption in different software components is a function of the number of classes loaded, the size of the
classes, the number of methods compiled, the number of times a method is invoked after compilation,
the heap size determining the frequency of GC invocation, the size of
data set and the heap allocation, and memory access behavior during execution. Since the actual execution
of the compiled code is the dominant component, we need to focus on developing techniques
to reduce the energy consumed by this component.
Optimizations during the JIT compilation phase (e.g., [28,38]) can also potentially improve the
energy efficiency of the execution phase, sometimes at the cost of increasing energy consumption due
to dynamic compilation itself.
Finally, we would like to emphasize that the energy behavior in the different portions of the JVM is
also dependent on the dataset size. We observe from Figure that the share of class
loading and dynamic compilation are comparatively smaller for the s100 dataset as compared to s10 dataset.
|