Two Cache Hierarchy

**Figure 9:** On the x-axis is the size of the and caches. We plot the inter-cache traffic (left), and the average response time allowing unlimited bandwidth and free demotions (right), as a function of the caching algorithm and cache size.
$\begin{figure}\begin{center} {\small Two Cache Hierarchy on traces P3, P5, and ... ....25in} \epsfig{file=eps/P7-art.eps, height=2.1in, width=3.25in} }\end{figure}$

**Figure 10:** On the x-axis is the size of the and caches. We plot the inter-cache traffic (left), and the average response time allowing unlimited bandwidth and free demotions (right), as a function of the caching algorithm and cache size.
$\begin{figure}\begin{center} {\small Two Cache Hierarchy on traces Financial2, ... ...5in} \epsfig{file=eps/zipf-art.eps, height=2.1in, width=3.25in} }\end{figure}$

In Figure 8, we present detailed results for a first trace, P1, in a two cache (same size) hierarchy. We observed similar results with all traces given in Section IV-A, the results for some of which are shown in Figures 9 and 10.

Inter-cache Bandwidth Usage: PROMOTE x more efficient than DEMOTE. In Figure 8-10, we plot the total traffic between the

and

caches (demotions +

misses). Observe that as the cache sizes grow, the inter-cache traffic decreases as

produces more hits. For both LRU and ARC variants, DEMOTE generates more than double the traffic generated by PROMOTE. This is because DEMOTE causes a demotion for every

miss (after

is full), and also incurs more misses in

than PROMOTE. This is true for all traces and cache sizes, where, on an average, DEMOTE requires

% more inter-cache bandwidth than PROMOTE for the LRU variant, and about

% more for the ARC variant.

Aggregate Hit Ratio: PROMOTE same as DEMOTE. In Figure 8, we observe that both PROMOTE-LRU and PROMOTE-ARC achieve almost the same aggregate hit ratio as their DEMOTE counterparts. This was observed for all traces and cache sizes. We also confirm that plain LRU achieves the lowest aggregate hit ratio as the inclusive nature of the lower cache results in very few hits. Please note, however, that the aggregate hit ratio is not a reliable performance metric.

Table I: Number of

hits (out of 2000000 requests), at a cache size of

blocks. While aggregate hits were almost the same for both PROMOTE and DEMOTE, we observed that PROMOTE accumulates more hits in

	P1	P3	P5	P7	P9	P11	Financial1	Financial2	spc1	zipf
PROMOTE-ARC	709476	546440	377332	456983	570250	692042	532270	1204243	533414	845104
DEMOTE-ARC	408174	333751	280481	263244	469771	540978	506503	947009	390810	365909
PROMOTE-LRU	653880	446803	210473	222156	511271	639430	644893	1410101	291183	791667
DEMOTE-LRU	590276	140384	172625	125960	396135	667633	688946	1488752	155197	760877

For the same aggregate hit ratio, a higher number of hits in the highest cache leads to better average response times. In Table I, we compare the number of hits in

for a wide range of traces with two levels of cache of

blocks each. While all the traces exhibited similar behavior, we skip some traces in the table to keep it small. We observe that PROMOTE-LRU beats DEMOTE-LRU by

% on an average, and PROMOTE-ARC beats its DEMOTE contender by

%. This is primarily because PROMOTE probabilistically promotes high reuse pages to the higher cache, while DEMOTE forces all pages to flow through the higher cache, pushing out a great number of hits to the lower cache levels.

Average Response Time: PROMOTE beats DEMOTE. In Figure 8 we plot the average response time for the trace P1 at various cache sizes. When we assume an unlimited inter-cache bandwidth and free demotions (although DEMOTE has a rough model to attribute for demotion costs, we create the case most favorable to DEMOTE), PROMOTE-LRU still beats DEMOTE-LRU by up to

% and PROMOTE-ARC beats DEMOTE-ARC by up to

% across all cache sizes. For all other traces (some shown in Figures 9 and 10) PROMOTE achieves

%(LRU) and

%(ARC) better response times on an average.

In the lower right panel of Figure 8, we examine a limited bandwidth case, which is more realistic. We allow

blocks per second, which is

x that required if there were no hits or demotions (

). When we average the response time across all cache sizes, we observe that PROMOTE substantially outperforms DEMOTE by achieving lower response times,

ms (for ARC) and

ms (for LRU), as compared to DEMOTE with

ms (for ARC) and

ms (for LRU), respectively. In fact, both DEMOTE-LRU and DEMOTE-ARC consistently perform worse than even plain LRU. Surprisingly, for smaller cache sizes, DEMOTE variants perform even worse than no caching at all (i.e. worse than

)! This happens because, when bandwidth is the bottleneck and we use DEMOTE, the bandwidth saved due to one hit in cache

is consumed by one demotion due to a miss. When the number of misses is greater than the number of hits in the cache

(

% hit ratio), a no-caching policy will actually perform better.

Since DEMOTE is clearly worse in the limited bandwidth case, we consistently assume unlimited inter-cache bandwidth and free demotions for the remaining traces shown in Figures 9 and 10.