Check out the new USENIX Web site. next up previous
Next: Differing Cache Sizes Up: Results Previous: Very Close Bounds on

Two Cache Hierarchy

Figure 9: On the x-axis is the size of the $ C_1$ and $ C_2$ caches. We plot the inter-cache traffic (left), and the average response time allowing unlimited bandwidth and free demotions (right), as a function of the caching algorithm and cache size.
\begin{figure*}\begin{center}
{\small Two Cache Hierarchy on traces P3, P5, and ...
....25in}
\epsfig{file=eps/P7-art.eps, height=2.1in, width=3.25in}
}\end{figure*}

Figure 10: On the x-axis is the size of the $ C_1$ and $ C_2$ caches. We plot the inter-cache traffic (left), and the average response time allowing unlimited bandwidth and free demotions (right), as a function of the caching algorithm and cache size.
\begin{figure*}\begin{center}
{\small Two Cache Hierarchy on traces Financial2, ...
...5in}
\epsfig{file=eps/zipf-art.eps, height=2.1in, width=3.25in}
}\end{figure*}

In Figure 8, we present detailed results for a first trace, P1, in a two cache (same size) hierarchy. We observed similar results with all traces given in Section IV-A, the results for some of which are shown in Figures 9 and 10.

Inter-cache Bandwidth Usage: PROMOTE $ 2$x more efficient than DEMOTE. In Figure 8-10, we plot the total traffic between the $ C_1$ and $ C_2$ caches (demotions + $ C_1$ misses). Observe that as the cache sizes grow, the inter-cache traffic decreases as $ C_1$ produces more hits. For both LRU and ARC variants, DEMOTE generates more than double the traffic generated by PROMOTE. This is because DEMOTE causes a demotion for every $ C_1$ miss (after $ C_1$ is full), and also incurs more misses in $ C_1$ than PROMOTE. This is true for all traces and cache sizes, where, on an average, DEMOTE requires $ 101$% more inter-cache bandwidth than PROMOTE for the LRU variant, and about $ 121$% more for the ARC variant.

Aggregate Hit Ratio: PROMOTE same as DEMOTE. In Figure 8, we observe that both PROMOTE-LRU and PROMOTE-ARC achieve almost the same aggregate hit ratio as their DEMOTE counterparts. This was observed for all traces and cache sizes. We also confirm that plain LRU achieves the lowest aggregate hit ratio as the inclusive nature of the lower cache results in very few hits. Please note, however, that the aggregate hit ratio is not a reliable performance metric.

Hits in the Highest Cache: PROMOTE beats DEMOTE.


Table I: Number of $ C_1$ hits (out of 2000000 requests), at a cache size of $ 50K$ blocks. While aggregate hits were almost the same for both PROMOTE and DEMOTE, we observed that PROMOTE accumulates more hits in $ C_1$.
P1 P3 P5 P7 P9 P11 Financial1 Financial2 spc1 zipf
PROMOTE-ARC 709476 546440 377332 456983 570250 692042 532270 1204243 533414 845104
DEMOTE-ARC 408174 333751 280481 263244 469771 540978 506503 947009 390810 365909
PROMOTE-LRU 653880 446803 210473 222156 511271 639430 644893 1410101 291183 791667
DEMOTE-LRU 590276 140384 172625 125960 396135 667633 688946 1488752 155197 760877


For the same aggregate hit ratio, a higher number of hits in the highest cache leads to better average response times. In Table I, we compare the number of hits in $ C_1$ for a wide range of traces with two levels of cache of $ 50K$ blocks each. While all the traces exhibited similar behavior, we skip some traces in the table to keep it small. We observe that PROMOTE-LRU beats DEMOTE-LRU by $ 13.0$% on an average, and PROMOTE-ARC beats its DEMOTE contender by $ 37.5$%. This is primarily because PROMOTE probabilistically promotes high reuse pages to the higher cache, while DEMOTE forces all pages to flow through the higher cache, pushing out a great number of hits to the lower cache levels.

Average Response Time: PROMOTE beats DEMOTE. In Figure 8 we plot the average response time for the trace P1 at various cache sizes. When we assume an unlimited inter-cache bandwidth and free demotions (although DEMOTE has a rough model to attribute for demotion costs, we create the case most favorable to DEMOTE), PROMOTE-LRU still beats DEMOTE-LRU by up to $ 4$% and PROMOTE-ARC beats DEMOTE-ARC by up to $ 5$% across all cache sizes. For all other traces (some shown in Figures 9 and 10) PROMOTE achieves $ 0.3$%(LRU) and $ 1.5$%(ARC) better response times on an average.

In the lower right panel of Figure 8, we examine a limited bandwidth case, which is more realistic. We allow $ 300$ blocks per second, which is $ 1.5$x that required if there were no hits or demotions ( $ 1/t_m = 200$). When we average the response time across all cache sizes, we observe that PROMOTE substantially outperforms DEMOTE by achieving lower response times, $ 3.21$ms (for ARC) and $ 3.42$ms (for LRU), as compared to DEMOTE with $ 5.43$ms (for ARC) and $ 5.05$ms (for LRU), respectively. In fact, both DEMOTE-LRU and DEMOTE-ARC consistently perform worse than even plain LRU. Surprisingly, for smaller cache sizes, DEMOTE variants perform even worse than no caching at all (i.e. worse than $ t_m = 5 ms$)! This happens because, when bandwidth is the bottleneck and we use DEMOTE, the bandwidth saved due to one hit in cache $ C_1$ is consumed by one demotion due to a miss. When the number of misses is greater than the number of hits in the cache $ C_1$ ( $ < 50$% hit ratio), a no-caching policy will actually perform better.

Since DEMOTE is clearly worse in the limited bandwidth case, we consistently assume unlimited inter-cache bandwidth and free demotions for the remaining traces shown in Figures 9 and 10.


next up previous
Next: Differing Cache Sizes Up: Results Previous: Very Close Bounds on
root 2008-01-08