We present two major results:
Bounds for Optimal Performance: In the study of caching algorithms, it is invaluable to know the offline optimal performance that a multi-level cache can deliver. While Belady's MIN [4] (an offline algorithm) is considered optimal for single-level caches, it cannot be applied to multi-level cache scenarios, where, apart from the hit ratio, the location of hits is also extremely important. Our first contribution provides insight into the optimal offline performance of multi-level caches.
We provide policies called OPT-UB and OPT-LB that provably serve as upper and lower bounds for the optimal offline performance for multi-level caches along both average response time and inter-cache bandwidth usage metrics.
Through a series of experiments on a wide gamut of traces, cache sizes and configurations, we demonstrate that
OPT-UB and OPT-LB are very close bounds on the optimal average response time,
running on an average, within % and
% of each other for all the tested two-cache and three-cache hierarchies, respectively.
Even for more complex hierarchies, the bounds remain close at about
% of each other.
This novel result enables us to estimate for the first time, the
performance gap between the current state-of-the-art algorithms and the offline
optimal for multi-level caches.
PROMOTE Technique:
As another fundamental contribution to the field of caching, we propose a simple and significantly better alternative to the DEMOTE technique, called PROMOTE,
which provides exclusive caching without demotions, application hints, or any of the overheads inherent in DEMOTE.
PROMOTE uses a probabilistic filtering technique to ``promote" pages to higher caches on a read.
Not only do we show that PROMOTE is applicable to a wider range of algorithms and cache hierarchies, it is on an average, x more efficient than DEMOTE requiring only half the
inter-cache bandwidth between the various cache levels.
In a wide variety of experiments, while both techniques achieved the same aggregate hit ratio, PROMOTE provided % and
% more hits in
the highest cache than DEMOTE when the techniques were applied to LRU and ARC [25] algorithms, respectively, leading to better average response times
even when we allow DEMOTE unlimited inter-cache bandwidth and free demotions.
In limited bandwidth scenarios, PROMOTE convincingly outperforms
DEMOTE. For example, in a trace from a real-life scenario,
PROMOTE provided an average response time of
ms as compared to
ms for DEMOTE on a two-level hierarchy of ARC caches, and
ms as compared to
ms on a three-level cache hierarchy.