SARC

Next: System Implementation, Workload, and Up: SARC: Sequential Prefetching in Previous: Equalizing Marginal Utilities

SARC

We now weave together the above analysis and insights into a concrete, adaptive (self-tuning) algorithm that dynamically balances the cache space allocated to RANDOM and SEQ data under different conditions such as different number of sequential/random streams, different data layout strategies, different think times of sequential/random streams, different cache-sensitivities/footprints of random streams, and different I/O intensities. The complete algorithm is displayed in Figure 4. SARC is extremely simple to implement and has virtually no computational/space overhead over an LRU-based implementation.

$\begin{figure}\hrule\vspace*{1mm} {\small {\raggedright {\sc Initialization:}}} ... ...seqCounter} (x)$ = $1$ \\ 40:\>\> {\sf endif} \end{tabbing}\hrule\end{figure}$

**Figure 4:** Algorithm for Sequential Prefetching Adaptive Replacement Cache. This algorithm is completely self-contained, and can directly be used as a basis for an implementation. The algorithm starts from an empty cache.
$\begin{figure}\hrule\vspace*{1mm} {\small {\raggedright {\sc Cache Management Po... ...}$ = ${\sf seqListSize}$ \\ 73:\> {\sf endif} \end{tabbing}\hrule\end{figure}$

In our experiments, we have set $\Delta L$ to be $2\%$ of the cache space. Any other small fraction would work as well.

In the algorithm, we will need to determine if a hit in one of the lists was actually in its bottom (see lines 6 and 12). Of course, one could maintain separate fixed sized bottom lists for both the lists. However, with an eye towards simplification and computational efficiency, we now describe a time-based approximation to achieve nearly the same effect. We now describe how to determine if the a hit in SEQ lies in its bottom $\Delta L$ . Let $T_{\sf MRU}$ and $T_{\sf LRU}$ denote the time stamp of the MRU and LRU tracks, respectively, in SEQ. Let denote the size of SEQ in pages. Let $T_{\sf HIT}$ denote the time stamp of the hit track. Now, if

$\displaystyle (T_{\sf HIT} - T_{\sf LRU}) \leq \frac{\Delta L}{L} (T_{\sf MRU} - T_{\sf LRU}),$

then we say that a bottom hit has occurred. The same calculation can also be used for determining the bottom hits in RANDOM.

The algorithm uses the following constants: (lines 18 and 32) denotes the degree of synchronous and asynchronous read-ahead; (lines 18 and 32) denotes the number of disks in a RAID group; ${\sf triggerOffset}$ (line 49) is the offset from the end of a prefetched group of tracks and is used to demarcate the asynchronous trigger; LargeRatio (line 13) is a threshold, say, , to indicate that the ratio has become much larger than ; and FreeQThreshold (line 52) denotes the desired size of the FreeQ. Shark uses RAID arrays (either RAID-5 or RAID-10) at the back-end. Our machine (see Section IV-B) was configured with RAID-5 (6 data disks + parity disk + spare disk) meaning that . RAID allows for parallel reads since logical data is striped across the physical data disks. To leverage this, setting as a multiple of is meaningful. We set . Finally, we chose ${\sf triggerOffset}$ to be .

The algorithm is self-explanatory. We now briefly explain important portions of the algorithm in Figure 4 in a line-by-line fashion.

Lines 1-3 are used during the initialization phase only. The counter ${\sf seqMiss}$ tracks the number of sequential misses between two consecutive bottom hits in RANDOM, and is initialized to zero. The variable ${\sf desiredSeqListSize}$ is the desired size of SEQ, and is initially set to zero meaning adaptation is shut off. The adaptation will start only after SEQ is populated (see lines 69-73). The variable ${\sf adapt}$ determines the instantaneous magnitude and direction of the adaptation to ${\sf desiredSeqListSize}$ .

Lines 4-50 describe the cache management policy. The quantity ${\sf ratio}$ in line 4 is derived from (6). Lines 5-10 deal with the case when a track in RANDOM is hit. If the hit is in the bottom portion of the RANDOM list (line 6), then ${\sf seqMiss}$ is reset to zero (line 7) since we are interested in number of sequential misses between two successive hits in the bottom of RANDOM. Line 8, sets the variable

$\displaystyle {\sf adapt} = \frac{2 \cdot {\sf seqMiss} \cdot \Delta L}{L} - 1.$

Note that in the steady state the rate of new tracks being added to any cache is the same as the rate of old tracks being demoted from the cache. This rate is an upper bound on the rate at which the size of either SEQ or RANDOM can change while keeping the total number of tracks in the cache constant. Since ${\sf adapt}$ is used to change the ${\sf desiredSeqListSize}$ , it is therefore not allowed to exceed

or be less than

. Observe that by the test prescribed in (6), if ${\sf adapt}$ is greater than zero, then we would like to increase ${\sf desiredSeqListSize}$ , else we would like to decrease it. This increase or decrease is executed in line 70. Also, observe that when the inequality between the marginal utilities of SEQ and RANDOM is larger, the magnitude of ${\sf adapt}$ is larger, and, hence, a faster rate of adaptation is adopted, whereas when the two marginal utilities are nearly equal, ${\sf adapt}$ will be close to zero, and a slower rate of adaptation is adopted. Finally, observe that line 70 (which carries out the actual adaptation) is executed only when a track is actually evicted from one of the lists. In a steady state, tracks are evicted from the cache at the rate of cache misses. Hence, a larger (respectively, a smaller) rate of misses will effect a faster (respectively, a slower) rate of adaptation. Hence, SARC adapts not only the sizes of the two lists, but also the rate at which the sizes are adapted.

Lines 11-27 deal with the case when a track in SEQ is hit. If the hit is in the bottom portion of the SEQ list (line 12) and ${\sf ratio}$ has become large (line 13), in other words, no hit has been observed in the bottom of the RANDOM list while comparatively large number of sequential misses have been seen on the SEQ list, then set ${\sf adapt}$ to (line 14) meaning that increase ${\sf desiredSeqListSize}$ at the fastest rate possible. Now, if the hit track is an asynchronous trigger track (line 17), then asynchronously read-ahead the next sequential group of tracks (line 18). Lines 21-27 describe how the sequential detection mechanism in Section II-A is implemented.

Lines 28-40 deal with a cache miss. For a sequential miss (lines 29-31), synchronously read-ahead a sequential group of tracks (line 32). The remaining lines deal with sequential detection mechanism in Section II-A.

Lines 41-50 (i) read the missing tracks from a given range of tracks; (ii) places all tracks in the given range at the MRU position; and (iii) set the asynchronous trigger.

Lines 51-73 implement the cache replacement policy and carry out the adaptation. As is typical in multi-threaded systems, we assume that these lines run on a separate thread (line 51). If the size of the free queue drops below some predetermined threshold (line 52), then tracks are evicted from SEQ if it exceeds ${\sf desiredSeqListSize}$ and tracks are evicted from RANDOM otherwise. In either case, the evicted tracks are placed on the free queue. Observe that SARC becomes active (lines 60-64) only if the sizes of both the SEQ and the RANDOM list exceed $\Delta L$ . Otherwise, a simple LRU eviction (lines 54-58) is done. Whenever the utility of one of the two lists becomes so small when compared to the utility of the other list that its size eventually drops below $\Delta L$ , we do not want to waste even $\Delta L$ amount of cache, and revert back to LRU. Whenever both list sizes exceed $\Delta L$ , SARC takes over. Finally, lines 68-73 evict the LRU track from the desired list, and effect an adaption as already described above.

Our description of SARC is now complete.

Next: System Implementation, Workload, and Up: SARC: Sequential Prefetching in Previous: Equalizing Marginal Utilities

Binny Gill 2005-02-14