Figure 3: Cache Miss Rate. These charts compare the accumulative cache miss rate with prefetching (PF) against that without prefetching (No PF). The comparison is performed under three different cache sizes (S), measured by number of files.
Our first metric was the cache miss rate. The results for the three traces appears in Figure 3. Prefetching delivers a substantially better miss rate with all cache sizes. The results are worse for the second trace because it includes a recursive directory traversal (i.e., the UNIX find program) over a hierarchy that includes thousands of files that are never accessed again.
In addition, we monitored the cache behavior at a finer grain: how well our mechanism worked over each run of 500 accesses. At the end of each run, we checked to see whether prefetching had beaten no-prefetching by measuring the number of misses during that run. As shown in Table 1, prefetching won this comparison easily and consistently. This shows that our mechanism's superiority is steady and stable, and not simply a result of a few exceptionally fruitful prefetch sequences. The few losses due to bad prefetching guesses are more than offset by the many wins.
Both Figure 3 and Table 1 indicate that the increased intelligence of our mechanism is more effective in smaller caches. This was expected. Consider that, in the extreme case of an infinitely large cache, any file appearing in any access tree is still in the cache. There is no room for improvement from any prefetching algorithm that is based solely on information about the past.
Table 1: A Finer-grained Comparison. At the end of each run of 500 file accesses, we determined whether prefetching had beaten no-prefetching by measuring the number of cache misses during that run.
Table 2: Prefetching Accuracy and Overhead. The accuracy is the percentage of the predictions that were actually used. The overhead is the percentage of the file fetches due to bad guesses.
Figure 4: Effectiveness of the compatibility Function. This figure illustrates, for each trace, the split of pattern tree loads into proper ones and improper ones.
Figure 5: Mechanism Stability. This figure presents measurements of the prefetching mechanism over a wide range of parameter values. MATCH_THRESHOLD was varied from 0 to 1. PREFETCH_CAPACITY from 0 (no prefetching) to 50. The measurements were taken for Trace 3 and a cache size of 50.
We measured the accuracy of our prefetch decisions, defined as the percentage of file access predictions that were actually used. We also calculated one overhead of the mechanism, i.e., the percentage of the file fetches that were initiated due to bad prefetch decisions. This reflects the network bandwidth and server capacity that were wasted. Table 2 shows these results. (We shall address another overhead, the CPU cycles expended by the prefetcher, in Section 4.) Because of the LRU policy, bigger caches give better accuracy results because prefetched entries have a better chance to survive cache entry replacements. Similarly, bigger caches also give better overhead results because bad predictions have a better chance to have already existed in the cache; in such a case, no fetch is performed.
The compatibility function plays a critical role in our mechanism. The assumption underlying this function is that if the initial portions of two access trees bear a high compatibility, so will the two trees in their entirety. In order to test this assumption, we examined the number of loads, i.e., the total number of times we prefetched pattern trees. We further classified these loads as either proper or improper. A load became improper when a different pattern tree was selected by the prefetcher to take the place of the current one, or when the finished working tree did not match the pattern tree loaded. Otherwise, the load was proper. Our results in Figure 4 suggest that the compatibility function is effective.
Finally, we note that the algorithmic parameters should be stable: small variations in the settings must not produce large performance degradations. To illustrate the stability of the algorithmic parameters, we measure the mechanism over a wide range of parameter values. In Figure 5, we show the results for Trace 3 and a cache size of 50. Three measurements are presented: cache miss rate, predication accuracy and prefetch overhead. The results for the other two traces and cache sizes are similar.
The simulation demonstrates that our algorithm can accurately predict future file accesses based on past file usage. The major limitation of the simulation study is that it does not account for the relative timing of events due to the absence of such information in the traces used. As a result, prefetched files are assumed to appear in the cache instantaneously. It remains to be determined whether a real system will have the resources to exploit the information on future file accesses. The limitations of the simulation motivated us to conduct a full implementation and further evaluations.