Check out the new USENIX Web site. next up previous
Next: Our Contributions Up: Introduction Previous: Prefetching

Sequential Prefetching

Sequentiality is a characteristic of workloads which reference consecutively numbered pages in ascending order without gaps. Sequential file accesses have been known at least since Multics [16]. Sequentiality naturally arises in video-on-demand, database scans, copy, backup, and recovery that may read a large number of files sequentially. Evidence of sequentiality abounds in database workloads, for example, [17,18,19,20,21]. The world of database and storage systems performance is largely dominated by benchmarks. The Transaction Processing Performance Council (TPC) benchmarks TPC-D [21,22] and TPC-H exhibit a significant amount of sequentiality. Similarly, more recent Storage Performance Council (SPC)'s first benchmark SPC-1 is designed to be a mix of random and sequential workloads [23,24]. The importance of sequential access patterns is further underscored by the fact that forthcoming SPC-2 benchmark will focus entirely on many concurrent sequential clients [25].

In contrast to sophisticated forecasting methods, detecting sequentiality is easy, requiring very little history information, and can attain nearly 100% predictive accuracy. An important trend is that sequential bandwidth of the disk has been increasing at a respectable annual rate of 40% while seek time have improving only at a meager annual rate of 8%. This implies that the additional cost of read-ahead on a seek is becoming progressively smaller. For these reasons, all UNIX variants [26], most modern day file systems [27,28], databases such as DB2 [29] and Oracle [30], and high-end storage controllers such as IBM Shark [31], EMC Symmetrix all employ sequential detection and prefetching.


next up previous
Next: Our Contributions Up: Introduction Previous: Prefetching
Binny Gill 2005-02-14