A well-known solution to this problem implemented in many UNIX systems, including Linux, is to prefetch a disk block before it is needed [12]. Linux assumes that most applications access a file in a sequential order. Hence, whenever an application reads a data block from the disk, the file system prefetches blocks , , ... for some small value of . If the access pattern is not sequential, prefetching is suspended until the application's disk accesses exhibit a sequential access pattern again. For most common applications, this simple sequential prefetching scheme seems to work well. However, sequential access is not necessarily the dominant access pattern for some other important applications, such as volumetric data visualization, multi-dimensional FFT, or digital video playbacks (e.g., fast forwards and rewinds); these are popular applications in the scientific visualization or multimedia world.
Different applications may exhibit different access patterns and thus a fixed prefetching policy is not adequate. We propose an Automatic Application-Specific File Prefetching mechanism (AASFP) that allows an application program to instruct the file system how to prefetch on its behalf. AASFP automatically generates the prefetch instructions from an application's source code. The prefetch instructions are instantiated as a program running inside either a local file system or a network file server. AASFP is an application of the idea of decoupled architecture [22], which was originally proposed to address the von Neumann bottleneck to bridging the CPU-disk performance gap.
AASFP transforms a given application program into two threads: a computation thread and a prefetch thread. The computation thread is the original program and contains both computational and disk access instructions; the prefetch thread contains all the instructions in the original program that are related to disk I/O. At run time, the prefetch thread is started earlier than the computation thread. Because the computation load in the prefetch thread is typically a small percentage of that of the original program, the prefetch thread could remain ahead of the computation thread throughout the application's entire lifetime. Consequently, most disk I/O accesses in the computation thread are serviced directly from the file system buffer, which is populated beforehand by the I/O accesses in the prefetch thread. In other words, the prefetch thread brings in exactly the data blocks as required by the computation thread before they are needed. Of course, it is not always possible for the prefetch thread to maintain sufficient advance over the computation thread. For example, the computation thread may need to access a disk address, but the generation of this address may depend on the user inputs or on the inputs from a file. In such cases, these two threads need to synchronize with each other. Fortunately, such tight synchronization is relatively infrequent in I/O-intensive programs. The key advantage of AASFP is that it eliminates the need for manual programming of file prefetch hints by generating the prefetch thread from the original program using compile-time analysis. In addition to being more accurate in what to prefetch, AASFP also provides more flexibility to the file system in deciding when to prefetch disk blocks via a large look-ahead window lead by the prefetch thread.
The rest of this paper is organized as follows. Section 2 reviews some of the related work. Section 3 describes the system architecture of AASFP. Section 4 shows how to generate the prefetch thread from a given program. Section 5 discusses the run-time system of AASFP. Section 6 evaluates the performance results of AASFP. Section 7 concludes this paper and points out potential future directions.