The source-to-source translator generates a prefetch thread from an application program by extracting the parts of the program that are related to disk I/O accesses and removing all the other computation. In addition, all the disk I/O calls in the prefetch thread are replaced by their corresponding prefetch calls to the prefetch library. The original program itself forms the computation thread. There is a one-to-one correspondence between the prefetch calls in the prefetch thread and the actual disk I/O calls in the computation thread. The prefetch thread is executed as a Linux thread, as supported by the pthread library [11]. All the prefetch calls are serviced by AASFP's run-time prefetch library, which generates the logical file block address associated with each prefetch call and inserts the prefetch requests into a user-level prefetch queue. When the user-level prefetch queue becomes full, the prefetch library makes a system call to transfer the prefetch requests to the application's kernel-level prefetch queue. However, these prefetch hints are completely non-binding (i.e., the kernel might ignore these hints if there are not enough resources for file prefetching).
A major innovation of AASFP is that the computation and the prefetch thread are automatically synchronized so that the kernel neither prefetches too far ahead nor falls significantly behind. That is, AASFP ensures that the portion of the target files prefetched are those that are to be accessed by the computation thread in the immediate future. To maintain synchronization between the prefetch and computation threads, AASFP marks each entry in the prefetch queue with the ID of the corresponding prefetch call. The kernel also maintains the ID of the current disk I/O call of the computation thread. These IDs are created and maintained by the run-time system without programmer intervention. When the ID of an entry in the prefetch queue is smaller than the ID of the most recently made disk I/O call, the computation thread has run ahead of the prefetch thread, and the kernel simply skips the expired prefetch queue entries. Therefore, even if the prefetch thread falls behind, it never prefetches unnecessary data. To prevent the prefetch thread from running too far ahead of the computation thread, the kernel attempts to maintain a prefetch depth of , based on the following: the run-time measurements of the average disk service time, which can be measured off-line beforehand; and the average amount of computation that the application performs between consecutive I/O calls.
The files that an application accesses can be classified into two categories: data files, which are expected to be prefetched by AASFP, and configuration files, which are retrieved in the beginning and used to steer the computation to generate access addresses, and thus do not need to be prefetched. It is not possible for the translator to distinguish between data and configuration files. For every data file, the application programmer can optionally provide an additional annotation of the form PREFETCH file *fp to indicate to the source-to-source translator that the file descriptor fp should be prefetched at run time. These files will be referred to as prefetchable files for the rest of the paper.