Check out the new USENIX Web site. next up previous
Next: Performance Results and Analysis Up: Performance Evaluation Previous: Performance Evaluation

Methodology

We have successfully implemented a fully operational AASFP prototype under Linux 2.0.34. For the source-to-source translator, we did not modify Gcc, but built a parser of our own, which currently only accepts programs written in ANSI C. The parser itself consists of 2.5K lines of code and the I/O extraction part about 3K lines of code. The modification to the Linux kernel involves about only 500 lines of code and the modification to the device driver code is less than 50 lines; therefore this work should be fairly easy to port to new Linux kernels. To evaluate the prototype's performance, we ran one micro-benchmark and two real media applications, and measured their performance on a 200-MHz PentiumPro machine with 64MByte memory.

The first real-application is a volume visualization program based on the direct ray casting algorithm [27]. The volume data set used here is of the size $256 \times 256 \times 256$ and each data point is one byte. This data set is divided into equal-sized blocks, which is the basic unit of disk I/O operation. The block size can be tuned to exploit the best trade-off between disk transfer efficiency and computation-I/O parallelism. In this experiment, we view that data from different viewing directions and use different block sizes. Results for two block sizes are reported here: 4KB ( $16 \times 16 \times 16$) and 32KB ( $32 \times 32 \times 32$). For non-orthonormal viewing directions, the access patterns of the blocks are quite random. Therefore it provides a good example showing that the default Linux prefetching algorithm can do little help here.

The second application is an out-of-core FFT program [20]. The original program uses four files for reading and writing. We have modified it to merge all the reads and writes into one big file. We have tested the FFT program with 256K points and 512K points of complex numbers; the input file sizes are 2MB and 4MB respectively. Each read/write unit is 4KB bytes.

Table 1 shows the characteristics of different applications we used in this performance evaluation study. Vol Vis 1, Vol Vis 2, Vol Vis3, and Vol Vis 4 are four variations of the volume visualization application viewed from different angles with different block sizes. FFT 256K and FFT 512K are the out-of-core FFT program with different input sizes. Forward 1, Backward 1, Forward 2, and Backward 2 are variations of a micro-benchmark that emulates the disk access behavior of a digital video player that supports fast forward and backward in addition to normal playbacks.


next up previous
Next: Performance Results and Analysis Up: Performance Evaluation Previous: Performance Evaluation
chuan-kai yang 2002-04-15