FAST '08 – Abstract
Pp. 159–174 of the Proceedings
Improving I/O Performance of Applications through Compiler-Directed Code
Restructuring
Mahmut Kandemir and Seung Woo Son, Pennsylvania State University; Mustafa Karakoy, Imperial College
Abstract
Ever-increasing complexity of large-scale applications and continuous
increases in sizes of the data they process make the problem of maximizing
performance of such applications a very challenging task. In
particular, many challenging applications from the domains of
astrophysics, medicine, biology, computational chemistry, and
materials science are extremely data intensive. Such applications
typically use a disk system to store and later retrieve their large
data sets, and consequently, their disk performance is a critical
concern. Unfortunately, while disk density has significantly improved
over the last couple of decades, disk access latencies have not. As a
result, I/O is increasingly becoming a bottleneck for data-intensive
applications, and has to be addressed at the software level if
we want to extract the maximum performance from modern computer
architectures.
This paper presents a compiler-directed code restructuring scheme for
improving the I/O performance of data-intensive scientific
applications. The proposed approach improves I/O performance by
reducing the number of disk accesses through a new concept called disk
reuse maximization. In this context, disk reuse refers to reusing the
data in a given set of disks as much as possible before moving to
other disks. Our compiler-based approach restructures application
code, with the help of a polyhedral tool, such that disk reuse is
maximized to the extent allowed by intrinsic data dependencies in the
application code. The proposed optimization can be applied to each loop
nest individually or to the entire application code. The experiments
show that the average I/O improvements brought by the loop nest based
version of our approach are 9.0% and 2.7%, over the original application
codes and the codes optimized using conventional schemes, respectively.
Further, the average improvements obtained when our approach is applied
to the entire application code are 15.0% and 13.5%, over the original
application codes and the codes optimized using conventional schemes,
respectively. This paper also discusses how careful file layout
selection helps to improve our performance gains, and how our proposed
approach can be extended to work with parallel applications.
- View the full text of this paper in HTML and PDF.
Listen to the presentation in
MP3 format.
The Proceedings are published as a collective work, © 2008 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
|