Check out the new USENIX Web site. next up previous
Next: Inter-Procedural Dependency Analysis Up: Generation of Prefetch Thread Previous: Generation of Prefetch Thread

Intra-Procedural Dependency Analysis

The goal of intra-procedure dependency analysis is to identify, inside a procedure, all the variables and statements that disk access statements depend on. Let ${related\_set(x)}$ of a variable $x$ be the set of statements that are involved, directly or indirectly, in generating the value of $x$ in a procedure. We use a simple and conservative approach as follows to compute $related\_set(x)$:
  1. All the statements that directly update the variable $x$, i.e., those that define $x$, are included in $related\_set(x)$.
  2. Compute the set $A$ that contains all the variables used in the statements in $related\_set(x)$. Then for each variable $a \in A$, include $related\_set(a)$ into $related\_set(x)$.

Note that in a block structured language like C, special care should be taken to restrict the computation of $related\_set(x)$ in Step 1 within the scope of the declaration of the variable $x$. For example in the code fragment below, lines 4 and 5 should not be included in $related\_set$ of the variable $i$ on line 7.

1. void main(void){
2.      int i;
3.      i = 5;
4.      {   int i;
5.          i = 6;
6.      }
7.      lseek(fp,i,SEEK_SET);
8. }

The above algorithm for computing $related\_set(x)$ is conservative and simple to implement. However, it might include some redundant definitions of a variable which never reaches any disk I/O statement. Since the generated source code will go through a final compilation phase, we expect that the compiler could eliminate these redundant statements with a more detailed data flow analysis [1].

Given this algorithm to identify related set, we use the following algorithm to analyze a procedure and deduce the corresponding prefetch thread:

  1. Include the disk access calls in the original program that operate on prefetchable files in $PT$, the set of statements that are I/O related. For each variable $x$ that appears in the disk access statements, mark it as I/O related, compute $related\_set(x)$, and include the result into $PT$.

  2. For each flow-control statement, e.g., for, while, do-while, if-then-else, case, if there is at least one member of $PT$ inside its body, mark all the variables used in the boolean expression of the flow-control statement as I/O related. For each such variable $a$, compute $related\_set(a)$, and include it in $PT$. Repeat this step until $PT$ stops growing in terms of program size.

  3. Include into $PT$ the declaration statements of all variables that are marked as I/O related.

  4. Insert into $PT$ the necessary synchronization calls, add send_XXX and receive_XXX calls to transfer data between the threads, and rename shared global variables to avoid simultaneous updates from both threads.

  5. Finally, if a procedure does not contain any I/O related statements after the algorithm completes, then remove its statements from $PT$.

The above algorithm executes on each procedure iteratively until the resulting I/O-related variable set converges.


next up previous
Next: Inter-Procedural Dependency Analysis Up: Generation of Prefetch Thread Previous: Generation of Prefetch Thread
chuan-kai yang 2002-04-15