Reduction

Next: The Reduction Skeleton Up: Data Parallel Objects Previous: Data Parallel Objects

Reduction

An important aspect of data parallelism is the reduction operation where each thread contributes a value and the values are reduced using a function to obtain and return a reduced value to each of the threads. Reduction can be defined follows:
Let be the set of all elements of type . Let denote the set of all finite subsets of . Let be the set of all commutative and associative binary operators/functions defined over elements of type . Then a reduction operator, , is defined from . A reduction operation is the application of an operator to a set of elements of type and returns a result of type . Operationally, if is the set , then

is one way of applying the operator.

As an example, let be the int type, and be instantiated to a vector v of type v<int> with 10 elements . Let be the addition operator +. Since + is associative over the set of integers, it qualifies as a reduction operation and .

The reduction operation itself is a parallel operation that can be done with O(log N) complexity using a tree-style operation. There are three aspects to reduction:

The data types of the individual values contributed by each of the threads.
The computation pattern of the reduction operation.
The reduction operator or function that is used to reduce the values. This should be an associative function (i.e., f(f(x,y),z) = f(x, f(y,z))).

It may be noted that the reduction computation pattern is independent of the data types of the values contributed. It just depends on the amount of parallelism that is available in terms of the number of processors and threads. Tree reductions are effective in reducing the parallel complexity. However, doing the reduction operation this way in parallel may demand that the reduction operator not only be associative but also be commutative (i.e., f(x,y) = f(y,x)).

Since the computation pattern is independent of the data types of the values and the reduction operator, a computation pattern skeleton can be built at the time of rope creation. This skeleton can be used and reused for different reduction operations. The way Coir<Futures> defines the computation pattern, it has the following properties:

Reduction operations are rope-specific. Thus reduction operations belonging to different ropes are non interfering.
Two different reduction operations within the same rope are non-interfering. This is ensured by defining this skeleton to consist of two trees - a fan-in tree and a fan-out tree.

Next: The Reduction Skeleton Up: Data Parallel Objects Previous: Data Parallel Objects

Sundaresan Neelakantan
Thu May 15 16:11:49 PDT 1997