3.4 FastReplica in the Large

Next: 3.5 Reliable FastReplica Algorithm Up: 3 FastReplica Algorithm Previous: 3.3 Preliminary Performance Analysis

3.4 FastReplica in the Large

In this Section, we generalize FastReplica in the small to a case where a set of nodes to which a file has to be replicated can be in the range of hundreds/thousands of nodes. Let

be a number of network connections chosen for concurrent transfers between a single node and multiple receiving nodes (i.e.

limits the number of nodes in the group for Multiple Unicast or FastReplica strategies). An appropriate value of

can be experimentally determined via probing. Heterogeneous nodes might be capable of supporting a different number of connections. Let

be the number of connections suitable for most of the nodes in the overall replication set. A natural way to scale FastReplica in the small to a large number of nodes is:

partition the original set of nodes into replication groups, each consisting of nodes;
apply FastReplica in the small iteratively: first, replicate the original file to a group of nodes, and then use these nodes as the origin nodes with file to repeat the same procedure to a new groups of nodes, etc.

Schematically, this procedure is shown in Figure 6, where circles represent the nodes, and boxes represent the replication groups. The arrows, connecting one node with a set of other nodes, reflect the origin node and the recipient nodes, involved in communications on a particular iteration of the algorithm.

**Figure 6:** *FastReplica in the large*: iterative replication process.
$\begin{figure} \centering \def 1 ...$

At the first iteration, node

replicates file

to group

, consisting of

nodes, by using the FastReplica in the small algorithm. At the second iteration, each node $N_i^1 (1\leq i \leq k)$ of group

can serve as the origin node propagating file

to another group

. Thus in two iterations, file

can be replicated to $k \times k$ nodes. Correspondingly, in three iterations, file

can be replicated to $k \times k \times k$ nodes. The general FastReplica algorithm is based on the reasoning described above. Let the problem consist in replicating file

across nodes

and let ${n \over k} = m$ . Then all the nodes are partitioned into

groups:

$\begin{displaymath}G^1, G^2, ..., G^m\end{displaymath}$

where each group has

nodes. Any number

can be represented as

$\begin{displaymath} m = { {c_1} \times k^{i_1} + {c_2} \times k^{i_2} + ... + {c_j} \times k^{i_j} } \end{displaymath}$

(8)

where $i_1 > i_2 > ... > i_j \geq 0$ and

. Practically, it is a

-ary representation of a number

. This representation defines the rules for constructing the tree structure similar to the one shown in Figure 6. In particular, the height of such a tree is ${i_1}+1$ , and it defines the number of iterations in the general FastReplica algorithm. >From this representation, the rules for constructing the corresponding distribution lists of nodes are straightforward. We omit the technical details of the distribution lists construction in order to keep the description of the overall algorithm concise. If the targeted number

of nodes for a file replication is not a multiple of

, i.e.

$\begin{displaymath}{n \over k} = m + r \end{displaymath}$

where

, then there is one ``incomplete'' group $\hat{G}$ with

nodes in it. The best way to deal with this group is to arrange it to be a leaf-group in the shortest subtree. Let $G^{\prime} = \{N_1^{\prime}, ..., N_k^{\prime}\}$ be a replication group in the shortest subtree.

**Figure 7:** Communications between the nodes of regular replication group $G^{\prime }$ and incomplete replication group $\hat{G}$ : *special step*.
$\begin{figure} \centering \def 1 ...$

The communications between groups $G^{\prime }$ and $\hat{G}$ follow a slightly different file exchange protocol. All the nodes in $G^{\prime }$ have already received all subfiles

comprising the entire original file

. Each node $N_i^{\prime}$ of group $G^{\prime }$ opens

concurrent network connections to all

nodes of group $\hat{G}$ for transferring its subfile

as shown in Figure 7. In this way, at the end of this step, each node of group $\hat{G}$ has all subfiles

of the original file

. We will denote this step as a special step. Example. Let

. How many algorithm iterations are required to replicate the original file to 1000 nodes? Using Formula (8) above, we derive the following representation for 1000 nodes:

$\begin{displaymath}1000 = 10 \times 10^2\end{displaymath}$

Thus, in three algorithm iterations ( $10 \times 10 \times 10)$ , the original file can be replicated among all 1000 nodes. At each iteration, the replication process follows FastReplica in the small, i.e. the iteration consists of 2 steps, each used for transferring the $1 \over k$ -th portion of the original file

. Let Multiple Unicast follow a similar recursive replication tree as the one defined above in general FastReplica and shown in Figure 6, with the only difference being that communications between the origin nodes and the recipient nodes follow the Multiple Unicast schema, i.e. the origin node transfers the entire file

to the corresponding recipient nodes by simultaneously using

concurrent network connections. Thus, in three algorithm iterations, by using Multiple Unicast recursively, the original file can be replicated among 1000 nodes.

Next: 3.5 Reliable FastReplica Algorithm Up: 3 FastReplica Algorithm Previous: 3.3 Preliminary Performance Analysis