We now turn to the computation of the ALMI distribution tree. A session multicast tree is formed as a virtual Minimum Spanning Tree that connects all members. The minimum spanning tree calculation is performed at the session controller and results are communicated to all members in the form of a (parent, children) list. Link costs are representative of application specific performance metric which is computed by members in a distributed fashion and reported to the controller periodically. In our current implementation, we uses roundtrip delay, measured at ALMI layer, as the metric because latency is important to most applications and is also relatively easy to monitor. However, some applications may find other metrics such as available link bandwidth, more useful and better suited to match its performance measure. As an example, a bandwidth intensive application may prefer a high bandwidth, high delay link to a low delay, low bandwidth link to carry its traffic. Design and development of these type of tools to obtain more sophisticated measurements helps ALMI to provide more flexible services and these tools can be easily plugged in as a module to ALMI. Nevertheless such instrumentation in a wide area network is non-trivial and it is beyond the scope of this paper to discuss these mechanisms. In the rest of this paper, we will simply use delay as the default performance metric.
Neighbor monitoring graph
In order to obtain monitoring results, ALMI connects all group members
into a monitoring graph. Members send ping messages to measure round
trip delay to its neighbors in the graph. For small groups, it is
possible to create a mash and have message exchanges to
compute the best multicast tree. However, as group size grows, it
becomes unscalable to have large number of message exchanges since the
monitoring process is periodic and continuous through the whole
multicast session. To reduce control overhead, we limit the degree of
each node in the graph, i.e. the number of neighbors monitored by a
member, to be constant so as to reduce the number of message exchanges
to . The consequent spanner graph results in sub-optimal
multicast tree since it does not have a complete view of all possible
paths and its set of edges may not be a super set of all edges in
MST. Such sub-optimality is reduced, however, by occasionally purging
the currently known bad edges from the graph and updating it with
edges currently not in the graph. Over time, the graph converges to
include all edges in the optimal degree-bounded spanning
tree. Likewise, in a dynamic environment, the graph updates to trace
the better set of edges and to produce a more favorable multicast
tree.
Multicast tree and its stability
Once members start to report monitoring results to their session
controller, ALMI is able to improve the multicast tree from its
initial random tree.1 As
described above, an ALMI multicast tree is a degree-bounded optimal
spanning tree. Since most end hosts tend to be on access links rather
than at network core, it is desirable to confine the number of packet
copies traversing through access links to be small, i.e a small degree
bound. On the other hand, if servers use ALMI to construct a multicast
session and they have access to high speed network, the degree bound
can be correspondingly configured higher.
A more crucial issue is how to achieve stability of the multicast tree since a change of tree is associated with operational cost such as GRAFT, GRAFT ACK and re-initiation of the data connection. More over, data packet may be lost or duplicated during a tree transition, and recovery process can be expensive for it incurs additional delay and data buffering at the application. Therefore, our goal of improving the performance of multicast tree is only on a long term basis and any potential path oscillations are prevented. The controller calculates the overall performance gain of the new multicast tree and switches tree only if the overall gain exceeds a threshold. Both the frequency and threshold of switching tree are user configurable parameters.