Selecting the right provider link over which to direct each transfer is crucial to realizing the performance benefits of multihoming from the enterprise network's perspective. The choice of the right ISP clearly depends on the time-varying performance of each provider link to each destination being accessed. However, network performance could vary over small timescales, very drastically on some occasions [4,22]. A multihomed enterprise, therefore, needs effective mechanisms to monitor the performance for most, if not all, destinations over each of its providers links.
There are two further issues in monitoring performance over provider links: what to monitor and how. In the enterprise case, one would ideally like to monitor the performance from every possible content provider over each ISP link. However, this may be infeasible in the case of a large enterprise which accesses content from many different sources. A simple solution to this problem is to monitor only the most important destinations on the basis of the volume of requests made from the enterprise (e.g., the top 100 most frequently accessed destinations). This would ensure that a significant fraction of all flows will experience good performance.
For the second question (i.e., how to monitor), two common approaches are active monitoring and passive monitoring. Active monitoring works by having the multihomed enterprise perform out-of-band measurements of performance to or from the destinations selected by the policy used to determine what to monitor. These measurements could be simple pings involving, for example, ICMP ECHO_REQUEST or TCP SYN packets to the destinations. These measurements are to be taken over each provider at regular intervals.
On the other hand, passive measurement mechanisms rely on observing the performance of ongoing transfers (i.e., in-band) to destinations, and using these observations as samples for estimating performance over the given provider. However, in order to ensure that there are enough samples over all providers, it may be necessary to explicitly direct some transfers over particular links. We describe our implementations of both passive and active monitoring in detail in Section 3.
An important component of monitoring performance is the time interval of monitoring. A long interval between performance samples implies using stale information to estimate provider performance. This might result in a suboptimal choice of the provider link for a particular destination. While using smaller time intervals would address this, it could have a negative impact as well. In active monitoring, frequent measurements inflate the out-of-band measurement traffic causing additional bandwidth and processing overhead; some destinations might interpret this traffic as a security threat. In passive monitoring, frequent sampling may cause too many connections to be directed over sub-optimal providers in an attempt to obtain performance samples. As such, a careful choice of the interval size is crucial.