Detour paths provide overlay networks with improved performance and resilience.
Finding good detour routes with methods that scale to millions of nodes is a
challenging problem. We propose a novel approach for decentralised discovery of
detour paths based on the observation that Internet paths that traverse
overlapping sets of autonomous systems may benefit from the same detour
nodes. We show how nodes can learn about overlap between Internet paths at the
level of autonomous systems and demonstrate how they can exploit detours that
other nodes have already found. Our approach is to cluster paths based on the
extent to which the autonomous systems traversed overlap and gossip potential
detours among nodes. We find that our centralised path clustering algorithm
correctly classified over of potential latency detours in a
-node
dataset drawn from PlanetLab. In our decentralised version, we detected
of potentially available detours with each node sampling data from only
of other nodes.
Internet routing is not well suited to satisfy the diverse quality-of-service (QoS) requirements of modern distributed Internet applications. For example, a content distribution system may prefer Internet paths that have high available bandwidth, whereas an audio conferencing application may rather use paths with low latency. Internet routing decisions made by interior and exterior gateway protocols are shared across all applications with no scope for individual requirements.
In contrast to traditional client/server systems, modern distributed applications can construct overlay networks that give them the flexibility to use multi-hop routing paths that are different from direct Internet paths. Overlay networks abstract away the direct IP connections between the nodes that constitute these applications. Instead of sending data directly to a target IP address, such as a server in the traditional model, current applications often explicitly measure the network and guide packets according to their own internal requirements. This enables them to choose custom paths with desirable QoS properties. In previous work, this detour routing [10] approach has been shown to improve the reliability of Internet paths [1,3] and to reduce their latency [7], among other desirable properties. Such improvements are achieved by exploiting potential path diversity beyond the default Internet path when suboptimal performance is caused by hops other than the first or last on the default path [3].
An open issue is how applications can discover detour paths that improve a given QoS measure in a scalable and efficient manner. A common approach is to perform end-to-end measurements of a set of paths to reveal improvements caused by combining paths [10]. However, without either careful selection of which end-to-end measurements to take, or measuring almost all network paths, it is difficult to capture the full extent of what detour paths may be available. Fundamentally, discovering detour paths at the application layer remains a hard problem because applications have only limited insight into routing decisions made by lower Internet layers. The cost of a large number of end-to-end measurements may outweigh any potential benefit of detour routing.
Previous work on overlay detour routing often treated the network itself as a black box. Measurements of QoS parameters were between individual nodes, and the underlying physical and contractual location of these nodes was ignored. Participants in these types of wide-area distributed applications are, however, part of autonomous systems. Considering nodes as part of their larger environment -- their autonomous systems and the routes among them -- is more natural when considering network-level behaviour, such as detour paths.
This paper proposes a new approach for discovering detour paths that works at the level of autonomous system (AS) paths. By analysing detour paths at the AS level, we exploit that detours are frequently a consequence of the intrinsic behaviour of BGP routing, for example, caused by the fact that certain peering relationships are not advertised externally, or because BGP does not react effectively to link failure or congestion. We obtain Internet AS paths through traceroute measurements of direct paths and translate the obtained IP addresses to AS numbers. We then decompose the AS paths into pairwise AS links and use these links to construct fingerprints of clusters of direct Internet paths that share similar detour nodes. These fingerprints are a representation of direct paths with detours and are disseminated throughout the system. This enables us to reduce the number of detour nodes that need to be stored and to discover previously unknown detours by establishing their similarity to known paths.
The main contributions of this paper are two-fold. First, we show how a
hierarchical clustering algorithm can be used to construct fingerprints of sets
of direct Internet paths with and without detours. Our experiments indicate
that our approach correctly classifies of paths with detours and
of paths without detours on a
-node PlanetLab dataset.
of paths classified as having detours experience a reduction in latency by
using the suggested detour node.
Second, we show how this process can be decentralised to construct an overlay
network that an application can use to discover better detour paths on demand.
In the overlay network, nodes perform decentralised clustering of AS paths and
gossip fingerprint information among themselves to reduce measurement
overheads. The decentralised execution of the clustering algorithm finds most
of the potentially available detours (), while only requiring each node
to collect a small number of measurements from
of other nodes.
The rest of the paper is organised as follows. In the next section, we discuss work on detour routing. In Section 3, we present the centralised version of our clustering algorithm for fingerprint generation of Internet paths, which is followed by a description of an overlay network for detour routing in Section 4. Section 5 evaluates our approach in terms of detection accuracy and detour path improvement and we conclude in Section 7.
Savage et al. [10] presented the main idea behind detour
routing: that sets of specially-tuned routers could tunnel traffic in more
intelligent ways than standard IP.
Instead of routers themselves becoming more attuned to application
traffic, distributed applications themselves have taken on the role of
determining packet flows; Savage's ``routers'' are overlay nodes.
In Andersen's RON [1], overlay nodes actively measured network
characteristics to each other so that they could rapidly circumvent network
failures.
RON does not scale due to its inter-node measurements, but it confirms
the benefits of detour routing.
Subsequent research further explored why direct IP routing was often poorer than alternative routes. Gummadi et al. [3] found that, if a detour could improve the reliability of a route at all, detouring via only a single random hop was sufficient in almost all cases. Madhyastha et al. [8] built iPlane, which combined continuous network measurements from many vantage points to return information about network characteristics. Like the method we propose, their approach builds a view of the network topology.
Other research most similar to ours has examined how to select good detour routes. Su et al. [11] minimise the overhead of detour discovery by taking advantage of on-going measurments of commercial content distribution networks. The work by Lee et al. [5] uses heuristics to perform strategic measurements of detour paths that may yield higher bandwidth. The PeerWise system [7,6] uses network coordinates to discover latency detours. In contrast to our approach, all of these techniques make assumptions about the properties of path costs. For example, PeerWise assumes that the cost measure (latency) is embedable in a virtual coordinate system and therefore cannot support non-embedable measures, such as bandwidth.
![]() |
We let the direct path between two hosts
and
be the
routing path across multiple autonomous
systems
given by regular Internet
routing.
Figure 1 illustrates a direct path from
host
to
.
We let this path have a cost
, which could be, for example,
the one-way communication latency, bandwidth or loss experienced by packets.
Detour routing exploits the fact that most measures of Internet path costs,
including latency, do not form pure metric spaces and contain triangle
inequality violations (TIVs) [13]. Therefore a detour path
that relays traffic via another node
may exhibit a lower overall
cost. The inequality differs based on the cost metric:
for latency,
![]() |
(1) |
![]() |
(2) |
![]() |
(3) |
Previous work has shown that the bulk of improvement is obtained by relaying through a single detour node, with diminishing returns when additional detour nodes are included [7]. Therefore we focus on single-hop detours only. In what follows, we will refer to a path as a TIV path if there is a detour node such that the corresponding detour path exhibits a lower cost than the direct path, and a no-TIV path if no such detour exists.
In many cases, TIVs are the result of routing policies chosen by network operators of ASs. For example, an AS that is a customer of a transit AS will not advertise routes to the transit AS externally to avoid attracting chargeable third-party traffic. Traffic flowing through a detour node in the customer AS will be treated as local traffic and hence benefit from the transit arrangement. Another example is a transit AS, such as Internet2 in the US, that has a policy preventing it from carrying certain classes of traffic. Here a detour node can enable the external traffic of an application to be treated in the same way as local traffic, therefore leading to better end-to-end performance.
We propose an approach to discover detours at the AS-level graph. We believe
this to be valid as the majority of detour routes exist at the AS level.
For example, in the all-pairs traceroute dataset between PlanetLab nodes
described in Section 5,
of all existing detour paths
include at least one different AS from the original direct path. By working
with AS paths instead of IP paths, we drastically reduce the size of the
data. We also benefit from the fact that AS paths are relatively static
compared to IP paths. However, we assume that nodes within the same AS
behave similarly in terms of externally visible network measurements.
Path clustering. Our method is based on the rationale that end-to-end paths traversing similar AS links will benefit from the same sets of detour nodes. To share potentially useful detours, we group paths that are similar into clusters and share potential detours within each cluster. To group sets of similar paths, we examine their shared links: a link is a particular pair of ASs along a path and a shared link is a link that is shared by two or more paths, as illustrated in Figure 2. We use the number of shared links between paths to determine whether they should be combined into a cluster, and the number of shared links between clusters to determine whether they should be merged into a larger cluster. Because knowing that certain paths do not have good detours is also useful, we form both clusters of TIV paths, called TIV clusters and of no-TIV paths, called no-TIV clusters.
![]() |
We now give a formal description of our algorithm. Each pair of nodes and
is represented by the set of AS links in the direct path
. In
Figure 2, the path
is associated with the set
. Let
and
be two pairs
of nodes with respective AS paths
and
, where
and
, correspond to AS links. We define the similarity
ratio
between the two AS paths by
The main idea behind our algorithm consists of finding clusters of resemblant
paths based on their similarity ratio. To this end, we introduce a similarity
threshold and we let two paths
and
be similar if
.
In this case, these two paths are merged to form a cluster with
fingerprint
.
As shown in Figure 2, for
, the paths
and
have a similarity ratio larger that
and are thus merged into one
cluster
.
In addition, as node
is known to be a detour node for path
it will
be included in the list of detours for the resulting TIV cluster. In practice,
shared links in clusters will only be stored once to obtain a more
space-efficient representation.
To perform the clustering of AS path data, we use a variant of
bottom-up hierarchical clustering.
Each individual AS path is initially assigned its own cluster.
Then, we iteratively merge clusters that have a high similarity ratio. At each step,
we define the similarity ratio between two clusters
and
, where
and
are AS paths, by
When TIV clusters are merged, we also merge their lists of detour nodes, which are then sorted in decreasing order of their usage frequency in detours. We may envisage other schemes for maintaining the lists of detour nodes to ensure load balancing between detours as discussed in Section 6.
The similarity threshold accounts for the trade-off between having
many small clusters and a small number of very large clusters.
When , each cluster corresponds to an individual AS path; when
all the AS paths belong to the same cluster.
To discover a detour path for a given direct path , we first find the
cluster that is most similar to it: if it is a TIV cluster, we test the two
most frequently used detour nodes in the corresponding detour list, say
and
, and return the one that yields the lower cost among the detour paths
and
. Otherwise, if it is a no-TIV cluster, no detour can be
provided. We observed that considering more than two candidate detours for a
given TIV cluster resulted in a marginal improvement but the increase from one
to two was substantial.
In the decentralised version of our path clustering approach, a set of
nodes cooperate to form an overlay network that is used to discover detour
paths between any two nodes. Each node performs measurements between a small subset
of all nodes in the network and attempts to discover detour paths within this
set of nodes. Measured paths are clustered according to the approach described
above. However, only TIV paths are clustered -- paths for which no detours
were found may still have detours outside of the set of considered nodes. This
means that such paths cannot be classified reliability as no-TIV paths. TIV
clusters are then exchanged between nodes using a gossiping algorithm. This
spreads information about them and means that nodes can benefit from each
others' measurements. In more detail, the nodes operate in four phases:
1. Measurement phase. Each node picks a set of
2. TIV clustering phase. Each node clusters its measured TIV paths into TIV clusters with associated detours, as described above in the centralised case.
3. TIV cluster exchange phase. After constructing its TIV clusters, a node begins exchanging them with other nodes using gossiping. In each gossip round, a node receives a set of TIV clusters with detours from another node and merges them with its own TIV clusters. TIV clusters are merged greedily by combining the most similar clusters up to the similarity threshold
4. Detour discovery phase. To discover a detour path for a given direct path, a node finds the TIV cluster in the merged cluster set that is most similar to the given path and returns the ordered set of candidate detour nodes associated with this TIV cluster. For a direct path that does not have a detour (no-TIV path), none of the returned candidate detour nodes will provide a cost improvement and the application should use the direct path instead.
In the above, we assume that detours are static, implying that nodes do not need to re-measure paths. In an actual deployment, nodes could periodically re-execute the above four phases to update their knowledge about detours as routing paths and costs change over time. We leave an investigation of such dynamic behaviour for future work.
Our evaluation has two aims. First, we want to investigate the accuracy and quality of discovered detours using our clustering approach with complete knowledge of all measurements. Second, we want to examine how the detection efficiency decreases in a decentralised setting, when nodes do not possess complete measurements but only measure a subset of nodes and exchange information about clusters through gossipping.
Collected dataset. To carry out our experiments, we collected a dataset of all-pairs traceroute measurements between
To obtain AS paths from traceroute measurements, we followed Mao et al.'s
method [9] with the exception of using Team Cymru's
service for AS lookups [12]. Team Cymru actively collects AS
numbers from BGP tables at more than vantage points. If an IP address
could not be resolved by this service, we performed a whois lookup. Out
of
encountered IP addresses, we resolved
using the Team Cymru
service,
using whois lookups and
were reserved
addresses that are unroutable on the public Internet.
We collected a total of paths that successfully reached their
destination host, which is
of all possible paths.
of the
collected paths can be improved through a detour path and
of the paths
can be improved by at least
ms. Due to the homogeneous nature of the
academic PlanetLab network [2], we believe that the
number of detours and the scope for latency improvement on detour paths is
lower than for a more representative subset of the Internet.
Centralised detour discovery. In our first experiment, we applied our centralised clustering algorithm to our dataset and investigated the accuracy of detour discovery and the quality of discovered detours.
We perform centralised clustering using all available paths. The best detour
node is associated with each TIV path in a TIV cluster. We varied the
similarity threshold (described in Section 3) from
to
at
increments and empirically determined the best detection accuracy
to be at
. With this similarity threshold, we obtained
clusters in
total, with
of them being no-TIV clusters. The higher proportion of
no-TIV clusters compared to the proportion of no-TIV paths (
) can be
explained by the fact that no-TIV paths are more heterogeneous and thus less
amenable to clustering.
Table 1 shows the correctness of identifying paths with and
without detours. With knowledge of all measurements, our clustering approach
manages to correctly classify of all TIV paths by matching them to TIV
clusters and
by matching them to no-TIV clusters. The accuracy of
misclassification of no-TIV paths is higher because, as mentioned before,
no-TIV clusters are more diverse and therefore easier to misclassify.
![]() |
In Figure 3, we plot the distribution of detour
improvements as the fraction of detour path latency over direct path
latency. We compare a brute force search for the best detour (BRUTE
FORCE), with our clustering approach (PATH CLUSTERING) and a random
strategy that picks the best detour node out of random choices
(RANDOM-10). In our clustering approach, we used the two most frequently
featured detour nodes per TIV cluster and picked the better one.
As shown in the graph, PATH CLUSTERING performs significantly better than
the RANDOM strategy. The quality of discovered detours remains high --
our approach reduces the latency of of TIV paths as discovered by the
BRUTE FORCE strategy, while having to store fewer detour nodes overall:
BRUTE FORCE associates a detour node per TIV path (
), whereas the
centralised clustering approach stores two detour nodes per cluster (
in
total).
Decentralised detour discovery. In our second experiment, we simulated the decentralised execution of our path clustering approach, as described in Section 4. Each node took a set of measurements between
![]() |
Figure 4 shows the latency reduction when using our
decentralised path clustering (DPC) approach for and different
numbers of rounds (
) of cluster exchange. This is compared to the
best-possible latency reduction as discovered by BRUTE FORCE searching
the global dataset (
), and the detours found by the original latency
measurements between
nodes (MEASUREMENTS ONLY). The intersection of
each curve with the right hand y-axis represents the proportion of potentially
detourable routes found. With the original measurement set and no clustering,
we find around
of possible detours. Applying clustering to this dataset
(DPC,
) increases our detour detection rate to around
. Each
successive round of cluster exchange (
shown) improves detour
detection, but with diminishing returns for each round. All of the curves show
a fairly similar shape, suggesting the quality of detours found is consistent
with those found by BRUTE FORCE.
Our current implementation relies on clusters' fingerprints that incorporate
complete information on the AS links of paths belonging to clusters. To reduce
storage overhead, we plan to interpret the similarity between paths in graph
theoretical terms. Starting from all AS paths, we can construct a
similarity graph that consists of the set of nodes corresponding to the
AS paths discovered. The existence of a link between two paths and
depends on their similarity ratio
. This perspective opens up
various avenues for future work to refine our approach both for cluster
identification and compact fingerprint construction [4].
We will also implement more elaborate schemes for detour selection that ensure load balancing between detour paths. To this end, we plan to construct detour lists that provide some diversity, as measured by our similarity ratio given by Eq. (4), in the detour paths suggested by our scheme in order to avoid overwhelming detour nodes and detour paths.
We are currently working on an implementation of our approach for detour discovery as an overlay network on PlanetLab. We want to investigate the benefits of detour routing when considering actual communication patterns used between nodes in a distributed application and study the impact of churn of detour nodes on performance. A PlanetLab deployment will also enable us to compare our approach in terms of detouring benefits and incurred costs to other systems, such as PeerWise [6] and iPlane [8].
We have presented a novel technique for discovering detour nodes to improve the end-to-end performance provided by standard Internet routing. Unlike previous work where the network is treated as a black box, we explore the AS path description of routes to enable collaborative discovery of detour nodes. To this end, we use a variant of a hierarchical clustering algorithm to partition paths into TIV and no-TIV clusters and exploit this partitioning to associate paths with detour nodes based on the similarity of their AS links. We evaluate the latency reduction when using discovered detours. Although our evaluation focused on latency, we believe that our detour mechanism can be applied to other measures, such as bandwidth and packet loss.
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -show_section_numbers -local_icons -no_navigation -prefix detour-iptps09 -dir detour-iptps09 iptps09.tex
The translation was initiated by Peter Pietzuch on 2009-03-24