Next: Measurement testbed
Up: Experimental methodology
Previous: Experimental methodology
Since the goal of our work is to study the geographic properties of
Internet routing, much of our measurement work has focused on
gathering network path data using the traceroute
tool [8]. We are not interested in studying the dynamic
properties of Internet routing (e.g., how routes change over time), so
we only record a single snapshot of the network path between a given
pair of hosts. It may possible that some of the routes in our dataset
are backup paths due to failures at the time of our
measurement. However, we do not expect the aggregate statistics
reported in this paper to be affected by such failures since our
measurements were spread over a month time period. We use
traceroute to determine the network path between 20 traceroute sources
and thousands of geographically distributed destination hosts.
Once we have gathered the traceroute data, we use the GeoTrack tool to
determine the location of the nodes along each network path where
possible. GeoTrack reports the location at the granularity of a
city. We then use an on-line latitude-longitude server [18] to
compute the geographic distance between the source and destination of
a traceroute as well as between each pair of adjacent routers along
the path. The latter enables us to compute the linearized
distance, which we define as the sum of the geographic distances
between successive pairs of routers along the path. So if the path
between A and D passes through B and C, then the linearized distance
of the path from A to D is the sum of the geographic distances between
A & B, B & C, and C & D.
As we discuss in Section 3.4.1, we are typically
able to determine the location of most but not all routers. We simply
skip the routers whose locations we are unable to determine. So in
the above example, if the location of C is unknown, then we compute
the linearized distance of the path from A to D as the sum of the
geographic distances between A & B and B & D. Clearly, skipping over
C would lead us to underestimate the linearized distance. However, as
noted in Section 3.4.1, most of the skipped nodes
are in the vicinity of the either the source or the destination, so
the error introduced in the linearized distance computation is small.
Figure 1:
Locations of our traceroute sources in the U.S. Note that there were 17 hosts in 15 locations (two hosts each in Seattle and Berkeley).
|
Next: Measurement testbed
Up: Experimental methodology
Previous: Experimental methodology
Lakshminarayanan Subramanian
2002-04-14