Overview

Next: Measurement testbed Up: Experimental methodology Previous: Experimental methodology

Overview

Since the goal of our work is to study the geographic properties of Internet routing, much of our measurement work has focused on gathering network path data using the traceroute tool [8]. We are not interested in studying the dynamic properties of Internet routing (e.g., how routes change over time), so we only record a single snapshot of the network path between a given pair of hosts. It may possible that some of the routes in our dataset are backup paths due to failures at the time of our measurement. However, we do not expect the aggregate statistics reported in this paper to be affected by such failures since our measurements were spread over a

month time period. We use traceroute to determine the network path between 20 traceroute sources and thousands of geographically distributed destination hosts. Once we have gathered the traceroute data, we use the GeoTrack tool to determine the location of the nodes along each network path where possible. GeoTrack reports the location at the granularity of a city. We then use an on-line latitude-longitude server [18] to compute the geographic distance between the source and destination of a traceroute as well as between each pair of adjacent routers along the path. The latter enables us to compute the linearized distance, which we define as the sum of the geographic distances between successive pairs of routers along the path. So if the path between A and D passes through B and C, then the linearized distance of the path from A to D is the sum of the geographic distances between A & B, B & C, and C & D. As we discuss in Section 3.4.1, we are typically able to determine the location of most but not all routers. We simply skip the routers whose locations we are unable to determine. So in the above example, if the location of C is unknown, then we compute the linearized distance of the path from A to D as the sum of the geographic distances between A & B and B & D. Clearly, skipping over C would lead us to underestimate the linearized distance. However, as noted in Section 3.4.1, most of the skipped nodes are in the vicinity of the either the source or the destination, so the error introduced in the linearized distance computation is small.

**Figure 1:** Locations of our traceroute sources in the U.S. Note that there were 17 hosts in 15 locations (two hosts each in Seattle and Berkeley).

Next: Measurement testbed Up: Experimental methodology Previous: Experimental methodology

Lakshminarayanan Subramanian 2002-04-14