Next: AS and network clustering
Up: A Precise and Efficient
Previous: Measurement impact
Analysis results
We conducted our measurement study for about three months, and nineteen Web
sites participated, as described in
Table 2. We classify these sites into two categories:
commercial (sites 1-3) and educational (sites 4-19). As
we show in Section 3.1, the client and local DNS
associations visiting these two sites have very different characteristics.
For ease of discussion, we use LDNS to represent a local DNS server.
A total of 4,253,157 unique client and LDNS associations were collected.
Table 3 presents the
statistics of the DNS server and the redirector log for all
sites.
Table 2:
Participating sites in the study
Site |
Type |
# of 1-pixel |
Duration |
|
|
image hits |
|
1 |
att.com |
20,816,927 |
2 months |
2,3 |
Personal pages |
|
|
|
(commercial domain) |
1,743 |
3 months |
4 |
Research lab |
212,814 |
3 months |
5-7 |
University sites |
4,367,076 |
3 months |
8-19 |
Personal pages |
|
|
|
(university domain) |
26,563 |
3 months |
Table 3:
DNS and HTTP log statistics for all sites
Type |
Count |
Client-LDNS associations |
4,253,157 |
HTTP requests |
25,425,123 |
Unique client IPs |
3,234,449 |
Unique LDNS IPs |
157,633 |
Client-LDNS associations where |
|
client and LDNS have the same IP address |
56,086 |
To study the proximity between the client and its local DNS server, we
use the following four metrics.
- AS clustering. Autonomous System (AS) clustering refers to
observing whether a client is in the same AS as its local DNS server. An AS is
a region under a single administrative control. A single AS might contain an
entire backbone or a large corporation which might span multiple
continents. Therefore, AS-based clustering is the most coarse-grained
metric we use.
- Network clustering. This metric observes whether a client is in
the same network-aware cluster (NAC) as its local DNS server,
where network clusters are identified by the network-aware
clustering technique [13] using prefix entries from BGP
routing table snapshots from a wide set of routing tables.
Longest prefix matching is used to map clients to network clusters
identified by a network prefix.
All the clients within a network cluster are topologically
close together and with a high probability belong to the same
administrative domain. Validation tests (in [13]) using nslookup and traceroute show that the accuracy of network
clustering is above 90% across all the Web logs from the study by
Krishnamurthy and Wang.
Network clustering is much more fine-grained than AS
clustering [12].
For both AS and
network clustering, BGP prefixes and the association of IP CIDR blocks
to ASes were extracted from an extensive set of BGP tables collected
on May 27, 2001 from the sources listed by Krishnamurthy and
Wang [13] and Telstra Internet [5]. There are a total
of more than 440,000 unique routing entries.
- Traceroute divergence. This metric, used previously in [15],
is based on the length of divergent paths to the client and its local DNS server
from a probe point using traceroute. It is defined to be the
maximum number of disjoint network hops from a probe location to the
client and its LDNS.
- Round-trip time correlation. This metric,
used previously in both [15] and [4], refers to examining
the correlation between the message round-trip times from a probe point to
the client and its local DNS server.
AS clustering, network clustering, and traceroute divergence are
topology-oriented metrics, while
round-trip time correlation is a performance-oriented metric. AS and
network clustering are passive, requiring no active probing. The other
metrics are highly dependent on the probe locations. To obtain an
exhaustive evaluation of proximity, we include all four metrics in our
study.
Subsections
Next: AS and network clustering
Up: A Precise and Efficient
Previous: Measurement impact