Recent work has observed that in some peer-to-peer networks, query inter-arrivals exhibit burstiness on several time scales [10], making the Pareto distribution a good candidate for modeling these inter-arrival times. Therefore, in this section we compare CUP with PCX under Pareto inter-arrivals.
The Pareto distribution has two parameters associated with it: the shape parameter and the scale parameter . The cumulative distribution function of inter-arrival time durations is This distribution is heavy-tailed with unbounded variance when . For , the average number of query arrivals per time unit is equal to . For , the expectation of an inter-arrival duration is unbounded and therefore the average number of query arrivals per time unit is 0.
We ran experiments for a range of and values but can only present representative results here. Table 3 compares CUP with PCX for equal to 1.25 and 1.1 respectively for a network of 1024 nodes. We set the value of in each run so that the average rate of arrivals equals 1, 10, 100, and 1000 queries per second to match the rate of the Poisson experiments in previous sections.
As decreases toward 1, query interarrivals become more bursty. Queries arrive in more frequent and more intense bursts, followed by idle periods of varying lengths. If an idle period occasionally falls in the heavy-tail portion of the Pareto distribution (i.e., it is a very long idle period), then second chance CUP propagation cost could be unrecoverable, since the next query may arrive long after the cached entry has expired. However, CUP does well under bursty conditions because when it is able to refresh a cache before a burst of queries, it saves a large penalty which by far outweighs any unrecovered overhead that occurs during the occasional, very long idle period. Therefore, refreshing the cache in time provides greater benefits with increasing burstiness. The table results confirm this. In going from to , we see that the average query latency reduction CUP achieves generally improves and the IR increases for all query rates.