We describe early on the workloads we studied, so that we can use concrete examples in the subsequent discussion.
The first of our traces is the one collected by Douglis and Killian and used in their study [DK99]. This trace consists of the activity over a week in May 1998, polled every 30 seconds, of 100 users of a modem pool maintained by AT&T Labs. The modems in the pool served as the uplink of an asymmetric connection (the downlink was a cable connection). The setup where this trace was collected is interesting because it uses static IP addresses. This makes disconnects less undesirable: all TCP connections can stay open until the next modem connection. Thus, the cost of disconnecting a user is primarily in terms of the delay of reconnection. As a consequence, the modem pool itself has an aggressive 15 minute idle timeout (which would probably be too short for general ISPs). A few users have no inactivity timeout (after their request) and can stay connected indefinitely. An interesting characteristic of this setup is that all users have a dedicated phone line. A direct consequence is that users cannot be denied service but also that they have less of an incentive to disconnect explicitly. In fact, the trace we studied contains no disconnection information. Douglis and Killian report that 15 users appeared active at all times. (Actually this number depends on what is defined as ``at all times'': for a higher inactivity tolerance, 19 users appear active at all times.)
Our other two traces are from Telesys--the Internet Service Provider for the University of Texas community. In September 1998, Telesys was the largest ISP in the Austin, Texas metropolitan area (we are not aware of more recent data) with over 3,000 modems, serving over 34,000 subscribers [Aca98]. (The total number of subscribers seems to be artificially high: unusually many subscribers did not use their accounts at all during the two periods of our study. The inflation in the number of subscribers is probably due to the low cost of Telesys for its restricted user base--many Telesys users can afford to subscribe even though they rarely use the service.) Our Telesys traces contain explicit disconnect information, and users are allowed to stay connected indefinitely. (Telesys does not disconnect idle users despite a stated policy of a 2-hour idle timeout.) Telesys has not had busy signals due to limited capacity in the past few years [Aca98], hence no users were denied service during our trace collection.
We collected the two traces of Telesys activity by repeatedly polling the modem servers in two-minute intervals. The traces are from periods of significantly different activity. The first was collected in a period of low activity (June 26 to July 6, 1999, which includes the 4th of July holiday) with 18,086 distinct users accessing the system in this time period. The maximum number of simultaneously connected users was 2,151, and 5 users were connected at all times. In all, there were over 315,000 connections (i.e., instances of users connecting and disconnecting). The second trace was collected in a high activity period (November 1 to 8, 1999). 21,221 distinct users accessed the system in that time with a maximum of 3,024 users connected simultaneously. There were over 347,000 connections in total and 11 users stayed connected throughout the week-long tracing period. The number of connected users as a function of time for the Telesys traces is plotted in Figure 1.
Figure 1: Number of connected users as a function of time. The
first plot shows three ``low usage'' days because of the 4th
of July holiday.
An interesting issue concerns users who stay permanently connected (termed ``workaholics'' in the Douglis and Killian study, with the term originating in Barbará and Imielinski [BI94]). If ``workaholics'' are connected regardless of disconnections (e.g., due to periodic tasks using the network) they should be included in the study. If they are active only because they try to ``beat'' the disconnection policy of the tracing environment, they should be excluded. This issue is significant only for our first trace (from AT&T Labs) because of the idle timeout that was in place when the trace was collected, and because of the large percentage of ``workaholics''. In the Douglis and Killian study, ``workaholics'' were excluded from the experiments. The reason was that after interviews with individual users, Douglis and Killian concluded that their constant activity was due to programs used as a response to the 15-minute idle timeout. (That is, these users ran periodic tasks to simulate activity and avoid automatic disconnections.) Similarly, we excluded ``workaholics'' from our first trace. Nevertheless, we have also performed all experiments with the full trace and concluded that the inclusion of ``workaholics'' does not fundamentally affect our results or our conclusions (absolute counts may change but relative differences are practically the same).