Check out the new USENIX Web site. next up previous
Next: Description of Data Logs Up: Data Characteristics Previous: Data Characteristics

Types of Accesses

For browsing, the web site is accessed in three different ways and we categorize the browse accesses based on this usage: desktop, offline, and wireless. Desktop accesses include requests from desktop and laptop machines connected to the web site via wireline networks. Offline accesses are generated due to handheld devices such as PDAs. Companies such as Avantgo and Vindigo offer services that let users select content from different web sites and download it onto a handheld device for browsing at a later time. The content download occurs when a user synchronizes his/her handheld with a desktop machine and is controlled by a ``downloader'' program; we refer to these programmatic accesses by the downloader as offline accesses. Wireless accesses occur due to browse actions initiated by users from their cell-phones or wireless devices. Typically, a request from a cell-phone is directed to a ``gateway'' (operated by the user's service provider) that forwards the message to the web site; this gateway also forwards the reply back to the cell-phone. Thus, from the web site's perspective, it just communicates directly with the gateway machines using the standard HTTP protocol. Since one gateway can serve multiple clients, we do not use IP addresses to identify users; instead, we use a unique identifier assigned to every client that is logged with each access.
 
Table 1: User accesses according to browser types
Browser Type No. of accesses No. of users
Desktop 7,342,206 639,971
Wireless 2,210,758 58,432
Offline 20,508,272 50,968
Misc 2,944,708 1,634


 

We determine the type of access based on the browser type stored in the log entry corresponding to that access. For example, entries with browser type ``Mozilla Windows'', ``Avantgo'', ``UP.Browser'' are categorized as desktop, offline and wireless accesses respectively. In Table 1 we show the number of accesses according to the browser type (in our case, each access corresponds to a single HTML page). The last category (Misc) corresponds to log entries for which the browser type either was empty or contained characters that could not be mapped to any known browser client. The table also shows the number of unique users that were responsible for different types of accesses. Note, the number of desktop users is much higher than the offline and wireless users due to the fact that a large number of users use their desktop machines to register with the web site. In the case of notifications, there is a client type in the logs that tells us the type of the registered clients. More than 99% of the messages were sent to wireless clients; the remaining were sent to desktop clients.
next up previous
Next: Description of Data Logs Up: Data Characteristics Previous: Data Characteristics
Lili Qiu
2002-04-17