For browsing, the web site is accessed in three different ways and we
categorize the browse accesses based on this usage: desktop,
offline, and wireless. Desktop accesses include requests from desktop
and laptop machines connected to the web site via wireline networks. Offline
accesses are generated due to handheld devices such as PDAs. Companies such as
Avantgo and Vindigo offer services that let users select content from
different web sites and download it onto a handheld device for browsing at a
later time. The content download occurs when a user synchronizes his/her
handheld with a desktop machine and is controlled by a ``downloader'' program;
we refer to these programmatic accesses by the downloader as offline
accesses. Wireless accesses occur due to browse actions initiated by users
from their cell-phones or wireless devices. Typically, a request from a
cell-phone is directed to a ``gateway'' (operated by the user's service
provider) that forwards the message to the web site; this gateway also
forwards the reply back to the cell-phone. Thus, from the web site's
perspective, it just communicates directly with the gateway machines using the
standard HTTP protocol. Since one gateway can serve multiple clients, we do
not use IP addresses to identify users; instead, we use a unique identifier
assigned to every client that is logged with each access.
Table 1:
User accesses according to browser types
Browser Type
No. of accesses
No. of users
Desktop
7,342,206
639,971
Wireless
2,210,758
58,432
Offline
20,508,272
50,968
Misc
2,944,708
1,634
We determine the type of access based on the browser type stored in the log
entry corresponding to that access. For example, entries with browser type
``Mozilla Windows'', ``Avantgo'', ``UP.Browser'' are categorized as desktop,
offline and wireless accesses respectively. In Table 1
we show the number of accesses according to the browser type (in our case,
each access corresponds to a single HTML page). The last category (Misc)
corresponds to log entries for which the browser type either was empty or
contained characters that could not be mapped to any known browser client. The
table also shows the number of unique users that were responsible for
different types of accesses. Note, the number of desktop users is much
higher than the offline and wireless users due to the fact that a large
number of users use their desktop machines to register with the web site.
In the case of notifications, there is a client type in the logs that tells us the type of the registered clients. More than 99% of the messages were sent to
wireless clients; the remaining were sent to desktop clients.
Next:Description of Data Logs Up:Data Characteristics Previous:Data CharacteristicsLili Qiu 2002-04-17