In Section 3, we explored the idea of rejecting requests that could cause security problems or abuse system resources. Figure 17 shows a snapshot of the statistics about various reasons for rejecting requests. Three major reasons include clients exceeding the maximum rate, requests using methods other than GET and requests with no host field, indicating non-standard browsers. Most of the time, these three comprise more than 80% of the rejected traffic. The query count represents the number of bandwidth capped CGI queries which include all sorts of malicious behaviors previously mentioned. Disallowed CONNECTs and POSTs indicate attempts to send spam through our system. CONNECTs alone constitute, on the average, over 5% and sometimes 30% of all rejected requests. From this graph, we can get an idea of how many scavenging attempts are being made through the open proxies like CoDeeN.
We assume most of this traffic is being generated automatically by
running some custom programs. We are now studying how to identify
these malicious programs versus normal human users and innocuous
programs like web crawlers, in order to provide an application-level
QoS depending on client classification.