Servers based on clusters of workstations or PCs are the most popular hardware platform used to meet the ever growing traffic demands placed on the World Wide Web. This hardware platform combines a cutting edge price--performance ratio in the individual server nodes with the promise of perfect scalability as additional server nodes are added to meet increasing demands. However, in order to actually attain scalable performance, it is essential that scalable mechanisms and policies for request distribution and load balancing be employed. In this paper, we present a novel, scalable mechanism for content-aware request distribution in cluster based Web servers.
State-of-the-art cluster-based servers employ a specialized front-end node, which acts as a single point of contact for clients and distributes requests to back-end nodes in the cluster. Typically, the front-end distributes requests such that the load among the back-end nodes remains balanced. With content-aware request distribution, the front-end additionally takes into account the content or type of service requested when deciding which back-end node should handle a given request.
Content-aware request distribution can improve scalability and flexibility by enabling the use of a partitioned server database and specialized server nodes. Moreover, previous work [6,26,31] has shown that by distributing requests based on cache affinity, content-aware request distribution can afford significant performance improvements compared to strategies that only take load into account.
While the use of a centralized request distribution strategy on the front-end affords simplicity, it can limit the scalability of the cluster. Very fast layer-4 switches are available that can act as front-ends for clusters where the request distribution strategies do not consider the requested content [11,21]. The hardware based switch fabric of these layer-4 switches can scale to very large clusters. Unfortunately, layer-4 switches cannot efficiently support content-aware request distribution because the latter requires layer-7 (HTTP) processing on the front-end to determine the requested content. Conventional, PC/workstation based front-ends, on the other hand, can only scale to a relatively small number of server nodes (less than ten on typical Web workloads [26].)
In this paper, we investigate scalable mechanisms for content-aware request distribution. We propose a cluster architecture that decouples the request distribution strategy from the distributor, which is the front-end component that interfaces with the client and that distributes requests to back-ends. This has several advantages: (1) it dramatically improves the scalability of the system by enabling multiple distributor components to co-exist in the cluster, (2) it improves cluster utilization by enabling the distributors to reside in the back-end nodes, and (3) it retains the simplicity afforded by a centralized request distribution strategy.
The rest of the paper is organized as follows. Section 2 presents some background information for the rest of the paper. In Section 3, we discuss the cluster configurations currently used for supporting content-aware request distribution and we provide experimental results that quantify limitations in their scalability. Section 4 describes our scalable cluster design for supporting content-aware request distribution. We discuss our prototype implementation in Section 5 and Section 6 presents experimental results obtained with the prototype. Related work is covered in Section 7 and Section 8 presents conclusions.