 
 
 
 
 
 
   
We have presented a new, scalable architecture for content-aware request distribution in Web server clusters. Content-aware distribution improves server performance by allowing partitioned secondary storage, specialized server nodes, and request distribution strategies that optimize for locality.
Our architecture employs a level-4 switch that acts as the central point of contact for the server on the Internet, and distributes the incoming requests to a number of back-ends. In particular, the switch does not perform any content-aware distribution. This function is performed by each of the back-ends, who may forward the incoming request to another back-end based on the requested content. In order to make their request distribution decisions, the back-ends access a dispatcher node that implements the request distribution policy.
In terms of scalability, the proposed architecture compares favorably with existing approaches, where a front-end node performs content-aware request distribution. In our architecture, the expensive operations of TCP connection establishment and handoff are distributed over all back-end nodes, rather than being centralized in the front-end node. Only a minimal additional latency penalty is paid for much improved scalability. Furthermore, the dispatcher module is so fast that centralizing it on a single 300MHz PIII machine scales to throughput rates of up to 50,000 conns/sec.
We have implemented this new architecture, and we demonstrate its scalability by comparing it to a system that performs content-aware distribution in the front-end, both under synthetic and trace-driven workloads.
 
 
 
 
