2006 USENIX Annual Technical Conference Abstract
Pp. 375380 of the Proceedings
Efficient Query Subscription Processing for Prospective Search Engines
Utku Irmak, Polytechnic University; Svilen Mihaylov, University of Pennsylvania; Torsten Suel, Polytechnic University; Samrat Ganguly and Rauf Izmailov, NEC Laboratories America
Abstract
Current web search engines are retrospective in that they limit
users to searches against already existing pages. Prospective
search engines, on the other hand, allow users to upload
queries that will be applied to newly discovered pages in the
future. Some examples of prospective search are the
subscription features in Google News and in RSS-based blog
search engines.
In this paper, we study the problem of efficiently
processing large numbers of keyword query subscriptions against
a stream of newly discovered documents, and propose several
query processing optimizations for prospective search. Our
experimental evaluation shows that these techniques can improve
the throughput of a well known algorithm by more than a factor
of 20, and allow matching hundreds
or thousands of incoming documents per second against millions
of subscription queries per node.
- View the full text of this paper in HTML and PDF. Listen to the presentation in MP3 format.
Until June 2007, you will need your USENIX membership identification in order to access the full papers. The Proceedings are published as a collective work, © 2006 by the USENIX Association. All Rights Reserved. Rights to individual papers remain with the author or the author's employer. Permission is granted for the noncommercial reproduction of the complete work for educational or research purposes. USENIX acknowledges all trademarks within this paper.
- If you need the latest Adobe Acrobat Reader, you can download it from Adobe's site.
|