Inverted Index Support for Parametric Search

As an attempt to integrate data from the Web and corporate knowledge portals with data residing in proprietary databases, search engines are required to support a mroe complex set of features, such as Boolean, XML, and parametric restrictions. In this paper we propose a scheme by which an inverted-index based search engine can efficiently support queries that contain parametric restrictions in addition to standard, free-text portions. We show how inverted lists for parametric fields can be constructed and used seamlessly during query evaluation time. We will show how to maximize query processing performance while respecting limits on index size and build time, or conversely, how to minimize index space and build time while maintaining guarantees on runtime performance. We concisely present the tradeoff between index size, build time, and runtime performance. Finally, our experimental evaluation shows significant performance benefits when compared to alternative approaches.

By: Marcus Fontoura, Jason Zien, Ronny Lempel, Runping Qi

Published in: RJ10329 in 2004

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rj10329.pdf

Questions about this service can be mailed to reports@us.ibm.com .