An Improved Ranking Formula For Free Text Search Engines

The current NetQuestion product and its immediate predecessor, SearchManager, incorporate portions of a free text search engine named GURU developed at IBM Research. A free text search engine’s ranking formula is the most crucial component in determining its effectiveness in retrieving good documents in the first few documents in a hit list. Yoëlle Maarak and Mark Wegman developed the GURU ranking formula in 1989. Since then, except for minor bug fixes and algorithmic changes in the C implementation embodied by the GURU research testbed search engine, no work had been done to substantially change the formula.

By: Herb Chong

Published in: RC21057 in 2001

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

RC21057.pdf

Questions about this service can be mailed to reports@us.ibm.com .