Evaluating Knowledge Base Relevancy

Whenever possible, it is preferable to reuse knowledge base content rather than redevelop it. The problem is how to determine the relevancy of an existing knowledge base to a new problem domain, or to determine which existing knowledge base is most relevant. In this paper, we describe an implementation of an algorithm and methodology for comparing the relevancy of different knowledge bases to a set of problem tickets. Using distance metrics that have been applied in the field of Text Clustering, we systematically compare the text of the problem domain to the text of each knowledge base. We then allow an expert in the domain area to check the validity of the problem/solution matches and determine where to draw the line between the sets of solved and unsolved problem tickets. This approach makes it possible to objectively compare thousands of problems against thousands of solutions in a matter of hours, to determine an approximate percent coverage of the solutions when applied to the problem domain. We have implemented our approach, and we present the results of performing knowledge base relevancy evaluation on a computer helpdesk problem ticket set compared to different knowledge base solution sets.

By: Scott Spangler

Published in: RJ10234 in 2002

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rj10234.pdf

Questions about this service can be mailed to reports@us.ibm.com .