Discovering Frequently Asked Questions

Most helpdesk centers document each call to the helpdesk with a short concise text description of each customer's problem and how it was solved. Because of its unstructured nature, the aggregation of this data is difficult to analyze in a meaningful way. In particular we would like to use the unstructured text to answer the question, "What are the current Frequently Asked Questions for a given helpdesk?"

In this paper, we describe an implementation of an algorithm and methodology for discovering Frequently Asked Questions (FAQ's). We utilize text clustering on problem ticket text to determine a set of problem categories. We then use a novel search strategy to find groups problem tickets containing a sufficiently high number of common keywords. We present to the user the most frequently occurring problem ticket groups, with appropriate and readable names for each group. Our claim is that this search strategy serves as a useful method for quickly and automatically determining what the FAQ's are for a given helpdesk. We have validated our approach on many different helpdesk datasets. In particular we compared the FAQ's generated using our approach to those that were generated manually through reading the tickets, and found that our method compares quite favorably to human expert opinion. Finally we describe how the IBM Virtual HelpdeskTM uses our tool to keep customer helpdesk websites current.

By: Scott Spangler, Leo Garcia

Published in: RJ10235 in 2002

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rj10235.pdf

Questions about this service can be mailed to reports@us.ibm.com .