Discussion Mining: Knowledge Discovery from Online Discussion Records

For the last decade, text mining techniques, which discover knowledgefrom large amounts of text data, have been making a significant
progress. While text mining techniques are useful for independenttexts, they don't work well for email and BBS (bulletin board systems)
discussions because of the incompleteness of each individual message.Therefore, we need a new method to extract useful information from
these kinds of text fragments. This paper describes methods ofstructuring discussion records or logs, such as email and BBS
messages, allowing knowledge discovery using various retrieval and visualization techniques. Each message mainly consists of quotations
and comments, but based on the quote-comment relationships, our proposed system generates a Thread Summary which is an abstraction of
the ongoing discussion on a particular topic. We consider the Thread Summary to be a coherent document and produce customized summaries to
allow users to find their topics of interest more easily. Also, our system can recognize an authority, a person who has a good knowledge
in a particular field based on patterns of message exchange.

By: Akiko Murakami, Katashi Nagao, Koichi Takeda

Published in: RT0422 in 2002

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rt0422.pdf

Questions about this service can be mailed to reports@us.ibm.com .