Tracking the Evolution of Topics in On-line Postings: 2006 IBM Innovation Jam data

We developed a prototype to automatically identify and track topics in large, dynamic databases based on fast approximation of cluster centroids and inter-cluster correspondence mappings. The prototype is part of a preliminary study for the construction of a discussion mining system. To verify the applicability of the new algorithms in our system to real-world data, we conducted implementation studies using data from Innovation Jam 2006, an on-line brainstorming session, in which 53,000 participants around the globe posted more than 37,000 opinions. Results output by our system were consistent with the original text in the postings, and would have required considerable manual effort to uncover.

By: Mei KOBAYASHI and Raylene Kay YUNG

Published in: RT0710 in 2007

This Research Report is not available electronically. Please request a copy from the contact listed below. IBM employees should contact ITIRC for a copy.

Questions about this service can be mailed to reports@us.ibm.com .