Construction of an OO Framework for Text Mining

We describe how we constructed a Java class library for text mining and information retrieval. The system consists of Facades around a database, a search engine and a text mining tool. We discuss the design of the object models we used for each of these elements and how they evolved as different databases and search engines became available. Then we discuss how we needed to evolve the system further in work with our customer. Finally we discuss the eventual fate of the system after the customer adopted the final version of the code, showing what we learned from the experience.

By: James W. Cooper, Edward C. So, Christian L. Cesar, Robert L. Mack

Published in: RC22122 in 2001

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

RC22122.pdf

Questions about this service can be mailed to reports@us.ibm.com .