Gloss-Ont: A Concept-focused Ontology Building Tool

The demand for ontologies is rapidly growing especially due to developments in knowledge management, E-commerce and the Semantic Web. Building an ontology and a background knowledge base manually is so costly and time-consuming that it hampers progress in intelligent information access. Therefore, semi-automatic or automatic construction of ontologies is very useful.

This paper presents a concept-focused ontology building method based on text mining technology. The method focuses on a particular domain concept at a time and actively acquires source documents for the ontological knowledge about the concepts. It begins with the analysis of glossary definitions about the target concept by extracting classes and relationships. Then, it generates advanced queries from the result of glossary definition processing and activates a Web search engine to obtain more documents relevant to the target concept. The method extends the ontological knowledge by extracting more domain-specific concepts and relationships from the search documents.

By focusing on a specific concept and highly relevant document set, this method suffers less ambiguity and can identify domain concepts and relations more accurately. In addition, by acquiring source documents from the Web on demand, the method can produce up-to-date ontologies.

By: Youngja Park

Published in: RC22982 REVISED in 2004


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to .