Classification Using Heuristics for Computing Hyperplanes

        Classification is the process of learning relationships between a set of attributes and a set of predetermined classes using a given data set. Two methods for classification are presented
        in this paper. Both methods use the same heuristics to compute hyperplanes for separating regions of different classes but differ substantially in the form of the solution and the computational requirements. The first method (COGS) classifies by optimizing geometric shapes.
        A COGS solution consists of a set of convex regions, each of which contains points predominantly from one class. The heuristically computed hyperplanes form {\em tight} boundaries for the convex regions. COGS produces a classification solution using a randomized search procedure to
        minimizes the number of regions given an upper limit on the training set classification error.
        In contrast to the computationally intensive COGS, the second method (HOT) generates oblique decision trees using a novel and simple algorithm. HOT also differs from COGS by using the
        heuristically computed hyperplanes as candidates for discriminating regions of
        different classes in the nodes of the decision trees generated. Experimental results on various benchmarks indicate that the decision trees generated are well suited for some
        problem domains. HOT is simple and easy to incorporate into most decision tree packages,
        yet its results compare well with much more complex schemes for oblique trees.

By: Vijay S. Iyengar, Daniel Brand, Murray Campbell, Philip Heidelberger

Published in: RC21566 in 1999

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

Rc21566.zip

Questions about this service can be mailed to reports@us.ibm.com .