Decomposition of Heterogeneous Classification Problems

In some classification problems the feature space is heterogeneous in that the best features on which to base the classification are different in different parts of the feature space. In some other problems the classes can be divided into subsets such that distinguishing one subset of classes from another and classifying examples within such subsets require very different decision rules, involving different sets of features. In such heterogeneous problems, many modeling techniques (including decision trees, rules, and neural networks) evaluate the performance of alternative decision rules by averaging over the entire problem space, and are prone to generating a model that is suboptimal in any of the regions or subproblems. Better overall models can be obtained by splitting the problem appropriately and modeling each subproblem separately. This paper presents a new measure to determine the degree of dissimilarity between the decision surfaces of two given problems, and suggests a way to search for strategic splitting of the feature space that identifies regions with different characteristics. We illustrate the concept using a multiplexor problem.

By: Chidanand Apte, Se June Hong, Jonathan Hosking, Jorge Lepre, Edwin Pednault and Barry Rosen

Published in: Lecture Notes In Computer Science, volume 1280, (no ), pages 17-28 in 1997

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .