Classification Trees With Bivariate Linear Discriminant Node Models
- 1 September 2003
- journal article
- Published by Taylor & Francis Ltd in Journal of Computational and Graphical Statistics
- Vol. 12 (3), 512-530
- https://doi.org/10.1198/1061860032049
Abstract
This article introduces a classification tree algorithm that can simultaneously reduce tree size, improve class prediction, and enhance data visualization. We accomplish this by fitting a bivariate linear discriminant model to the data in each node. Standard algorithms can produce fairly large tree structures because they employ a very simple node model, wherein the entire partition associated with a node is assigned to one class. We reduce the size of our trees by letting the discriminant models share part of the data complexity. Being themselves classifiers, the discriminant models can also help to improve prediction accuracy. Finally, because the discriminant models use only two predictor variables at a time, their effects are easily visualized by means of two-dimensional plots. Our algorithm does not simply fit discriminant models to the terminal nodes of a pruned tree, as this does not reduce the size of the tree. Instead, discriminant modeling is carried out in all phases of tree growth and the misclassification costs of the node models are explicitly used to prune the tree. Our algorithm is also distinct from the “linear combination split” algorithms that partition the data space with arbitrarily oriented hyperplanes. We use axis-orthogonal splits to preserve the interpretability of the tree structures. An extensive empirical study with real datasets shows that, in general, our algorithm has better prediction power than many other tree or nontree algorithms.Keywords
This publication has 9 references indexed in Scilit:
- A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification AlgorithmsMachine Learning, 2000
- Polychotomous RegressionJournal of the American Statistical Association, 1997
- Penalized Discriminant AnalysisThe Annals of Statistics, 1995
- Flexible Discriminant Analysis by Optimal ScoringJournal of the American Statistical Association, 1994
- A System for Induction of Oblique Decision TreesJournal of Artificial Intelligence Research, 1994
- Very Simple Classification Rules Perform Well on Most Commonly Used DatasetsMachine Learning, 1993
- Multisurface method of pattern separation for medical diagnosis applied to breast cytology.Proceedings of the National Academy of Sciences of the United States of America, 1990
- Tree-Structured Classification Via Generalized Discriminant AnalysisJournal of the American Statistical Association, 1988
- Further aspects of the theory of multiple regressionMathematical Proceedings of the Cambridge Philosophical Society, 1938