Classification Trees With Bivariate Linear Discriminant Node Models

1 September 2003

journal article
Published by Taylor & Francis Ltd in Journal of Computational and Graphical Statistics

Vol. 12 (3), 512-530
https://doi.org/10.1198/1061860032049

Abstract

This article introduces a classification tree algorithm that can simultaneously reduce tree size, improve class prediction, and enhance data visualization. We accomplish this by fitting a bivariate linear discriminant model to the data in each node. Standard algorithms can produce fairly large tree structures because they employ a very simple node model, wherein the entire partition associated with a node is assigned to one class. We reduce the size of our trees by letting the discriminant models share part of the data complexity. Being themselves classifiers, the discriminant models can also help to improve prediction accuracy. Finally, because the discriminant models use only two predictor variables at a time, their effects are easily visualized by means of two-dimensional plots. Our algorithm does not simply fit discriminant models to the terminal nodes of a pruned tree, as this does not reduce the size of the tree. Instead, discriminant modeling is carried out in all phases of tree growth and the misclassification costs of the node models are explicitly used to prune the tree. Our algorithm is also distinct from the “linear combination split” algorithms that partition the data space with arbitrarily oriented hyperplanes. We use axis-orthogonal splits to preserve the interpretability of the tree structures. An extensive empirical study with real datasets shows that, in general, our algorithm has better prediction power than many other tree or nontree algorithms.

Keywords

This publication has 9 references indexed in Scilit:

A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms
Machine Learning, 2000
Polychotomous Regression
Journal of the American Statistical Association, 1997
Penalized Discriminant Analysis
The Annals of Statistics, 1995
Flexible Discriminant Analysis by Optimal Scoring
Journal of the American Statistical Association, 1994
A System for Induction of Oblique Decision Trees
Journal of Artificial Intelligence Research, 1994
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets
Machine Learning, 1993
Multisurface method of pattern separation for medical diagnosis applied to breast cytology.
Proceedings of the National Academy of Sciences of the United States of America, 1990
Tree-Structured Classification Via Generalized Discriminant Analysis
Journal of the American Statistical Association, 1988
Further aspects of the theory of multiple regression
Mathematical Proceedings of the Cambridge Philosophical Society, 1938

Cited by 59 articles