Random Forests and Adaptive Nearest Neighbors

1 June 2006

journal article
Published by Taylor & Francis Ltd in Journal of the American Statistical Association

Vol. 101 (474), 578-590
https://doi.org/10.1198/016214505000001230

Abstract

In this article we study random forests through their connection with a new framework of adaptive nearest-neighbor methods. We introduce a concept of potential nearest neighbors (k-PNNs) and show that random forests can be viewed as adaptively weighted k-PNN methods. Various aspects of random forests can be studied from this perspective. We study the effect of terminal node sizes on the prediction accuracy of random forests. We further show that random forests with adaptive splitting schemes assign weights to k-PNNs in a desirable way: for the estimation at a given target point, these random forests assign voting weights to the k-PNNs of the target point according to the local importance of different input variables. We propose a new simple splitting scheme that achieves desirable adaptivity in a straightforward fashion. This simple scheme can be combined with existing algorithms. The resulting algorithm is computationally faster and gives comparable results. Other possible aspects of random forests, such as using linear combinations in splitting, are also discussed. Simulations and real datasets are used to illustrate the results.

Keywords

This publication has 1 reference indexed in Scilit:

Locally adaptive metric nearest-neighbor classification
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002

Cited by 271 articles