A meta-analysis of research in random forests for classification
- 1 November 2016
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech)
Abstract
Since their introduction, random forests (RFs) have successfully been employed in a vast array of application areas. Fairly recently, a number of algorithms that are related to Breiman's original Forest-RI algorithm have been proposed in the literature. In this paper we conduct a meta-analysis of all (34) 2001-2015 papers that could be found in which a novel RF algorithm was proposed and compared to already established RF algorithms. The analysis revealed several limitations regarding the choice of performance measures, the way in which these measures are estimated, and the methodology for comparisons of multiple algorithms over multiple data sets. In fact, it is shown that in almost a third of the results from RF research papers, a significant improvement over the performance of Forest-RI is not found when comparisons are made using appropriate statistical tests.Keywords
This publication has 33 references indexed in Scilit:
- Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of powerInformation Sciences, 2010
- Cross-validation and bootstrapping are unreliable in small sample classificationPattern Recognition Letters, 2008
- Shape Quantization and Recognition with Randomized TreesNeural Computation, 1997
- The use of the area under the ROC curve in the evaluation of machine learning algorithmsPattern Recognition, 1997
- Improvements on Cross-Validation: The 632+ Bootstrap MethodJournal of the American Statistical Association, 1997
- On a Monotonicity Problem in Step-Down Multiple Test ProceduresJournal of the American Statistical Association, 1993
- Modified Sequentially Rejective Multiple Test ProceduresJournal of the American Statistical Association, 1986
- Estimating the Error Rate of a Prediction Rule: Improvement on Cross-ValidationJournal of the American Statistical Association, 1983
- Using Weighted Rankings in the Analysis of Complete Blocks with Additive Block EffectsJournal of the American Statistical Association, 1979
- Estimation of the Medians for Dependent VariablesThe Annals of Mathematical Statistics, 1959