A meta-analysis of research in random forests for classification

Abstract
Since their introduction, random forests (RFs) have successfully been employed in a vast array of application areas. Fairly recently, a number of algorithms that are related to Breiman's original Forest-RI algorithm have been proposed in the literature. In this paper we conduct a meta-analysis of all (34) 2001-2015 papers that could be found in which a novel RF algorithm was proposed and compared to already established RF algorithms. The analysis revealed several limitations regarding the choice of performance measures, the way in which these measures are estimated, and the methodology for comparisons of multiple algorithms over multiple data sets. In fact, it is shown that in almost a third of the results from RF research papers, a significant improvement over the performance of Forest-RI is not found when comparisons are made using appropriate statistical tests.