The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases
Open Access
- 22 June 2009
- journal article
- review article
- Published by Taylor & Francis Ltd in Scandinavian Journal of Forest Research
- Vol. 24 (3), 235-246
- https://doi.org/10.1080/02827580902870490
Abstract
Almost universally, forest inventory and monitoring databases are incomplete, ranging from missing data for only a few records and a few variables, common for small land areas, to missing data for many observations and many variables, common for large land areas. For a wide variety of applications, nearest neighbor (NN) imputation methods have been developed to fill in observations of variables that are missing on some records (Y-variables), using related variables that are available for all records (X-variables). This review attempts to summarize the advantages and weaknesses of NN imputation methods and to give an overview of the NN approaches that have most commonly been used. It also discusses some of the challenges of NN imputation methods. The inclusion of NN imputation methods into standard software packages and the use of consistent notation may improve further development of NN imputation methods. Using X-variables from different data sources provides promising results, but raises the issue of spatial and temporal registration errors. Quantitative measures of the contribution of individual X-variables to the accuracy of imputing the Y-variables are needed. In addition, further research is warranted to verify statistical properties, modify methods to improve statistical properties, and provide variance estimators.Keywords
This publication has 49 references indexed in Scilit:
- Localization of growth estimates using non-parametric imputation methodsForest Ecology and Management, 2008
- Comparison of linear and mixed-effect regression models and a k-nearest neighbour approach for estimation of single-tree biomassCanadian Journal of Forest Research, 2008
- Estimating areal means and variances of forest attributes using the k-Nearest Neighbors technique and satellite imageryRemote Sensing of Environment, 2007
- Laser scanning of forest resources: the nordic experienceScandinavian Journal of Forest Research, 2004
- Imputing tree-lists from aerial attributes for complex stands of south-eastern British ColumbiaForest Ecology and Management, 2003
- Most similar neighbour-based stand variable estimation for use in inventory by compartments in FinlandForestry: An International Journal of Forest Research, 2003
- Forest stand characteristics estimation using a most similar neighbor approach and image spatial structure informationRemote Sensing of Environment, 2001
- Development and testing of regeneration imputation models for forests in MinnesotaForest Ecology and Management, 1997
- Application of nearest‐neighbour regression for generalizing sample tree informationScandinavian Journal of Forest Research, 1997
- An Introduction to Kernel and Nearest-Neighbor Nonparametric RegressionThe American Statistician, 1992