A comparison of machine learning approaches for identifying high-poverty counties: robust features of DMSP/OLS night-time light imagery
- 21 February 2019
- journal article
- research article
- Published by Taylor & Francis Ltd in International Journal of Remote Sensing
- Vol. 40 (15), 5716-5736
- https://doi.org/10.1080/01431161.2019.1580820
Abstract
The goal of the present study is to demonstrate that high-poverty counties and robust classification features can be identified by machine learning approaches using only DMSP/OLS night-time light imagery. To accomplish this goal, a total of 96 high-poverty and 96 non-poverty counties were classified using 15 statistical and spatial features extracted from night-time light imagery in China in 2010 formed a training set for identifying high-poverty counties. Seven machine learning approaches were adopted to classify high-poverty counties, and five feature importance measures were used to select robust features. The resulting metrics, including the user’s (>63%), producer’s (>66%) and overall (>82%) accuracies of the poor county identification (probability of poverty greater than 0.6), show that the seven machine learning approaches used in this paper exhibit good performance, although some differences exist among the approaches. The order of feature importance reveals that the relative importance of each feature differs among the models; however, the important features remain consistent. The nine most important features ranked in each approach are relatively robust for poverty identification at the county level. Both spatial feature and statistical features calculated in part from the central tendency, degree of dispersion, and the distribution of the night-time light data were identified as indispensable robust features in all the approaches, indicating that the complex social phenomenon of poverty requires analysis from different aspects. Previous studies that utilized primarily night-time light imagery applied single features related to the central tendency or the distribution features of the imagery; this study provides a new method and can act as a reference for feature selection and identification of high-poverty counties using night-time light imagery and has potential applications across several scientific domains.Funding Information
- The National Key R&D Program of China (2017YFB0503500)
This publication has 55 references indexed in Scilit:
- High density biomass estimation for wetland vegetation using WorldView-2 imagery and random forest regression algorithmInternational Journal of Applied Earth Observation and Geoinformation, 2012
- Poverty assessment using DMSP/OLS night-time light satellite imagery at a provincial scale in ChinaAdvances in Space Research, 2012
- Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithmsComputer Methods and Programs in Biomedicine, 2011
- Modelling the population density of China at the pixel level based on DMSP/OLS non‐radiance‐calibrated night‐time light imagesInternational Journal of Remote Sensing, 2009
- Absolute poverty measures for the developing world, 1981–2004Proceedings of the National Academy of Sciences of the United States of America, 2007
- Discrimination of hoary cress and determination of its detection limits via hyperspectral image processing and accuracy assessment techniquesRemote Sensing of Environment, 2005
- Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysisRemote Sensing of Environment, 2004
- Comparing support vector machines with Gaussian kernels to radial basis function classifiersIEEE Transactions on Signal Processing, 1997
- Iteratively Reweighted Partial Least Squares Estimation for Generalized Linear RegressionTechnometrics, 1996
- A review of assessing the accuracy of classifications of remotely sensed dataRemote Sensing of Environment, 1991