A comparison of machine learning approaches for identifying high-poverty counties: robust features of DMSP/OLS night-time light imagery

21 February 2019

journal article
research article
Published by Taylor & Francis Ltd in International Journal of Remote Sensing

Vol. 40 (15), 5716-5736
https://doi.org/10.1080/01431161.2019.1580820

Abstract

The goal of the present study is to demonstrate that high-poverty counties and robust classification features can be identified by machine learning approaches using only DMSP/OLS night-time light imagery. To accomplish this goal, a total of 96 high-poverty and 96 non-poverty counties were classified using 15 statistical and spatial features extracted from night-time light imagery in China in 2010 formed a training set for identifying high-poverty counties. Seven machine learning approaches were adopted to classify high-poverty counties, and five feature importance measures were used to select robust features. The resulting metrics, including the user’s (>63%), producer’s (>66%) and overall (>82%) accuracies of the poor county identification (probability of poverty greater than 0.6), show that the seven machine learning approaches used in this paper exhibit good performance, although some differences exist among the approaches. The order of feature importance reveals that the relative importance of each feature differs among the models; however, the important features remain consistent. The nine most important features ranked in each approach are relatively robust for poverty identification at the county level. Both spatial feature and statistical features calculated in part from the central tendency, degree of dispersion, and the distribution of the night-time light data were identified as indispensable robust features in all the approaches, indicating that the complex social phenomenon of poverty requires analysis from different aspects. Previous studies that utilized primarily night-time light imagery applied single features related to the central tendency or the distribution features of the imagery; this study provides a new method and can act as a reference for feature selection and identification of high-poverty counties using night-time light imagery and has potential applications across several scientific domains.

Funding Information

The National Key R&D Program of China (2017YFB0503500)

This publication has 55 references indexed in Scilit:

High density biomass estimation for wetland vegetation using WorldView-2 imagery and random forest regression algorithm
International Journal of Applied Earth Observation and Geoinformation, 2012
Poverty assessment using DMSP/OLS night-time light satellite imagery at a provincial scale in China
Advances in Space Research, 2012
Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms
Computer Methods and Programs in Biomedicine, 2011
Modelling the population density of China at the pixel level based on DMSP/OLS non‐radiance‐calibrated night‐time light images
International Journal of Remote Sensing, 2009
Absolute poverty measures for the developing world, 1981–2004
Proceedings of the National Academy of Sciences of the United States of America, 2007
Discrimination of hoary cress and determination of its detection limits via hyperspectral image processing and accuracy assessment techniques
Remote Sensing of Environment, 2005
Classification of remotely sensed imagery using stochastic gradient boosting as a refinement of classification tree analysis
Remote Sensing of Environment, 2004
Comparing support vector machines with Gaussian kernels to radial basis function classifiers
IEEE Transactions on Signal Processing, 1997
Iteratively Reweighted Partial Least Squares Estimation for Generalized Linear Regression
Technometrics, 1996
A review of assessing the accuracy of classifications of remotely sensed data
Remote Sensing of Environment, 1991

Cited by 38 articles