Random Forests for Global and Regional Crop Yield Predictions
Top Cited Papers
Open Access
- 3 June 2016
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 11 (6), e0156571
- https://doi.org/10.1371/journal.pone.0156571
Abstract
Accurate predictions of crop yield are critical for developing effective agricultural and food policies at the regional and global scales. We evaluated a machine-learning method, Random Forests (RF), for its ability to predict crop yield responses to climate and biophysical variables at global and regional scales in wheat, maize, and potato in comparison with multiple linear regressions (MLR) serving as a benchmark. We used crop yield data from various sources and regions for model training and testing: 1) gridded global wheat grain yield, 2) maize grain yield from US counties over thirty years, and 3) potato tuber and maize silage yield from the northeastern seaboard region. RF was found highly capable of predicting crop yields and outperformed MLR benchmarks in all performance statistics that were compared. For example, the root mean square errors (RMSE) ranged between 6 and 14% of the average observed yield with RF models in all test cases whereas these values ranged from 14% to 49% for MLR models. Our results show that RF is an effective and versatile machine-learning method for crop yield predictions at regional and global scales for its high accuracy and precision, ease of use, and utility in data analysis. RF may result in a loss of accuracy when predicting the extreme ends or responses beyond the boundaries of the training data.Keywords
Funding Information
- Rural Development Administration (PJ01000707)
- Rural Development Administration (PJ01000707)
- U.S. Department of Agriculture (58-1265-1-074)
- National Institute of Food and Agriculture (2011-68004-30057)
- USDA-ARS Headquarters Postdoctoral Research Associate Program
- National Institute of Food and Agriculture (2016-67012-25208)
- Directorate for Geosciences (1521210)
- David and Lucile Packard Foundation
This publication has 37 references indexed in Scilit:
- Random Forests modelling for the estimation of mango (Mangifera indica L. cv. Chok Anan) fruit yields under different irrigation regimesAgricultural Water Management, 2013
- High density biomass estimation for wetland vegetation using WorldView-2 imagery and random forest regression algorithmInternational Journal of Applied Earth Observation and Geoinformation, 2012
- Closing yield gaps through nutrient and water managementNature, 2012
- Global food demand and the sustainable intensification of agricultureProceedings of the National Academy of Sciences of the United States of America, 2011
- Variable Importance Assessment in Regression: Linear Regression versus Random ForestThe American Statistician, 2009
- Nonlinear temperature effects indicate severe damages to U.S. crop yields under climate changeProceedings of the National Academy of Sciences of the United States of America, 2009
- Comparing niche‐ and process‐based models to reduce prediction uncertainty in species range shifts under climate changeEcology, 2009
- Newer classification and regression tree techniques: Bagging and random forests for ecological predictionEcosystems, 2006
- Prediction of protein–protein interactions using random decision forest frameworkBioinformatics, 2005
- Identifying SNPs predictive of phenotype using random forestsGenetic Epidemiology, 2004