Evaluation of Different Machine Learning Models for Predicting Soil Erosion in Tropical Sloping Lands of Northeast Vietnam

Abstract
Soil erosion induced by rainfall under prevailing conditions is a prominent problem to farmers in tropical sloping lands of Northeast Vietnam. This study evaluates possibility of predicting erosion status by machine learning models, including fuzzy k-nearest neighbor (FKNN), artificial neural network (ANN), support vector machine (SVM), least squares support vector machine (LSSVM), and relevance vector machine (RVM). Model evaluation employed a historical dataset consisting of ten explanatory variables and soil erosion featured four different land use managements on hillslopes in Northwest Vietnam. All 236 data samples representing soil erosion/nonerosion events were randomly prepared (80 for training and 20 for testing) to assess the robustness of the five models. This subsampling process was repeatedly carried out by 30 rounds to eliminate the issue of randomness in data selection. Classification accuracy rate (CAR) and area under receiver operating characteristic (AUC) were used to evaluate performance of the five models. Significant difference between different algorithms was verified by the Wilcoxon test. Results of the study showed that RVM model achieves the best outcomes in both training (CAR=92.22 and AUC=0.98) and testing phases (CAR=91.94 and AUC=0.97). Four other learning algorithms also demonstrated good performance as indicated by their CAR values surpassing 80 and AUC values greater than 0.9. Hence, these results strongly confirm the efficacy of applying machine learning models for soil erosion prediction.
Funding Information
  • National Foundation for Science and Technology Development (105.08-2017.302)