Supervised Machine Learning for Estimation of Total Suspended Solids in Urban Watersheds

Open Access

10 January 2021

journal article
research article
Published by MDPI AG in Water

Vol. 13 (2), 147
https://doi.org/10.3390/w13020147

Abstract

Machine Learning (ML) algorithms provide an alternative for the prediction of pollutant concentration. We compared eight ML algorithms (Linear Regression (LR), uniform weighting k-Nearest Neighbor (UW-kNN), variable weighting k-Nearest Neighbor (VW-kNN), Support Vector Regression (SVR), Artificial Neural Network (ANN), Regression Tree (RT), Random Forest (RF), and Adaptive Boosting (AdB)) to evaluate the feasibility of ML approaches for estimation of Total Suspended Solids (TSS) using the national stormwater quality database. Six factors were used as features to train the algorithms with TSS concentration as the target parameter: Drainage area, land use, percent of imperviousness, rainfall depth, runoff volume, and antecedent dry days. Comparisons among the ML methods demonstrated a higher degree of variability in model performance, with the coefficient of determination (R²) and Nash–Sutcliffe (NSE) values ranging from 0.15 to 0.77. The Root Mean Square (RMSE) values ranged from 110 mg/L to 220 mg/L. The best fit was obtained using the AdB and RF models, with R² values of 0.77 and 0.74 in the training step and 0.67 and 0.64 in the prediction step. The NSE values were 0.76 and 0.72 in the training step and 0.67 and 0.62 in the prediction step. The predictions from AdB were sensitive to all six factors. However, the sensitivity level was variable.

Funding Information

South Dakota Board of Regents (MA2000007)

This publication has 61 references indexed in Scilit:

Discriminating the papyrus vegetation (Cyperus papyrusL.) and its co-existent species using random forest and hyperspectral data resampled to HYMAP
International Journal of Remote Sensing, 2011
Suspended particle characteristics in storm runoff from urban impervious surfaces in Toowoomba, Australia
Urban Water Journal, 2009
Artificial neural network modeling of the river water quality—A case study
Ecological Modelling, 2009
Prediction of urban stormwater quality using artificial neural networks
Environmental Modelling & Software, 2009
Random forests as a tool for ecohydrological distribution modelling
Ecological Modelling, 2007
Advanced Statistics: Linear Regression, Part II: Multiple Linear Regression
Academic Emergency Medicine, 2004
Vegetation effects on fecal bacteria, BOD, and suspended solid removal in constructed wetlands treating domestic wastewater
Ecological Engineering, 2003
Linear regression for calibration lines revisited: weighting schemes for bioanalytical methods
Journal of Chromatography B, 2002
Evaluation of methods for estimating stormwater pollutant loads
Water Environment Research, 1998
River flow forecasting through conceptual models part I — A discussion of principles
Journal of Hydrology, 1970

Cited by 16 articles