Analysing the accuracy of machine learning techniques to develop an integrated influent time series model: case study of a sewage treatment plant, Malaysia
- 1 April 2018
- journal article
- research article
- Published by Springer Science and Business Media LLC in Environmental Science and Pollution Research
- Vol. 25 (12), 12139-12149
- https://doi.org/10.1007/s11356-018-1438-z
Abstract
The function of a sewage treatment plant is to treat the sewage to acceptable standards before being discharged into the receiving waters. To design and operate such plants, it is necessary to measure and predict the influent flow rate. In this research, the influent flow rate of a sewage treatment plant (STP) was modelled and predicted by autoregressive integrated moving average (ARIMA), nonlinear autoregressive network (NAR) and support vector machine (SVM) regression time series algorithms. To evaluate the models' accuracy, the root mean square error (RMSE) and coefficient of determination (R (2)) were calculated as initial assessment measures, while relative error (RE), peak flow criterion (PFC) and low flow criterion (LFC) were calculated as final evaluation measures to demonstrate the detailed accuracy of the selected models. An integrated model was developed based on the individual models' prediction ability for low, average and peak flow. An initial assessment of the results showed that the ARIMA model was the least accurate and the NAR model was the most accurate. The RE results also prove that the SVM model's frequency of errors above 10% or below - 10% was greater than the NAR model's. The influent was also forecasted up to 44 weeks ahead by both models. The graphical results indicate that the NAR model made better predictions than the SVM model. The final evaluation of NAR and SVM demonstrated that SVM made better predictions at peak flow and NAR fit well for low and average inflow ranges. The integrated model developed includes the NAR model for low and average influent and the SVM model for peak inflow.Keywords
Funding Information
- Universiti Malaya (FL001-13SUS)
- Ministry of Higher Education, Malaysia (FP016-2014A)
This publication has 60 references indexed in Scilit:
- Daily Forecasting of Dam Water Levels: Comparing a Support Vector Machine (SVM) Model With Adaptive Neuro Fuzzy Inference System (ANFIS)Water Resources Management, 2013
- Prediction of Influent Flow Rate: Data-Mining ApproachJournal of Energy Engineering, 2013
- A hybrid seasonal prediction model for tuberculosis incidence in ChinaBMC Medical Informatics and Decision Making, 2013
- A wavelet-support vector machine conjunction model for monthly streamflow forecastingJournal of Hydrology, 2011
- Wavelet regression model for short-term streamflow forecastingJournal of Hydrology, 2010
- Support vector regression for real-time flood stage forecastingJournal of Hydrology, 2006
- Multi-time scale stream flow predictions: The support vector machines approachJournal of Hydrology, 2006
- Groundwater level forecasting using artificial neural networksJournal of Hydrology, 2005
- Time series forecasting using a hybrid ARIMA and neural network modelNeurocomputing, 2003
- Support-vector networksMachine Learning, 1995