The Impact of Data Filtration on the Accuracy of Multiple Time-Domain Forecasting for Photovoltaic Power Plants Generation

Abstract
The paper reports the forecasting model for multiple time-domain photovoltaic power plants, developed in response to the necessity of bad weather days’ accurate and robust power generation forecasting. We provide a brief description of the piloted short-term forecasting system and place under close scrutiny the main sources of photovoltaic power plants’ generation forecasting errors. The effectiveness of the empirical approach versus unsupervised learning was investigated in application to source data filtration in order to improve the power generation forecasting accuracy for unstable weather conditions. The k-nearest neighbors’ methodology was justified to be optimal for initial data filtration, based on the clusterization results, associated with peculiar weather and seasonal conditions. The photovoltaic power plants’ forecasting accuracy improvement was further investigated for a one hour-ahead time-domain. It was proved that operational forecasting could be implemented based on the results of short-term day-ahead forecast mismatches predictions, which form the basis for multiple time-domain integrated forecasting tools. After a comparison of multiple time series forecasting approaches, operational forecasting was realized based on the second-order autoregression function and applied to short-term forecasting errors with the resulting accuracy of 87%. In the concluding part of the article the authors from the points of view of computational efficiency and scalability proposed the hardware system composition.