Comparison and Adaptation of Two Strategies for Anomaly Detection in Load Profiles Based on Methods from the Fields of Machine Learning and Statistics
Open Access
- 1 January 2021
- journal article
- research article
- Published by Scientific Research Publishing, Inc. in Open Journal of Energy Efficiency
- Vol. 10 (02), 37-49
- https://doi.org/10.4236/ojee.2020.102003
Abstract
The Federal Office for Economic Affairs and Export Control (BAFA) of Germany promotes digital concepts for increasing energy efficiency as part of the “Pilotprogramm Einsparzähler”. Within this program, Limón GmbH is developing software solutions in cooperation with the University of Kassel to identify efficiency potentials in load profiles by means of automated anomaly detection. Therefore, in this study two strategies for anomaly detection in load profiles are evaluated. To estimate the monthly load profile, strategy 1 uses the artificial neural network LSTM (Long Short-Term Memory), with a data period of one month (1 M) or three months (3 M), and strategy 2 uses the smoothing method PEWMA (Probalistic Exponential Weighted Moving Average). By comparing with original load profile data, residuals or summed residuals of the sequence lengths of two, four, six and eight hours are identified as an anomaly by exceeding a predefined threshold. The thresholds are defined by the Z-Score test, i.e., residuals greater than 2, 2.5 or 3 standard deviations are considered anomalous. Furthermore, the ESD (Extreme Studentized Deviate) test is used to set thresholds by means of three significance level values of 0.05, 0.10 and 0.15, with a maximum of k = 40 iterations. Five load profiles are examined, which were obtained by the cluster method k-Means as a representative sample from all available data sets of the Limón GmbH. The evaluation shows that for strategy 1 a maximum F1-value of 0.4 (1 M) and for all examined companies an average F1-value of maximum 0.24 and standard deviation of 0.09 (1 M) could be achieved for the investigation on single residuals. In variant 3 M the highest F1-value could be achieved with an average F1-value of 0.21 and standard deviation of 0.06 (3 M) for summed residuals of the partial sequence length of four hours. The PEWMA-based strategy 2 did not show a higher anomaly detection efficacy compared to strategy 1 in any of the investigated companies.Keywords
This publication has 10 references indexed in Scilit:
- A Comparison of ARIMA and LSTM in Forecasting Time SeriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2018
- Exact variable-length anomaly detection algorithm for univariate and multivariate time seriesData Mining and Knowledge Discovery, 2018
- Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic ThresholdingPublished by Association for Computing Machinery (ACM) ,2018
- Statistical and Machine Learning forecasting methods: Concerns and ways forwardPLOS ONE, 2018
- Real-Time Sentiment-Based Anomaly Detection in Twitter Data StreamsPublished by Springer Science and Business Media LLC ,2015
- Outlier Detection for Temporal Data: A SurveyIEEE Transactions on Knowledge and Data Engineering, 2013
- Probabilistic reasoning for streaming anomaly detectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Anomaly detectionACM Computing Surveys, 2009
- Long Short-Term MemoryNeural Computation, 1997
- Percentage Points for a Generalized ESD Many-Outlier ProcedureTechnometrics, 1983