Comparison and Adaptation of Two Strategies for Anomaly Detection in Load Profiles Based on Methods from the Fields of Machine Learning and Statistics

Open Access

1 January 2021

journal article
research article
Published by Scientific Research Publishing, Inc. in Open Journal of Energy Efficiency

Vol. 10 (02), 37-49
https://doi.org/10.4236/ojee.2020.102003

Abstract

The Federal Office for Economic Affairs and Export Control (BAFA) of Germany promotes digital concepts for increasing energy efficiency as part of the “Pilotprogramm Einsparzähler”. Within this program, Limón GmbH is developing software solutions in cooperation with the University of Kassel to identify efficiency potentials in load profiles by means of automated anomaly detection. Therefore, in this study two strategies for anomaly detection in load profiles are evaluated. To estimate the monthly load profile, strategy 1 uses the artificial neural network LSTM (Long Short-Term Memory), with a data period of one month (1 M) or three months (3 M), and strategy 2 uses the smoothing method PEWMA (Probalistic Exponential Weighted Moving Average). By comparing with original load profile data, residuals or summed residuals of the sequence lengths of two, four, six and eight hours are identified as an anomaly by exceeding a predefined threshold. The thresholds are defined by the Z-Score test, i.e., residuals greater than 2, 2.5 or 3 standard deviations are considered anomalous. Furthermore, the ESD (Extreme Studentized Deviate) test is used to set thresholds by means of three significance level values of 0.05, 0.10 and 0.15, with a maximum of k = 40 iterations. Five load profiles are examined, which were obtained by the cluster method k-Means as a representative sample from all available data sets of the Limón GmbH. The evaluation shows that for strategy 1 a maximum F1-value of 0.4 (1 M) and for all examined companies an average F1-value of maximum 0.24 and standard deviation of 0.09 (1 M) could be achieved for the investigation on single residuals. In variant 3 M the highest F1-value could be achieved with an average F1-value of 0.21 and standard deviation of 0.06 (3 M) for summed residuals of the partial sequence length of four hours. The PEWMA-based strategy 2 did not show a higher anomaly detection efficacy compared to strategy 1 in any of the investigated companies.

Keywords

This publication has 10 references indexed in Scilit:

A Comparison of ARIMA and LSTM in Forecasting Time Series
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2018
Exact variable-length anomaly detection algorithm for univariate and multivariate time series
Data Mining and Knowledge Discovery, 2018
Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding
Published by Association for Computing Machinery (ACM) ,2018
Statistical and Machine Learning forecasting methods: Concerns and ways forward
PLOS ONE, 2018
Real-Time Sentiment-Based Anomaly Detection in Twitter Data Streams
Published by Springer Science and Business Media LLC ,2015
Outlier Detection for Temporal Data: A Survey
IEEE Transactions on Knowledge and Data Engineering, 2013
Probabilistic reasoning for streaming anomaly detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Anomaly detection
ACM Computing Surveys, 2009
Long Short-Term Memory
Neural Computation, 1997
Percentage Points for a Generalized ESD Many-Outlier Procedure
Technometrics, 1983

Cited by 2 articles