Optimizing Small-Sample Disk Fault Detection Based on LSTM-GAN Model

Open Access

23 January 2022

journal article
research article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization

Vol. 19 (1), 1-24
https://doi.org/10.1145/3500917

Abstract

In recent years, researches on disk fault detection based on SMART data combined with different machine learning algorithms have been proven to be effective. However, these methods require a large amount of data. In the early stages of the establishment of a data center or the deployment of new storage devices, the amount of reliability data for disks is relatively limited, and the amount of failed disk data is even less, resulting in the unsatisfactory detection performances of machine learning algorithms. To solve the above problems, we propose a novel small sample disk fault detection (SSDFD)¹ optimizing method based on Generative Adversarial Networks (GANs). Combined with the characteristics of hard disk reliability data, the generator of the original GAN is improved based on Long Short-Term Memory (LSTM), making it suitable for the generation of failed disk data. To alleviate the problem of data imbalance and expand the failed disk dataset with reduced amounts of original data, the proposed model is trained through adversarial training, which focuses on the generation of failed disk data. Experimental results on real HDD datasets show that SSDFD can generate enough virtual failed disk data to enable the machine learning algorithm to detect disk faults with increased accuracy under the condition of a few original failed disk data. Furthermore, the model trained with 300 original failed disk data has a significant effect on improving the accuracy of HDD fault detection. The optimal amount of generated virtual data are, 20–30 times that of the original data.

Keywords

Funding Information

National Key Research and Development Plan of China (2016YFB1000303)

This publication has 31 references indexed in Scilit:

A Practical Approach to Hard Disk Failure Prediction in Cloud Platforms: Big Data Model for Failure Management in Datacenters
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Hard Drive Failure Prediction Using Classification and Regression Trees
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Rapid Recovery for Systems with Scarce Faults
Electronic Proceedings in Theoretical Computer Science, 2012
Windows Azure Storage
Published by Association for Computing Machinery (ACM) ,2011
Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning
Lecture Notes in Computer Science, 2005
The Google file system
Published by Association for Computing Machinery (ACM) ,2003
Learning to Forget: Continual Prediction with LSTM
Neural Computation, 2000
Long Short-Term Memory
Neural Computation, 1997
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
Journal of Computer and System Sciences, 1997
Bagging predictors
Machine Learning, 1996

Cited by 7 articles