A fuzzy data augmentation technique to improve regularisation

8 November 2021

journal article
research article
Published by Hindawi Limited in International Journal of Intelligent Systems

Vol. 37 (8), 4561-4585
https://doi.org/10.1002/int.22731

Abstract

Deep learning (DL) has achieved superior classification in many applications due to its capability of extracting features from the data. However, the success of DL comes with the tradeoff of possible overfitting. The bias towards the data it has seen during the training process leads to poor generalisation. One way of solving this issue is by having enough training data so that the classifier is invariant to many data patterns. In the literature, data augmentation has been used as a type of regularisation method to reduce the chance for the model to overfit. However, most of the relevant works focus on image, sound or text data. There is not much work on numerical data augmentation, although many real-world problems deal with numerical data. In this paper, we propose using a technique based on Fuzzy C-Means clustering and fuzzy membership grades. Fuzzy-related techniques are used to address the variance problem by generating new data items based on fuzzy numbers and each data item's belongings to different fuzzy clusters. This data augmentation technique is used to improve the generalisation of a Deep Neural Network that is suitable for numerical data. By combining the proposed fuzzy data augmentation technique with the Dropout regularisation technique, we manage to balance the classification model's bias-variance tradeoff. Our proposed technique is evaluated using four popular data sets and is shown to provide better regularisation and higher classification accuracy compared with popular regularisation approaches.

Keywords

This publication has 40 references indexed in Scilit:

Predictive analytics using statistical, learning, and ensemble methods to support real-time exploration of discrete event simulations
Future Generation Computer Systems, 2016
Deep learning
Nature Methods, 2015
Data Augmentation for Deep Neural Network Acoustic Modeling
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015
Deep learning
Nature, 2015
Data augmentation for deep convolutional neural network acoustic modeling
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Better Digit Recognition with a Committee of Simple Neural Nets
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Deep, Big, Simple Neural Nets for Handwritten Digit Recognition
Neural Computation, 2010
Model-based overlapping clustering
Published by Association for Computing Machinery (ACM) ,2005
FCM: The fuzzy c-means clustering algorithm
Computers & Geosciences, 1984
Fuzzy sets
Information and Control, 1965

Cited by 3 articles