Lexical Data Augmentation for Text Classification in Deep Learning

Abstract
This paper presents our work on using part-of-speech focused lexical substitution for data augmentation (PLSDA) to enhance the prediction capabilities and the performance of deep learning models. This paper explains how PLSDA uses part-of-speech information to identify words and make use of different augmentation strategies to find semantically related substitutions to generate new instances for training. Evaluations of PLSDA is conducted on a variety of datasets across different text classification tasks. When PLSDA is applied to four deep learning models, results show that classifiers trained with PLSDA achieve 1.3% accuracy improvement on average.

This publication has 4 references indexed in Scilit: