Lexical Data Augmentation for Text Classification in Deep Learning
- 6 May 2020
- book chapter
- conference paper
- Published by Springer Science and Business Media LLC in Lecture Notes in Computer Science
Abstract
This paper presents our work on using part-of-speech focused lexical substitution for data augmentation (PLSDA) to enhance the prediction capabilities and the performance of deep learning models. This paper explains how PLSDA uses part-of-speech information to identify words and make use of different augmentation strategies to find semantically related substitutions to generate new instances for training. Evaluations of PLSDA is conducted on a variety of datasets across different text classification tasks. When PLSDA is applied to four deep learning models, results show that classifiers trained with PLSDA achieve 1.3% accuracy improvement on average.Keywords
This publication has 4 references indexed in Scilit:
- EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification TasksPublished by Association for Computational Linguistics (ACL) ,2019
- That's So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve TweetsPublished by Association for Computational Linguistics (ACL) ,2015
- WordNetPublished by Wiley ,2012
- Feature-rich part-of-speech tagging with a cyclic dependency networkPublished by Association for Computational Linguistics (ACL) ,2003