Lexical Data Augmentation for Text Classification in Deep Learning

6 May 2020

book chapter
conference paper
Published by Springer Science and Business Media LLC in Lecture Notes in Computer Science

p. 521-527
https://doi.org/10.1007/978-3-030-47358-7_53

Abstract

This paper presents our work on using part-of-speech focused lexical substitution for data augmentation (PLSDA) to enhance the prediction capabilities and the performance of deep learning models. This paper explains how PLSDA uses part-of-speech information to identify words and make use of different augmentation strategies to find semantically related substitutions to generate new instances for training. Evaluations of PLSDA is conducted on a variety of datasets across different text classification tasks. When PLSDA is applied to four deep learning models, results show that classifiers trained with PLSDA achieve 1.3% accuracy improvement on average.

Keywords

This publication has 4 references indexed in Scilit:

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Published by Association for Computational Linguistics (ACL) ,2019
That's So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets
Published by Association for Computational Linguistics (ACL) ,2015
WordNet
Published by Wiley ,2012
Feature-rich part-of-speech tagging with a cyclic dependency network
Published by Association for Computational Linguistics (ACL) ,2003

Cited by 7 articles