Unsupervised learning in cross-corpus acoustic emotion recognition

Abstract
One of the ever-present bottlenecks in Automatic Emotion Recognition is data sparseness. We therefore investigate the suitability of unsupervised learning in cross-corpus acoustic emotion recognition through a large-scale study with six commonly used databases, including acted and natural emotion speech, and covering a variety of application scenarios and acoustic conditions. We show that adding unlabeled emotional speech to agglomerated multi-corpus training sets can enhance recognition performance even in a challenging cross-corpus setting; furthermore, we show that the expected gain by adding unlabeled data on average is approximately half the one achieved by additional manually labeled data in leave-one-corpus-out validation.

This publication has 15 references indexed in Scilit: