Sparse Kernel Reduced-Rank Regression for Bimodal Emotion Recognition From Facial Expression and Speech

21 April 2016

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Multimedia

Vol. 18 (7), 1319-1329
https://doi.org/10.1109/tmm.2016.2557721

Abstract

A novel bimodal emotion recognition approach from facial expression and speech based on the sparse kernel reduced-rank regression (SKRRR) fusion method is proposed in this paper. In this method, we use the openSMILE feature extractor and the scale invariant feature transform feature descriptor to respectively extract effective features from speech modality and facial expression modality, and then propose the SKRRR fusion approach to fuse the emotion features of two modalities. The proposed SKRRR method is a nonlinear extension of the traditional reduced-rank regression (RRR), where both predictor and response feature vectors in RRR are kernelized by being mapped onto two high-dimensional feature space via two nonlinear mappings, respectively. To solve the SKRRR problem, we propose a sparse representation (SR)-based approach to find the optimal solution of the coefficient matrices of SKRRR, where the introduction of the SR technique aims to fully consider the different contributions of training data samples to the derivation of optimal solution of SKRRR. Finally, we utilize the eNTERFACE '05 and AFEW 4.0 bimodal emotion database to conduct the experiments of monomodal emotion recognition and bimodal emotion recognition, and the results indicate that our presented approach acquires the highest or comparable bimodal emotion recognition rate among some state-of-the-art approaches.

Keywords

Funding Information

National Basic Research Program of China (2015CB351704)
National Natural Science Foundation of China (61231002, 61501249)
Natural Science Foundation of Jiangsu Province (BK20150855, BK20130020)
Ph.D. Program Foundation of the Ministry Education of China (20120092110054)
Natural Science Foundation for Jiangsu Higher Education Institutions (15KJB510022)
NUPTSF (NY214143)

This publication has 50 references indexed in Scilit:

AV+EC 2015
Published by Association for Computing Machinery (ACM) ,2015
Multimodal depression recognition with dynamic visual and audio cues
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Emotion Recognition In The Wild Challenge 2014
Published by Association for Computing Machinery (ACM) ,2014
Speech Emotion Recognition Based on Sparse Representation
Archives of Acoustics, 2013
Robust face recognition based on Kernel Reduced Rank Regression
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
A Least-Squares Framework for Component Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012
Bilinear Kernel Reduced Rank Regression for Facial Expression Synthesis
Lecture Notes in Computer Science, 2010
A Survey of Decision Fusion and Feature Fusion Strategies for Pattern Classification
IETE Technical Review, 2010
Sparse Canonical Correlation Analysis with Application to Genomic Data Integration
Statistical Applications in Genetics and Molecular Biology, 2009
Estimating Linear Restrictions on Regression Coefficients for Multivariate Normal Distributions
The Annals of Mathematical Statistics, 1951

Cited by 76 articles