Facial Expression Recognition Using Weighted Mixture Deep Neural Network Based on Double-Channel Facial Images

Open Access

31 December 2017

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Access

Vol. 6, 4630-4640
https://doi.org/10.1109/ACCESS.2017.2784096

Abstract

Facial expression recognition (FER) is a significant task for the machines to understand the emotional changes in human beings. However, accurate hand-crafted features that are highly related to changes in expression are difficult to extract because of the influences of individual difference and variations in emotional intensity. Therefore, features that can accurately describe the changes in facial expressions are urgently required. Method: A weighted mixture deep neural network (WMDNN) is proposed to automatically extract the features that are effective for FER tasks. Several pre-processing approaches, such as face detection, rotation rectification, and data augmentation, are implemented to restrict the regions for FER. Two channels of facial images, including facial grayscale images and their corresponding local binary pattern (LBP) facial images, are processed by WMDNN. Expression-related features of facial grayscale images are extracted by fine-tuning a partial VGG16 network, the parameters of which are initialized using VGG16 model trained on ImageNet database. Features of LBP facial images are extracted by a shallow convolutional neural network (CNN) built based on DeepID. The outputs of both channels are fused in a weighted manner. The result of final recognition is calculated using softmax classification. Results: Experimental results indicate that the proposed algorithm can recognize six basic facial expressions (happiness, sadness, anger, disgust, fear, and surprise) with high accuracy. The average recognition accuracies for benchmarking data sets "CK+," "JAFFE," and "Oulu-CASIA" are 0.970, 0.922, and 0.923, respectively. Conclusions: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel. Compared with the deep networks that use multiple channels, our proposed network can achieve comparable performance with easier procedures. Fine-tuning is effective to FER tasks with a well pre-trained model if sufficient samples cannot be collected.

Funding Information

National Natural Science Foundation of China (61501060, 61703381)
Natural Science Foundation of Jiangsu Province (BK20150271)
Key Laboratory for New Technology Application of Road Conveyance of Jiangsu Province (BM20082061708)

This publication has 32 references indexed in Scilit:

AU-inspired Deep Networks for Facial Expression Feature Learning
Neurocomputing, 2015
Deep learning in neural networks: An overview
Neural Networks, 2015
Facial expression recognition based on a mlp neural network using constructive training algorithm
Multimedia Tools and Applications, 2014
Learning Multiscale Active Facial Patches for Expression Analysis
IEEE Transactions on Cybernetics, 2014
Depth Camera-Based Facial Expression Recognition System Using Multilayer Scheme
IETE Technical Review, 2014
PCA-based dictionary building for accurate facial expression recognition via sparse representation
Journal of Visual Communication and Image Representation, 2014
An Analysis of the Viola-Jones Face Detection Algorithm
Image Processing On Line, 2014
A 0.64 mm$^{2}$ Real-Time Cascade Face Detection Design Based on Reduced Two-Field Extraction
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2010
Facial expression recognition based on Local Binary Patterns: A comprehensive study
Image and Vision Computing, 2009
Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition
IEEE Transactions on Image Processing, 2002

Cited by 158 articles