Facial Expression Recognition Using Weighted Mixture Deep Neural Network Based on Double-Channel Facial Images
Open Access
- 31 December 2017
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Access
- Vol. 6, 4630-4640
- https://doi.org/10.1109/ACCESS.2017.2784096
Abstract
Facial expression recognition (FER) is a significant task for the machines to understand the emotional changes in human beings. However, accurate hand-crafted features that are highly related to changes in expression are difficult to extract because of the influences of individual difference and variations in emotional intensity. Therefore, features that can accurately describe the changes in facial expressions are urgently required. Method: A weighted mixture deep neural network (WMDNN) is proposed to automatically extract the features that are effective for FER tasks. Several pre-processing approaches, such as face detection, rotation rectification, and data augmentation, are implemented to restrict the regions for FER. Two channels of facial images, including facial grayscale images and their corresponding local binary pattern (LBP) facial images, are processed by WMDNN. Expression-related features of facial grayscale images are extracted by fine-tuning a partial VGG16 network, the parameters of which are initialized using VGG16 model trained on ImageNet database. Features of LBP facial images are extracted by a shallow convolutional neural network (CNN) built based on DeepID. The outputs of both channels are fused in a weighted manner. The result of final recognition is calculated using softmax classification. Results: Experimental results indicate that the proposed algorithm can recognize six basic facial expressions (happiness, sadness, anger, disgust, fear, and surprise) with high accuracy. The average recognition accuracies for benchmarking data sets "CK+," "JAFFE," and "Oulu-CASIA" are 0.970, 0.922, and 0.923, respectively. Conclusions: The proposed FER method outperforms the state-of-the-art FER methods based on the hand-crafted features or deep networks using one channel. Compared with the deep networks that use multiple channels, our proposed network can achieve comparable performance with easier procedures. Fine-tuning is effective to FER tasks with a well pre-trained model if sufficient samples cannot be collected.Funding Information
- National Natural Science Foundation of China (61501060, 61703381)
- Natural Science Foundation of Jiangsu Province (BK20150271)
- Key Laboratory for New Technology Application of Road Conveyance of Jiangsu Province (BM20082061708)
This publication has 32 references indexed in Scilit:
- AU-inspired Deep Networks for Facial Expression Feature LearningNeurocomputing, 2015
- Deep learning in neural networks: An overviewNeural Networks, 2015
- Facial expression recognition based on a mlp neural network using constructive training algorithmMultimedia Tools and Applications, 2014
- Learning Multiscale Active Facial Patches for Expression AnalysisIEEE Transactions on Cybernetics, 2014
- Depth Camera-Based Facial Expression Recognition System Using Multilayer SchemeIETE Technical Review, 2014
- PCA-based dictionary building for accurate facial expression recognition via sparse representationJournal of Visual Communication and Image Representation, 2014
- An Analysis of the Viola-Jones Face Detection AlgorithmImage Processing On Line, 2014
- A 0.64 mm$^{2}$ Real-Time Cascade Face Detection Design Based on Reduced Two-Field ExtractionIEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2010
- Facial expression recognition based on Local Binary Patterns: A comprehensive studyImage and Vision Computing, 2009
- Gabor feature based classification using the enhanced fisher linear discriminant model for face recognitionIEEE Transactions on Image Processing, 2002