Adaptive Fusion and Category-Level Dictionary Learning Model for Multiview Human Action Recognition
Top Cited Papers
- 17 April 2019
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Internet of Things Journal
- Vol. 6 (6), 9280-9293
- https://doi.org/10.1109/jiot.2019.2911669
Abstract
Abstract– Human actions are often captured by multiple cameras (or sensors) to overcome the significant variations in viewpoints, background clutter, object speed and motion patterns in video surveillance, and action recognition systems often benefit from fusing multiple types of cameras (sensors). Therefore, adaptive fusion of the information from multiple domains is mandatory for multi-view human action recognition. Two widely applied fusion schemes are feature-level fusion and score-level fusion. We point out that limitations still exist and there is tremendous room for improvement, including the separate computation of feature fusion and action recognition, or the fixed weights for each action and each camera. However, previous fusion methods cannot accomplish them. In this work, inspired by nature, the above limitations are addressed for multi-view action recognition by developing a novel adaptive fusion and category-level dictionary learning model (abbreviated to AFCDL). It can jointly learn the adaptive weight for each camera and optimize the reconstruction of samples towards the action recognition task. To induce the dictionary learning and the reconstruction of query set (or test samples), the induced set for each category is built, and the corresponding induced regularization term is designed for the objective function. Extensive experiments on four public multi-view action benchmarks show that AFCDL can significantly outperforms the state-of-the-art methods with 3% to 10% improvement in recognition accuracy.Funding Information
- National Natural Science Foundation of China (61872270, 61572357)
- Natural Science Foundation of Tianjin City (18JCYBJC85500)
This publication has 58 references indexed in Scilit:
- Computer vision for RGB-D sensors: Kinect and its applications [special issue intro.]IEEE Transactions on Cybernetics, 2013
- Spatio-Temporal Laplacian Pyramid Coding for Action RecognitionIEEE Transactions on Cybernetics, 2013
- Label Consistent K-SVD: Learning a Discriminative Dictionary for RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
- Performance evaluation of early and late fusion methods for generic semantics indexingPattern Analysis and Applications, 2013
- Dense Trajectories and Motion Boundary Descriptors for Action RecognitionInternational Journal of Computer Vision, 2013
- Double Fusion for Multimedia Event DetectionLecture Notes in Computer Science, 2012
- Multi-view Discriminant AnalysisLecture Notes in Computer Science, 2012
- Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz InequalityMathematics of Operations Research, 2010
- $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse RepresentationIEEE Transactions on Signal Processing, 2006
- Least angle regressionThe Annals of Statistics, 2004