Matrix Capsule Convolutional Projection for Deep Feature Learning
- 13 October 2020
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Signal Processing Letters
- Vol. 27 (10709908), 1899-1903
- https://doi.org/10.1109/lsp.2020.3030550
Abstract
Capsule projection network (CapProNet) has shown its ability to obtain semantic information, and spatial structural information from the raw images. However, the vector capsule of CapProNet has limitations in representing semantic information due to ignoring local information. Besides, the number of trainable parameters also increases greatly with the dimension of the feature vector. To that end, we propose a matrix capsule convolution projection (MCCP) module by replacing the feature vector with a feature matrix, of which each column represents a local feature. The feature matrix is then convoluted by columns into capsule subspaces to decrease the number of trainable parameters effectively. Furthermore, the CapDetNet is designed to explore the structural information encoding of the MCCP module based on object detection task. Experimental results demonstrate that the proposed MCCP outperforms the baselines in image classification, and CapDetNet achieves the 2.3% performance gain in object detection.Funding Information
- National Natural Science Foundation of China (61771321, 61872429)
- Department of Education of Guangdong Province (2018KCXTD027)
- Natural Science Foundation of Guangdong Province (2020A1515010959)
- Natural Science Foundation of Shenzhen (JCYJ20170818091621856, JCYJ2020N294)
- Interdisciplinary Innovation Team of Shenzhen University
This publication has 22 references indexed in Scilit:
- Fully Convolutional Instance-Aware Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2017
- SSD: Single Shot MultiBox DetectorPublished by Springer Science and Business Media LLC ,2016
- Identity Mappings in Deep Residual NetworksPublished by Springer Science and Business Media LLC ,2016
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence, 2016
- Deep Residual Learning for Image RecognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- You Only Look Once: Unified, Real-Time Object DetectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Going deeper with convolutionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- The Pascal Visual Object Classes Challenge: A RetrospectiveInternational Journal of Computer Vision, 2014
- Transforming Auto-EncodersLecture Notes in Computer Science, 2011
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998