Region-Based Convolutional Networks for Accurate Object Detection and Segmentation
Top Cited Papers
- 25 May 2015
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence
- Vol. 38 (1), 142-158
- https://doi.org/10.1109/tpami.2015.2437384
Abstract
Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 percent relative to the previous best result on VOC 2012-achieving a mAP of 62.4 percent. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.Keywords
Funding Information
- US National Science Foundation (IIS-0905647, IIS-1134072, IIS-1212798, MURI N000014-10-1-0933)
This publication has 37 references indexed in Scilit:
- What Makes for Effective Detection Proposals?IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015
- Selective Search for Object RecognitionInternational Journal of Computer Vision, 2013
- Representation Learning: A Review and New PerspectivesIEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
- The Pascal Visual Object Classes (VOC) ChallengeInternational Journal of Computer Vision, 2009
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Neural network-based face detectionIEEE Transactions on Pattern Analysis and Machine Intelligence, 1998
- Gradient-based learning applied to document recognitionProceedings of the IEEE, 1998
- Original approach for the localisation of objects in imagesIEE Proceedings - Vision, Image, and Signal Processing, 1994
- Multitask Learning: A Knowledge-Based Source of Inductive BiasPublished by Elsevier BV ,1993
- Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in positionBiological Cybernetics, 1980