Bounding Box Regression With Uncertainty for Accurate Object Detection
Top Cited Papers
- 1 June 2019
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 2883-2892
- https://doi.org/10.1109/cvpr.2019.00300
Abstract
Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization accuracies of various architectures with nearly no additional computation. The learned localization variance allows us to merge neighboring bounding boxes during non-maximum suppression (NMS), which further improves the localization performance. On MS-COCO, we boost the Average Precision (AP) of VGG-16 Faster R-CNN from 23.6% to 29.1%. More importantly, for ResNet-50-FPN Mask R-CNN, our method improves the AP and AP90 by 1.8% and 6.2% respectively, which significantly outperforms previous state-of-the-art bounding box refinement methods. Our code and models are available at github.com/yihui-he/KL-LossKeywords
This publication has 24 references indexed in Scilit:
- YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in VideoPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2017
- UnitBoxPublished by Association for Computing Machinery (ACM) ,2016
- Deep Residual Learning for Image RecognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- You Only Look Once: Unified, Real-Time Object DetectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Object Detection via a Multi-region and Semantic Segmentation-Aware CNN ModelPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- CaffePublished by Association for Computing Machinery (ACM) ,2014
- Rich Feature Hierarchies for Accurate Object Detection and Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Machine Learning, a Probabilistic PerspectiveCHANCE, 2014
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Edge and Curve Detection for Visual Scene AnalysisInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1971