Bounding Box Regression With Uncertainty for Accurate Object Detection

Top Cited Papers

1 June 2019

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 2883-2892
https://doi.org/10.1109/cvpr.2019.00300

Abstract

Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization accuracies of various architectures with nearly no additional computation. The learned localization variance allows us to merge neighboring bounding boxes during non-maximum suppression (NMS), which further improves the localization performance. On MS-COCO, we boost the Average Precision (AP) of VGG-16 Faster R-CNN from 23.6% to 29.1%. More importantly, for ResNet-50-FPN Mask R-CNN, our method improves the AP and AP90 by 1.8% and 6.2% respectively, which significantly outperforms previous state-of-the-art bounding box refinement methods. Our code and models are available at github.com/yihui-he/KL-Loss

Keywords

This publication has 24 references indexed in Scilit:

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
UnitBox
Published by Association for Computing Machinery (ACM) ,2016
Deep Residual Learning for Image Recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
You Only Look Once: Unified, Real-Time Object Detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Caffe
Published by Association for Computing Machinery (ACM) ,2014
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Machine Learning, a Probabilistic Perspective
CHANCE, 2014
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision, 2004
Edge and Curve Detection for Visual Scene Analysis
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1971

Cited by 361 articles