Sharing Visual Features for Multiclass and Multiview Object Detection

Top Cited Papers

19 March 2007

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 29 (5), 854-869
https://doi.org/10.1109/tpami.2007.1055

Abstract

We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (runtime) computational complexity and the (training-time) sample complexity scale linearly with the number of classes to be detected. We present a multitask learning procedure, based on boosted decision stumps, that reduces the computational and sample complexity by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required and, therefore, the runtime cost of the classifier, is observed to scale approximately logarithmically with the number of classes. The features selected by joint training are generic edge-like features, whereas the features chosen by training each class separately tend to be more object-specific. The generic features generalize better and considerably reduce the computational cost of multiclass object detection

This publication has 27 references indexed in Scilit:

LabelMe: A Database and Web-Based Tool for Image Annotation
International Journal of Computer Vision, 2007
Pictorial Structures for Object Recognition
International Journal of Computer Vision, 2005
Learning to detect objects in images via a sparse, part-based representation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004
Robust Real-Time Face Detection
International Journal of Computer Vision, 2004
Greedy function approximation: A gradient boosting machine.
The Annals of Statistics, 2001
Additive logistic regression: a statistical view of boosting (With discussion and a rejoinder by the authors)
The Annals of Statistics, 2000
Classification by pairwise coupling
The Annals of Statistics, 1998
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Computation of Orientational Filters for Real-Time Computer Vision Problems I: Implementation and Methodology
Real-Time Imaging, 1995
Visual learning and recognition of 3-d objects from appearance
International Journal of Computer Vision, 1995

Cited by 484 articles