Fast Feature Pyramids for Object Detection

Top Cited Papers

16 January 2014

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 36 (8), 1532-1545
https://doi.org/10.1109/tpami.2014.2300479

Abstract

Multi-resolution image features may be approximated via extrapolation from nearby scales, rather than being computed explicitly. This fundamental insight allows us to design object detection algorithms that are as accurate, and considerably faster, than the state-of-the-art. The computational bottleneck of many modern detectors is the computation of features at every scale of a finely-sampled image pyramid. Our key insight is that one may compute finely sampled feature pyramids at a fraction of the cost, without sacrificing performance: for a broad family of features we find that features computed at octave-spaced scale intervals are sufficient to approximate features on a finely-sampled pyramid. Extrapolation is inexpensive as compared to direct feature computation. As a result, our approximation yields considerable speedups with negligible loss in detection accuracy. We modify three diverse visual recognition systems to use fast feature pyramids and show results on both pedestrian detection (measured on the Caltech, INRIA, TUD-Brussels and ETH data sets) and general object detection (measured on the PASCAL VOC). The approach is general and is widely applicable to vision algorithms requiring fine-grained multi-scale analysis. Our approximation is valid for images with broad spectra (most natural images) and fails for images with narrow band-pass spectra (e.g., periodic textures).

Keywords

This publication has 62 references indexed in Scilit:

A Fast Stereo-based System for Detecting and Tracking Pedestrians from a Moving Vehicle
The International Journal of Robotics Research, 2009
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Distortion invariant object recognition in the dynamic link architecture
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1993
Computational framework for determining stereo correspondence from a set of linear spatial filters
Image and Vision Computing, 1992
Scale-space for discrete signals
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1990
Multirate digital filters, filter banks, polyphase networks, and applications: a tutorial
Proceedings of the IEEE, 1990
A theory for multiresolution signal decomposition: the wavelet representation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989
A theory of multirate filter banks
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
The Laplacian Pyramid as a Compact Image Code
IEEE Transactions on Communications, 1983
A Characterization of the Exponential Function
The American Mathematical Monthly, 1957

Cited by 1545 articles