A Cascaded Ensemble of Sparse-and-Dense Dictionaries for Vehicle Detection

Open Access

19 February 2021

journal article
research article
Published by MDPI AG in Applied Sciences

Vol. 11 (4), 1861
https://doi.org/10.3390/app11041861

Abstract

Vehicle detection as a special case of object detection has practical meaning but faces challenges, such as the difficulty of detecting vehicles of various orientations, the serious influence from occlusion, the clutter of background, etc. In addition, existing effective approaches, like deep-learning-based ones, demand a large amount of training time and data, which causes trouble for their application. In this work, we propose a dictionary-learning-based vehicle detection approach which explicitly addresses these problems. Specifically, an ensemble of sparse-and-dense dictionaries (ESDD) are learned through supervised low-rank decomposition; each pair of sparse-and-dense dictionaries (SDD) in the ensemble is trained to represent either a subcategory of vehicle (corresponding to certain orientation range or occlusion level) or a subcategory of background (corresponding to a cluster of background patterns) and only gives good reconstructions to samples of the corresponding subcategory, making the ESDD capable of classifying vehicles from background even though they exhibit various appearances. We further organize ESDD into a two-level cascade (CESDD) to perform coarse-to-fine two-stage classification for better performance and computation reduction. The CESDD is then coupled with a downstream AdaBoost process to generate robust classifications. The proposed CESDD model is used as a window classifier in a sliding-window scan process over image pyramids to produce multi-scale detections, and an adapted mean-shift-like non-maximum suppression process is adopted to remove duplicate detections. Our CESDD vehicle detection approach is evaluated on KITTI dataset and compared with other strong counterparts; the experimental results exhibit the effectiveness of CESDD-based classification and detection, and the training of CESDD only demands small amount of time and data.

This publication has 26 references indexed in Scilit:

Towards Scene Understanding with Detailed 3D Object Representations
International Journal of Computer Vision, 2014
Sparse and Dense Hybrid Representation via Dictionary Decomposition for Face Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Ensemble dictionary learning for saliency detection
Image and Vision Computing, 2014
Multi-class AdaBoost
Statistics and Its Interface, 2009
Rapid object detection using a boosted cascade of simple features
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Histograms of Oriented Gradients for Human Detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Greedy function approximation: A gradient boosting machine.
The Annals of Statistics, 2001
Long Short-Term Memory
Neural Computation, 1997
Support-vector networks
Machine Learning, 1995

Cited by 1 article