A Cascaded Ensemble of Sparse-and-Dense Dictionaries for Vehicle Detection
Open Access
- 19 February 2021
- journal article
- research article
- Published by MDPI AG in Applied Sciences
- Vol. 11 (4), 1861
- https://doi.org/10.3390/app11041861
Abstract
Vehicle detection as a special case of object detection has practical meaning but faces challenges, such as the difficulty of detecting vehicles of various orientations, the serious influence from occlusion, the clutter of background, etc. In addition, existing effective approaches, like deep-learning-based ones, demand a large amount of training time and data, which causes trouble for their application. In this work, we propose a dictionary-learning-based vehicle detection approach which explicitly addresses these problems. Specifically, an ensemble of sparse-and-dense dictionaries (ESDD) are learned through supervised low-rank decomposition; each pair of sparse-and-dense dictionaries (SDD) in the ensemble is trained to represent either a subcategory of vehicle (corresponding to certain orientation range or occlusion level) or a subcategory of background (corresponding to a cluster of background patterns) and only gives good reconstructions to samples of the corresponding subcategory, making the ESDD capable of classifying vehicles from background even though they exhibit various appearances. We further organize ESDD into a two-level cascade (CESDD) to perform coarse-to-fine two-stage classification for better performance and computation reduction. The CESDD is then coupled with a downstream AdaBoost process to generate robust classifications. The proposed CESDD model is used as a window classifier in a sliding-window scan process over image pyramids to produce multi-scale detections, and an adapted mean-shift-like non-maximum suppression process is adopted to remove duplicate detections. Our CESDD vehicle detection approach is evaluated on KITTI dataset and compared with other strong counterparts; the experimental results exhibit the effectiveness of CESDD-based classification and detection, and the training of CESDD only demands small amount of time and data.This publication has 26 references indexed in Scilit:
- Towards Scene Understanding with Detailed 3D Object RepresentationsInternational Journal of Computer Vision, 2014
- Sparse and Dense Hybrid Representation via Dictionary Decomposition for Face RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 2014
- Rich Feature Hierarchies for Accurate Object Detection and Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Ensemble dictionary learning for saliency detectionImage and Vision Computing, 2014
- Multi-class AdaBoostStatistics and Its Interface, 2009
- Rapid object detection using a boosted cascade of simple featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Histograms of Oriented Gradients for Human DetectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Greedy function approximation: A gradient boosting machine.The Annals of Statistics, 2001
- Long Short-Term MemoryNeural Computation, 1997
- Support-vector networksMachine Learning, 1995