Deep Learning for Real-Time 3D Multi-Object Detection, Localisation, and Tracking: Application to Smart Mobility

Open Access

18 January 2020

journal article
research article
Published by MDPI AG in Sensors

Vol. 20 (2), 532
https://doi.org/10.3390/s20020532

Abstract

In core computer vision tasks, we have witnessed significant advances in object detection, localisation and tracking. However, there are currently no methods to detect, localize and track objects in road environments, and taking into account real-time constraints. In this paper, our objective is to develop a deep learning multi object detection and tracking technique applied to road smart mobility. Firstly, we propose an effective detector-based on YOLOv3 which we adapt to our context. Subsequently, to localize successfully the detected objects, we put forward an adaptive method aiming to extract 3D information, i.e., depth maps. To do so, a comparative study is carried out taking into account two approaches: Monodepth2 for monocular vision and MADNEt for stereoscopic vision. These approaches are then evaluated over datasets containing depth information in order to discern the best solution that performs better in real-time conditions. Object tracking is necessary in order to mitigate the risks of collisions. Unlike traditional tracking approaches which require target initialization beforehand, our approach consists of using information from object detection and distance estimation to initialize targets and to track them later. Expressly, we propose here to improve SORT approach for 3D object tracking. We introduce an extended Kalman filter to better estimate the position of objects. Extensive experiments carried out on KITTI dataset prove that our proposal outperforms state-of-the-art approches.

Keywords

This publication has 17 references indexed in Scilit:

SSD: Single Shot MultiBox Detector
Published by Springer Science and Business Media LLC ,2016
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016
Fast R-CNN
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
ImageNet Large Scale Visual Recognition Challenge
International Journal of Computer Vision, 2015
The Pascal Visual Object Classes Challenge: A Retrospective
International Journal of Computer Vision, 2014
Vision meets robotics: The KITTI dataset
The International Journal of Robotics Research, 2013
Harmonic distortion free distance estimation in ToF camera
Published by SPIE-Intl Soc Optical Eng ,2011
Cost Aggregation and Occlusion Handling With WLS in Stereo Matching
IEEE Transactions on Image Processing, 2008
Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
EURASIP Journal on Image and Video Processing, 2008
A New Approach to Linear Filtering and Prediction Problems
Journal of Basic Engineering, 1960

Cited by 40 articles