Deep Alignment Network Based Multi-Person Tracking With Occlusion and Motion Reasoning

Abstract
Tracking-by-detection is one of the most common paradigms for multi-person tracking, due to the availability of automatic pedestrian detectors. However, existing multi-person trackers are greatly challenged by misalignment in the pedestrian detectors (i.e., excessive background and part missing) and occlusion. To address these problems, we propose a deep alignment network based multi-person tracking method with occlusion and motion reasoning. Specifically, the inaccurate detections are firstly corrected via a deep alignment network, in which an alignment estimation module is used to automatically learn the spatial transformation of these detections. As a result, the deep features from our alignment network will have a better representation power and thus lead to more consistent tracks. Then, a coarse-to-fine schema is designed for construing a discriminative association cost matrix with spatial, motion and appearance information. Meanwhile, a principled approach is developed to allow our method to handle occlusion with motion reasoning and the re-identification ability of the pedestrian alignment network. Finally, a simple yet real-time Hungarian algorithm is employed to solve the association problem. Comprehensive experiments on MOT16, ISSIA soccer, PETS09 and TUD datasets validate the effectiveness and robustness of the proposed method.
Funding Information
  • National Natural Science Foundation of China (61572205, 61802135)
  • Natural Science Foundation of Fujian Province (2017J01113)

This publication has 51 references indexed in Scilit: