Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering
- 4 December 2014
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Multimedia
- Vol. 17 (2), 186-200
- https://doi.org/10.1109/tmm.2014.2377515
Abstract
The problem of tracking multiple moving speakers in indoor environments has received much attention. Earlier techniques were based purely on a single modality, e.g., vision. Recently, the fusion of multi-modal information has been shown to be instrumental in improving tracking performance, as well as robustness in the case of challenging situations like occlusions (by the limited field of view of cameras or by other speakers). However, data fusion algorithms often suffer from noise corrupting the sensor measurements which cause non-negligible detection errors. Here, a novel approach to combining audio and visual data is proposed. We employ the direction of arrival angles of the audio sources to reshape the typical Gaussian noise distribution of particles in the propagation step and to weight the observation model in the measurement step. This approach is further improved by solving a typical problem associated with the PF, whose efficiency and accuracy usually depend on the number of particles and noise variance used in state estimation and particle propagation. Both parameters are specified beforehand and kept fixed in the regular PF implementation which makes the tracker unstable in practice. To address these problems, we design an algorithm which adapts both the number of particles and noise variance based on tracking error and the area occupied by the particles in the image. Experiments on the AV16.3 dataset show the advantage of our proposed methods over the baseline PF method and an existing adaptive PF algorithm for tracking occluded speakers with a significantly reduced number of particles.Keywords
Funding Information
- Engineering and Physical Sciences Research Council (EP/H050000/1, EP/K014307/1, EP/L000539/1)
This publication has 42 references indexed in Scilit:
- On some properties of Markov chain Monte Carlo simulation methods based on the particle filterJournal of Econometrics, 2012
- Localization and tracking for simultaneous speakers based on time-frequency method and Probability Hypothesis Density filterPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2011
- Audio-Visual Fusion and Tracking With Multilevel Iterative Decoding: Framework and Experimental EvaluationIEEE Journal of Selected Topics in Signal Processing, 2010
- Audio–Visual Active Speaker Tracking in Cluttered Indoors EnvironmentsIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008
- Audiovisual Probabilistic Tracking of Multiple Speakers in MeetingsIEEE Transactions on Audio, Speech, and Language Processing, 2007
- Tracking of Multiple, Partially Occluded Humans based on Static Body Part DetectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Multiple Object Tracking Using Particle FiltersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Adapting the Sample Size in Particle Filters Through KLD-SamplingThe International Journal of Robotics Research, 2003
- A graphical model for audiovisual object trackingIEEE Transactions on Pattern Analysis and Machine Intelligence, 2003
- Kernel-based object trackingIEEE Transactions on Pattern Analysis and Machine Intelligence, 2003