BM³E : Discriminative Density Propagation for Visual Tracking

17 September 2007

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in Ieee Transactions On Pattern Analysis and Machine Intelligence

Vol. 29 (11), 2030-2044
https://doi.org/10.1109/tpami.2007.1111

Abstract

We introduce BM³E, a conditional Bayesian mixture of experts Markov model, that achieves consistent probabilistic estimates for discriminative visual tracking. The model applies to problems of temporal and uncertain inference and represents the unexplored bottom-up counterpart of pervasive generative models estimated with Kalman filtering or particle filtering. Instead of inverting a nonlinear generative observation model at runtime, we learn to cooperatively predict complex state distributions directly from descriptors that encode image observations (typically, bag-of-feature global image histograms or descriptors computed over regular spatial grids). These are integrated in a conditional graphical model in order to enforce temporal smoothness constraints and allow a principled management of uncertainty. The algorithms combine sparsity, mixture modeling, and nonlinear dimensionality reduction for efficient computation in high-dimensional continuous state spaces. The combined system automatically self-initializes and recovers from failure. The research has three contributions: (1) we establish the density propagation rules for discriminative inference in continuous, temporal chain models, (2) we propose flexible supervised and unsupervised algorithms to learn feed-forward, multivalued contextual mappings (multimodal state distributions) based on compact, conditional Bayesian mixture of experts models, and (3) we validate the framework empirically for the reconstruction of 3D human motion in monocular video sequences. Our tests on both real and motion-capture-based sequences show significant performance gains with respect to competing nearest neighbor, regression, and structured prediction methods.

Keywords

This publication has 35 references indexed in Scilit:

Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Discriminative Density Propagation for 3D Human Motion Estimation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Conditional models for contextual human motion recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Variational mixture smoothing for non-linear dynamical systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Gibbs likelihoods for Bayesian tracking
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Generative modeling for continuous non-linearly embedded visual inference
Published by Association for Computing Machinery (ACM) ,2004
Fast pose estimation with parameter-sensitive hashing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Learning image statistics for Bayesian tracking
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Comparison of Approximate Methods for Handling Hyperparameters
Neural Computation, 1999
Nonlinear Component Analysis as a Kernel Eigenvalue Problem
Neural Computation, 1998

Cited by 58 articles