Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories

31 March 2008

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 31 (1), 114-128
https://doi.org/10.1109/tpami.2008.67

Abstract

We introduce a probabilistic grammar-Markov model (PGMM) which couples probabilistic context free grammars and Markov random fields. These PGMMs are generative models defined over attributed features and are used to detect and classify objects in natural images. PGMMs are designed so that they can perform rapid inference, parameter learning, and the more difficult task of structure induction. PGMMs can deal with unknown 2D pose (position, orientation, and scale) in both inference and learning, different appearances, or aspects, of the model. The PGMMs can be learnt in an unsupervised manner where the image can contain one of an unknown number of objects of different categories or even be pure background. We first study the weakly supervised case, where each image contains an example of the (single) object of interest, and then generalize to less supervised cases. The goal of this paper is theoretical but, to provide proof of concept, we demonstrate results from this approach on a subset of the Caltech dataset (learning on a training set and evaluating on a testing set). Our results are generally comparable with the current state of the art, and our inference is performed in less than five seconds.

This publication has 21 references indexed in Scilit:

Context and Hierarchy in a Probabilistic Image Model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Spatial Priors for Part-Based Recognition Using Statistical Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision, 2004
Semi-Local Affine Parts for Object Recognition
Published by British Machine Vision Association and Society for Pattern Recognition ,2004
Saliency, Scale and Image Description
International Journal of Computer Vision, 2001
Efficient Deformable Template Detection and Localization without User Initialization
Computer Vision and Image Understanding, 2000
10.1162/153244301753344605
Applied Physics Letters, 2000
A Computational Model for Visual Selection
Neural Computation, 1999
Are object shape primitives learnable?
Neurocomputing, 1999
A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants
Published by Springer Science and Business Media LLC ,1998

Cited by 46 articles