Unsupervised Learning of Probabilistic Grammar-Markov Models for Object Categories
- 31 March 2008
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Pattern Analysis and Machine Intelligence
- Vol. 31 (1), 114-128
- https://doi.org/10.1109/tpami.2008.67
Abstract
We introduce a probabilistic grammar-Markov model (PGMM) which couples probabilistic context free grammars and Markov random fields. These PGMMs are generative models defined over attributed features and are used to detect and classify objects in natural images. PGMMs are designed so that they can perform rapid inference, parameter learning, and the more difficult task of structure induction. PGMMs can deal with unknown 2D pose (position, orientation, and scale) in both inference and learning, different appearances, or aspects, of the model. The PGMMs can be learnt in an unsupervised manner where the image can contain one of an unknown number of objects of different categories or even be pure background. We first study the weakly supervised case, where each image contains an example of the (single) object of interest, and then generalize to less supervised cases. The goal of this paper is theoretical but, to provide proof of concept, we demonstrate results from this approach on a subset of the Caltech dataset (learning on a training set and evaluating on a testing set). Our results are generally comparable with the current state of the art, and our inference is performed in less than five seconds.This publication has 21 references indexed in Scilit:
- Context and Hierarchy in a Probabilistic Image ModelPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Spatial Priors for Part-Based Recognition Using Statistical ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- Semi-Local Affine Parts for Object RecognitionPublished by British Machine Vision Association and Society for Pattern Recognition ,2004
- Saliency, Scale and Image DescriptionInternational Journal of Computer Vision, 2001
- Efficient Deformable Template Detection and Localization without User InitializationComputer Vision and Image Understanding, 2000
- 10.1162/153244301753344605Applied Physics Letters, 2000
- A Computational Model for Visual SelectionNeural Computation, 1999
- Are object shape primitives learnable?Neurocomputing, 1999
- A View of the Em Algorithm that Justifies Incremental, Sparse, and other VariantsPublished by Springer Science and Business Media LLC ,1998