Variational Bayesian learning of directed graphical models with hidden variables

Open Access

1 December 2006

journal article
Published by Institute of Mathematical Statistics in Bayesian Analysis

Vol. 1 (4), 793-831
https://doi.org/10.1214/06-ba126

Abstract

A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most models of interest, notably those containing hidden or latent variables, the marginal likelihood is intractable to compute. We present the variational Bayesian (VB) algorithm for directed graphical models, which optimises a lower bound approximation to the marginal likelihood in a procedure similar to the standard EM algorithm. We show that for a large class of models, which we call conjugate exponential, the VB algorithm is a straightforward generalisation of the EM algorithm that incorporates uncertainty over model parameters. In a thorough case study using a small class of bipartite DAGs containing hidden variables, we compare the accuracy of the VB approximation to existing asymptotic-data approximations such as the Bayesian Information Criterion (BIC) and the Cheeseman-Stutz (CS) criterion, and also to a sampling based gold standard, Annealed Importance Sampling (AIS). We find that the VB algorithm is empirically superior to CS and BIC, and much faster than AIS. Moreover, we prove that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation.

Keywords

This publication has 18 references indexed in Scilit:

Independent Factor Analysis
Neural Computation, 1999
Simulating normalizing constants: from importance sampling to bridge sampling to path sampling
Statistical Science, 1998
A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants
Published by Springer Science and Business Media LLC ,1998
Learning Bayesian networks: The combination of knowledge and statistical data
Machine Learning, 1995
Bayes Factors
Journal of the American Statistical Association, 1995
Connectionist learning of belief networks
Artificial Intelligence, 1992
Optimization by Simulated Annealing
Science, 1983
Estimating the Dimension of a Model
The Annals of Statistics, 1978
Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling
Journal of Computational Physics, 1977
Monte Carlo sampling methods using Markov chains and their applications
Biometrika, 1970

Cited by 50 articles