Learning mixtures of separated nonspherical Gaussians
Open Access
- 1 February 2005
- journal article
- Published by Institute of Mathematical Statistics in The Annals of Applied Probability
- Vol. 15 (1A), 69-92
- https://doi.org/10.1214/105051604000000512
Abstract
Mixtures of Gaussian (or normal) distributions arise in a variety of application areas. Many heuristics have been proposed for the task of finding the component Gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin [J. Roy. Statist. Soc. Ser. B 39 (1977) 1-38]. These do not provably run in polynomial time. We present the first algorithm that provably learns the component Gaussians in time that is polynomial in the dimension. The Gaussians may have arbitrary shape, but they must satisfy a ``separation condition'' which places a lower bound on the distance between the centers of any two component Gaussians. The mathematical results at the heart of our proof are ``distance concentration'' results--proved using isoperimetric inequalities--which establish bounds on the probability distribution of the distance between a pair of points generated according to the mixture. We also formalize the more general problem of max-likelihood fit of a Gaussian mixture to unstructured data.This publication has 17 references indexed in Scilit:
- A spectral algorithm for learning mixture modelsJournal of Computer and System Sciences, 2004
- The mixing rate of Markov chains, an isoperimetric inequality, and computing the volumePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Random Vectors in the Isotropic PositionJournal of Functional Analysis, 1999
- Random walks in a convex body and an improved volume algorithmRandom Structures & Algorithms, 1993
- A random polynomial-time algorithm for approximating the volume of convex bodiesJournal of the ACM, 1991
- On the complexity of some geometric problems in unbounded dimensionJournal of Symbolic Computation, 1990
- Projection PursuitThe Annals of Statistics, 1985
- Mixture Densities, Maximum Likelihood and the EM AlgorithmSIAM Review, 1984
- Extensions of Lipschitz mappings into a Hilbert spaceContemporary Mathematics, 1984
- Probability Inequalities for Sums of Bounded Random VariablesJournal of the American Statistical Association, 1963