Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

6 October 2020

journal article
research article
Published by Association for Computing Machinery (ACM) in Journal of the ACM

Vol. 67 (6), 1-42
https://doi.org/10.1145/3417994

Abstract

We introduce a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a compression scheme can be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. As an application of this technique, we prove that ˜Θ(kd²/ε²) samples are necessary and sufficient for learning a mixture of k Gaussians in R^d, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ(kd/ε²) samples suffice, matching a known lower bound. Moreover, these results hold in an agnostic learning (or robust estimation) setting, in which the target distribution is only approximately a mixture of Gaussians. Our main upper bound is proven by showing that the class of Gaussians in R^d admits a small compression scheme.

Keywords

Funding Information

NSERC (22R23068)
CRM-ISM postdoctoral fellowship and an IVADO-Apogée-CFREF postdoctoral fellowship
NSERC Discovery

This publication has 21 references indexed in Scilit:

Fast and Near-Optimal Algorithms for Approximating Distributions by Histograms
Published by Association for Computing Machinery (ACM) ,2015
Disentangling Gaussians
Communications of the ACM, 2012
PAC Learning Axis-Aligned Mixtures of Gaussians with No Separation Assumption
Lecture Notes in Computer Science, 2006
Smallest singular value of random matrices and geometry of random polytopes
Advances in Mathematics, 2005
Learning mixtures of separated nonspherical Gaussians
The Annals of Applied Probability, 2005
Adaptive estimation of a quadratic functional by model selection
The Annals of Statistics, 2000
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM, 1989
Rates of Convergence of Minimum Distance Estimators and Kolmogorov's Entropy
The Annals of Statistics, 1985
On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
Theory of Probability and Its Applications, 1971
On Information and Sufficiency
The Annals of Mathematical Statistics, 1951

Cited by 4 articles