Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

Abstract
We introduce a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a compression scheme can be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. As an application of this technique, we prove that ˜Θ(kd22) samples are necessary and sufficient for learning a mixture of k Gaussians in Rd, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ(kd2) samples suffice, matching a known lower bound. Moreover, these results hold in an agnostic learning (or robust estimation) setting, in which the target distribution is only approximately a mixture of Gaussians. Our main upper bound is proven by showing that the class of Gaussians in Rd admits a small compression scheme.
Funding Information
  • NSERC (22R23068)
  • CRM-ISM postdoctoral fellowship and an IVADO-Apogée-CFREF postdoctoral fellowship
  • NSERC Discovery

This publication has 21 references indexed in Scilit: