High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence

Top Cited Papers

Open Access

1 January 2011

journal article
Published by Institute of Mathematical Statistics in Electronic Journal of Statistics

Vol. 5 (none), 935-980
https://doi.org/10.1214/11-ejs631

Abstract

Given i.i.d. observations of a random vector X∈ℝ^p, we study the problem of estimating both its covariance matrix Σ^*, and its inverse covariance or concentration matrix Θ^*=(Σ^*)⁻¹. When X is multivariate Gaussian, the non-zero structure of Θ^* is specified by the graph of an associated Gaussian Markov random field; and a popular estimator for such sparse Θ^* is the ℓ₁-regularized Gaussian MLE. This estimator is sensible even for for non-Gaussian X, since it corresponds to minimizing an ℓ₁-penalized log-determinant Bregman divergence. We analyze its performance under high-dimensional scaling, in which the number of nodes in the graph p, the number of edges s, and the maximum node degree d, are allowed to grow as a function of the sample size n. In addition to the parameters (p,s,d), our analysis identifies other key quantities that control rates: (a) the ℓ_∞-operator norm of the true covariance matrix Σ^*; and (b) the ℓ_∞ operator norm of the sub-matrix Γ^*_SS, where S indexes the graph edges, and Γ^*=(Θ^*)⁻¹⊗(Θ^*)⁻¹; and (c) a mutual incoherence or irrepresentability measure on the matrix Γ^* and (d) the rate of decay 1/f(n,δ) on the probabilities {|Σ̂ⁿ_ij−Σ^*_ij|>δ}, where Σ̂ⁿ is the sample covariance based on n samples. Our first result establishes consistency of our estimate Θ̂ in the elementwise maximum-norm. This in turn allows us to derive convergence rates in Frobenius and spectral norms, with improvements upon existing results for graphs with maximum node degrees . In our second result, we show that with probability converging to one, the estimate Θ̂ correctly specifies the zero pattern of the concentration matrix Θ^*. We illustrate our theoretical results via simulations for various graphs and problem parameters, showing good correspondences between the theoretical predictions and behavior in simulations.

Keywords

This publication has 24 references indexed in Scilit:

Sparsistency and rates of convergence in large covariance matrix estimation
The Annals of Statistics, 2009
Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$-Constrained Quadratic Programming (Lasso)
IEEE Transactions on Information Theory, 2009
Covariance regularization by thresholding
The Annals of Statistics, 2008
Regularized estimation of large covariance matrices
The Annals of Statistics, 2008
Sparse permutation invariant covariance estimation
Electronic Journal of Statistics, 2008
Estimation of Gaussian graphs by model selection
Electronic Journal of Statistics, 2008
Sparse inverse covariance estimation with the graphical lasso
Biostatistics, 2007
High-dimensional graphs and variable selection with the Lasso
The Annals of Statistics, 2006
Covariance matrix selection and estimation via penalised normal likelihood
Biometrika, 2006
On the distribution of the largest eigenvalue in principal components analysis
The Annals of Statistics, 2001

Cited by 380 articles