Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
Top Cited Papers
- 30 June 2011
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Automatic Control
- Vol. 57 (3), 592-606
- https://doi.org/10.1109/tac.2011.2161027
Abstract
The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication. It arises in various application domains, including distributed tracking and localization, multi-agent coordination, estimation in sensor networks, and large-scale machine learning. We develop and analyze distributed algorithms based on dual subgradient averaging, and we provide sharp bounds on their convergence rates as a function of the network size and topology. Our analysis allows us to clearly separate the convergence of the optimization algorithm itself and the effects of communication dependent on the network structure. We show that the number of iterations required by our algorithm scales inversely in the spectral gap of the network, and confirm this prediction's sharpness both by theoretical lower bounds and simulations for various networks. Our approach includes the cases of deterministic optimization and communication, as well as problems with stochastic optimization and/or communication.Keywords
Other Versions
This publication has 21 references indexed in Scilit:
- Fully Distributed Algorithms for Convex Optimization ProblemsSIAM Journal on Optimization, 2010
- Primal-dual subgradient methods for convex problemsMathematical Programming, 2007
- Distributed average consensus with least-mean-square deviationJournal of Parallel and Distributed Computing, 2007
- Toeplitz and Circulant Matrices: A ReviewFoundations and Trends® in Communications and Information Theory, 2005
- Detection, classification, and tracking of targetsIEEE Signal Processing Magazine, 2002
- The capacity of wireless networksIEEE Transactions on Information Theory, 2000
- Support-vector networksMachine Learning, 1995
- Distributed asynchronous deterministic and stochastic gradient optimization algorithmsIEEE Transactions on Automatic Control, 1986
- Robust identificationAutomatica, 1980
- Weighted sums of certain dependent random variablesTohoku Mathematical Journal, 1967