Directed Information Graphs
- 22 September 2015
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory
- Vol. 61 (12), 6887-6909
- https://doi.org/10.1109/tit.2015.2478440
Abstract
We propose a graphical model for representing networks of stochastic processes, the minimal generative model graph. It is based on reduced factorizations of the joint distribution over time. We show that under appropriate conditions, it is unique and consistent with another type of graphical model, the directed information graph, which is based on a generalization of Granger causality. We demonstrate how directed information quantifies Granger causality in a particular sequential prediction setting. We also develop efficient methods to estimate the topological structure from data that obviate estimating the joint statistics. One algorithm assumes upper bounds on the degrees and uses the minimal dimension statistics necessary. In the event that the upper bounds are not valid, the resulting graph is nonetheless an optimal approximation in terms of Kullback-Leibler (KL) divergence. Another algorithm uses near-minimal dimension statistics when no bounds are known, but the distribution satisfies a certain criterion. Analogous to how structure learning algorithms for undirected graphical models use mutual information estimates, these algorithms use directed information estimates. We characterize the sample-complexity of two plug-in directed information estimators and obtain confidence intervals. For the setting when point estimates are unreliable, we propose an algorithm that uses confidence intervals to identify the best approximation that is robust to estimation error. Last, we demonstrate the effectiveness of the proposed algorithms through the analysis of both synthetic data and real data from the Twitter network. In the latter case, we identify which news sources influence users in the network by merely analyzing tweet times.Keywords
Other Versions
Funding Information
- National Science Foundation through the NSF Center for Science of Information (CCF-0939370, CCF-1065352, SMA-1451221)
- Air Force Office of Scientific Research through MURI (FA9550-10-1-0573, FA9550-11-1-0016)
- U.S. Department of Energy Computational Science Graduate Fellowship (DE-FG02-97ER25308)
This publication has 56 references indexed in Scilit:
- The Relation between Granger Causality and Directed Information Theory: A ReviewEntropy, 2012
- A Granger Causality Measure for Point Process Models of Ensemble Neural Spiking ActivityPLoS Computational Biology, 2011
- Graphical modelling of multivariate time seriesProbability Theory and Related Fields, 2011
- Estimating the directed information to infer causal relationships in ensemble neural spike train recordingsJournal of Computational Neuroscience, 2010
- A general asymptotic theory for time-series modelsStatistica Neerlandica, 2010
- Motif Discovery in Tissue-Specific Regulatory Sequences Using Directed InformationEURASIP Journal on Bioinformatics and Systems Biology, 2007
- Measures of mutual and causal dependence between two time series (Corresp.)IEEE Transactions on Information Theory, 1987
- The General Equivalence of Granger and Sims CausalityEconometrica, 1982
- Testing for causality: A personal viewpointJournal of Economic Dynamics and Control, 1980
- Investigating Causal Relations by Econometric Models and Cross-spectral MethodsEconometrica, 1969