Theory and Limitations of Genetic Network Inference from Microarray Data

16 November 2007

journal article
review article
Published by Wiley in Annals of the New York Academy of Sciences

Vol. 1115 (1), 51-72
https://doi.org/10.1196/annals.1407.019

Abstract

Since the advent of gene expression microarray technology more than 10 years ago, many computational approaches have been developed aimed at using statistical associations between mRNA abundance profiles to predict transcriptional regulatory interactions. The ultimate goal is to develop causal network models describing the transcriptional influences that genes exert on each other (via their protein products), which can be used to predict network disruptions (e.g., mutations) leading to a disease phenotype, as well as the appropriate therapeutic intervention. However, microarray data measure only a small component of the interacting variables in a genetic regulatory network, as cells are known to regulate gene expression via many diverse mechanisms. Although many researchers have acknowledged the questionable interpretation of statistical dependencies between mRNA profiles, very little work has been done on theoretically characterizing the nature of inferred dependencies using models that account for unobserved interacting variables. In this work, we review the theory behind reverse engineering algorithms derived from three separate disciplines-system control theory, graphical models, and information theory-and highlight several mathematical relationships between the various methods. We then apply recent theoretical work on constructing graphical models with latent variables to the context of reverse engineering genetic networks. We demonstrate that even the addition of simple latent variables induces statistical dependencies between non-directly interacting (e.g., co-regulated) genes that cannot be eliminated by conditioning on any observed variables.

Keywords

This publication has 34 references indexed in Scilit:

How to infer gene networks from expression profiles
Molecular Systems Biology, 2007
NOTCH1 directly regulates c-MYC and activates a feed-forward-loop transcriptional network promoting leukemic cell growth
Proceedings of the National Academy of Sciences of the United States of America, 2006
ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context
BMC Bioinformatics, 2006
MicroRNA expression profiles classify human cancers
Nature, 2005
Reverse engineering of regulatory networks in human B cells
Nature Genetics, 2005
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
Nature Genetics, 2003
Ancestral graph Markov models
The Annals of Statistics, 2002
Using Bayesian Networks to Analyze Expression Data
Journal of Computational Biology, 2000
A Bayesian method for the induction of probabilistic networks from data
Machine Learning, 1992
Estimating the Dimension of a Model
The Annals of Statistics, 1978

Cited by 78 articles