Is My Network Module Preserved and Reproducible?
Open Access
- 20 January 2011
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 7 (1), e1001057
- https://doi.org/10.1371/journal.pcbi.1001057
Abstract
In many applications, one is interested in determining which of the properties of a network module change across conditions. For example, to validate the existence of a module, it is desirable to show that it is reproducible (or preserved) in an independent test network. Here we study several types of network preservation statistics that do not require a module assignment in the test network. We distinguish network preservation statistics by the type of the underlying network. Some preservation statistics are defined for a general network (defined by an adjacency matrix) while others are only defined for a correlation network (constructed on the basis of pairwise correlations between numeric variables). Our applications show that the correlation structure facilitates the definition of particularly powerful module preservation statistics. We illustrate that evaluating module preservation is in general different from evaluating cluster preservation. We find that it is advantageous to aggregate multiple preservation statistics into summary preservation statistics. We illustrate the use of these methods in six gene co-expression network applications including 1) preservation of cholesterol biosynthesis pathway in mouse tissues, 2) comparison of human and chimpanzee brain networks, 3) preservation of selected KEGG pathways between human and chimpanzee brain networks, 4) sex differences in human cortical networks, 5) sex differences in mouse liver networks. While we find no evidence for sex specific modules in human cortical networks, we find that several human cortical modules are less preserved in chimpanzees. In particular, apoptosis genes are differentially co-expressed between humans and chimpanzees. Our simulation studies and applications show that module preservation statistics are useful for studying differences between the modular structure of networks. Data, R software and accompanying tutorials can be downloaded from the following webpage: http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/ModulePreservation. In network applications, one is often interested in studying whether modules are preserved across multiple networks. For example, to determine whether a pathway of genes is perturbed in a certain condition, one can study whether its connectivity pattern is no longer preserved. Non-preserved modules can either be biologically uninteresting (e.g., reflecting data outliers) or interesting (e.g., reflecting sex specific modules). An intuitive approach for studying module preservation is to cross-tabulate module membership. But this approach often cannot address questions about the preservation of connectivity patterns between nodes. Thus, cross-tabulation based approaches often fail to recognize that important aspects of a network module are preserved. Cross-tabulation methods make it difficult to argue that a module is not preserved. The weak statement (“the reference module does not overlap with any of the identified test set modules”) is less relevant in practice than the strong statement (“the module cannot be found in the test network irrespective of the parameter settings of the module detection procedure”). Module preservation statistics have important applications, e.g. we show that the wiring of apoptosis genes in a human cortical network differs from that in chimpanzees.Keywords
This publication has 58 references indexed in Scilit:
- Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathwaysProceedings of the National Academy of Sciences of the United States of America, 2010
- MAP'ing CNS Development and Cognition: An ERKsome ProcessNeuron, 2009
- Functional organization of the transcriptome in human brainNature Neuroscience, 2008
- A pattern recognition approach to infer time-lagged genetic interactionsBioinformatics, 2008
- Weighted gene coexpression network analysis strategies applied to mouse weightMammalian Genome, 2007
- Conservation and evolution of gene coexpression networks in human and chimpanzee brainsProceedings of the National Academy of Sciences of the United States of America, 2006
- Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular targetProceedings of the National Academy of Sciences of the United States of America, 2006
- Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipidsProceedings of the National Academy of Sciences of the United States of America, 2006
- Transitive functional annotation by shortest-path analysis of gene expression dataProceedings of the National Academy of Sciences of the United States of America, 2002
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000