Human Gene Coexpression Landscape: Confident Network Derived from Tissue Transcriptomic Profiles
Open Access
- 15 December 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 3 (12), e3911
- https://doi.org/10.1371/journal.pone.0003911
Abstract
Analysis of gene expression data using genome-wide microarrays is a technique often used in genomic studies to find coexpression patterns and locate groups of co-transcribed genes. However, most studies done at global “omic” scale are not focused on human samples and when they correspond to human very often include heterogeneous datasets, mixing normal with disease-altered samples. Moreover, the technical noise present in genome-wide expression microarrays is another well reported problem that many times is not addressed with robust statistical methods, and the estimation of errors in the data is not provided. Human genome-wide expression data from a controlled set of normal-healthy tissues is used to build a confident human gene coexpression network avoiding both pathological and technical noise. To achieve this we describe a new method that combines several statistical and computational strategies: robust normalization and expression signal calculation; correlation coefficients obtained by parametric and non-parametric methods; random cross-validations; and estimation of the statistical accuracy and coverage of the data. All these methods provide a series of coexpression datasets where the level of error is measured and can be tuned. To define the errors, the rates of true positives are calculated by assignment to biological pathways. The results provide a confident human gene coexpression network that includes 3327 gene-nodes and 15841 coexpression-links and a comparative analysis shows good improvement over previously published datasets. Further functional analysis of a subset core network, validated by two independent methods, shows coherent biological modules that share common transcription factors. The network reveals a map of coexpression clusters organized in well defined functional constellations. Two major regions in this network correspond to genes involved in nuclear and mitochondrial metabolism and investigations on their functional assignment indicate that more than 60% are house-keeping and essential genes. The network displays new non-described gene associations and it allows the placement in a functional context of some unknown non-assigned genes based on their interactions with known gene families. The identification of stable and reliable human gene to gene coexpression networks is essential to unravel the interactions and functional correlations between human genes at an omic scale. This work contributes to this aim, and we are making available for the scientific community the validated human gene coexpression networks obtained, to allow further analyses on the network or on some specific gene associations. The data are available free online at http://bioinfow.dep.usal.es/coexpression/.This publication has 32 references indexed in Scilit:
- Comparative analysis of microarray normalization procedures: effects on reverse engineering gene networksBioinformatics, 2007
- Characterization of mismatch and high-signal intensity probes associated with Affymetrix genechipsBioinformatics, 2007
- A genetic signature of interspecies variations in gene expressionNature Genetics, 2006
- Pvclust: an R package for assessing the uncertainty in hierarchical clusteringBioinformatics, 2006
- Assessment and integration of publicly available SAGE, cDNA microarray, and oligonucleotide microarray expression data for global coexpression analysesGenomics, 2005
- Coexpression Analysis of Human Genes Across Many Microarray Data SetsGenome Research, 2004
- The yeast coexpression network has a small‐world, scale‐free architecture and can be explained by a simple modelEMBO Reports, 2004
- Estimating genomic coexpression networks using first-order conditional independenceGenome Biology, 2004
- A Gene-Coexpression Network for Global Discovery of Conserved Genetic ModulesScience, 2003
- A comparison of normalization methods for high density oligonucleotide array data based on variance and biasBioinformatics, 2003