Efficient identification of multiple pathways: RNA-Seq analysis of livers from 56Fe ion irradiated mice
Open Access
- 20 March 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 21 (1), 1-12
- https://doi.org/10.1186/s12859-020-3446-5
Abstract
MRNA interaction with other mRNAs and other signaling molecules determine different biological pathways and functions. Gene co-expression network analysis methods have been widely used to identify correlation patterns between genes in various biological contexts (e.g., cancer, mouse genetics, yeast genetics). A challenge remains to identify an optimal partition of the networks where the individual modules (clusters) are neither too small to make any general inferences, nor too large to be biologically interpretable. Clustering thresholds for identification of modules are not systematically determined and depend on user-settable parameters requiring optimization. The absence of systematic threshold determination may result in suboptimal module identification and a large number of unassigned features. In this study, we propose a new pipeline to perform gene co-expression network analysis. The proposed pipeline employs WGCNA, a software widely used to perform different aspects of gene co-expression network analysis, and Modularity Maximization algorithm, to analyze novel RNA-Seq data to understand the effects of low-dose 56Fe ion irradiation on the formation of hepatocellular carcinoma in mice. The network results, along with experimental validation, show that using WGCNA combined with Modularity Maximization, provides a more biologically interpretable network in our dataset, than that obtainable using WGCNA alone. The proposed pipeline showed better performance than the existing clustering algorithm in WGCNA, and identified a module that was biologically validated by a mitochondrial complex I assay. We present a pipeline that can reduce the problem of parameter selection that occurs with the existing algorithm in WGCNA, for applicable RNA-Seq datasets. This may assist in the future discovery of novel mRNA interactions, and elucidation of their potential downstream molecular effects.Keywords
Funding Information
- National Aeronautics and Space Administration (NNX15AD65G)
This publication has 62 references indexed in Scilit:
- STAR: ultrafast universal RNA-seq alignerBioinformatics, 2012
- Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variationNucleic Acids Research, 2012
- Deciphering Network Community Structure by SurprisePLOS ONE, 2011
- Parallel Proteomics to Improve Coverage and Confidence in the Partially Annotated Oryctolagus cuniculus Mitochondrial ProteomeMolecular & Cellular Proteomics, 2011
- edgeR: a Bioconductor package for differential expression analysis of digital gene expression dataBioinformatics, 2009
- A HUPO test sample study reveals common problems in mass spectrometry–based proteomicsNature Methods, 2009
- WGCNA: an R package for weighted correlation network analysisBMC Bioinformatics, 2008
- Resolution limit in community detectionProceedings of the National Academy of Sciences of the United States of America, 2007
- Network biology: understanding the cell's functional organizationNature Reviews Genetics, 2004
- Comparing partitionsJournal of Classification, 1985