Integration of transcriptomic data identifies key hallmark genes in hypertrophic cardiomyopathy
Open Access
- 6 July 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Cardiovascular Disorders
- Vol. 21 (1), 1-10
- https://doi.org/10.1186/s12872-021-02147-7
Abstract
Background: Hypertrophic cardiomyopathy (HCM) represents one of the most common inherited heart diseases. To identify key molecules involved in the development of HCM, gene expression patterns of the heart tissue samples in HCM patients from multiple microarray and RNA-seq platforms were investigated. Methods: The significant genes were obtained through the intersection of two gene sets, corresponding to the identified differentially expressed genes (DEGs) within the microarray data and within the RNA-Seq data. Those genes were further ranked using minimum-Redundancy Maximum-Relevance feature selection algorithm. Moreover, the genes were assessed by three different machine learning methods for classification, including support vector machines, random forest and k-Nearest Neighbor. Results: Outstanding results were achieved by taking exclusively the top eight genes of the ranking into consideration. Since the eight genes were identified as candidate HCM hallmark genes, the interactions between them and known HCM disease genes were explored through the protein–protein interaction (PPI) network. Most candidate HCM hallmark genes were found to have direct or indirect interactions with known HCM diseases genes in the PPI network, particularly the hub genes JAK2 and GADD45A. Conclusions: This study highlights the transcriptomic data integration, in combination with machine learning methods, in providing insight into the key hallmark genes in the genetic etiology of HCM.Keywords
This publication has 48 references indexed in Scilit:
- STAR: ultrafast universal RNA-seq alignerBioinformatics, 2012
- A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiaeNucleic Acids Research, 2012
- Removing technical variability in RNA-seq data using conditional quantile normalizationBiostatistics, 2012
- k-Nearest neighbor models for microarray gene expression analysis and clinical outcome predictionThe Pharmacogenomics Journal, 2010
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networksBioinformatics, 2009
- Gadd45 in stress signalingJournal of Molecular Signaling, 2008
- A comprehensive comparison of random forests and support vector machines for microarray-based cancer classificationBMC Bioinformatics, 2008
- lumi: a pipeline for processing Illumina microarrayBioinformatics, 2008
- Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction NetworksGenome Research, 2003