Improving Cancer Classification Accuracy Using Gene Pairs
Open Access
- 21 December 2010
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 5 (12), e14305
- https://doi.org/10.1371/journal.pone.0014305
Abstract
Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN).This publication has 28 references indexed in Scilit:
- k-Top Scoring Pair Algorithm for feature selection in SVM with applications to microarray data classificationSoft Computing, 2009
- The use of gene ontology evidence codes in preventing classifier assessment biasBioinformatics, 2009
- An Integrated Genomic Analysis of Human Glioblastoma MultiformeScience, 2008
- Classification and feature selection algorithms for multi-class CGH dataBioinformatics, 2008
- Microarray data mining using landmark gene-guided clusteringBMC Bioinformatics, 2008
- Visualization-based cancer microarray data classification analysisBioinformatics, 2007
- Simple decision rules for classifying human cancers from gene expression profilesBioinformatics, 2005
- Classifying Gene Expression Profiles from Pairwise mRNA ComparisonsStatistical Applications in Genetics and Molecular Biology, 2004
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Prediction of central nervous system embryonal tumour outcome based on gene expressionNature, 2002