An efficient SVM based tumor classification with symmetry Non-negative Matrix Factorization using gene expression data
- 1 February 2013
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
A reliable and accurate identification of the type of tumors is crucial to the proper treatment of cancers. The classification of tumors was and is both a practical and theoretic necessity and requirement. DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. This paper presents a new method for tumor classification using gene expression data. In the proposed method, we first select genes using Nonnegative Matrix Factorization (NMF). In order to improve the performance of classification, Symmetry NMF (SymNMF) is used in this approach. Then, features are extracted from the selected genes by virtue SymNMF. As a last step, an efficient machine learning approach is used to classify the tumor samples using the extracted features. In order for a better classification, Support Vector Machine with Weighted Kernel Width (WSVM) is used in this classification approach. The performance of the proposed approach is tested using colon cancer data set and the acute leukemia data set. It is observed from the experimental results that the proposed approach provides better performance when compared with the traditional approaches.Keywords
This publication has 24 references indexed in Scilit:
- Independent component analysis-based penalized discriminant method for tumor classification using gene expression dataBioinformatics, 2006
- Gene selection using support vector machines with non-convex penaltyBioinformatics, 2005
- Improving molecular cancer class discovery through sparse non-negative matrix factorizationBioinformatics, 2005
- Metagenes and molecular pattern discovery using matrix factorizationProceedings of the National Academy of Sciences of the United States of America, 2004
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Tumor classification by partial least squares using microarray gene expression dataBioinformatics, 2002
- Predicting the clinical status of human breast cancer by using gene expression profilesProceedings of the National Academy of Sciences of the United States of America, 2001
- Singular value decomposition for genome-wide expression data processing and modelingProceedings of the National Academy of Sciences of the United States of America, 2000
- 10.1162/153244303322753715Applied Physics Letters, 2000
- Independent component analysis, A new concept?Signal Processing, 1994