Improving molecular cancer class discovery through sparse non-negative matrix factorization
Open Access
- 8 September 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (21), 3970-3975
- https://doi.org/10.1093/bioinformatics/bti653
Abstract
Motivation: Identifying different cancer classes or subclasses with similar morphological appearances presents a challenging problem and has important implication in cancer diagnosis and treatment. Clustering based on gene-expression data has been shown to be a powerful method in cancer class discovery. Non-negative matrix factorization is one such method and was shown to be advantageous over other clustering techniques, such as hierarchical clustering or self-organizing maps. In this paper, we investigate the benefit of explicitly enforcing sparseness in the factorization process. Results: We report an improved unsupervised method for cancer classification by the use of gene-expression profile via sparse non-negative matrix factorization. We demonstrate the improvement by direct comparison with classic non-negative matrix factorization on the three well-studied datasets. In addition, we illustrate how to identify a small subset of co-expressed genes that may be directly involved in cancer. Contact:g1m1c1@receptor.med.harvard.edu, ygao@receptor.med.harvard.edu Supplementary information:http://arep.med.harvard.edu/snmf/supplement.htmKeywords
This publication has 17 references indexed in Scilit:
- Metagenes and molecular pattern discovery using matrix factorizationProceedings of the National Academy of Sciences of the United States of America, 2004
- Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray dataMachine Learning, 2003
- PCA disjoint models for multiclass cancer analysis using gene expression dataBioinformatics, 2003
- Molecular classification of cutaneous malignant melanoma by gene expression profilingNature, 2000
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Learning the parts of objects by non-negative matrix factorizationNature, 1999
- Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression MonitoringScience, 1999
- Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arraysProceedings of the National Academy of Sciences of the United States of America, 1999
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences of the United States of America, 1998