Supervised analysis when the number of candidate features (p) greatly exceeds the number of cases (n)
- 1 December 2003
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGKDD Explorations Newsletter
- Vol. 5 (2), 31-36
- https://doi.org/10.1145/980972.980978
Abstract
New genomic and proteomic technologies provide measurements of thousands of features for each case. This provides a context for enhanced discovery and false discovery. Most statistical and machine learning procedures were not developed for the p>>n setting and the literature of DNA microarray studies contains many examples of mis-use of analytic and computatinal methods such a cross-validation. This paper highlights some of key aspects of p>>n problems for identifying informative features and developing accurate classifiers.Keywords
This publication has 23 references indexed in Scilit:
- Evolutionary algorithms for finding optimal gene sets in microarray predictionBioinformatics, 2003
- Selection bias in gene extraction on the basis of microarray gene-expression dataProceedings of the National Academy of Sciences of the United States of America, 2002
- New feature subset selection procedures for classification of expression profilesGenome Biology, 2002
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networksNature Medicine, 2001
- Singular value decomposition for genome-wide expression data processing and modelingProceedings of the National Academy of Sciences of the United States of America, 2000
- Molecular classification of cutaneous malignant melanoma by gene expression profilingNature, 2000
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- 'Gene shaving' as a method for identifying distinct sets of genes with similar expression patternsGenome Biology, 2000
- Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression MonitoringScience, 1999