A new method for class prediction based on signed-rank algorithms applied to Affymetrix® microarray experiments
Open Access
- 11 January 2008
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 9 (1), 16
- https://doi.org/10.1186/1471-2105-9-16
Abstract
The huge amount of data generated by DNA chips is a powerful basis to classify various pathologies. However, constant evolution of microarray technology makes it difficult to mix data from different chip types for class prediction of limited sample populations. Affymetrix® technology provides both a quantitative fluorescence signal and a decision (detection call: absent or present) based on signed-rank algorithms applied to several hybridization repeats of each gene, with a per-chip normalization. We developed a new prediction method for class belonging based on the detection call only from recent Affymetrix chip type. Biological data were obtained by hybridization on U133A, U133B and U133Plus 2.0 microarrays of purified normal B cells and cells from three independent groups of multiple myeloma (MM) patients. After a call-based data reduction step to filter out non class-discriminative probe sets, the gene list obtained was reduced to a predictor with correction for multiple testing by iterative deletion of probe sets that sequentially improve inter-class comparisons and their significance. The error rate of the method was determined using leave-one-out and 5-fold cross-validation. It was successfully applied to (i) determine a sex predictor with the normal donor group classifying gender with no error in all patient groups except for male MM samples with a Y chromosome deletion, (ii) predict the immunoglobulin light and heavy chains expressed by the malignant myeloma clones of the validation group and (iii) predict sex, light and heavy chain nature for every new patient. Finally, this method was shown powerful when compared to the popular classification method Prediction Analysis of Microarray (PAM). This normalization-free method is routinely used for quality control and correction of collection errors in patient reports to clinicians. It can be easily extended to multiple class prediction suitable with clinical groups, and looks particularly promising through international cooperative projects like the "Microarray Quality Control project of US FDA" MAQC as a predictive classifier for diagnostic, prognostic and response to treatment. Finally, it can be used as a powerful tool to mine published data generated on Affymetrix systems and more generally classify samples with binary feature values.Keywords
This publication has 48 references indexed in Scilit:
- Microarray Analysis and Tumor ClassificationNew England Journal of Medicine, 2006
- Heparan sulphate proteoglycans are essential for the myeloma cell growth activity of EGF-family ligands in multiple myelomaOncogene, 2006
- Empirical Bayes screening of many p-values with applications to microarray studiesBioinformatics, 2005
- Molecular decomposition of complex clinical phenotypes using biologically structured analysis of microarray dataBioinformatics, 2005
- A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast CancerNew England Journal of Medicine, 2004
- Prediction of Survival in Diffuse Large-B-Cell Lymphoma Based on the Expression of Six GenesNew England Journal of Medicine, 2004
- Detecting outlying samples in microarray data: A critical assessment of the effect of outliers on sample classificationChem-Bio Informatics Journal, 2003
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Measurement of free kappa and lambda chains in serum and the significance of their ratio in patients with multiple myelomaBritish Journal of Haematology, 1992
- Pattern recognition by means of disjoint principal components modelsPattern Recognition, 1976