A distribution free summarization method for Affymetrix GeneChip® arrays
Open Access
- 5 December 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (3), 321-327
- https://doi.org/10.1093/bioinformatics/btl609
Abstract
Motivation: Affymetrix GeneChip arrays require summarization in order to combine the probe-level intensities into one value representing the expression level of a gene. However, probe intensity measurements are expected to be affected by different levels of non-specific- and cross-hybridization to non-specific transcripts. Here, we present a new summarization technique, the Distribution Free Weighted method (DFW), which uses information about the variability in probe behavior to estimate the extent of non-specific and cross-hybridization for each probe. The contribution of the probe is weighted accordingly during summarization, without making any distributional assumptions for the probe-level data.Results: We compare DFW with several popular summarization methods on spike-in datasets, via both our own calculations and the ‘Affycomp II’ competition. The results show that DFW outperforms other methods when sensitivity and specificity are considered simultaneously. With the Affycomp spike-in datasets, the area under the receiver operating characteristic curve for DFW is nearly 1.0 (a perfect value), indicating that DFW can identify all differentially expressed genes with a few false positives. The approach used is also computationally faster than most other methods in current use.Availability: The R code for DFW is available upon request.Contact: mmcgee@smu.eduSupplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 19 references indexed in Scilit:
- A new summarization method for affymetrix probe level dataBioinformatics, 2006
- Comparison of Affymetrix GeneChip expression measuresBioinformatics, 2006
- Microarray data analysis: from disarray to consolidation and consensusNature Reviews Genetics, 2006
- Evolving gene/transcript definitions significantly alter the interpretation of GeneChip dataNucleic Acids Research, 2005
- A Model-Based Background Adjustment for Oligonucleotide Expression ArraysJournal of the American Statistical Association, 2004
- A benchmark for Affymetrix GeneChip expression measuresBioinformatics, 2004
- Exploration, normalization, and summaries of high density oligonucleotide array probe level dataBiostatistics, 2003
- Summaries of Affymetrix GeneChip probe level dataNucleic Acids Research, 2003
- Expression monitoring by hybridization to high-density oligonucleotide arraysNature Biotechnology, 1996
- R: A Language for Data Analysis and GraphicsJournal of Computational and Graphical Statistics, 1996