Bias in the estimation of false discovery rate in microarray studies

Open Access

16 August 2005

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 21 (20), 3865-3872
https://doi.org/10.1093/bioinformatics/bti626

Abstract

Motivation: The false discovery rate (FDR) provides a key statistical assessment for microarray studies. Its value depends on the proportion π₀ of non-differentially expressed (non-DE) genes. In most microarray studies, many genes have small effects not easily separable from non-DE genes. As a result, current methods often overestimate π₀ and FDR, leading to unnecessary loss of power in the overall analysis. Methods: For the common two-sample comparison we derive a natural mixture model of the test statistic and an explicit bias formula in the standard estimation of π₀. We suggest an improved estimation of π₀ based on the mixture model and describe a practical likelihood-based procedure for this purpose. Results: The analysis shows that a large bias occurs when π₀ is far from 1 and when the non-centrality parameters of the distribution of the test statistic are near zero. The theoretical result also explains substantial discrepancies between non-parametric and model-based estimates of π₀. Simulation studies indicate mixture-model estimates are less biased than standard estimates. The method is applied to breast cancer and lymphoma data examples. Availability: An R-package OCplus containing functions to compute π₀ based on the mixture model, the resulting FDR and other operating characteristics of microarray data, is freely available at http://www.meb.ki.se/~yudpaw Contact:yudi.pawitan@meb.ki.se and alexander.ploner@meb.ki.se

Keywords

This publication has 13 references indexed in Scilit:

False discovery rate, sensitivity and sample size for microarray studies
Bioinformatics, 2005
A practical false discovery rate approach to identifying patterns of differential expression in microarray data
Bioinformatics, 2005
Empirical Bayes screening of many p-values with applications to microarray studies
Bioinformatics, 2005
A simple procedure for estimating the false discovery rate
Bioinformatics, 2004
A mixture model-based strategy for selecting sets of genes in multiclass response microarray experiments
Bioinformatics, 2004
Large-Scale Simultaneous Hypothesis Testing
Journal of the American Statistical Association, 2004
Improving false discovery rate estimation
Bioinformatics, 2004
The Use of Molecular Profiling to Predict Survival after Chemotherapy for Diffuse Large-B-Cell Lymphoma
New England Journal of Medicine, 2002
Empirical Bayes Analysis of a Microarray Experiment
Journal of the American Statistical Association, 2001
Gene-Expression Profiles in Hereditary Breast Cancer
New England Journal of Medicine, 2001

Cited by 58 articles