Systematic comparison of RNA-Seq normalization methods using measurement error models
Open Access
- 22 August 2012
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 28 (20), 2584-2591
- https://doi.org/10.1093/bioinformatics/bts497
Abstract
Motivation: Further advancement of RNA-Seq technology and its application call for the development of effective normalization methods for RNA-Seq data. Currently, different normalization methods are compared and validated by their correlations with a certain gold standard. Gene expression measurements generated by a different technology or platform such as Real-time reverse transcription polymerase chain reaction (qRT–PCR) or Microarray are usually used as the gold standard. Although the current approach is intuitive and easy to implement, it becomes statistically inadequate when the gold standard is also subject to measurement error (ME). Furthermore, the current approach is not informative, because the correlation of a normalization method with a certain gold standard does not provide much information about the exact quality of the normalized RNA-Seq measurements. Results: We propose to use the system of ME models based on qRT–PCR, Microarray and RNA-Seq gene expression data to compare and validate RNA-Seq normalization methods. This approach does not assume the existence of a gold standard. The performance of a normalization method can be characterized by a group of parameters of the system, which are referred to as the performance parameters, and these performance parameters can be consistently estimated. Different normalization methods can thus be compared by comparing their corresponding estimated performance parameters. We applied the proposed approach to compare five existing RNA-Seq normalization methods using the gene expression data of two RNA samples from the microArray Quality Control and Sequencing Quality Control projects and gained much insight about the pros and cons of these methods. Contact:sunz@purdue.edu; yuzhu@purdue.eduThis publication has 15 references indexed in Scilit:
- Comparison and calibration of transcriptome data from RNA-Seq and tiling arraysBMC Genomics, 2010
- Biases in Illumina transcriptome sequencing caused by random hexamer primingNucleic Acids Research, 2010
- Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experimentsBMC Bioinformatics, 2010
- Modeling non-uniformity in short-read rates in RNA-Seq dataGenome Biology, 2010
- Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing dataBioinformatics, 2009
- Ultrafast and memory-efficient alignment of short DNA sequences to the human genomeGenome Biology, 2009
- RNA-seq: An assessment of technical reproducibility and comparison with gene expression arraysGenome Research, 2008
- Stem cell transcriptome profiling via massive-scale mRNA sequencingNature Methods, 2008
- Multiple-laboratory comparison of microarray platformsNature Methods, 2005
- The elimination of primer-dimer accumulation in PCRNucleic Acids Research, 1997