pIRS: Profile-based Illumina pair-end reads simulator
- 15 April 2012
- journal article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 28 (11), 1533-1535
- https://doi.org/10.1093/bioinformatics/bts187
Abstract
Motivation: The next-generation high-throughput sequencing technologies, especially from Illumina, have been widely used in re-sequencing and de novo assembly studies. However, there is no existing software that can simulate Illumina reads with real error and quality distributions and coverage bias yet, which is very useful in relevant software development and study designing of sequencing projects. Results: We provide a software package, pIRS (profile-based Illumina pair-end reads simulator), which simulates Illumina reads with empirical Base-Calling and GC%-depth profiles trained from real re-sequencing data. The error and quality distributions as well as coverage bias patterns of simulated reads using pIRS fit the properties of real sequencing data better than existing simulators. In addition, pIRS also comes with a tool to simulate the heterozygous diploid genomes. Availability: pIRS is written in C++ and Perl, and is freely available at ftp://ftp.genomics.org.cn/pub/pIRS/. Contact: fanweisz09@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 9 references indexed in Scilit:
- ART: a next-generation sequencing read simulatorBioinformatics, 2011
- Sequence-specific error profile of Illumina sequencersNucleic Acids Research, 2011
- Analyzing and minimizing PCR amplification bias in Illumina sequencing librariesGenome Biology, 2011
- Fast and accurate long-read alignment with Burrows–Wheeler transformBioinformatics, 2010
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- SOAP2: an improved ultrafast tool for short read alignmentBioinformatics, 2009
- MetaSim—A Sequencing Simulator for Genomics and MetagenomicsPLOS ONE, 2008
- Mapping short DNA sequencing reads and calling variants using mapping quality scoresGenome Research, 2008
- Substantial biases in ultra-short read data sets from high-throughput DNA sequencingNucleic Acids Research, 2008