A Filtering Method to Generate High Quality Short Reads Using Illumina Paired-End Technology
Open Access
- 17 June 2013
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 8 (6), e66643
- https://doi.org/10.1371/journal.pone.0066643
Abstract
Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.Keywords
This publication has 13 references indexed in Scilit:
- Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencingNature Methods, 2013
- PANDAseq: paired-end assembler for illumina sequencesBMC Bioinformatics, 2012
- Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systemsGenome Biology, 2011
- BIPES, a cost-effective high-throughput method for assessing microbial diversityThe ISME Journal, 2010
- Search and clustering orders of magnitude faster than BLASTBioinformatics, 2010
- Ironing out the wrinkles in the rare biosphere through improved OTU clusteringEnvironmental Microbiology, 2010
- Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimatesEnvironmental Microbiology, 2009
- PyNAST: a flexible tool for aligning sequences to a template alignmentBioinformatics, 2009
- Dipping into the Rare BiosphereScience, 2007
- Microbial diversity in the deep sea and the underexplored “rare biosphere”Proceedings of the National Academy of Sciences of the United States of America, 2006