Assessing and Improving Methods Used in Operational Taxonomic Unit-Based Approaches for 16S rRNA Gene Sequence Analysis
Top Cited Papers
- 15 May 2011
- journal article
- research article
- Published by American Society for Microbiology in Applied and Environmental Microbiology
- Vol. 77 (10), 3219-3226
- https://doi.org/10.1128/aem.02810-10
Abstract
In spite of technical advances that have provided increases in orders of magnitude in sequencing coverage, microbial ecologists still grapple with how to interpret the genetic diversity represented by the 16S rRNA gene. Two widely used approaches put sequences into bins based on either their similarity to reference sequences (i.e., phylotyping) or their similarity to other sequences in the community (i.e., operational taxonomic units [OTUs]). In the present study, we investigate three issues related to the interpretation and implementation of OTU-based methods. First, we confirm the conventional wisdom that it is impossible to create an accurate distance-based threshold for defining taxonomic levels and instead advocate for a consensus-based method of classifying OTUs. Second, using a taxonomic-independent approach, we show that the average neighbor clustering algorithm produces more robust OTUs than other hierarchical and heuristic clustering algorithms. Third, we demonstrate several steps to reduce the computational burden of forming OTUs without sacrificing the robustness of the OTU assignment. Finally, by blending these solutions, we propose a new heuristic that has a minimal effect on the robustness of OTUs and significantly reduces the necessary time and memory requirements. The ability to quickly and accurately assign sequences to OTUs and then obtain taxonomic information for those OTUs will greatly improve OTU-based analyses and overcome many of the challenges encountered with phylotype-based methods.Keywords
This publication has 23 references indexed in Scilit:
- Search and clustering orders of magnitude faster than BLASTBioinformatics, 2010
- Ironing out the wrinkles in the rare biosphere through improved OTU clusteringEnvironmental Microbiology, 2010
- Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimatesEnvironmental Microbiology, 2009
- Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial CommunitiesApplied and Environmental Microbiology, 2009
- ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequencesNucleic Acids Research, 2009
- Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencersNucleic Acids Research, 2008
- SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARBNucleic Acids Research, 2007
- Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial TaxonomyApplied and Environmental Microbiology, 2007
- The bacterial species definition in the genomic eraPhilosophical Transactions B, 2006
- Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARBApplied and Environmental Microbiology, 2006