Expanding the taxonomic range in the fecal metagenome
Open Access
- 9 June 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 22 (1), 1-14
- https://doi.org/10.1186/s12859-021-04212-6
Abstract
Background: Except for bacteria, the taxonomic diversity of the human fecal metagenome has not been widely studied, despite the potential importance of viruses and eukaryotes. Widely used bioinformatic tools contain limited numbers of non-bacterial species in their databases compared to available genomic sequences and their methodologies do not favour classification of rare sequences which may represent only a small fraction of their parent genome. In seeking to optimise identification of non-bacterial species, we evaluated five widely-used metagenome classifier programs (BURST, Kraken2, Centrifuge, MetaPhlAn2 and CCMetagen) for their ability to correctly assign and count simulations of bacterial, viral and eukaryotic DNA sequence reads, including the effect of taxonomic order of analysis of bacteria, viruses and eukaryotes and the effect of sequencing depth. Results: We found that the precision of metagenome classifiers varied significantly between programs and between taxonomic groups. When classifying viruses and eukaryotes, ordering the analysis such that bacteria were classified first significantly improved classification precision. Increasing sequencing depth decreased classification precision and did not improve recall of rare species. Conclusions: Choice of metagenome classifier program can have a marked effect on results with respect to precision of species assignment in different taxonomic groups. The order of taxonomic classification can markedly improve precision. Increasing sequencing depth can decrease classification precision and yields diminishing returns in probability of species detection.Keywords
Funding Information
- Juvenile Diabetes Research Foundation Australia (3-SRA-2019-899-M-N)
- Leona M. and Harry B. Helmsley Charitable Trust (3-SRA-2019-899-M-N)
- National Health and Medical Research Council (LCH 1037321, LCH 1173945)
This publication has 22 references indexed in Scilit:
- Centrifuge: rapid and sensitive classification of metagenomic sequencesGenome Research, 2016
- Fungi in the healthy human gastrointestinal tractVirulence, 2016
- Revised Estimates for the Number of Human and Bacteria Cells in the BodyPLoS Biology, 2016
- Centrifuge: rapid and sensitive classification of metagenomic sequencesPublished by Cold Spring Harbor Laboratory ,2016
- Virome Capture Sequencing Enables Sensitive Viral Diagnosis and Comprehensive Virome AnalysismBio, 2015
- A framework for human microbiome researchNature, 2012
- Metagenomic microbial community profiling using unique clade-specific marker genesNature Methods, 2012
- The Impact of the Gut Microbiota on Human Health: An Integrative ViewCell, 2012
- Search and clustering orders of magnitude faster than BLASTBioinformatics, 2010
- MetaSim—A Sequencing Simulator for Genomics and MetagenomicsPLOS ONE, 2008