Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps
Open Access
- 11 July 2019
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Communications
- Vol. 10 (1), 1-12
- https://doi.org/10.1038/s41467-019-10934-2
Abstract
Metagenomic sequence classification should be fast, accurate and information-rich. Emerging long-read sequencing technologies promise to improve the balance between these factors but most existing methods were designed for short reads. MetaMaps is a new method, specifically developed for long reads, capable of mapping a long-read metagenome to a comprehensive RefSeq database with >12,000 genomes in 94% accuracy for species-level read assignment and r2 > 0.97 for the estimation of sample composition on both simulated and real data when the sample genomes or close relatives are present in the classification database. To address novel species and genera, which are comparatively harder to predict, MetaMaps outputs mapping locations and qualities for all classified reads, enabling functional studies (e.g. gene presence/absence) and detection of incongruities between sample and reference genomes.Keywords
Funding Information
- Jürgen Manchot Stiftung
- U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute
This publication has 51 references indexed in Scilit:
- Pathoscope: Species identification and strain attribution with unassembled sequencing dataGenome Research, 2013
- Metagenomic abundance estimation and diagnostic testing on species levelNucleic Acids Research, 2012
- MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence readsNucleic Acids Research, 2012
- Metagenomic microbial community profiling using unique clade-specific marker genesNature Methods, 2012
- Interactive metagenomic visualization in a Web browserBMC Bioinformatics, 2011
- Integrative analysis of environmental sequences using MEGAN4Genome Research, 2011
- PhymmBL expanded: confidence scores, custom databases, parallelization and moreNature Methods, 2011
- Taxonomic metagenome sequence assignment with structured output modelsNature Methods, 2011
- Adaptive seeds tame genomic sequence comparisonGenome Research, 2011
- Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov modelsNature Methods, 2009