Rapid and precise alignment of raw reads against redundant databases with KMA
Top Cited Papers
Open Access
- 29 August 2018
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 19 (1), 1-8
- https://doi.org/10.1186/s12859-018-2336-6
Abstract
Background: As the cost of sequencing has declined, clinical diagnostics based on next generation sequencing (NGS) have become reality. Diagnostics based on sequencing will require rapid and precise mapping against redundant databases because some of the most important determinants, such as antimicrobial resistance and core genome multilocus sequence typing (MLST) alleles, are highly similar to one another. In order to facilitate this, a novel mapping method, KMA (k-mer alignment), was designed. KMA is able to map raw reads directly against redundant databases, it also scales well for large redundant databases. KMA uses k-mer seeding to speed up mapping and the Needleman-Wunsch algorithm to accurately align extensions from k-mer seeds. Multi-mapping reads are resolved using a novel sorting scheme (ConClave scheme), ensuring an accurate selection of templates. Results: The functionality of KMA was compared with SRST2, MGmapper, BWA-MEM, Bowtie2, Minimap2 and Salmon, using both simulated data and a dataset of Escherichia coli mapped against resistance genes and core genome MLST alleles. KMA outperforms current methods with respect to both accuracy and speed, while using a comparable amount of memory. Conclusion: With KMA, it was possible map raw reads directly against redundant databases with high accuracy, speed and memory efficiency.Keywords
Funding Information
- European Union’s Horizon 2020 Research and Innovation Programme (643476)
This publication has 17 references indexed in Scilit:
- Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testingJournal of Antimicrobial Chemotherapy, 2012
- Identification of acquired antimicrobial resistance genesJournal of Antimicrobial Chemotherapy, 2012
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- Genomic epidemiology of the Escherichia coli O104:H4 outbreaks in Europe, 2011Proceedings of the National Academy of Sciences of the United States of America, 2012
- A survey of sequence alignment algorithms for next-generation sequencingBriefings in Bioinformatics, 2010
- BEDTools: a flexible suite of utilities for comparing genomic featuresBioinformatics, 2010
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- Fast and accurate short read alignment with Burrows–Wheeler transformBioinformatics, 2009
- Bioinformatics, 2006
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970