SnapperDB: a database solution for routine sequencing analysis of bacterial isolates

Abstract
Real-time surveillance of infectious disease using whole genome sequencing data poses challenges in both result generation and communication. SnapperDB represents a set of tools to store bacterial variant data and facilitate reproducible and scalable analysis of bacterial populations. We also introduce the ‘SNP address’ nomenclature to describe the relationship between isolates in a population to the single nucleotide resolution. We announce the release of SnapperDB v1.0 a program for scalable routine SNP analysis and storage of microbial populations. SnapperDB is implemented as a python application under the open source BSD license. All code and user guides are available at https://github.com/phe-bioinformatics/snapperdb. Reference genomes and SnapperDB configs are available at https://github.com/phe-bioinformatics/snapperdb_references.
Funding Information
  • National Institute for Health Research Health Protection Research Unit in GI Infections
  • NHS
  • NIHR
  • Department of Health
  • Public Health England