Recovery of metagenomic data from the Aedes aegypti microbiome using a reproducible snakemake pipeline: MINUUR
Open Access
- 23 March 2023
- journal article
- Published by F1000 Research Ltd in Wellcome Open Research
Abstract
Background: Ongoing research of the mosquito microbiome aims to uncover novel strategies to reduce pathogen transmission. Sequencing costs, especially for metagenomics, are however still significant. A resource that is increasingly used to gain insights into host-associated microbiomes is the large amount of publicly available genomic data based on whole organisms like mosquitoes, which includes sequencing reads of the host-associated microbes and provides the opportunity to gain additional value from these initially host-focused sequencing projects. Methods: To analyse non-host reads from existing genomic data, we developed a snakemake workflow called MINUUR (Microbial INsights Using Unmapped Reads). Within MINUUR, reads derived from the host-associated microbiome were extracted and characterised using taxonomic classifications and metagenome assembly followed by binning and quality assessment. We applied this pipeline to five publicly available Aedes aegypti genomic datasets, consisting of 62 samples with a broad range of sequencing depths. Results: We demonstrate that MINUUR recovers previously identified phyla and genera and is able to extract bacterial metagenome assembled genomes (MAGs) associated to the microbiome. Of these MAGS, 42 are high-quality representatives with >90% completeness and <5% contamination. These MAGs improve the genomic representation of the mosquito microbiome and can be used to facilitate genomic investigation of key genes of interest. Furthermore, we show that samples with a high number of KRAKEN2 assigned reads produce more MAGs. Conclusions: Our metagenomics workflow, MINUUR, was applied to a range of Aedes aegypti genomic samples to characterise microbiome-associated reads. We confirm the presence of key mosquito-associated symbionts that have previously been identified in other studies and recovered high-quality bacterial MAGs. In addition, MINUUR and its associated documentation are freely available on GitHub and provide researchers with a convenient workflow to investigate microbiome data included in the sequencing data for any applicable host genome of interest.Keywords
Funding Information
- Medical Research Council (MR/N013514/1)
- Biotechnology and Biological Sciences Research Council (BB/T001240/1, BB/V011278/1, BB/W018446/1)
- Engineering and Physical Sciences Research Council (V043811/1)
- Royal Society (RSWF\R1\180013)
- Bill and Melinda Gates Foundation (INV-048598)
- UK Research and Innovation (20197, 85336)
- NIHR (NIHR2000907)
- Wellcome Trust (217303)
This publication has 76 references indexed in Scilit:
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- Dynamic Gut Microbiome across Life History of the Malaria Mosquito Anopheles gambiae in KenyaPLOS ONE, 2011
- Contribution of midgut bacteria to blood digestion and egg production in Aedes aegypti (diptera: culicidae) (L.)Parasites & Vectors, 2011
- HMMER web server: interactive sequence similarity searchingNucleic Acids Research, 2011
- Cutadapt removes adapter sequences from high-throughput sequencing readsEMBnet.Journal, 2011
- pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference treeBMC Bioinformatics, 2010
- Prodigal: prokaryotic gene recognition and translation initiation site identificationBMC Bioinformatics, 2010
- BEDTools: a flexible suite of utilities for comparing genomic featuresBioinformatics, 2010
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- Fast and accurate short read alignment with Burrows–Wheeler transformBioinformatics, 2009