inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains
Top Cited Papers
- 18 January 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Biotechnology
- Vol. 39 (6), 727-736
- https://doi.org/10.1038/s41587-020-00797-0
Abstract
Coexisting microbial cells of the same species often exhibit genetic variation that can affect phenotypes ranging from nutrient preference to pathogenicity. Here we present inStrain, a program that uses metagenomic paired reads to profile intra-population genetic diversity (microdiversity) across whole genomes and compares microbial populations in a microdiversity-aware manner, greatly increasing the accuracy of genomic comparisons when benchmarked against existing methods. We use inStrain to profile >1,000 fecal metagenomes from newborn premature infants and find that siblings share significantly more strains than unrelated infants, although identical twins share no more strains than fraternal siblings. Infants born by cesarean section harbor Klebsiella with significantly higher nucleotide diversity than infants delivered vaginally, potentially reflecting acquisition from hospital rather than maternal microbiomes. Genomic loci that show diversity in individual infants include variants found between other infants, possibly reflecting inoculation from diverse hospital-associated sources. inStrain can be applied to any metagenomic dataset for microdiversity analysis and rigorous strain comparison.Keywords
Funding Information
- National Science Foundation (DGE 1106400, DGE1106400)
- Alfred P. Sloan Foundation (APSF-2012-10-05)
- Foundation for the National Institutes of Health (RAI092531A)
This publication has 70 references indexed in Scilit:
- Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonizationGenome Research, 2012
- pIRS: Profile-based Illumina pair-end reads simulatorBioinformatics, 2012
- IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depthBioinformatics, 2012
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- Prodigal: prokaryotic gene recognition and translation initiation site identificationBMC Bioinformatics, 2010
- Circos: An information aesthetic for comparative genomicsGenome Research, 2009
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- Mathematical properties of the r2 measure of linkage disequilibriumTheoretical Population Biology, 2008
- Population Genomic Analysis of Strain Variation in Leptospirillum Group II Bacteria Involved in Acid Mine Drainage FormationPLoS Biology, 2008
- Mathematical model for studying genetic variation in terms of restriction endonucleases.Proceedings of the National Academy of Sciences of the United States of America, 1979