A unified catalog of 204,938 reference genomes from the human gut microbiome
Top Cited Papers
Open Access
- 20 July 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Biotechnology
- Vol. 39 (1), 105-114
- https://doi.org/10.1038/s41587-020-0603-3
Abstract
Comprehensive, high-quality reference genomes are required for functional characterization and taxonomic assignment of the human gut microbiota. We present the Unified Human Gastrointestinal Genome (UHGG) collection, comprising 204,938 nonredundant genomes from 4,644 gut prokaryotes. These genomes encode >170 million protein sequences, which we collated in the Unified Human Gastrointestinal Protein (UHGP) catalog. The UHGP more than doubles the number of gut proteins in comparison to those present in the Integrated Gene Catalog. More than 70% of the UHGG species lack cultured representatives, and 40% of the UHGP lack functional annotations. Intraspecies genomic variation analyses revealed a large reservoir of accessory genes and single-nucleotide variants, many of which are specific to individual human populations. The UHGG and UHGP collections will enable studies linking genotypes to phenotypes in the human gut microbiome.Keywords
This publication has 86 references indexed in Scilit:
- The human gut and groundwater harbor non-photosynthetic bacteria belonging to a new candidate phylum sibling to CyanobacteriaeLife, 2013
- Structure, function and diversity of the healthy human microbiomeNature, 2012
- Fast gapped-read alignment with Bowtie 2Nature Methods, 2012
- A Catalog of Reference Genomes from the Human MicrobiomeScience, 2010
- Prodigal: prokaryotic gene recognition and translation initiation site identificationBMC Bioinformatics, 2010
- Fast and accurate long-read alignment with Burrows–Wheeler transformBioinformatics, 2010
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- Fast and accurate short read alignment with Burrows–Wheeler transformBioinformatics, 2009
- Infernal 1.0: inference of RNA alignmentsBioinformatics, 2009
- tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic SequenceNucleic Acids Research, 1997