Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program
Open Access
- 6 March 2019
- preprint content
- Published by Cold Spring Harbor Laboratory
Abstract
Summary paragraph: The Trans-Omics for Precision Medicine (TOPMed) program seeks to elucidate the genetic architecture and disease biology of heart, lung, blood, and sleep disorders, with the ultimate goal of improving diagnosis, treatment, and prevention. The initial phases of the program focus on whole genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here, we describe TOPMed goals and design as well as resources and early insights from the sequence data. The resources include a variant browser, a genotype imputation panel, and sharing of genomic and phenotypic data via dbGaP. In 53,581 TOPMed samples, >400 million single-nucleotide and insertion/deletion variants were detected by alignment with the reference genome. Additional novel variants are detectable through assembly of unmapped reads and customized analysis in highly variable loci. Among the >400 million variants detected, 97% have frequency <1% and 46% are singletons. These rare variants provide insights into mutational processes and recent human evolutionary history. The nearly complete catalog of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and non-coding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and extends the reach of nearly all genome-wide association studies to include variants down to ~0.01% in frequency.Keywords
This publication has 86 references indexed in Scilit:
- The UK Biobank resource with deep phenotyping and genomic dataNature, 2018
- Burden Testing of Rare Variants Identified through Exome Sequencing via Publicly Available Control DataAmerican Journal of Human Genetics, 2018
- Genotype Imputation from Large Reference PanelsAnnual Review of Genomics and Human Genetics, 2018
- A reference panel of 64,976 haplotypes for genotype imputationNature Genetics, 2016
- Analysis of protein-coding genetic variation in 60,706 humansNature, 2016
- A Method to Exploit the Structure of Genetic Ancestry Space to Enhance Case-Control StudiesAmerican Journal of Human Genetics, 2016
- A global reference for human genetic variationNature, 2015
- An integrated map of genetic variation from 1,092 human genomesNature, 2012
- Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human ExomesScience, 2012
- DNA sequencing of a cytogenetically normal acute myeloid leukaemia genomeNature, 2008