Rare Genetic Variation Underlying Human Diseases and Traits: Results from 200,000 Individuals in the UK Biobank

Abstract
Background: Many human diseases are known to have a genetic contribution. While genome-wide studies have identified many disease-associated loci, it remains challenging to elucidate causal genes. In contrast, exome sequencing provides an opportunity to identify new disease genes and large-effect variants of clinical relevance. We therefore sought to determine the contribution of rare genetic variation in a curated set of human diseases and traits using a unique resource of 200,000 individuals with exome sequencing data from the UK Biobank.Methods and Results: We included 199,832 participants with a mean age of 68 at follow-up. Exome-wide gene-based tests were performed for 64 diseases and 23 quantitative traits using a mixed-effects model, testing rare loss-of-function and damaging missense variants. We identified 51 known and 23 novel associations with 26 diseases and traits at a false-discovery-rate of 1%. There was a striking risk associated with many Mendelian disease genes including:MYPBC3with over a 100-fold increased odds of hypertrophic cardiomyopathy,PKD1with a greater than 25-fold increased odds of chronic kidney disease, andBRCA2, BRCA1, ATMandPALB2with 3 to 10-fold increased odds of breast cancer. Notable novel findings included an association betweenGIGYF1and type 2 diabetes (OR 5.6,P=5.35×10−8), elevated blood glucose, and lower insulin-like-growth-factor-1 levels. Rare variants inCCAR2were also associated with diabetes risk (OR 13,P=8.5×10−8), whileCOL9A3was associated with cataract (OR 3.4,P=6.7×10−8). Notable associations for blood lipids and hypercholesterolemia includedNR1H3, RRBP1, GIGYF1, SCGN, APH1A, PDE3BandANGPTL8. A number of novel genes were associated with height, includingDTL, PIEZO1, SCUBE3, PAPPAandADAMTS6, whileBSNwas associated with body-mass-index. We further assessed putatively pathogenic variants in known Mendelian cardiovascular disease genes and found that between 1.3 and 2.3% of the population carried likely pathogenic variants in known cardiomyopathy, arrhythmia or hypercholesterolemia genes.Conclusions: Large-scale population sequencing identifies known and novel genes harboring high-impact variation for human traits and diseases. A number of novel findings, includingGIGYF1,represent interesting potential therapeutic targets. Exome sequencing at scale can identify a meaningful proportion of the population that carries a pathogenic variant underlying cardiovascular disease.
Other Versions