Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease
Open Access
- 13 January 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Genome Medicine
- Vol. 13 (1), 1-16
- https://doi.org/10.1186/s13073-020-00820-8
Abstract
Clinical laboratory (lab) tests are used in clinical practice to diagnose, treat, and monitor disease conditions. Test results are stored in electronic health records (EHRs), and a growing number of EHRs are linked to patient DNA, offering unprecedented opportunities to query relationships between genetic risk for complex disease and quantitative physiological measurements collected on large populations. A total of 3075 quantitative lab tests were extracted from Vanderbilt University Medical Center’s (VUMC) EHR system and cleaned for population-level analysis according to our QualityLab protocol. Lab values extracted from BioVU were compared with previous population studies using heritability and genetic correlation analyses. We then tested the hypothesis that polygenic risk scores for biomarkers and complex disease are associated with biomarkers of disease extracted from the EHR. In a proof of concept analyses, we focused on lipids and coronary artery disease (CAD). We cleaned lab traits extracted from the EHR performed lab-wide association scans (LabWAS) of the lipids and CAD polygenic risk scores across 315 heritable lab tests then replicated the pipeline and analyses in the Massachusetts General Brigham Biobank. Heritability estimates of lipid values (after cleaning with QualityLab) were comparable to previous reports and polygenic scores for lipids were strongly associated with their referent lipid in a LabWAS. LabWAS of the polygenic score for CAD recapitulated canonical heart disease biomarker profiles including decreased HDL, increased pre-medication LDL, triglycerides, blood glucose, and glycated hemoglobin (HgbA1C) in European and African descent populations. Notably, many of these associations remained even after adjusting for the presence of cardiovascular disease and were replicated in the MGBB. Polygenic risk scores can be used to identify biomarkers of complex disease in large-scale EHR-based genomic analyses, providing new avenues for discovery of novel biomarkers and deeper understanding of disease trajectories in pre-symptomatic individuals. We present two methods and associated software, QualityLab and LabWAS, to clean and analyze EHR labs at scale and perform a Lab-Wide Association Scan.Keywords
Funding Information
- Canadian Institutes of Health Research (MFE-142936)
- National Institute of General Medical Sciences (5T32GM080178-12)
- American Heart Association (16FTF30130005)
- National Institutes of Health (GM130791-01, UL1TR000427, 1U24CA242637-01, 1R01MH118233-01, 5U54MD010722-04, 5R01MH113362-03, 1R56MH120736-01, 1R01MH118233-01, 1R56MH120736-01)
This publication has 42 references indexed in Scilit:
- Identifying and mitigating biases in EHR laboratory testsJournal of Biomedical Informatics, 2014
- Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study dataNature Biotechnology, 2013
- Partitioning the Heritability of Tourette Syndrome and Obsessive Compulsive Disorder Reveals Differences in Genetic ArchitecturePLoS Genetics, 2013
- Discovery and refinement of loci associated with lipid levelsNature Genetics, 2013
- A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effectsHuman Genetics, 2013
- Phenome-Wide Association Study (PheWAS) for Detection of Pleiotropy within the Population Architecture using Genomics and Epidemiology (PAGE) NetworkPLoS Genetics, 2013
- GCTA: A Tool for Genome-wide Complex Trait AnalysisAmerican Journal of Human Genetics, 2011
- PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage AnalysesAmerican Journal of Human Genetics, 2007
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006
- Population Structure and EigenanalysisPLoS Genetics, 2006