Improved exome prioritization of disease genes through cross-species phenotype comparison
Open Access
- 25 October 2013
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 24 (2), 340-348
- https://doi.org/10.1101/gr.160325.113
Abstract
Numerous new disease-gene associations have been identified by whole-exome sequencing studies in the last few years. However, many cases remain unsolved due to the sheer number of candidate variants remaining after common filtering strategies such as removing low quality and common variants and those deemed unlikely to be pathogenic. The observation that each of our genomes contains about 100 genuine loss-of-function variants makes identification of the causative mutation problematic when using these strategies alone. We propose using the wealth of genotype to phenotype data that already exists from model organism studies to assess the potential impact of these exome variants. Here, we introduce PHenotypic Interpretation of Variants in Exomes (PHIVE), an algorithm that integrates the calculation of phenotype similarity between human diseases and genetically modified mouse models with evaluation of the variants according to allele frequency, pathogenicity, and mode of inheritance approaches in our Exomiser tool. Large-scale validation of PHIVE analysis using 100,000 exomes containing known mutations demonstrated a substantial improvement (up to 54.1-fold) over purely variant-based (frequency and pathogenicity) methods with the correct gene recalled as the top hit in up to 83% of samples, corresponding to an area under the ROC curve of >95%. We conclude that incorporation of phenotype data can play a vital role in translational bioinformatics and propose that exome sequencing projects should systematically capture clinical phenotypes to take advantage of the strategy presented here.Keywords
This publication has 50 references indexed in Scilit:
- PhenoDigm: analyzing curated annotations to associate animal models with human diseasesDatabase: The Journal of Biological Databases and Curation, 2013
- An integrated map of genetic variation from 1,092 human genomesNature, 2012
- Mouse large-scale phenotyping initiatives: overview of the European Mouse Disease Clinic (EUMODIC) and of the Wellcome Trust Sanger Institute Mouse Genetics ProjectMammalian Genome, 2012
- VAR-MD: A tool to analyze whole exome-genome variants in small human pedigrees with mendelian inheritanceHuman Mutation, 2012
- Finding Disease Variants in Mendelian Disorders By Using Sequence Data: Methods and ApplicationsAmerican Journal of Human Genetics, 2011
- dbNSFP: A lightweight database of human nonsynonymous SNPs and their functional predictionsHuman Mutation, 2011
- Marfan syndrome with neonatal progeroid syndrome‐like lipodystrophy associated with a novel frameshift mutation at the 3′ terminus of the FBN1‐geneAmerican Journal of Medical Genetics Part A, 2010
- Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in OntologiesAmerican Journal of Human Genetics, 2009
- The Human Phenotype Ontology: A Tool for Annotating and Analyzing Human Hereditary DiseaseAmerican Journal of Human Genetics, 2008
- Gene prioritization through genomic data fusionNature Biotechnology, 2006