The human splicing code reveals new insights into the genetic determinants of disease

Abstract
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
Funding Information
  • NIH (R37-GM42699a)
  • Autism Speaks
  • Genome Canada
  • Canadian Institutes for Advanced Research (CIHR)
  • CIHR
  • Natural Sciences and Engineering Research Council of Canada (NSERC)
  • Ontario Genomics Institute (OGI)
  • OGI
  • University of Toronto McLaughlin Centre
  • McLaughlin
  • Autism Research Training Fellowship
  • CIHR Banting Fellowship
  • NSERC Alexander Graham Bell Scholarship