ASEP: Gene-based detection of allele-specific expression across individuals in a population by RNA sequencing

Abstract
Author summary Allele-specific expression (ASE) quantifies the relative expression of two alleles in a diploid individual, and such expression imbalance potentially contributes to phenotypic variation and disease pathophysiology among individuals. Since the two alleles used to measure ASE come from the same cellular environment and genetic background, they can serve as internal control and eliminate the influence of trans-acting genetic and environmental factors. Existing ASE detection methods analyze one individual at a time, therefore not only wasting shared information across individuals, but also posing a challenge for result interpretation across individuals. To overcome this limitation, we developed ASEP, a method that is able to detect gene-level ASE under one condition, as well as, ASE difference between two conditions (e.g., pre- vs post-treatment) in a population. We have demonstrated ASEP's convincing performance through extensive simulations. Application of ASEP to human kidney and macrophage RNA-seq datasets have further illustrated its ability to uncover ASE genes related to kidney functions and cardiometabolic traits. With the wide application of large-scale transcriptome sequencing in biomedical studies, there is an urgent need to learn a comprehensive picture of ASE in diverse populations. We believe ASEP will be well-suited for this purpose and can guide future ASE studies. Allele-specific expression (ASE) analysis, which quantifies the relative expression of two alleles in a diploid individual, is a powerful tool for identifying cis-regulated gene expression variations that underlie phenotypic differences among individuals. Existing methods for gene-level ASE detection analyze one individual at a time, therefore failing to account for shared information across individuals. Failure to accommodate such shared information not only reduces power, but also makes it difficult to interpret results across individuals. However, when only RNA sequencing (RNA-seq) data are available, ASE detection across individuals is challenging because the data often include individuals that are either heterozygous or homozygous for the unobserved cis-regulatory SNP, leading to sample heterogeneity as only those heterozygous individuals are informative for ASE, whereas those homozygous individuals have balanced expression. To simultaneously model multi-individual information and account for such heterogeneity, we developed ASEP, a mixture model with subject-specific random effect to account for multi-SNP correlations within the same gene. ASEP only requires RNA-seq data, and is able to detect gene-level ASE under one condition and differential ASE between two conditions (e.g., pre- versus post-treatment). Extensive simulations demonstrated the convincing performance of ASEP under a wide range of scenarios. We applied ASEP to a human kidney RNA-seq dataset, identified ASE genes and validated our results with two published eQTL studies. We further applied ASEP to a human macrophage RNA-seq dataset, identified genes showing evidence of differential ASE between M0 and M1 macrophages, and confirmed our findings by results from cardiometabolic trait-relevant genome-wide association studies. To the best of our knowledge, ASEP is the first method for gene-level ASE detection at the population level that only requires the use of RNA-seq data. With the growing adoption of RNA-seq, we believe ASEP will be well-suited for various ASE studies for human diseases.
Funding Information
  • National Institute of General Medical Sciences (R01GM108600)
  • National Institute of General Medical Sciences (R01GM125301)
  • National Heart, Lung, and Blood Institute (R01HL113147)
  • National Eye Institute (R01EY030192)
  • National Institute of Diabetes and Digestive and Kidney Diseases (R01DK076077)
  • National Heart, Lung, and Blood Institute (R01HL113147)
  • National Institute of Diabetes and Digestive and Kidney Diseases (R01DK087635)
  • National Heart, Lung, and Blood Institute (R00HL130574)
  • Irving Medical Center, Columbia University (UL1TR001873)