ASEP: Gene-based detection of allele-specific expression across individuals in a population by RNA sequencing
Open Access
- 1 May 2020
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 16 (5), e1008786
- https://doi.org/10.1371/journal.pgen.1008786
Abstract
Author summary Allele-specific expression (ASE) quantifies the relative expression of two alleles in a diploid individual, and such expression imbalance potentially contributes to phenotypic variation and disease pathophysiology among individuals. Since the two alleles used to measure ASE come from the same cellular environment and genetic background, they can serve as internal control and eliminate the influence of trans-acting genetic and environmental factors. Existing ASE detection methods analyze one individual at a time, therefore not only wasting shared information across individuals, but also posing a challenge for result interpretation across individuals. To overcome this limitation, we developed ASEP, a method that is able to detect gene-level ASE under one condition, as well as, ASE difference between two conditions (e.g., pre- vs post-treatment) in a population. We have demonstrated ASEP's convincing performance through extensive simulations. Application of ASEP to human kidney and macrophage RNA-seq datasets have further illustrated its ability to uncover ASE genes related to kidney functions and cardiometabolic traits. With the wide application of large-scale transcriptome sequencing in biomedical studies, there is an urgent need to learn a comprehensive picture of ASE in diverse populations. We believe ASEP will be well-suited for this purpose and can guide future ASE studies. Allele-specific expression (ASE) analysis, which quantifies the relative expression of two alleles in a diploid individual, is a powerful tool for identifying cis-regulated gene expression variations that underlie phenotypic differences among individuals. Existing methods for gene-level ASE detection analyze one individual at a time, therefore failing to account for shared information across individuals. Failure to accommodate such shared information not only reduces power, but also makes it difficult to interpret results across individuals. However, when only RNA sequencing (RNA-seq) data are available, ASE detection across individuals is challenging because the data often include individuals that are either heterozygous or homozygous for the unobserved cis-regulatory SNP, leading to sample heterogeneity as only those heterozygous individuals are informative for ASE, whereas those homozygous individuals have balanced expression. To simultaneously model multi-individual information and account for such heterogeneity, we developed ASEP, a mixture model with subject-specific random effect to account for multi-SNP correlations within the same gene. ASEP only requires RNA-seq data, and is able to detect gene-level ASE under one condition and differential ASE between two conditions (e.g., pre- versus post-treatment). Extensive simulations demonstrated the convincing performance of ASEP under a wide range of scenarios. We applied ASEP to a human kidney RNA-seq dataset, identified ASE genes and validated our results with two published eQTL studies. We further applied ASEP to a human macrophage RNA-seq dataset, identified genes showing evidence of differential ASE between M0 and M1 macrophages, and confirmed our findings by results from cardiometabolic trait-relevant genome-wide association studies. To the best of our knowledge, ASEP is the first method for gene-level ASE detection at the population level that only requires the use of RNA-seq data. With the growing adoption of RNA-seq, we believe ASEP will be well-suited for various ASE studies for human diseases.Funding Information
- National Institute of General Medical Sciences (R01GM108600)
- National Institute of General Medical Sciences (R01GM125301)
- National Heart, Lung, and Blood Institute (R01HL113147)
- National Eye Institute (R01EY030192)
- National Institute of Diabetes and Digestive and Kidney Diseases (R01DK076077)
- National Heart, Lung, and Blood Institute (R01HL113147)
- National Institute of Diabetes and Digestive and Kidney Diseases (R01DK087635)
- National Heart, Lung, and Blood Institute (R00HL130574)
- Irving Medical Center, Columbia University (UL1TR001873)
This publication has 49 references indexed in Scilit:
- Powerful Identification of Cis-regulatory SNPs in Human Primary Monocytes Using Allele-Specific Gene ExpressionPLOS ONE, 2012
- STAR: ultrafast universal RNA-seq alignerBioinformatics, 2012
- eQTL Mapping Using RNA-seq DataStatistics in Biosciences, 2012
- Five Years of GWAS DiscoveryAmerican Journal of Human Genetics, 2012
- Plasma apolipoprotein C-III metabolism in patients with chronic kidney diseaseJournal of Lipid Research, 2011
- Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWASPLoS Genetics, 2010
- RNA-Seq: a revolutionary tool for transcriptomicsNature Reviews Genetics, 2009
- Revealing the architecture of gene regulation: the promise of eQTL studiesTrends in Genetics, 2008
- The kinase p38α serves cell type–specific inflammatory functions in skin injury and coordinates pro- and anti-inflammatory gene expressionNature Immunology, 2008
- A genome-wide association study of global gene expressionNature Genetics, 2007