Analysis of alternative polyadenylation from single-cell RNA-seq using scDaPars reveals cell subpopulations invisible to gene expression

Abstract
Alternative polyadenylation (APA) is a major mechanism of post-transcriptional regulation in various cellular processes including cell proliferation and differentiation, but the APA heterogeneity among single cells remains largely unknown. Single-cell RNA sequencing (scRNA-seq) has been extensively used to define cell subpopulations at the transcription level. Yet, most scRNA-seq data have not been analyzed in an "APA-aware" manner. Here, we introduce scDaPars (Dynamic Analysis of Alternative PolyAdenylation from Single-cell RNA-seq), a bioinformatics algorithm to accurately quantify APA events at both single-cell and single-gene resolution using either 3’ end (10x Chromium) or full-length (Smart-seq2) scRNA-seq data. Validations in both real and simulated data indicate that scDaPars can robustly recover missing APA events caused by the low amounts of mRNA sequenced in single cells. When applied to cancer and human endoderm differentiation data, scDaPars not only revealed cell type-specific APA regulation but also identified cell subpopulations that are otherwise invisible to conventional gene expression analysis. Thus, scDaPars will enable us to understand cellular heterogeneity at the post-transcriptional APA level.
Funding Information
  • US National Institutes of Health (R01HG007538, R01CA193466, R01CA228140)
  • Cancer Prevention Research Institute of Texas (RR170048)