Abstract
// <![CDATA[ $('.header-date').hide();$('#titleAuthors').hide();$('#abstractHeader').hide(); // ]]> Matthew A.M. Devall1 and Graham Casey1 1 Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA Correspondence to: Graham Casey, email: gc8r@virginia.edu Keywords: colorectal cancer; single-cell deconvolution; microsatellite instability; RNA-sequencing; enteroendocrine Received: January 23, 2021 Accepted: March 22, 2021 Published: PUBLISHED_DATE Copyright: © 2021 Devall and Casey. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. ABSTRACT Approximately 15% of colorectal cancer (CRC) cases present with high levels of microsatellite instability (MSI-H). Bulk RNA-sequencing approaches have been employed to elucidate transcriptional differences between MSI-H and microsatellite stable (MSS) CRC tumors. These approaches are frequently confounded by the complex cellular heterogeneity of tumors. We performed single-cell deconvolution of bulk RNA-sequencing on The Cancer Genome Atlas colon adenocarcinoma (TCGA-COAD) dataset. Cell composition within each dataset was estimated using CIBERSORTx. Cell composition differences were analyzed using linear regression. Significant differences in abundance were observed for 13 of 19 cell types between MSI-H and MSS/MSI-L tumors in TCGA-COAD. This included a novel finding of increased enteroendocrine (q = 3.71E-06) and reduced colonocyte populations (q = 2.21E-03) in MSI-H versus MSS/MSI-L tumors. We were able to validate some of these differences in an independent biopsy dataset. By incorporating cell composition into our regression model, we identified 3,193 differentially expressed genes (q = 0.05), of which 556 were deemed novel. We subsequently validated many of these genes in an independent dataset of colon cancer cell lines. In summary, we show that some of the challenges associated with cellular heterogeneity can be overcome using single-cell deconvolution, and through our analysis we highlight several novel gene targets for further investigation.