Identifying transcriptomic correlates of histology using deep learning
Open Access
- 25 November 2020
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 15 (11), e0242858
- https://doi.org/10.1371/journal.pone.0242858
Abstract
Linking phenotypes to specific gene expression profiles is an extremely important problem in biology, which has been approached mainly by correlation methods or, more fundamentally, by studying the effects of gene perturbations. However, genome-wide perturbations involve extensive experimental efforts, which may be prohibitive for certain organisms. On the other hand, the characterization of the various phenotypes frequently requires an expert’s subjective interpretation, such as a histopathologist’s description of tissue slide images in terms of complex visual features (e.g. ‘acinar structures’). In this paper, we use Deep Learning to eliminate the inherent subjective nature of these visual histological features and link them to genomic data, thus establishing a more precisely quantifiable correlation between transcriptomes and phenotypes. Using a dataset of whole slide images with matching gene expression data from 39 normal tissue types, we first developed a Deep Learning tissue classifier with an accuracy of 94%. Then we searched for genes whose expression correlates with features inferred by the classifier and demonstrate that Deep Learning can automatically derive visual (phenotypical) features that are well correlated with the transcriptome and therefore biologically interpretable. As we are particularly concerned with interpretability and explainability of the inferred histological models, we also develop visualizations of the inferred features and compare them with gene expression patterns determined by immunohistochemistry. This can be viewed as a first step toward bridging the gap between the level of genes and the cellular organization of tissues.Funding Information
- Ministerul Educatiei si Cercetarii (PN 1937-0601/2019)
- Ministerul Educatiei si Cercetarii (PN 1937-0301/CPN 301 300/2019)
This publication has 43 references indexed in Scilit:
- Thyroid transcription factors in development, differentiation and diseaseNature Reviews Endocrinology, 2014
- Assessing the clinical utility of cancer genomic and proteomic data across tumor typesNature Biotechnology, 2014
- Quantification of tumour budding, lymphatic vessel density and invasion through image analysis in colorectal cancerJournal of Translational Medicine, 2014
- The Cancer Genome Atlas Pan-Cancer analysis projectNature Genetics, 2013
- The Genotype-Tissue Expression (GTEx) projectNature Genetics, 2013
- Genetic Regulation of Pituitary Gland Development in Human and MouseEndocrine Reviews, 2009
- Cardiac transcription factor Csx/Nkx2-5: Its role in cardiac development and diseasesPharmacology & Therapeutics, 2005
- BMP10 is essential for maintaining cardiac growth during murine cardiogenesisDevelopment, 2004
- Long Short-Term MemoryNeural Computation, 1997
- Disagreement of histopathological diagnoses of different pathologists in ovarian tumors — with some theoretical considerationsEuropean Journal of Obstetrics & Gynecology and Reproductive Biology, 1982