Classification of colorectal tissue images from high throughput tissue microarrays by ensemble deep learning methods

Abstract
Tissue microarray (TMA) core images are a treasure trove for artificial intelligence applications. However, a common problem of TMAs is multiple sectioning, which can change the content of the intended tissue core and requires re-labelling. Here, we investigate different ensemble methods for colorectal tissue classification using high-throughput TMAs. Hematoxylin and Eosin (H&E) core images of 0.6 mm or 1.0 mm diameter from three international cohorts were extracted from 54 digital slides (n = 15,150 cores). After TMA core extraction and color enhancement, five different flows of independent and ensemble deep learning were applied. Training and testing data with 2144 and 13,006 cores included three classes: tumor, normal or “other” tissue. Ground-truth data were collected from 30 ngTMA slides (n = 8689 cores). A test augmentation is applied to reduce the uncertain prediction. Predictive accuracy of the best method, namely Soft Voting Ensemble of one VGG and one CapsNet models was 0.982, 0.947 and 0.939 for normal, “other” and tumor, which outperformed to independent or ensemble learning with one base-estimator. Our high-accuracy algorithm for colorectal tissue classification in high-throughput TMAs is amenable to images from different institutions, core sizes and stain intensity. It helps to reduce error in TMA core evaluations with previously given labels.
Funding Information
  • Rising Tide, foundation for clinical cancer research (REF-36-361, REF-36-361)
  • the Swiss Cancer League (Grant KFS-4427-02-2018)