TN‐USMA Net: Triple normalization‐based gastrointestinal stromal tumors classification on multicenter EUS images with ultrasound‐specific pretraining and meta attention

Abstract
Purpose Accurate quantification of gastrointestinal stromal tumors’ (GISTs) risk stratification on multicenter endoscopic ultrasound (EUS) images plays a pivotal role in aiding the surgical decision-making process. This study focuses on automatically classifying higher-risk and lower-risk GISTs in the presence of a multicenter setting and limited data. Methods In this study, we retrospectively enrolled 914 patients with GISTs (1824 EUS images in total) from 18 hospitals in China. We propose a triple normalization-based deep learning framework with ultrasound-specific pretraining and meta attention, namely, TN-USMA model. The triple normalization module consists of the intensity normalization, size normalization, and spatial resolution normalization. First, the image intensity is standardized and same-size regions of interest (ROIs) and same-resolution tumor masks are generated in parallel. Then, the transfer learning strategy is utilized to mitigate the data scarcity problem. The same-size ROIs are fed into a deep architecture with ultrasound-specific pretrained weights, which are obtained from self-supervised learning using a large volume of unlabeled ultrasound images. Meanwhile, tumors’ size features are calculated from the same-resolution masks individually. Afterward, the size features together with two demographic features are integrated to the model before the final classification layer using a meta attention mechanism to further enhance feature representations. The diagnostic performance of the proposed method was compared with one radiomics-based method and two state-of-the-art deep learning methods. Four evaluation metrics, namely, the accuracy, the area under the receiver operator curve, the sensitivity, and the specificity were used to evaluate the model performance. Results The proposed TN-USMA model achieves an overall accuracy of 0.834 (95% confidence interval [CI]: 0.772, 0.885), an area under the receiver operator curve of 0.881 (95% CI: 0.825, 0.924), a sensitivity of 0.844 (95% CI: 0.672, 0.947), and a specificity of 0.832 (95% CI: 0.762, 0.888). The AUC significantly outperforms other two deep learning approaches (p < 0.05, DeLong et al). Moreover, the performance is stable under different variations of multicenter dataset partitions. Conclusions The proposed TN-USMA model can successfully differentiate higher-risk GISTs from lower-risk ones. It is accurate, robust, generalizable, and efficient for potential clinical applications.
Funding Information
  • National Natural Science Foundation of China (61771143, 61871135, 81830058)
  • Science and Technology Commission of Shanghai Municipality (20DZ1100104)