Radiomics analysis combining unsupervised learning and handcrafted features: A multiple‐disease study
- 28 August 2021
- journal article
- research article
- Published by Wiley in Medical Physics
- Vol. 48 (11), 7003-7015
- https://doi.org/10.1002/mp.15199
Abstract
Purpose To study and investigate the synergistic benefit of incorporating both conventional handcrafted and learning-based features in disease identification across a wide range of clinical setups. Methods and Materials In this retrospective study, we collected 170/150/209/137 patients with four different disease types associated with identification objectives of: lymph node metastasis status of gastric cancer (GC), 5-year survival status of patients with high-grade osteosarcoma (HOS), early recurrence status of intrahepatic cholangiocarcinoma (ICC), and pathological grades of pancreatic neuroendocrine tumors (pNETs). CT and MR were used to derive image features for GC/HOS/pNETs and ICC respectively. In each study, 67 universal handcrafted features and study-specific features based on sparse autoencoder (SAE) method were extracted and fed into the subsequent feature selection and learning model to predict the corresponding disease identification. Models using handcrafted alone, SAE alone, and hybrid features were optimized and their performance was compared. Prominent features were analyzed both qualitatively and quantitatively to generate study-specific and cross-study insight. In addition to direct performance gain assessment, correlation analysis was performed to assess the complementarity between handcrafted features and SAE features. Results On the independent hold-off test, the handcrafted, SAE, and hybrid features based prediction yielded AUC of 0.761 vs 0.769 vs 0.829 for GC, 0.629 vs 0.740 vs 0.709 for HOS, 0.717 vs 0.718 vs 0.758 for ICC, and 0.739 vs 0.715 vs 0.771 for pNETs studies respectively. In three out of the four studies, prediction using the hybrid features yields the best performance, demonstrating the general benefit in using hybrid features. Prediction with SAE features alone had the best performance in the HOS study, which may be explained by the complexity of HOS prognosis and the possibility of a slight overfit due to higher correlation between handcrafted and SAE features. Conclusion This study demonstrated the general benefit of combing handcrafted and learning-based features in radiomics modelling. It also clearly illustrates the task-specific and data-specific dependency on the performance gain and suggests that while the common methodology of feature combination may be applied across various studies and tasks, study-specific feature selection and model optimization is still necessary to achieve high accuracy and robustness. This article is protected by copyright. All rights reservedKeywords
Funding Information
- National Natural Science Foundation of China (81871351, 81950410632)
This publication has 48 references indexed in Scilit:
- Combining Models is More Likely to Give Better Predictions than Single ModelsPhytopathology®, 2015
- Stacked Sparse Autoencoder (SSAE) for Nuclei Detection on Breast Cancer Histopathology ImagesIEEE Transactions on Medical Imaging, 2015
- Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: Evidence from whole-brain resting-state functional connectivity patterns of schizophreniaNeuroImage, 2015
- CT-based radiomic signature predicts distant metastasis in lung adenocarcinomaRadiotherapy and Oncology, 2015
- Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approachNature Communications, 2014
- Radiomics: Extracting more information from medical images using advanced feature analysisEuropean Journal of Cancer, 2012
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- A learning method for the class imbalance problem with medical data setsComputers in Biology and Medicine, 2010
- Sample size requirements for estimating pearson, kendall and spearman correlationsPsychometrika, 2000
- The bootstrap and its application in signal processingIEEE Signal Processing Magazine, 1998