An End-to-End Integrated Clinical and CT-Based Radiomics Nomogram for Predicting Disease Severity and Need for Ventilator Support in COVID-19 Patients: A Large Multisite Retrospective Study

Abstract
Objective: The disease COVID-19 has caused a widespread global pandemic with ~3. 93 million deaths worldwide. In this work, we present three models—radiomics (MRM), clinical (MCM), and combined clinical–radiomics (MRCM) nomogram to predict COVID-19-positive patients who will end up needing invasive mechanical ventilation from the baseline CT scans. Methods: We performed a retrospective multicohort study of individuals with COVID-19-positive findings for a total of 897 patients from two different institutions (Renmin Hospital of Wuhan University, D1 = 787, and University Hospitals, US D2 = 110). The patients from institution-1 were divided into 60% training, D1T (N = 473), and 40% test set D1V (N = 314). The patients from institution-2 were used for an independent validation test set D2V (N = 110). A U-Net-based neural network (CNN) was trained to automatically segment out the COVID consolidation regions on the CT scans. The segmented regions from the CT scans were used for extracting first- and higher-order radiomic textural features. The top radiomic and clinical features were selected using the least absolute shrinkage and selection operator (LASSO) with an optimal binomial regression model within D1T. Results: The three out of the top five features identified using D1T were higher-order textural features (GLCM, GLRLM, GLSZM), whereas the last two features included the total absolute infection size on the CT scan and the total intensity of the COVID consolidations. The radiomics model (MRM) was constructed using the radiomic score built using the coefficients obtained from the LASSO logistic model used within the linear regression (LR) classifier. The MRM yielded an area under the receiver operating characteristic curve (AUC) of 0.754 (0.709–0.799) on D1T, 0.836 on D1V, and 0.748 D2V. The top prognostic clinical factors identified in the analysis were dehydrogenase (LDH), age, and albumin (ALB). The clinical model had an AUC of 0.784 (0.743–0.825) on D1T, 0.813 on D1V, and 0.688 on D2V. Finally, the combined model, MRCM integrating radiomic score, age, LDH and ALB, yielded an AUC of 0.814 (0.774–0.853) on D1T, 0.847 on D1V, and 0.771 on D2V. The MRCM had an overall improvement in the performance of ~5.85% (D1T: p = 0.0031; D1V p = 0.0165; D2V: p = 0.0369) over MCM. Conclusion: The novel integrated imaging and clinical model (MRCM) outperformed both models (MRM) and (MCM). Our results across multiple sites suggest that the integrated nomogram could help identify COVID-19 patients with more severe disease phenotype and potentially require mechanical ventilation.