Random forest can predict 30‐day mortality of spontaneous intracerebral hemorrhage with remarkable discrimination

Abstract
Risk-stratification models based on patient and disease characteristics are useful for aiding clinical decisions and for comparing the quality of care between different physicians or hospitals. In addition, prediction of mortality is beneficial for optimizing resource utilization. We evaluated the accuracy and discriminating power of the random forest (RF) to predict 30-day mortality of spontaneous intracerebral hemorrhage (SICH). We retrospectively studied 423 patients admitted to the Taichung Veterans General Hospital who were diagnosed with spontaneous SICH within 24 h of stroke onset. The initial evaluation data of the patients were used to train the RF model. Areas under the receiver operating characteristic curves (AUC) were used to quantify the predictive performance. The performance of the RF model was compared to that of an artificial neural network (ANN), support vector machine (SVM), logistic regression model, and the ICH score. The RF had an overall accuracy of 78.5% for predicting the mortality of patients with SICH. The sensitivity was 79.0%, and the specificity was 78.4%. The AUCs were as follows: RF, 0.87 (0.84-0.90); ANN, 0.81 (0.77-0.85); SVM, 0.79 (0.75-0.83); logistic regression, 0.78 (0.74-0.82); and ICH score, 0.72 (0.68-0.76). The discriminatory power of RF was superior to that of the other prediction models. The RF provided the best predictive performance amongst all of the tested models. We believe that the RF is a suitable tool for clinicians to use in predicting the 30-day mortality of patients after SICH.