A Digital Twins Machine Learning Model for Forecasting Disease Progression in Stroke Patients

Abstract
Background: Machine learning methods have been developed to predict the likelihood of a given event or classify patients into two or more diagnostic categories. Digital twin models, which forecast entire trajectories of patient health data, have potential applications in clinical trials and patient management. Methods: In this study, we apply a digital twin model based on a variational autoencoder to a population of patients who went on to experience an ischemic stroke. The digital twin’s ability to model patient clinical features was assessed with regard to its ability to forecast clinical measurement trajectories leading up to the onset of the acute medical event and beyond using International Classification of Diseases (ICD) codes for ischemic stroke and lab values as inputs. Results: The simulated patient trajectories were virtually indistinguishable from real patient data, with similar feature means, standard deviations, inter-feature correlations, and covariance structures on a withheld test set. A logistic regression adversary model was unable to distinguish between the real and simulated data area under the receiver operating characteristic (ROC) curve (AUCadversary = 0.51). Conclusion: Through accurate projection of patient trajectories, this model may help inform clinical decision making or provide virtual control arms for efficient clinical trials.