Predicting Severe Chronic Obstructive Pulmonary Disease Exacerbations Developing a Population Surveillance Approach with Administrative Data

Abstract
Rationale: Automatic prediction algorithms based on routinely collected health data may be able to identify patients at high risk for hospitalizations related to acute exacerbations of chronic obstructive pulmonary disease (COPD). Objectives: To conduct a proof-of-concept study of a population surveillance approach for identifying individuals at high risk of severe COPD exacerbations. Methods: We used British Columbia's administrative health databases (1997-2016) to identify patients with diagnosed COPD. We used data from the previous 6 months to predict the risk of severe exacerbation in the next 2 months after a randomly selected index date. We applied statistical and machine-learning algorithms for risk prediction (logistic regression, random forest, neural network, and gradient boosting). We used calibration plots and receiver operating characteristic curves to evaluate model performance based on a randomly chosen future date at least 1 year later (temporal validation). Results: There were 108,433 patients in the development dataset and 113,786 in the validation dataset; of these, 1,126 and 1,136, respectively, were hospitalized for COPD within their outcome windows. The best prediction algorithm (gradient boosting) had an area under the receiver operating characteristic curve of 0.82 (95% confidence interval, 0.80-0.83), which was significantly higher than the corresponding value for the model with exacerbation history as the only predictor (current standard of care: 0.68). The predicted risk scores were well calibrated in the validation dataset. Conclusions: Imminent COPD-related hospitalizations can be predicted with good accuracy using administrative health data. This model may be used as a means to target high-risk patients for preventive exacerbation therapies.

This publication has 25 references indexed in Scilit: