Development and validation of a clinical and genetic model for predicting risk of severe COVID-19

Abstract
Clinical and genetic risk factors for severe coronavirus disease 2019 (COVID-19) are often considered independently and without knowledge of the magnitudes of their effects on risk. Using severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) positive participants from the UK Biobank, we developed and validated a clinical and genetic model to predict risk of severe COVID-19. We used multivariable logistic regression on a 70% training dataset and used the remaining 30% for validation. We also validated a previously published prototype model. In the validation dataset, our new model was associated with severe COVID-19 (odds ratio per quintile of risk = 1.77, 95% confidence interval (CI) 1.64–1.90) and had acceptable discrimination (area under the receiver operating characteristic curve = 0.732, 95% CI 0.708–0.756). We assessed calibration using logistic regression of the log odds of the risk score, and the new model showed no evidence of over- or under-estimation of risk (α = −0.08; 95% CI −0.21−0.05) and no evidence or over-or under-dispersion of risk (β = 0.90, 95% CI 0.80–1.00). Accurate prediction of individual risk is possible and will be important in regions where vaccines are not widely available or where people refuse or are disqualified from vaccination, especially given uncertainty about the extent of infection transmission among vaccinated people and the emergence of SARS-CoV-2 variants of concern.