Data quality evaluation for observational multiple sclerosis registries

Abstract
Objective: Objective and reproducible evaluation of data quality is of paramount importance for studies of ‘real-world’ observational data. Here, we summarise a standardised data quality, density and generalisability process implemented by MSBase, a global multiple sclerosis (MS) cohort study. Methods: Error rate, data density score and generalisability score were developed using all 35,869 patients enrolled in MSBase as of November 2015. The data density score was calculated across six domains (follow-up, demography, visits, MS relapses, paraclinical data and therapy) and emphasised data completeness. The error rate evaluated syntactic accuracy and consistency of data. The generalisability score evaluated believability of the demographic and treatment information. Correlations among the three scores and the number of patients per centre were evaluated. Results: Errors were identified at the median rate of 3 per 100 patient-years. The generalisability score indicated the samples’ representativeness of the known MS epidemiology. Moderate correlation between the density and generalisability scores (ρ = 0.58) and a weak correlation between the error rate and the other two scores (ρ = −0.32 to −0.33) were observed. The generalisability score was strongly correlated with centre size (ρ = 0.79). Conclusion: The implemented scores enable objective evaluation of the quality of observational MS data, with an impact on the design of future analyses.