Dead reckoning: can we trust estimates of mortality rates in clinical databases?☆

Abstract
Objectives: Databases almost invariably contain some errors and improvements to the quality of recorded data are costly. We sought to assess the extent to which given levels of error in a clinical database can lead to misleading mortality rates being derived. Methods: We deliberately seeded a large database concerning congenital heart surgery involving over 17,600 operations, which we assumed to be error free, with errors at known rates of 0–20%. The effects of three different types of random error were explored: data omission, outcome miscoding (alive or dead) and the miscoding of procedures. For each error type, we compared the mortality rates calculated from the ‘seeded’ database to those calculated from the pristine database. Results: Outcome miscoding typically results in overestimated mortality rates which for low-risk procedures may well give estimates over double the true value. Random data omission has relatively little effect. If procedure types are miscoded, procedure-specific mortality estimates for high-risk operations tend to be underestimates and those for low-risk operations overestimates. A mathematical model developed to examine these effects accurately forecasted the results of such error-seeding experiments. Software to implement this model is available free of charge on the Internet. Conclusion: Even small levels of data error can substantially affect the accuracy of mortality rate estimates, especially for low-risk operations. Such inaccuracy could lead to misleading analysis of institutional and individual surgeons’ results. Our results suggest that caution is warranted in interpreting the mortality estimates derived from clinical databases. Our analysis goes beyond the realms of surgical mortality and concerns all adverse events whose frequency is rare.