Effects of incomplete inter-hospital network data on the assessment of transmission dynamics of hospital-acquired infections

Abstract
In the year 2020, there were 105 different statutory insurance companies in Germany with heterogeneous regional coverage. Obtaining data from all insurance companies is challenging, so that it is likely that projects will have to rely on data not covering the whole population. Consequently, the study of epidemic spread in hospital referral networks using data-driven models may be biased. We studied this bias using data from three German regional insurance companies covering four federal states: AOK (historically “general local health insurance company”, but currently only the abbreviation is used) Lower Saxony (in Federal State of Lower Saxony), AOK Bavaria (in Bavaria), and AOK PLUS (in Thuringia and Saxony). To understand how incomplete data influence network characteristics and related epidemic simulations, we created sampled datasets by randomly dropping a proportion of patients from the full datasets and replacing them with random copies of the remaining patients to obtain scale-up datasets to the original size. For the sampled and scale-up datasets, we calculated several commonly used network measures, and compared them to those derived from the original data. We found that the network measures (degree, strength and closeness) were rather sensitive to incompleteness. Infection prevalence as an outcome from the applied susceptible-infectious-susceptible (SIS) model was fairly robust against incompleteness. At incompleteness levels as high as 90% of the original datasets the prevalence estimation bias was below 5% in scale-up datasets. Consequently, a coverage as low as 10% of the local population of the federal state population was sufficient to maintain the relative bias in prevalence below 10% for a wide range of transmission parameters as encountered in clinical settings. Our findings are reassuring that despite incomplete coverage of the population, German health insurance data can be used to study effects of patient traffic between institutions on the spread of pathogens within healthcare networks. Patterns of patients’ transfer between different hospitals contribute crucially to the risk of hospital-acquired infections (HAIs) in the health care system. To quantify this risk, network models can be applied. The estimated risk can be inaccurate in the case of incomplete data on hospital admissions, which can be a consequence of the multiplicity of insurance companies as it is the case in Germany. To develop a better understanding of how incompleteness of data affects network measures and the simulated spread of HAI, we compared those measures derived from sampled, scale-up and original data, based on hospitalization data from three AOK insurance companies. We found that common network measures were affected by incompleteness, but the simulated prevalence as a measure of epidemic spread in the network was robust over a large range of incompleteness proportions. Epidemics and the transition of the infectious diseases may be modelled on hospital data with a coverage as low as 10% of the local population, whilst maintaining accuracy to within 10% of the true population prevalence.
Funding Information
  • Bundesministerium für Bildung und Forschung (01KI1704C)
  • National Science Centre, Poland (2016/22/Z/ST1/00690)
  • ZonMw (547001005)