Flexible Modeling of Epidemics with an Empirical Bayes Framework

Top Cited Papers
Open Access
Abstract
Seasonal influenza epidemics cause consistent, considerable, widespread loss annually in terms of economic burden, morbidity, and mortality. With access to accurate and reliable forecasts of a current or upcoming influenza epidemic’s behavior, policy makers can design and implement more effective countermeasures. This past year, the Centers for Disease Control and Prevention hosted the “Predict the Influenza Season Challenge”, with the task of predicting key epidemiological measures for the 2013–2014 U.S. influenza season with the help of digital surveillance data. We developed a framework for in-season forecasts of epidemics using a semiparametric Empirical Bayes framework, and applied it to predict the weekly percentage of outpatient doctors visits for influenza-like illness, and the season onset, duration, peak time, and peak height, with and without using Google Flu Trends data. Previous work on epidemic modeling has focused on developing mechanistic models of disease behavior and applying time series tools to explain historical data. However, tailoring these models to certain types of surveillance data can be challenging, and overly complex models with many parameters can compromise forecasting ability. Our approach instead produces possibilities for the epidemic curve of the season of interest using modified versions of data from previous seasons, allowing for reasonable variations in the timing, pace, and intensity of the seasonal epidemics, as well as noise in observations. Since the framework does not make strict domain-specific assumptions, it can easily be applied to some other diseases with seasonal epidemics. This method produces a complete posterior distribution over epidemic curves, rather than, for example, solely point predictions of forecasting targets. We report prospective influenza-like-illness forecasts made for the 2013–2014 U.S. influenza season, and compare the framework’s cross-validated prediction error on historical data to that of a variety of simpler baseline predictors. Influenza epidemics occur annually, and incur significant losses in terms of lost productivity, sickness, and death. Policy makers employ countermeasures, such as vaccination campaigns, to combat the occurrence and spread of infectious diseases, but epidemics exhibit a wide range of behavior, which makes designing and planning these efforts difficult. Accurate and reliable numerical forecasts of how an epidemic will behave, as well as advance notice of key events, could enable policy makers to further specialize countermeasures for a particular season. While a large amount of work already exists on modeling epidemics in past seasons, work on forecasting is relatively sparse. Specially tailored models for historical data may be overly strict and fail to produce behavior similar to the current season. We designed a framework for predicting epidemics without making strong assumptions about how the disease propagates by relying on slightly modified versions of past epidemics to form possibilities for the current season. We report forecasts generated for the 2013–2014 Centers for Disease Control and Prevention (CDC) “Predict the Influenza Season Challenge”, and assess its accuracy retrospectively.