Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report Data
Open Access
- 13 January 2017
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Neglected Tropical Diseases
- Vol. 11 (1), e0005295
- https://doi.org/10.1371/journal.pntd.0005295
Abstract
Over 400,000 people across the Americas are thought to have been infected with Zika virus as a consequence of the 2015–2016 Latin American outbreak. Official government-led case count data in Latin America are typically delayed by several weeks, making it difficult to track the disease in a timely manner. Thus, timely disease tracking systems are needed to design and assess interventions to mitigate disease transmission. We combined information from Zika-related Google searches, Twitter microblogs, and the HealthMap digital surveillance system with historical Zika suspected case counts to track and predict estimates of suspected weekly Zika cases during the 2015–2016 Latin American outbreak, up to three weeks ahead of the publication of official case data. We evaluated the predictive power of these data and used a dynamic multivariable approach to retrospectively produce predictions of weekly suspected cases for five countries: Colombia, El Salvador, Honduras, Venezuela, and Martinique. Models that combined Google (and Twitter data where available) with autoregressive information showed the best out-of-sample predictive accuracy for 1-week ahead predictions, whereas models that used only Google and Twitter typically performed best for 2- and 3-week ahead predictions. Given the significant delay in the release of official government-reported Zika case counts, we show that these Internet-based data streams can be used as timely and complementary ways to assess the dynamics of the outbreak. In the absence of access to real-time government-reported Zika case counts, we demonstrate the ability of Internet-based data sources to track the outbreak. Our model predictions fill a critical time-gap in existing Zika surveillance, given that early interventions and real-time surveillance are necessary to curb mosquito transmission. Official Zika case reports will likely continue to be delayed in their release; thus, it is important that health and government officials have access to real-time and future estimates of Zika activity in order to allocate resources according to potential changes in outbreak dynamics. The methodologies presented here may be expanded to any country–and perhaps finer spatial resolutions–to identify changes in Zika transmission for public health decision-makers.Keywords
This publication has 37 references indexed in Scilit:
- Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic ScalesPLoS Computational Biology, 2013
- Monitoring Influenza Epidemics in China with Search Query from BaiduPLOS ONE, 2013
- Assessing Google Flu Trends Performance in the United States during the 2009 Influenza Virus A (H1N1) PandemicPLOS ONE, 2011
- Prediction of Dengue Incidence Using Search Query SurveillancePLoS Neglected Tropical Diseases, 2011
- The Use of Twitter to Track Levels of Disease Activity and Public Concern in the U.S. during the Influenza A H1N1 PandemicPLOS ONE, 2011
- Regression Shrinkage and Selection via The Lasso: A RetrospectiveJournal of the Royal Statistical Society Series B: Statistical Methodology, 2011
- Zika Virus Outbreak on Yap Island, Federated States of MicronesiaThe New England Journal of Medicine, 2009
- Detecting influenza epidemics using search engine query dataNature, 2009
- Using Internet Searches for Influenza SurveillanceClinical Infectious Diseases, 2008
- Surveillance Sans Frontières: Internet-Based Emerging Infectious Disease Intelligence and the HealthMap ProjectPLoS Medicine, 2008