Twitter Improves Influenza Forecasting
Top Cited Papers
Open Access
- 1 January 2014
- journal article
- Published by Public Library of Science (PLoS) in PLoS Currents
Abstract
Accurate disease forecasts are imperative when preparing for influenza epidemic outbreaks; nevertheless, these forecasts are often limited by the time required to collect new, accurate data. In this paper, we show that data from the microblogging community Twitter significantly improves influenza forecasting. Most prior influenza forecast models are tested against historical influenza-like illness (ILI) data from the U.S. Centers for Disease Control and Prevention (CDC). These data are released with a one-week lag and are often initially inaccurate until the CDC revises them weeks later. Since previous studies utilize the final, revised data in evaluation, their evaluations do not properly determine the effectiveness of forecasting. Our experiments using ILI data available at the time of the forecast show that models incorporating data derived from Twitter can reduce forecasting error by 17-30% over a baseline that only uses historical data. For a given level of accuracy, using Twitter data produces forecasts that are two to four weeks ahead of baseline models. Additionally, we find that models using Twitter data are, on average, better predictors of influenza prevalence than are models using data from Google Flu Trends, the leading web data source.Keywords
This publication has 20 references indexed in Scilit:
- Influenza Forecasting in Human Populations: A Scoping ReviewPLOS ONE, 2014
- Influenza-Like Illness Surveillance on Twitter through Automated Learning of Naïve LanguagePLOS ONE, 2013
- Real-time influenza forecasts during the 2012–2013 seasonNature Communications, 2013
- Monitoring Influenza Epidemics in China with Search Query from BaiduPLOS ONE, 2013
- Forecasting Peaks of Seasonal Influenza EpidemicsPLoS Currents, 2013
- Assessing Vaccination Sentiments with Online Social Media: Implications for Infectious Disease Dynamics and ControlPLoS Computational Biology, 2011
- Pandemics in the Age of Twitter: Content Analysis of Tweets during the 2009 H1N1 OutbreakPLOS ONE, 2010
- Modeling and Predicting Seasonal Influenza Transmission in Warm Regions Using Climatological ParametersPLOS ONE, 2010
- Detecting influenza epidemics using search engine query dataNature, 2009
- Using Internet Searches for Influenza SurveillanceClinical Infectious Diseases, 2008