The Parable of Google Flu: Traps in Big Data Analysis
Top Cited Papers
- 14 March 2014
- journal article
- editorial
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 343 (6176), 1203-1205
- https://doi.org/10.1126/science.1248506
Abstract
In February 2013, Google Flu Trends (GFT) made headlines but not for a reason that Google executives or the creators of the flu tracking system would have hoped. Nature reported that GFT was predicting more than double the proportion of doctor visits for influenza-like illness (ILI) than the Centers for Disease Control and Prevention (CDC), which bases its estimates on surveillance reports from laboratories across the United States (1, 2). This happened despite the fact that GFT was built to predict CDC reports. Given that GFT is often held up as an exemplary use of big data (3, 4), what lessons can we draw from this error?Keywords
This publication has 27 references indexed in Scilit:
- Reassessing Google Flu Trends Data for Detection of Seasonal and Pandemic Influenza: A Comparative Epidemiological Study at Three Geographic ScalesPLoS Computational Biology, 2013
- Forecasting seasonal outbreaks of influenzaProceedings of the National Academy of Sciences of the United States of America, 2012
- Beating the news using social media: the case study of American IdolEPJ Data Science, 2012
- Predicting consumer behavior with Web searchProceedings of the National Academy of Sciences of the United States of America, 2010
- Real-Time Epidemic Monitoring and Forecasting of H1N1-2009 Using Influenza-Like Illness from General Practice and Family Doctor Clinics in SingaporePLOS ONE, 2010
- FluTE, a Publicly Available Stochastic Influenza Epidemic Simulation ModelPLoS Computational Biology, 2010
- Multiscale mobility networks and the spatial spreading of infectious diseasesProceedings of the National Academy of Sciences of the United States of America, 2009
- Detecting influenza epidemics using search engine query dataNature, 2009
- Computational Social ScienceScience, 2009
- Real-time epidemic forecasting for pandemic influenzaEpidemiology and Infection, 2006