Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study
Top Cited Papers
Open Access
- 14 April 2020
- journal article
- research article
- Published by JMIR Publications Inc. in JMIR Public Health and Surveillance
- Vol. 6 (2), e18828-198
- https://doi.org/10.2196/18828
Abstract
Background The recent global outbreak of coronavirus disease (COVID-19) is affecting many countries worldwide. Iran is one of the top 10 most affected countries. Search engines provide useful data from populations, and these data might be useful to analyze epidemics. Utilizing data mining methods on electronic resources’ data might provide a better insight into the COVID-19 outbreak to manage the health crisis in each country and worldwide. Objective This study aimed to predict the incidence of COVID-19 in Iran. Methods Data were obtained from the Google Trends website. Linear regression and long short-term memory (LSTM) models were used to estimate the number of positive COVID-19 cases. All models were evaluated using 10-fold cross-validation, and root mean square error (RMSE) was used as the performance metric. Results The linear regression model predicted the incidence with an RMSE of 7.562 (SD 6.492). The most effective factors besides previous day incidence included the search frequency of handwashing, hand sanitizer, and antiseptic topics. The RMSE of the LSTM model was 27.187 (SD 20.705). Conclusions Data mining algorithms can be employed to predict trends of outbreaks. This prediction might support policymakers and health care managers to plan and allocate health care resources accordingly.This publication has 14 references indexed in Scilit:
- Studying the influence of mass media and environmental factors on influenza virus transmission in the US MidwestPublic Health, 2019
- Modification of the Conventional Influenza Epidemic Models Using Environmental Parameters in IranHealthcare Informatics Research, 2019
- STUDY OF FINGERPRINTS PATTERN IN BREAST CANCER PATIENTS INSHARKIA GOVERNORATE, A CASE –CONTROL RETROSPECTIVE CLINICAL STUDY.Zagazig University Medical Journal, 2018
- Forecasting Zika Incidence in the 2016 Latin America Outbreak Combining Traditional Disease Surveillance with Search, Social Media, and News Report DataPLoS Neglected Tropical Diseases, 2017
- Dynamic Forecasting of Zika Epidemics Using Google TrendsPLOS ONE, 2017
- Utilizing Nontraditional Data Sources for Near Real-Time Estimation of Transmission Dynamics During the 2015-2016 Colombian Zika Virus Disease OutbreakJMIR Public Health and Surveillance, 2016
- Accurate estimation of influenza epidemics using Google search data via ARGOProceedings of the National Academy of Sciences of the United States of America, 2015
- Combining Search, Social Media, and Traditional Data Sources to Improve Influenza SurveillancePLoS Computational Biology, 2015
- Improved Study of Heart Disease Prediction System using Data Mining Classification TechniquesInternational Journal of Computer Applications, 2012
- ADaM: a data mining toolkit for scientists and engineersComputers & Geosciences, 2005