Digital Influenza Surveillance: The Prospects of Google Trends Data for South Africa

Abstract
Influenza (flu) epidemics remain a public health concern around the world. Conventional disease surveillance systems are slow and resource intensive, since they rely on data from medical practitioners. To address these issues, recent research have explored alternative data sources, such as web search queries. In this paper, we investigate the potential of Google Trends (GT) for flu surveillance in South Africa. GT is an open-source tool maintained by Google for analyzing historical Google search engine queries. Although recent studies have shown significant correlation between GT data and actual flu surveillance records in some countries, the results cannot be generalized due to differences in culture and use of technology. This suggests the need to validate previous research findings across geographical and cultural contexts. Such studies focusing on African countries are scarce. To our knowledge, none has been reported for South Africa. Focusing on South Africa, we collected GT data for 244 flu-related queries covering all 11 official South African (SA) languages, over the period 2010–2018. Pearson's correlation analysis was performed to assess the correlation of these GT data with national flu surveillance data (influenza-like illness (ILI) and laboratory-confirmed influenza cases (LCIC) data) over each epidemiological year. The latter was provided for this study by the South African National Institute for Communicable Diseases (NICD). The study finds sufficient correlation (≥0.5) between GT data for 21 terms and ILI data, and 19 terms for LCIC. A few terms even recorded nearly 90% correlation. Our list can be used as proxy in practice. In all, the study establishes the potential of Google Trends as a complementary, faster and cheaper data source for influenza surveillance in South Africa.