Discovering Health Topics in Social Media Using Topic Models
Open Access
- 1 August 2014
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 9 (8), e103408
- https://doi.org/10.1371/journal.pone.0103408
Abstract
By aggregating self-reported health statuses across millions of users, we seek to characterize the variety of health information discussed in Twitter. We describe a topic modeling framework for discovering health topics in Twitter, a social media website. This is an exploratory approach with the goal of understanding what health topics are commonly discussed in social media. This paper describes in detail a statistical topic model created for this purpose, the Ailment Topic Aspect Model (ATAM), as well as our system for filtering general Twitter data based on health keywords and supervised classification. We show how ATAM and other topic models can automatically infer health topics in 144 million Twitter messages from 2011 to 2013. ATAM discovered 13 coherent clusters of Twitter messages, some of which correlate with seasonal influenza (r = 0.689) and allergies (r = 0.810) temporal surveillance data, as well as exercise (r = .534) and obesity (r = −.631) related geographic survey data in the United States. These results demonstrate that it is possible to automatically discover topics that attain statistically significant correlations with ground truth data, despite using minimal human supervision and no historical data to train the model, in contrast to prior work. Additionally, these results demonstrate that a single general-purpose model can identify many different health topics in social media.This publication has 24 references indexed in Scilit:
- Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messagesLanguage Resources and Evaluation, 2012
- Probabilistic topic modelsCommunications of the ACM, 2012
- Associations Between Displayed Alcohol References on Facebook and Problem Drinking Among College StudentsArchives of Pediatrics & Adolescent Medicine, 2012
- Social and News Media Enable Estimation of Epidemiological Patterns Early in the 2010 Haitian Cholera OutbreakThe American Journal of Tropical Medicine and Hygiene, 2012
- Online Social Networks and Smoking Cessation: A Scientific Research AgendaJournal of Medical Internet Research, 2011
- Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse CulturesScience, 2011
- Public Health Surveillance of Dental Pain via TwitterJournal of Dental Research, 2011
- Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDAPLOS ONE, 2011
- Election Forecasts With TwitterSocial Science Computer Review, 2010
- Dissemination of health information through social networks: Twitter and antibioticsAmerican Journal of Infection Control, 2010