Developing a standardized protocol for computational sentiment analysis research using health-related social media data
- 22 December 2020
- journal article
- research article
- Published by Oxford University Press (OUP) in Journal of the American Medical Informatics Association
- Vol. 28 (6), 1125-1134
- https://doi.org/10.1093/jamia/ocaa298
Abstract
Sentiment analysis is a popular tool for analyzing health-related social media content. However, existing studies exhibit numerous methodological issues and inconsistencies with respect to research design and results reporting, which could lead to biased data, imprecise or incorrect conclusions, or incomparable results across studies. This article reports a systematic analysis of the literature with respect to such issues. The objective was to develop a standardized protocol for improving the research validity and comparability of results in future relevant studies. We developed the Protocol of Analysis of senTiment in Health (PATH) based on a systematic review that analyzed common research design choices and how such choices were made, or reported, among eligible studies published 2010-2019. Of 409 articles screened, 89 met the inclusion criteria. A total of 16 distinctive research design choices were identified, 9 of which have significant methodological or reporting inconsistencies among the articles reviewed, ranging from how relevance of study data was determined to how the sentiment analysis tool selected was validated. Based on this result, we developed the PATH protocol that encompasses all these distinctive design choices and highlights the ones for which careful consideration and detailed reporting are particularly warranted. A substantial degree of methodological and reporting inconsistencies exist in the extant literature that applied sentiment analysis to analyzing health-related social media data. The PATH protocol developed through this research may contribute to mitigating such issues in future relevant studies.Funding Information
- National Center for Research Resources and the National Center for Advancing Translational Sciences of the National Institutes of Health (UL1TR001414)
This publication has 37 references indexed in Scilit:
- Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug useJournal of Biomedical Informatics, 2015
- Inside Chronic Autoimmune Disease CommunitiesPublished by Association for Computing Machinery (ACM) ,2015
- Sentiment analysis in medical settings: New opportunities and challengesArtificial Intelligence in Medicine, 2015
- Finding influential users of online health communities: a new metric based on sentiment influenceJournal of the American Medical Informatics Association, 2014
- Pharmaceutical drugs chatter on Online Social NetworksJournal of Biomedical Informatics, 2014
- Text classification for assisting moderators in online health communitiesJournal of Biomedical Informatics, 2013
- A New Dimension of Health Care: Systematic Review of the Uses, Benefits, and Limitations of Social Media for Health CommunicationJournal of Medical Internet Research, 2013
- The Psychological Meaning of Words: LIWC and Computerized Text Analysis MethodsJournal of Language and Social Psychology, 2009
- Opinion Mining and Sentiment AnalysisFoundations and Trends® in Information Retrieval, 2008