Sampling Twitter users for social science research: evidence from a systematic review of the literature

Open Access

27 January 2023

journal article
review article
Published by Springer Science and Business Media LLC in Quality & Quantity

p. 1-41
https://doi.org/10.1007/s11135-023-01615-w

Abstract

All social media platforms can be used to conduct social science research, but Twitter is the most popular as it provides its data via several Application Programming Interfaces, which allows qualitative and quantitative research to be conducted with its members. As Twitter is a huge universe, both in number of users and amount of data, sampling is generally required when using it for research purposes. Researchers only recently began to question whether tweet-level sampling—in which the tweet is the sampling unit—should be replaced by user-level sampling—in which the user is the sampling unit. The major rationale for this shift is that tweet-level sampling does not consider the fact that some core discussants on Twitter are much more active tweeters than other less active users, thus causing a sample biased towards the more active users. The knowledge on how to select representative samples of users in the Twitterverse is still insufficient despite its relevance for reliable and valid research outcomes. This paper contributes to this topic by presenting a systematic quantitative literature review of sampling plans designed and executed in the context of social science research in Twitter, including: (1) the definition of the target populations, (2) the sampling frames used to support sample selection, (3) the sampling methods used to obtain samples of Twitter users, (4) how data is collected from Twitter users, (5) the size of the samples, and (6) how research validity is addressed. This review can be a methodological guide for professionals and academics who want to conduct social science research involving Twitter users and the Twitterverse.

Funding Information

ISCTE – Instituto Universitário

This publication has 48 references indexed in Scilit:

The benefits of publishing systematic quantitative literature reviews for PhD candidates and other early-career researchers
Higher Education Research & Development, 2013
Towards more systematicTwitteranalysis: metrics for tweeting activities
International Journal of Social Research Methodology, 2013
Paradata for Nonresponse Adjustment
The Annals of the American Academy of Political and Social Science, 2012
Don't turn social media into another 'Literary Digest' poll
Communications of the ACM, 2011
Computational social science
WIREs Computational Statistics, 2010
Total Survey Error: Design, Implementation, and Evaluation
Public Opinion Quarterly, 2010
Web Survey Methods: Introduction
Public Opinion Quarterly, 2008
Electronic Survey Methodology: A Case Study in Reaching Hard-to-Involve Internet Users
International Journal of Human–Computer Interaction, 2003
Applied Sampling
Published by Elsevier BV ,1983
Sampling Methods for Random Digit Dialing
Journal of the American Statistical Association, 1978