Improved Inference for Respondent-Driven Sampling Data With Application to HIV Prevalence Estimation

1 March 2011

journal article
Published by Informa UK Limited in Journal of the American Statistical Association

Vol. 106 (493), 135-146
https://doi.org/10.1198/jasa.2011.ap09475

Abstract

Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing representations, our approach respects the essential without-replacement feature of the process, while converging to an existing with-replacement representation for small sample fractions, and to the sample mean for a full-population sample. We present a successive-sampling based estimator for population means based on respondent-driven sampling data, and demonstrate its superior performance when the size of the hidden population is known. We present sensitivity analyses for unknown population sizes. In addition, we note that like other existing estimators, our new estimator is subject to bias induced by the selection of the initial sample. Using data collected among three populations in two countries, we illustrate the application of this approach to populations with varying characteristics. We conclude that the successive sampling estimator improves on existing estimators, and can also be used as a diagnostic tool when population size is not known. This article has supplementary material online.

Keywords

This publication has 31 references indexed in Scilit:

7. Respondent-Driven Sampling: An Assessment of Current Methodology
Sociological Methodology, 2010
Link and subgraph likelihoods in random undirected networks with fixed and partially fixed degree sequences
Physical Review E, 2007
Applying the Horvitz-Thompson criterion in complex designs: A computer-intensive perspective for estimating inclusion probabilities
Biometrika, 2006
Effectiveness of Respondent-Driven Sampling for Recruiting Drug Users in New York City: Findings from a Pilot Study
Journal of Urban Health, 2006
Generation of uncorrelated random scale-free networks
Physical Review E, 2005
Cut-offs and finite size effects in scale-free networks
Zeitschrift für Physik B Condensed Matter, 2004
Uncorrelated random networks
Physical Review E, 2003
Connected Components in Random Graphs with Given Expected Degree Sequences
Annals of Combinatorics, 2002
Nonparametric Inference Under Biased Sampling from a Finite Population
The Annals of Statistics, 1992
Estimation of Finite Population Properties When Sampling is Without Replacement and Proportional to Magnitude
Journal of the American Statistical Association, 1986

Cited by 216 articles