Estimating the Size of Populations at High Risk for HIV Using Respondent-Driven Sampling Data

Abstract
The study of hard‐to‐reach populations presents significant challenges. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. This is especially true of populations at high risk for HIV/AIDS. Respondent‐driven sampling (RDS) is often used in such settings with the primary goal of estimating the prevalence of infection. In such populations, the number of people at risk for infection and the number of people infected are of fundamental importance. This article presents a case‐study of the estimation of the size of the hard‐to‐reach population based on data collected through RDS. We study two populations of female sex workers and men‐who‐have‐sex‐with‐men in El Salvador. The approach is Bayesian and we consider different forms of prior information, including using the UNAIDS population size guidelines for this region. We show that the method is able to quantify the amount of information on population size available in RDS samples. As separate validation, we compare our results to those estimated by extrapolating from a capture–recapture study of El Salvadorian cities. The results of our case‐study are largely comparable to those of the capture–recapture study when they differ from the UNAIDS guidelines. Our method is widely applicable to data from RDS studies and we provide a software package to facilitate this.
Funding Information
  • NICHD (1R21HD063000, 5R21HD075714-02)
  • ONR (N00014-08-1-1015)
  • NSF (MMS-0851555 MMS-1357619, SES-1230081)
  • National Agricultural Statistics Service
  • National Institute of Child Health and Human Development (R24-HD041022)
  • National Science Foundation