Correcting for ascertainment bias in the inference of population structure

Abstract
Background: The ascertainment process of molecular markers amounts to disregard loci carrying alleles with low frequencies. This can result in strong biases in inferences under population genetics models if not properly taken into account by the inference algorithm. Attempting to model this censoring process in view of making inference of population structure (i.e.identifying clusters of individuals) brings up challenging numerical difficulties. Method: These difficulties are related to the presence of intractable normalizing constants in Metropolis–Hastings acceptance ratios. This can be solved via an Markov chain Monte Carlo (MCMC) algorithm known as single variable exchange algorithm (SVEA). Result: We show how this general solution can be implemented for a class of clustering models of broad interest in population genetics that includes the models underlying the computer programs STRUCTURE, GENELAND and GESTE. We also implement the method proposed for a simple example and show that it allows us to reduce the bias substantially. Availability: Further details and a computer program implementing the method are available from http://folk.uio.no/gillesg/AscB/ Contact:gilles.guillot@bio.uio.no