Methods to identify population outliers using genetic markers

Abstract
Introduction Genetic studies to identify linkage or association usually assume participants are sampled from a genetically homogeneous population, so that a single set of marker allele frequencies is appropriate for all individuals in the study. Methods We have developed a method to identify individuals who are population outliers, because the marker allele frequency distributions from which their genotypes arise differ from the distributions of the remaining individuals in the study. Using allele frequencies estimated from an independent sample, the genotype log likelihood (GLL) test statistic calculates the likelihood of each individual’s genotypes across all markers. Extreme values of the statistic indicate that the individual arises from a different population. The distribution of the test statistic is derived and its convergence under the central limit theorem discussed. Results and Discussion This method was applied to genome search data from rheumatoid arthritis which identified a single population outlier family. We used allele frequencies from different populations to show that 100 markers provides high power to identify outliers across a range of populations. The GLL test statistic can be used as a screening tool to identify outlier families in any genetic study with genotyping at independent markers.

This publication has 5 references indexed in Scilit: