Measures of human population structure show heterogeneity among genomic regions

Abstract
Estimates of genetic population structure (FST) were constructed from all autosomes in two large SNP data sets. The Perlegen data set contains genotypes on ∼1 million SNPs segregating in all three samples of Americans of African, Asian, and European descent; and the Phase I HapMap data set contains genotypes on ∼0.6 million SNPs segregating in all four samples from specific Caucasian, Chinese, Japanese, and Yoruba populations. Substantial heterogeneity of FST values was found between segments within chromosomes, although there was similarity between the two data sets. There was also substantial heterogeneity among population-specific FST values, with the relative sizes of these values often changing along each chromosome. Population-structure estimates are often used as indicators of natural selection, but the analyses presented here show that individual-marker estimates are too variable to be useful. There is inherent variation in these statistics because of variation in genealogy even among neutral loci, and values at pairs of loci are correlated to an extent that reflects the linkage disequilibrium between them. Furthermore, it may be that the best indications of selection will come from population-specific FST values rather than the usually reported population-average values.