A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations
Open Access
- 23 August 2012
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 8 (8), e1002886
- https://doi.org/10.1371/journal.pgen.1002886
Abstract
Multivariate statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been widely used to summarize the structure of human genetic variation, often in easily visualized two-dimensional maps. Many recent studies have reported similarity between geographic maps of population locations and MDS or PCA maps of genetic variation inferred from single-nucleotide polymorphisms (SNPs). However, this similarity has been evident primarily in a qualitative sense; and, because different multivariate techniques and marker sets have been used in different studies, it has not been possible to formally compare genetic variation datasets in terms of their levels of similarity with geography. In this study, using genome-wide SNP data from 128 populations worldwide, we perform a systematic analysis to quantitatively evaluate the similarity of genes and geography in different geographic regions. For each of a series of regions, we apply a Procrustes analysis approach to find an optimal transformation that maximizes the similarity between PCA maps of genetic variation and geographic maps of population locations. We consider examples in Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, as well as in a worldwide sample, finding that significant similarity between genes and geography exists in general at different geographic levels. The similarity is highest in our examples for Asia and, once highly distinctive populations have been removed, Sub-Saharan Africa. Our results provide a quantitative assessment of the geographic structure of human genetic variation worldwide, supporting the view that geography plays a strong role in giving rise to human population structure. The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.Keywords
This publication has 48 references indexed in Scilit:
- Genomic Patterns of Homozygosity in Worldwide Human PopulationsAmerican Journal of Human Genetics, 2012
- Inference of Unexpected Genetic Relatedness among Individuals in HapMap Phase IIIAmerican Journal of Human Genetics, 2010
- Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP VariationAmerican Journal of Human Genetics, 2009
- Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association StudiesAmerican Journal of Human Genetics, 2009
- Genome-wide Insights into the Patterns and Determinants of Fine-Scale Population Structure in HumansAmerican Journal of Human Genetics, 2009
- The Genome-wide Patterns of Variation Expose Significant Substructure in a Founder PopulationAmerican Journal of Human Genetics, 2008
- Japanese Population Structure, Based on SNP Genotypes from 7003 Individuals Compared to Other Ethnic Groups: Effects on Population-Based Association StudiesAmerican Journal of Human Genetics, 2008
- The Population Reference Sample, POPRES: A Resource for Population, Disease, and Pharmacological Genetics ResearchAmerican Journal of Human Genetics, 2008
- Correlation between Genetic and Geographic Structure in EuropeCurrent Biology, 2008
- The population history of the Xibe in northern China: A comparison of autosomal, mtDNA and Y-chromosomal analyses of migration and gene flowForensic Science International: Genetics, 2007