Overview of techniques to account for confounding due to population stratification and cryptic relatedness in genomic data association analyses

14 July 2010

journal article
review article
Published by Springer Science and Business Media LLC in Heredity

Vol. 106 (4), 511-519
https://doi.org/10.1038/hdy.2010.91

Abstract

Population-based genomic association analyses are more powerful than within-family analyses. However, population stratification (unknown or ignored origin of individuals from multiple source populations) and cryptic relatedness (unknown or ignored covariance between individuals because of their relatedness) are confounding factors in population-based genomic association analyses, which inflate the false-positive rate. As a consequence, false association signals may arise in genomic data association analyses for reasons other than true association between the tested genomic factor (marker genotype, gene or protein expression) and the study phenotype. It is therefore important to correct or account for these confounders in population-based genomic data association analyses. The common correction techniques for population stratification and cryptic relatedness problems are presented here in the phenotype–marker association analysis context, and comments on their suitability for other types of genomic association analyses (for example, phenotype–expression association) are also provided. Even though many of these techniques have originally been developed in the context of human genetics, most of them are also applicable to model organisms and breeding populations.

Keywords

This publication has 137 references indexed in Scilit:

Identification of differentially expressed spatial clusters using humoral response microarray data
Computational Statistics & Data Analysis, 2009
Quantifying and correcting for the winner's curse in genetic association studies
Genetic Epidemiology, 2009
Increased accuracy of artificial selection by using the realized relationship matrix
Genetics Research, 2009
Genotype‐based matching to correct for population stratification in large‐scale case‐control genetic association studies
Genetic Epidemiology, 2009
On the Use of General Control Samples for Genome-wide Association Studies: Genetic Matching Highlights Causal Variants
American Journal of Human Genetics, 2008
Comment on a Simple and Improved Correction for Population Stratification
American Journal of Human Genetics, 2008
A Unified Association Analysis Approach for Family and Unrelated Samples Correcting for Stratification
American Journal of Human Genetics, 2008
A Randomization Test for Controlling Population Stratification in Whole-Genome Association Studies
American Journal of Human Genetics, 2007
Principal components analysis corrects for stratification in genome-wide association studies
Nature Genetics, 2006
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness
Nature Genetics, 2005

Cited by 70 articles