Self-reported ethnicity, genetic structure and the impact of population stratification in a multiethnic study
- 25 May 2010
- journal article
- research article
- Published by Springer Science and Business Media LLC in Human Genetics
- Vol. 128 (2), 165-177
- https://doi.org/10.1007/s00439-010-0841-4
Abstract
It is well-known that population substructure may lead to confounding in case–control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of five ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed four important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case–control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs.Keywords
This publication has 40 references indexed in Scilit:
- Identification of a new prostate cancer susceptibility locus on chromosome 8q24Nature Genetics, 2009
- Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in MexicoProceedings of the National Academy of Sciences of the United States of America, 2009
- Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in AmericaHuman Mutation, 2008
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- TRAF1–C5as a Risk Locus for Rheumatoid Arthritis — A Genomewide StudyThe New England Journal of Medicine, 2007
- A Genomewide Admixture Map for Latino PopulationsAmerican Journal of Human Genetics, 2007
- Multiple regions within 8q24 independently affect risk for prostate cancerNature Genetics, 2007
- A Genomewide Single-Nucleotide–Polymorphism Panel with High Ancestry Information for African American Admixture MappingAmerican Journal of Human Genetics, 2006
- Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American menProceedings of the National Academy of Sciences of the United States of America, 2006
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006