Genomic selection in admixed and crossbred populations 1

1 January 2010

journal article
Published by Oxford University Press (OUP) in Journal of Animal Science

Vol. 88 (1), 32-46
https://doi.org/10.2527/jas.2009-1975

Abstract

In livestock, genomic selection (GS) has primarily been investigated by simulation of purebred populations. Traits of interest are, however, often measured in crossbred or mixed populations with uncertain breed composition. If such data are used as the training data for GS without accounting for breed composition, estimates of marker effects may be biased due to population stratification and admixture. To investigate this, a genome of 100 cM was simulated with varying marker densities (5 to 40 segregating markers per cM). After 1,000 generations of random mating in a population of effective size 500, 4 lines with effective size 100 were isolated and mated for another 50 generations to create 4 pure breeds. These breeds were used to generate combined, F₁, F₂, 3- and 4-way crosses, and admixed training data sets of 1,000 individuals with phenotypes for an additive trait controlled by 100 segregating QTL and heritability of 0.30. The validation data set was a sample of 1,000 genotyped individuals from one pure breed. Method Bayes-B was used to simultaneously estimate the effects of all markers for breeding value estimation. With 5 (40) markers per cM, the correlation of true with estimated breeding value of selection candidates (accuracy) was greatest, 0.79 (0.85), when data from the same pure breed were used for training. When the training data set consisted of crossbreds, the accuracy ranged from 0.66 (0.79) to 0.74 (0.83) for the 2 marker densities, respectively. The admixed training data set resulted in nearly the same accuracies as when training was in the breed to which selection candidates belonged. However, accuracy was greatly reduced when genes from the target pure breed were not included in the admixed or crossbred population. This implies that, with high-density markers, admixed and crossbred populations can be used to develop GS prediction equations for all pure breeds that contributed to the population, without a substantial loss of accuracy compared with training on purebred data, even if breed origin has not been explicitly taken into account. In addition, using GS based on high-density marker data, purebreds can be accurately selected for crossbred performance without the need for pedigree or breed information. Results also showed that haplotype segments with strong linkage disequilibrium are shorter in crossbred and admixed populations than in purebreds, providing opportunities for QTL fine mapping. Copyright © 2010. American Society of Animal Science .

Keywords

This publication has 39 references indexed in Scilit:

Incorporating Desirable Genetic Characteristics From an Inferior Into a Superior Population Using Genomic Selection
Genetics, 2009
Genomic selection of purebreds for crossbred performance
Genetics Selection Evolution, 2009
An assessment of population structure in eight breeds of cattle using a whole genome SNP panel
BMC Genomic Data, 2008
Accuracy of Genomic Selection Using Different Methods to Define Haplotypes
Genetics, 2008
Genetic and Haplotypic Structure in 14 European and African Cattle Breeds
Genetics, 2007
Principal components analysis corrects for stratification in genome-wide association studies
Nature Genetics, 2006
Epistasis and the release of genetic variation during long-term selection
Nature Genetics, 2006
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness
Nature Genetics, 2005
Epistasis: too often neglected in complex trait studies?
Nature Reviews Genetics, 2004
Linkage disequilibrium in finite populations
Theoretical and Applied Genetics, 1968

Cited by 165 articles