Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers
Open Access
- 22 October 2010
- journal article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 11 (1), 529
- https://doi.org/10.1186/1471-2105-11-529
Abstract
The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time.Keywords
This publication has 21 references indexed in Scilit:
- The Impact of Genetic Architecture on Genome-Wide Evaluation MethodsGenetics, 2010
- Common SNPs explain a large proportion of the heritability for human heightNature Genetics, 2010
- EM algorithm for Bayesian estimation of genomic breeding valuesBMC Genomic Data, 2010
- Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and PedigreeGenetics, 2009
- A fast algorithm for BayesB type of prediction of genome-wide estimates of genetic valueGenetics Selection Evolution, 2009
- Efficient Methods to Compute Genomic PredictionsJournal of Dairy Science, 2008
- Predicting Type 2 Diabetes Based on Polymorphisms From Genome-Wide Association StudiesDiabetes, 2008
- Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's diseaseNature Genetics, 2008
- Bayesian LASSO for Quantitative Trait Loci MappingGenetics, 2008
- Genome-wide association analysis identifies 20 loci that influence adult heightNature Genetics, 2008