Estimating genome-wide IBD sharing from SNP data via an efficient hidden Markov model of LD with application to gene mapping

Open Access

1 June 2010

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 26 (12), i175-i182
https://doi.org/10.1093/bioinformatics/btq204

Abstract

Motivation: Association analysis is the method of choice for studying complex multifactorial diseases. The premise of this method is that affected persons contain some common genomic regions with similar SNP alleles and such areas will be found in this analysis. An important disadvantage of GWA studies is that it does not distinguish between genomic areas that are inherited from a common ancestor [identical by descent (IBD)] and areas that are identical merely by state [identical by state (IBS)]. Clearly, areas that can be marked with higher probability as IBD and have the same correlation with the disease status of identical areas that are more probably only IBS, are better candidates to be causative, and yet this distinction is not encoded in standard association analysis. Results: We develop a factorial hidden Markov model-based algorithm for computing genome-wide IBD sharing. The algorithm accepts as input SNP data of measured individuals and estimates the probability of IBD at each locus for every pair of individuals. For two g-degree relatives, when g≥8, the computation yields a precision of IBD tagging of over 50% higher than previous methods for 95% recall. Our algorithm uses a first-order Markovian model for the linkage disequilibrium process and employs a reduction of the state space of the inheritance vector from being exponential in g to quadratic. The higher accuracy along with the reduced time complexity marks our method as a feasible means for IBD mapping in practical scenarios. Availability: A software implementation, called IBDMAP, is freely available at http://bioinfo.cs.technion.ac.il/IBDmap. Contact: sberco@gmail.com

Keywords

This publication has 40 references indexed in Scilit:

Speeding up HMM algorithms for genetic linkage analysis via chain reductions of the state space
Bioinformatics, 2009
Rapid and Accurate Multiple Testing Correction and Power Estimation for Millions of Correlated Markers
PLoS Genetics, 2009
Maximizing power in association studies
Nature Biotechnology, 2009
Increasing power in association studies by using linkage disequilibrium structure and molecular function as prior information
Genome Research, 2008
A second generation human haplotype map of over 3.1 million SNPs
Nature, 2007
PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses
American Journal of Human Genetics, 2007
Online System for Faster Multipoint Linkage Analysis via Parallel Execution on Thousands of Personal Computers
American Journal of Human Genetics, 2006
Mapping complex disease loci in whole-genome association studies
Nature, 2004
Contents Vol. 21, 2001
American Journal of Nephrology, 2001
Estimating the Dimension of a Model
The Annals of Statistics, 1978

Cited by 23 articles