Inference and analysis of haplotypes from combined genotyping studies deposited in dbSNP
Open Access
- 26 October 2005
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 15 (11), 1594-1600
- https://doi.org/10.1101/gr.4297805
Abstract
In the attempt to understand human variation and the genetic basis of complex disease, a tremendous number of single nucleotide polymorphisms (SNPs) have been discovered and deposited into NCBI's dbSNP public database. More than 2.7 million SNPs in the database have genotype information. This data provides an invaluable resource for understanding the structure of human variation and the design of genetic association studies. The genotypes deposited to dbSNP are unphased, and thus, the haplotype information is unknown. We applied the phasing method HAP to obtain the haplotype information, block partitions, and tag SNPs for all publicly available genotype data and deposited this information into the dbSNP database. We also deposited the orthologous chimpanzee reference sequence for each predicted haplotype block computed using the UCSC BLASTZ alignments of human and chimpanzee. Using dbSNP, researchers can now easily perform analyses using multiple genotype data sets from the same genomic regions. Dense and sparse genotype data sets from the same region were combined to show that the number of common haplotypes is significantly underestimated in whole genome data sets, while the predicted haplotypes over the common SNPs are consistent between studies. To validate the accuracy of the predictions, we benchmarked HAP's running time and phasing accuracy against PHASE. Although HAP is slightly less accurate than PHASE, HAP is over 1000 times faster than PHASE, making it suitable for application to the entire set of genotypes in dbSNP.Keywords
This publication has 21 references indexed in Scilit:
- Haplotype Diversity across 100 Candidate Genes for Inflammation, Lipid Metabolism, and Blood Pressure Regulation in Two PopulationsAmerican Journal of Human Genetics, 2004
- The International HapMap ProjectNature, 2003
- Single Nucleotide Variation Analysis in 65 Candidate Genes for CNS Disorders in a Representative Sample of the European PopulationGenome Research, 2003
- Large-scale genotyping of complex DNANature Biotechnology, 2003
- The Structure of Haplotype Blocks in the Human GenomeScience, 2002
- A dynamic programming algorithm for haplotype block partitioningProceedings of the National Academy of Sciences of the United States of America, 2002
- Bayesian Haplotype Inference for Multiple Linked Single-Nucleotide PolymorphismsAmerican Journal of Human Genetics, 2002
- Blocks of Limited Haplotype Diversity Revealed by High-Resolution Scanning of Human Chromosome 21Science, 2001
- A New Statistical Method for Haplotype Reconstruction from Population DataAmerican Journal of Human Genetics, 2001
- Variation is the spice of lifeNature Genetics, 2001