Effects of reduced panel, reference origin, and genetic relationship on imputation of genotypes in Hereford cattle

Abstract
The objective of this study was to investigate alternative methods of designing and using reduced SNP panels for imputing SNP genotypes. Two purebred Hereford populations, an experimental population known as Line 1 Hereford (L1, n = 240) and registered Hereford with American Hereford Association (AHA, n = 311), were used. Using different reference samples of 62 to 311 animals with 39,497 SNP on 29 autosomes and study samples of 57 or 62 animals for which genotypes were available for ∼2,600 SNP (reduced panels), imputations were performed to predict the other ∼36,900 loci that had been masked. An imputation package, including LinkPHASE and DAGPHASE, was used for imputation. Four reduced panels differing in minor allele frequency (MAF) and marker spacing were evaluated. Reduced panels included every 15th SNP across the genome (SNP_space), commercial Illumina Bovine3K Beadchip (SNP_3K), SNP with the highest MAF (SNP_MAF), and SNP with high MAF that were also evenly spaced across the genome (SNP_MS). Imputation accuracy was defined as the correlation of imputed genotypes and real genotypes. Reference samples were either from L1 or AHA. Among animals with genotypes, genetic relationships were estimated based on molecular marker genotypes or pedigree. Reduced panel design, number of animals in the reference sample, reference origin and genetic relationship between animals in the reference, and study samples all affected imputation accuracy (P < 0.001). Across genotyping schemes, imputed genotypes from SNP_MS had the greatest accuracy. A 0.1 increase in average pedigree relationship or average molecular relationship between reference and study samples increased imputation accuracy 10 to 20%. Using reference samples from the L1 population resulted in lower imputation accuracy than using reference samples from the admixed population AHA (P < 0.001). Increasing the number of animals in the reference panel by 100 individuals increased imputation accuracy by 8% when pedigree relationship was used as a covariate and 6% when molecular relationship was used as a covariate. We concluded that imputation accuracy would be increased through optimization of reduced panel design and genotyping strategy. Copyright © 2012. American Society of Animal Science .