Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms

Abstract
The availability of dense single nucleotide polymorphism (SNP) genotypes for dairy cattle has created exciting research opportunities and revolutionized practical breeding programs. Broader application of this technology will lead to situations in which genotypes from different low-, medium-, or high-density platforms must be combined. In this case, missing SNP genotypes can be imputed using family- or population-based algorithms. Our objective was to evaluate the accuracy of imputation in Jersey cattle, using reference panels comprising 2,542 animals with 43,385 SNP genotypes and study samples of 604 animals for which genotypes were available for 1, 2, 5, 10, 20, 40, or 80% of loci. Two population-based algorithms, fastPHASE 1.2 (P. Scheet and M. Stevens; University of Washington TechTransfer Digital Ventures Program, Seattle, WA) and IMPUTE 2.0 (B. Howie and J. Marchini; Department of Statistics, University of Oxford, UK), were used to impute genotypes on Bos taurus autosomes 1, 15, and 28. The mean proportion of genotypes imputed correctly ranged from 0.659 to 0.801 when 1 to 2% of genotypes were available in the study samples, from 0.733 to 0.964 when 5 to 20% of genotypes were available, and from 0.896 to 0.995 when 40 to 80% of genotypes were available. In the absence of pedigrees or genotypes of close relatives, the accuracy of imputation may be modest (generally 40,000 SNP) from a reference population. Accurate imputation of high-density genotypes from inexpensive low- or medium-density platforms could greatly enhance the efficiency of whole-genome selection programs in dairy cattle.
Funding Information
  • National Research Initiative competitive (grant no. 2009-35205-05099)