Efficient and accurate KIR and HLA genotyping with massively parallel sequencing data

Abstract
Killer immunoglobulin-like receptor (KIR) genes and human leukocyte antigen (HLA) genes play important roles in innate and adaptive immunity. They are highly polymorphic and cannot be genotyped with standard variant calling pipelines. Compared with HLA genes, many KIR genes are similar to each other in sequences and may be absent in the chromosomes. Therefore, while many tools have been developed to genotype HLA genes using common sequencing data, none of them works for KIR genes. Even the specialized KIR genotypers could not resolve all the KIR genes. Here we describe T1K, a novel computational method for the efficient and accurate inference of KIR or HLA alleles from RNA-seq, whole genome sequencing or whole exome sequencing data. T1K jointly considers alleles across all genotyped genes, so it can reliably identify present genes and distinguish homologous genes, including the challenging KIR2DL5A/KIR2DL5B genes. This model also benefits HLA genotyping, where T1K achieves the highest accuracy in benchmarks. Moreover, T1K can call novel single nucleotide variants and process single-cell data. Applying T1K to tumor single-cell RNA-seq data, we found that KIR2DL4 expression was enriched in tumor-specific CD8+ T cells. T1K may open the opportunity for HLA and KIR genotyping across various sequencing applications.
Funding Information
  • National Cancer Institute (1R01CA245318, 1R01CA258524)
  • National Institutes of Health (R01HG010040, U01HG010961, R01HG011139, U01CA226196)
  • National Institute of General Medical Sciences (P20GM130454)