High-Throughput Prediction of MHC Class I and II Neoantigens with MHCnuggets

1 March 2020

journal article
research article
Published by American Association for Cancer Research (AACR) in Cancer Immunology Research

Vol. 8 (3), 396-408
https://doi.org/10.1158/2326-6066.cir-19-0464

Abstract

Computational prediction of binding between neoantigen peptides and major histocompatibility complex (MHC) proteins can be used to predict patient response to cancer immunotherapy. Current neoantigen predictors focus on in silico estimation of MHC binding affinity and are limited by low predictive value for actual peptide presentation, inadequate support for rare MHC alleles, and poor scalability to high-throughput data sets. To address these limitations, we developed MHCnuggets, a deep neural network method that predicts peptide–MHC binding. MHCnuggets can predict binding for common or rare alleles of MHC class I or II with a single neural network architecture. Using a long short-term memory network (LSTM), MHCnuggets accepts peptides of variable length and is faster than other methods. When compared with methods that integrate binding affinity and MHC-bound peptide (HLAp) data from mass spectrometry, MHCnuggets yields a 4-fold increase in positive predictive value on independent HLAp data. We applied MHCnuggets to 26 cancer types in The Cancer Genome Atlas, processing 26.3 million allele–peptide comparisons in under 2.3 hours, yielding 101,326 unique predicted immunogenic missense mutations (IMM). Predicted IMM hotspots occurred in 38 genes, including 24 driver genes. Predicted IMM load was significantly associated with increased immune cell infiltration (P < 2 × 10⁻¹⁶), including CD8⁺ T cells. Only 0.16% of predicted IMMs were observed in more than 2 patients, with 61.7% of these derived from driver mutations. Thus, we describe a method for neoantigen prediction and its performance characteristics and demonstrate its utility in data sets representing multiple human cancers.

Keywords

Other Versions

Funding Information

NIH (CA121113, CA006973, CA180950)

This publication has 64 references indexed in Scilit:

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome
BMC Bioinformatics, 2011
Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes
Immunogenetics, 2011
Major histocompatibility complex class I binding predictions as a tool in epitope discovery
Immunology, 2010
Derivation of an amino acid similarity matrix for peptide:MHC binding and its application as a Bayesian prior
BMC Bioinformatics, 2009
Antibody-based targeting of FGFR3 in bladder carcinoma and t(4;14)-positive multiple myeloma in mice
JCI Insight, 2009
NetMHCpan, a method for MHC class I binding prediction beyond humans
Immunogenetics, 2008
NetMHC-3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11
Nucleic Acids Research, 2008
Knockdown by shRNA identifies S249C mutant FGFR3 as a potential therapeutic target in bladder cancer
Oncogene, 2007
Mutation of the PIK3CA oncogene in human cancers
British Journal of Cancer, 2006
Reliable prediction of T‐cell epitopes using neural networks with novel sequence representations
Protein Science, 2003

Cited by 102 articles