Cancer vaccines targeting mutated tumor proteins are an emerging personalized medicine. In a number of clinical trials evaluating these therapies, including several at our institution, each patient's vaccine is individually formulated based on the unique mutations present in his or her tumor. As experimentally testing vaccine immunogenicity is infeasible in this setting, these therapies rely on computational prediction of vaccine immunogenicity. In this talk I will discuss recent work to accurately predict T cell epitopes, focusing on development of the MHCflurry software package for CD8+ T cell epitope prediction (https://github.com/openvax/mhcflurry). I will also touch on other, less studied modeling tasks that may help improve cancer vaccines in the future.
Vision and reflection on Mining Software Repositories research in 2024
Therapeutic Cancer Vaccines: Where Predictive Models Matter
1. Therapeutic Cancer Vaccines: Where
Predictive Models Matter
Tim O’Donnell
NEC Laboratories Europe GmbH
Oct 15, 2021
1
2. The OpenVax Project
● Clinical Trials: help run personalized cancer vaccine trials at
Mount Sinai
● Software: open source tools for cancer genomics and
neoantigen prediction
○ www.github.com/openvax/
● Research: improve methods for predicting the immune
response to tumor antigens
2
Tim O’Donnell
Mount Sinai
Julia Kodysh
Mount Sinai
Alex Rubinsteyn
UNC Chapel Hill
Nina Bhardwaj
Mount Sinai
9. Short history of cancer immunotherapy
9
Alex Rubinsteyn
20th century
Dark Age of
radiation and
chemotherapy
1850s-1890s
Infection & fever =>
tumor regression?
1893
Coley’s Toxins
(complete response in
~10% of sarcomas)
2010s
~20 approved cancer
immunotherapies
10. Cancer immunotherapy
10
Alex Rubinsteyn
Checkpoint blockade Cellular therapies Vaccines
Disinhibit T-cells.
Antigens responsible for tumor
clearance typically unknown.
Success stories:
● 𝛂CTLA-4 (ipi)
● 𝛂PD-1 (pembro, nivo, cemi)
● 𝛂PD-L1 (atezo, ave, durva)
Ex-vivo expansion of patient T-cells
after receptor engineering and/or
selection.
Success stories:
● CAR T-cells for B-cell
malignancies (CD19, CD20,
CD22, BCMA)
Therapeutic vaccines against
specific tumor antigens, including
patient-specific mutated tumor
antigens.
Success stories:
● ???
● Hints of efficacy in neoantigen
vaccine trials
11. Anti-PD1 vs. chemotherapy in metastatic melanoma
11
Robert et al. Nivolumab in Previously Untreated Melanoma without BRAF Mutation. NEJM 2014
12. Combination checkpoint blockade in melanoma
12
Wolchok et al. Overall Survival with Combined Nivolumab and Ipilimumab in Advanced Melanoma. NEJM 2017
13. Combination checkpoint blockade in melanoma
13
Wolchok et al. Overall Survival with Combined Nivolumab and Ipilimumab in Advanced Melanoma. NEJM 2017
Additional
therapies needed
14. Cancer vaccines
● Elicit a new T cell response against tumor
antigens
● May also reinvigorate pre-existing
exhausted T cells
● Once tumor killing gets going, more T cell
can come along
14
Hu Z, Ott P, Wu C. Nature Reviews Immunology 2017
16. First generation shared antigen vaccines unsuccessful
16
Rosenberg et al. Cancer immunotherapy: moving beyond current vaccines. Nature Medicine 2004
30. MHC binding restricts the space of possible epitopes
30
Only about 5% of peptides bind to MHC
strongly enough to be presented.
31. Peptide binding to MHC can be measured in vitro
31
The binding preferences for hundreds of MHC alleles have been characterized using in vitro affinity
measurements
32. Mass spec is another source of MHC binding data
32
Purcell et al. Nature Protocols 2019
40. Can we do better?
Larger training datasets enable more sophisticated models
1. MHC binding prediction
2. Antigen processing prediction
40
Measurements
42. Motivation for peptide encoding
Binding predictor input encoding
42
O’Donnell et al. Cell Systems 2020
HLA-A*02:01
binding the 15-mer
peptide
FLNKDLEVDGHFVT
M
HLA-A*02:01
binding the 9-mer
peptide LLFGYPVYV
43. Binding predictor architecture
43
O’Donnell et al. Cell Systems 2020
Tricks
● Training loss: MSE with inequalities
● Pretrain on synthetic measurements from allele-
specific predictor (99 alleles)
● Random negatives to equalize number of non-
binder points for each peptide length per allele
● Early stopping
● Dropout after each dense layer (50%)
● Skip connections
● L1 regularization on dense layers
● Ensembles Training loss: MSE with inequalities
Mass spec hits are assigned “< 100nM”
49. Training set generation for AP predictor
49
Consider only peptides
predicted to be top 2% in
binding affinity
Model learns to predict which
ones are actually detected in
MS experiments
50. AP predictor convolutional neural network
50
Intuition: MHC I ligands
must be cleaved at their
termini but not at interior
residues
55. Antigen processing motif
55
Known bias:
Prefers C-terminal Y, F, L, R
Disfavors C-terminal D, E, N, S
Enriched
Depleted
Uebel et al PNAS 1997
TAP
56. Antigen processing motif
56
Cleaves after
● Chymotryptic: F, Y, L, W, but not G
● Tryptic: R, K
● Caspase: D, E
Enriched
Depleted
Nussbaum et al. 1998; Harris et al. 2001
Proteasome
57. Antigen processing motif
57
Known bias:
Unable to cleave the X-Proline bond. Can
trim until there is a P at the second
position
Enriched
Depleted
Serwold et al. Nature 2002
ERAP
62. Neoantigen prediction
62
Robbins et al. Nature Medicine 2013, Tran et al. Science 2015, Gros et al. Nature Medicine 2016, Koşaloğlu-Yalçın et al. Oncoimmunology 2018
Steve Rosenberg group (NCI)
● 18 patients with melanoma or
gastrointestinal cancers
● 2,841 mutations screened
● 52 identified CD8+ T cell epitopes
Unbiased: mutations screened without
use of MHC binding prediction
65. Evaluation on viral epitopes
● Evaluation of CD8+ T cell epitopes
deposited in IEDB
● 1,380 epitopes + 527 non-epitopes
● Non-epitopes derive from the same proteins
as the epitopes and were assayed in the
same studies
● Surprisingly, MHCflurry 2.0 BA outperforms
MHCflurry 2.0 PS
● Suggests that to some extent the learned
antigen processing signals may be specific to
self proteins
65
66. Conclusions
● Larger training datasets and better modeling have enabled more accurate prediction of peptides
presented on MHC class I
● Antigen processing can be learned from MHC-presented peptides identified by mass spec. The
resulting predictors show agreement with the known biases of key processing steps
● Integration of antigen processing prediction with MHC binding can improve prediction of MHC-
presented peptides and tumor neoantigens
● Still significant room for improvement in CD8+ T cell epitope prediction
66
68. Sources of tumor T cell antigens
● Viral antigens
● Highly expressed genes in tumor cells (TAAs)
● Bacterial antigens
● Cancer-cell specific aberrations in...
○ The genome (mutation derived neoantigens)
○ Regulation of transcription (cancer testis antigens, endogenous retroviruses)
○ Splicing (intron retention, exon skipping)
○ RNA editing and RNA modifications
○ Translation (W-bumps)
○ Post translational modifications (phosphopeptides)
○ Antigen processing (?)
○ Metabolism (?)
○ … sensitivity to drugs that impact any of the above
68
69. Sources of tumor T cell antigens
● Viral antigens
● Highly expressed genes in tumor cells (TAAs)
● Bacterial antigens
● Cancer-cell specific aberrations in...
○ The genome (mutation derived neoantigens)
○ Regulation of transcription (cancer testis antigens, endogenous retroviruses)
○ Splicing (intron retention, exon skipping)
○ RNA editing and RNA modifications
○ Translation (W-bumps)
○ Post translational modifications (phosphopeptides)
○ Antigen processing (?)
○ Metabolism (?)
○ … sensitivity to drugs that impact any of the above
69
Predictors
needed
70. Emerging data identifies new candidate antigens
70
Reference Antigens Sequencing Mass spec
Griffin, …, Bernstein Nature 2021 Transposable elements (esp. LTR)
de-repressed by SETDB1 KO
RNA-seq, ATAC-seq MHC I MS, whole cell
lysate MS
Cuevas, …, Yewdell Cell Reports
2021
Novel isoforms, lncRNAs,
frameshifts
RNA-seq, ribo-seq MHC I MS, whole cell
lysate MS
Ouspenskaia, ... , Regev Biorxiv
2020
lncRNAs, pseudogenes, UTRs RNA-seq, ribo-seq MHC I MS
Chong, …, Bassani-Sternberg
Nature Communications 2020
lncRNAs, pseudogenes, UTRs,
TEs
WES, RNA-seq, sc-
RNA-seq, ribo-seq
MHC I and II MS
Laumont, …, Perreault Science
Translational Medicine 2018
lncRNAs, endogenous
retroelements
RNA-seq MHC I MS
71. Personalized tumor antigen detection from RNA-seq
Given tumor RNA-seq, identify candidate tumor specific antigens
71
Patient
RNA-seq
Database of tumor-
specific translation
products
Predicted
translated
peptides
Search
Vaccine prioritization
(MHC binding prediction,
expression levels)
Vaccine
72. Personalized tumor antigen detection from RNA-seq
Given tumor RNA-seq, identify candidate tumor specific antigens
72
Patient
RNA-seq
Database of tumor-
specific translation
products
Predicted
translated
peptides
Search
Vaccine prioritization
(MHC binding prediction,
expression levels)
Vaccine
Building the database is the main effort here
Experimenting with: logistic regression on RNA-seq k-mers
for each possible antigen to predict translation
73. Perspectives
● MHC binding prediction is reasonably well solved, T cell epitope prediction is not
● Immune monitoring from cancer vaccine studies will be useful to improve T cell epitope prediction
● Emerging high throughput readouts of TCR/pMHC interaction will eventually enable models of
TCR/pMHC binding
● For peptide vaccines, peptide pharmacokinetics likely has a huge impact on immunogenicity
● Other vaccine platforms such as mRNA will likely outperform peptide vaccines
● Casting a wider net for additional kinds of tumor antigens is likely to enable a new generation of
cancer vaccines - including semi-personalized vaccines
73
74. Thank you!
74
Nina Bhardwaj (Mount Sinai)
Alex Rubinsteyn (UNC Chapel Hill)
Julia Kodysh (Mount Sinai)
https://github.com/openvax/mhcflurry