SlideShare a Scribd company logo
1 of 31
Download to read offline
Predicting peptide/MHC interactions:
        Application to epitope identification and
                     vaccine design
                                   Or
             Finding the needle in the haystack

                      Morten Nielsen
           Center for Biological Sequence Analysis
              Department of Systems biology
              Technical University of Denmark
                      mniel@cbs.dtu.dk




Bridging between two worlds


  “Para serte sincero, no creo en este
  approach bioinformatico a la
  inmunologia, ...”

   “Hablar de detectar epitopes a partir del genoma de una bacteria
     entera me parece muy complicado. Me parece impracticable y
  "misleading", en el sentido de que puede quitar fondos, esfuerzos y
    atencion a las vias lentas pero seguras de llegar a este proposito
                      por metodos experimentales.”



                                                        FG, 2006
Vaccines have been
made for 36 of >400
human pathogens




                                                            +HPV & Rotavirus

Immunological Bioinformatics, The MIT press.




Deaths from
infectious diseases
in the world in 2002




www.who.int/entity/whr/2004/annex/topic/en/annex_2_en.pdf
The human immune system




Vaccine review
MHC Class I pathway
  Finding the needle in the haystack
                                       1/200 peptides make
                                       to the surface




Figure by Eric A.J. Reits




MHC-I molecules present peptides
on the surface of most cells




 Figure courtesy Mette Voldby Larsen
CTL response




             Figure courtesy Mette Voldby Larsen




The death of an infected cell
Antigen Discovery




                             Lauemøller et al., 2000




Influenza A virus (A/Goose/Guangdong/1/96(H5N1))


>Segment 1
             Genome
agcaaaagcaggtcaattatattcaatatggaaagaataaaagaactaagagatctaatg
tcgcagtcccgcactcgcgagatactaacaaaaaccactgtggatcatatggccataatc
aagaaatacacatcaggaagacaagagaagaaccctgctctcagaatgaaatggatgatg
gcaatgaaatatccaatcacagcagacaagagaataatggagatgattcctgaaaggaat


and 13350 other nucleotides on 8 segments
                                                                     9mer
                                                                    peptides
>polymerase“
               Proteins                                        MERIKELRD
                                                               ERIKELRDL

MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITAD   RIKELRDLM

KRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFE   IKELRDLMS

KVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSE   KELRDLMSQ

SQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW   ELRDLMSQS

EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQ   LRDLMSQSR

NPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGY   RDLMSQSRT

EEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNF   DLMSQSRTR

...                                                            LMSQSRTRE


and 9 other proteins                                           and 4376 other 9mers
Experimental validation of Bioinformatics predictions

                         HLA                      Elispot assay1        Elispot assay2
Peptide      Sequence    Restriction   KD (nM) + peptide - peptide   + peptide - peptide
PB1591-599 VSDGGPNLY     HLA-A1           6    18 ± 2    3±3         12 ± 4     1±1
NP44-52      CTELKLSDY   HLA-A1           7    34 ± 5    4±1         13 ± 4     0±0
PB1166-174 FLKDVMESM     HLA-A2          51    74 ± 10   11 ± 6      140 ± 36   20 ± 7
PB141-49     DTVNRTHQY   HLA-A26          6    40 ± 3    20 ± 7      38 ± 5     24 ± 3
PB1540-548 GPATAQMAL     HLA-B7           6    7±2       2±1         13 ± 2     6±1
NP225-233    ILKGKFQTA   HLA-B8          664   9±4       1±1         19 ± 7     2±2
PA601-609    SVKEKDMTK   HLA-B8          NB    23 ± 6    1±1         119 ± 8    2±1
PB1349-357 ARLGKGYMF     HLA-B27         246   10 ± 6    1±1         14 ± 4     1±1
NP383-391    SRYWAIRTR   HLA-B27         38    39 ± 6    1±1         40 ± 6     2±1
M1173-181    IRHENRMVL   HLA-B39         13    14 ± 3    3±1         84 ± 11    3±1
NP199-207    RGINDRNFW   HLA-B58         42    28 ± 5    1±1         15 ± 6     2±2
PB1347-355 KMARLGKGY HLA-B62             178   77 ± 20   3±2         91 ± 8     10 ± 3
PB1566-574 TQIQTRRSF HLA-B62             88    15 ± 5    2±2         21 ± 2     2±0



 Wang et al., 2006




 Epitope based vaccines and diagnostics

     • Challenges
            • Identify epitopes in pathogen genome
                 • A small viral genome contains >> 1000 potential CTL
                   epitopes
                 • Bacteria genomes contain 100.000 distinct
                   peptides
            • HLA diversity
                 • No two humans will induce the same reaction to a
                   pathogen infection
            • Viral escape and viral genomic diversity
                 • No two viral strains will “host” the same set of T
                   cell epitopes
HLA diversity.
Expression of HLA is co-dominant




MHC specificity




Figure courtesy of Can Kesmir
HLA polymorphism




HLA polymorphism

 • Few human beings will share the same set
   of HLA alleles
   – Different persons will react to a pathogen
     infection in a different manner
 • A T cell based vaccine must include
   epitopes specific for each HLA allele in a
   population
   – A peptide based vaccine must consist of many
     100 HLA class I epitopes
   – (and ~1000 class II epitopes)
Figure by Anne Mølgaard




           HLA binding motif
              SLLPAIVEL   YLLPAIVHI   TLWVDPYEV   GLVPFLVSV   KLLEPVLLL   LLDVPTAAV   LLDVPTAAV   LLDVPTAAV
              LLDVPTAAV   VLFRGGPRG   MVDGTLLLL   YMNGTMSQV   MLLSVPLLL   SLLGLLVEV   ALLPPINIL   TLIKIQHTL
              HLIDYLVTS   ILAPPVVKL   ALFPQLVIL   GILGFVFTL   STNRQSGRQ   GLDVLTAKV   RILGAVAKV   QVCERIPTI
              ILFGHENRV   ILMEHIHKL   ILDQKINEV   SLAGGIIGV   LLIENVASL   FLLWATAEA   SLPDFGISY   KKREEAPSL
              LERPGGNEI   ALSNLEVKL   ALNELLQHV   DLERKVESL   FLGENISNF   ALSDHHIYL   GLSEFTEYL   STAPPAHGV
              PLDGEYFTL   GVLVGVALI   RTLDKVLEV   HLSTAFARV   RLDSYVRSL   YMNGTMSQV   GILGFVFTL   ILKEPVHGV
              ILGFVFTLT   LLFGYPVYV   GLSPTVWLS   WLSLLVPFV   FLPSDFFPS   CLGGLLTMV   FIAGNSAYE   KLGEFYNQM
              KLVALGINA   DLMGYIPLV   RLVTLKDIV   MLLAVLYCL   AAGIGILTV   YLEPGPVTA   LLDGTATLR   ITDQVPFSV
              KTWGQYWQV   TITDQVPFS   AFHHVAREL   YLNKIQNSL   MMRKLAILS   AIMDKNIIL   IMDKNIILK   SMVGNWAKV
              SLLAPGAKQ   KIFGSLAFL   ELVSEFSRM   KLTPLCVTL   VLYRYGSFS   YIGEVLVSV   CINGVCWTV   VMNILLQYV
              ILTVILGVL   KVLEYVIKV   FLWGPRALV   GLSRYVARL   FLLTRILTI   HLGNVKYLV   GIAGGLALL   GLQDCTMLV
              TGAPVTYST   VIYQYMDDL   VLPDVFIRC   VLPDVFIRC   AVGIGIAVV   LVVLGLLAV   ALGLGLLPV   GIGIGVLAA
              GAGIGVAVL   IAGIGILAI   LIVIGILIL   LAGIGLIAA   VDGIGILTI   GAGIGVLTA   AAGIGIIQI   QAGIGILLA
              KARDPHSGH   KACDPHSGH   ACDPHSGHF   SLYNTVATL   RGPGRAFVT   NLVPMVATV   GLHCYEQLV   PLKQHFQIV
              AVFDRKSDA   LLDFVRFMG   VLVKSPNHV   GLAPPQHLI   LLGRNSFEV   PLTFGWCYK   VLEWRFDSR   TLNAWVKVV
              GLCTLVAML   FIDSYICQV   IISAVVGIL   VMAGVGSPY   LLWTLVVLL   SVRDRLARL   LLMDCSGSI   CLTSTVQLV
              VLHDDLLEA   LMWITQCFL   SLLMWITQC   QLSLLMWIT   LLGATCMFV   RLTRFLSRV   YMDGTMSQV   FLTPKKLQC
              ISNDVCAQV   VKTDGNPPE   SVYDFFVWL   FLYGALLLA   VLFSSDFRI   LMWAKIGPV   SLLLELEEV   SLSRFSWGA
              YTAFTIPSI   RLMKQDFSV   RLPRIFCSC   FLWGPRAYA   RLLQETELV   SLFEGIDFY   SLDQSVVEL   RLNMFTPYI
              NMFTPYIGV   LMIIPLINV   TLFIGSHVV   SLVIVTTFV   VLQWASLAV   ILAKFLHWL   STAPPHVNV   LLLLTVLTV
              VVLGVVFGI   ILHNGAYSL   MIMVKCWMI   MLGTHTMEV   MLGTHTMEV   SLADTNSLA   LLWAARPRL   GVALQTMKQ
              GLYDGMEHL   KMVELVHFL   YLQLVFGIE   MLMAQEALA   LMAQEALAF   VYDGREHTV   YLSGANLNL   RMFPNAPYL
              EAAGIGILT   TLDSQVMSL   STPPPGTRV   KVAELVHFL   IMIGVLVGV   ALCRWGLLL   LLFAGVQCQ   VLLCESTAV
              YLSTAFARV   YLLEMLWRL   SLDDYNHLV   RTLDKVLEV   GLPVEYLQV   KLIANNTRV   FIYAGSLSA   KLVANNTRL
              FLDEFMEGV   ALQPGTALL   VLDGLDVLL   SLYSFPEPE   ALYVDSLFF   SLLQHLIGL   ELTLGEFLK   MINAYLDKL
              AAGIGILTV   FLPSDFFPS   SVRDRLARL   SLREWLLRI   LLSAWILTA   AAGIGILTV   AVPDEIPPL   FAYDGKDYI
              AAGIGILTV   FLPSDFFPS   AAGIGILTV   FLPSDFFPS   AAGIGILTV   FLWGPRALV   ETVSEQSNV   ITLWQRPLV
HLA binding specificity
                        High information
                        positions




     If we have binding data, can we accurate describe
     the binding specificity!




HLA specificity clustering



 A0201
                               A6802




 A0101                          B0702
Coverage of HLA alleles




 Supertype Selected allele
 A1                     A*0101
 A2                     A*0201
 A3                     A*1101
 A24                    A*2401
 A26 (new*)             A*2601
 B7                     B*0702
 B8 (new*)              B*0801
 B27                    B*2705
 B39(new*)              B*3901
 B44                    B*4001
 B58                    B*5801
 B62                    B*1501   Clustering in: O Lund et al., Immunogenetics. 2004 55:797-810




Data




 • Alleles characterized with 5 or more data points
 • 3% covered
HLA polymorphism!
B0807   B4804   B0710   B1513   A6817   B5130   A0204   B3503   A2415   B0740   B3929   A0250   B5204   A2420   B1804   B3523   B3502   A3202   B0802   A3601   B4047
A6601   A0268   B0817   B5002   B5602   B3811   B4810   A0103   B1530   B4415   A3111   B7803   A6804   B3520   B3528   A2610   A6802   A2404   A7406   B0744   B3701
B4058   B1803   B1527   B3801   A6826   B5606   B0725   B5603   A0110   B1586   A3205   A0212   B3511   A2603   B5120   A0251   A3106   A6801   B5135   B1567   B4012
A3401   B5106   B3912   B1525   B5703   B4402   B0733   A2901   B0711   A6603   B3907   B4023   B2717   B4507   B4502   B4807   A2438   B1312   B1590   A0258   B5310
B5124   B4103   B0811   B3927   B4104   A1110   B1553   A2621   B5115   B1599   A0102   B5102   A0207   B4444   A3002   A6813   B5709   B5515   B4439   B1561   A2618
B2728   A3404   A6820   A3107   A2430   A0235   A2914   B1301   B4004   A2620   B1573   A0259   B0804   B1548   A2616   B5401   B0707   A2453   A2609   B3554   A0245
B4411   A0220   B1510   A2433   B5512   B5306   B1540   B5114   B3934   B5510   B1521   B0810   B5137   B3932   B4802   B4044   B3709   B3915   B2729   B3810   A0238
B0729   B3537   A2314   B0734   B3702   A0214   B4805   A0269   A3102   B5206   A6819   B3707   A3011   A1123   B1822   A6823   A4301   B3917   B4702   B5118   B3708
A0265   B5203   A3013   B3530   B4701   B4061   A0316   B4814   B2710   A7411   B3930   B0702   B5702   A1107   B7801   A0246   B3534   A0228   B1596   A3305   B2711
B3526   B4445   A0216   B1539   A3308   A2455   A0206   B4605   B2725   A0310   B4037   A1104   A2622   B5607   B4504   B4602   B1598   A3112   B0813   B5113   A0237
A3602   B0805   A6808   B4505   B1544   A0285   A3108   B5402   B6701   A6901   B0730   B4056   B5205   B1310   B5805   B1404   A2435   A2614   A7405   B1520   B3920
A0254   B2702   A6815   A3201   B1570   A0255   B5708   B4033   B4435   A2405   B4007   B4034   B4806   B5615   A0218   B3527   B3512   B0814   B5301   A6829   B4904
B4038   A0304   A7408   B7805   B3549   B1503   B4420   A1120   B1815   B5129   B0801   B0827   B5001   A3402   A0314   B4405   A2305   B4438   B4052   B0823   A8001
B1302   B4021   A2909   B3933   B4408   B4105   B0727   B5508   B4108   A3405   B1315   B3517   A1116   B0731   B4053   B1516   B4704   B1403   A6830   B5610   A3009
B0714   B1303   B1566   B2714   B3923   B5801   A2439   B2719   A0219   A2602   A2413   B1821   A0260   B4410   A6605   B1309   B8202   B4426   A2623   B4042   B1805
B3902   A2503   B1536   A0302   A3209   A0205   B2715   B5131   A0262   A6805   B5201   A1119   B1402   A0270   A2450   A1111   A3008   B3806   A6822   A0202   B5503
B0826   B3926   A2428   A1114   A2414   A3301   A0239   B4054   B0825   A0308   B3563   A0305   B4036   B1589   B1314   B1563   B4005   A3104   B4440   B5122   A3206
B7804   B0718   B4446   B4905   B9509   A0112   A0256   A6604   B4029   B1807   B5901   A2906   B1304   B3501   A2502   B5509   B4107   B2707   A0117   B4032   B3914
B3509   A3306   A6602   B1504   B5611   A2904   B3535   A2447   B6702   B1572   A2417   B1811   A2452   B3542   A2612   B1542   B1507   B5406   B3911   A2421   A2443
B4404   A3015   B5704   B4437   B4427   B8101   B4002   B3901   A1103   B3928   A2408   A6827   B1517   B0824   B1576   B4601   A2303   B4811   B4003   A2605   B1505
B4808   A7407   B1809   A0222   B4031   B1511   B4429   B1564   A2406   B1515   B5601   A2301   B4101   B3506   A0113   B5710   A7404   B3531   A0201   B4902   B1581
A2907   B4431   A0252   B4102   A2601   A6825   B5116   B5608   B4201   B5110   B4422   B2720   B2727   A3304   B1306   A2425   B5501   A0233   B0736   A2423   B1549
A1109   B3558   B5134   B5139   A0289   B5121   B4208   A0271   B2705   A2407   B4501   B3550   A2410   B2706   B1552   A1101   A0273   B1546   B3905   B4409   B5808
A2313   B0706   B1534   B5138   B0803   A2429   B5507   A6810   B1405   B2713   B3547   B4013   A3003   B5119   A3010   B0726   A3204   B3552   B3802   A3105   B4062
B4018   B4403   B1550   A0317   B4432   B4433   B3551   B9505   B8201   A3303   B5804   B4008   A0208   A0230   B1819   B2726   B3533   B4428   B5404   A0267   B1529
B4046   A0106   B9507   B3505   B4016   B3922   A7410   B1509   B0822   A3012   A0319   B4503   B5207   B1531   B3904   A2910   B5613   B0717   A2403   A2912   B3510
B0818   B5806   B0724   B7802   B3561   B0728   B1585   B2730   B4030   B4604   B3513   B3809   B5403   B3529   A2617   A3110   B5128   B3504   B3924   B3539   B5511
B5103   B5109   B5604   B1575   A3007   A2627   B3536   A2437   B3805   B4812   A1113   B5518   B3803   A0313   B3514   B9502   A6816   B3808   A2911   A0108   B1524
A2606   B1578   B1538   A2504   B1813   B4407   A0244   B1556   B5307   A0272   A2608   B2723   A2913   A2619   A0231   B2721   B4051   B1551   B5112   B4035   B2701
A0209   B0806   B4418   A2454   A2902   B8301   B4057   B5520   A2903   A6824   B1545   A0275   B4417   A0114   B3548   A0322   B0732   B4059   B3918   A0241   B5132
A2444   B4430   B0739   A3006   B2724   B1818   A2418   A3103   B5514   B0723   A2456   B4060   B5308   B3559   B1547   B5616   B4205   A7402   B4421   B4001   B1597
B5101   B1308   B4406   B4015   A2309   B8102   B0720   B4813   B3557   A6812   A2419   A0277   B4703   B5605   B9506   B3545   A0261   A2615   B5504   B4436   A7403
B1502   B3935   A2312   B4441   A3307   B1592   B0703   B4803   B0708   B5133   B1587   A0225   B5311   B0745   B5519   A0263   B1562   A2458   A2501   B4020   B4009
A6803   A0278   A3004   B4606   B1574   B1535   B1583   B1820   B3909   A2427   B5208   A0234   B0715   B0743   B0709   B5305   A0236   A0274   A2310   B4901   B5706
A2441   B5126   A2426   A1102   A2446   A0307   B1554   A0318   A3001   B1588   B3524   B3936   B3519   B4603   A2442   B1812   A0227   A2424   B0741   A1117   B3546




HLA polymorphism!
                         B1513
                                        B3811
                                                                                                                                A3106
                B3912
                                                                                        B5102
                        A3107
                                                                                                                                B3709
                A2314
                                                                        A7411
                A0216
                                                A3108
                                                                        A2405
                                                                                                                                                B4052
                                B4408
                                                                                                                                        B4426
                        A0302
                                                                                                B4036
                                                                                B5901
                                        A2904
        A3001
                                                                        B1515
                                                                                B4422
                                                                                                                                A0273

        B4403
                                                                                                B5207

                                                                                                                B3514
        B1578
                                                                        A6824
                                B2724
                                                                                                        B5605
                                                                                                                                        A2458
                                                                                                                B0709
                                                                                                                A2442
Predicting the specificity
Align A3001 (365) versus A3002 (365). Aln score 2445.000 Aln len 365 Id 0.9890
     A3001    0 MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFSTSVSRPGSGEPRFIAVGYVDDTQFVRFDSDAA
                :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     A3002    0 MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFSTSVSRPGSGEPRFIAVGYVDDTQFVRFDSDAA

     A3001    65 SQRMEPRAPWIEQERPEYWDQETRNVKAQSQTDRVDLGTLRGYYNQSEAGSHTIQIMYGCDVGSD
                 :::::::::::::::::::::::::::: ::::: :::::::::::::::::::::::::::::
     A3002    65 SQRMEPRAPWIEQERPEYWDQETRNVKAHSQTDRENLGTLRGYYNQSEAGSHTIQIMYGCDVGSD

     A3001   130 GRFLRGYEQHAYDGKDYIALNEDLRSWTAADMAAQITQRKWEAARWAEQLRAYLEGTCVEWLRRY
                 ::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::
     A3002   130 GRFLRGYEQHAYDGKDYIALNEDLRSWTAADMAAQITQRKWEAARRAEQLRAYLEGTCVEWLRRY

     A3001   195 LENGKETLQRTDPPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPA
                 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     A3002   195 LENGKETLQRTDPPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPA

     A3001   260 GDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWELSSQPTIPIVGIIAGLVLLGAVITGA
                 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
     A3002   260 GDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWELSSQPTIPIVGIIAGLVLLGAVITGA

     A3001   325 VVAAVMWRRKSSDRKGGSYTQAASSDSAQGSDVSLTACKV
                 ::::::::::::::::::::::::::::::::::::::::
     A3002   325 VVAAVMWRRKSSDRKGGSYTQAASSDSAQGSDVSLTACKV




                 HLA-A*3001                             HLA-A*3002
NetMHCpan - a pan-specific method




               NetMHC                           NetMHCpan


   NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any
  HLA-A and -B Locus Protein of Known Sequence. Nielsen et al. PLoS ONE 2007




Example
Peptide         Amino acids of HLA pockets            HLA      Aff
VVLQQHSIA       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.131751
SQVSFQQPL       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.487500
SQCQAIHNV       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.364186
LQQSTYQLV       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.582749
LQPFLQPQL       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.206700
VLAGLLGNV       YFAVLTWYGEKVHTHVDTLVRYHY             A0201    0.727865
VLAGLLGNV       YFAVWTWYGEKVHTHVDTLLRYHY             A0202    0.706274
VLAGLLGNV       YFAEWTWYGEKVHTHVDTLVRYHY             A0203    1.000000
VLAGLLGNV       YYAVLTWYGEKVHTHVDTLVRYHY             A0206    0.682619
VLAGLLGNV       YYAVWTWYRNNVQTDVDTLIRYHY             A6802    0.407855
Prediction for novel HLA alleles

        MHC allele         A*8001    MHC allele         A*7401
        Sequence     KD-value (nM)   Sequence     KD-value (nM)
        HSNASTLLY               <1   RVYHLTWLR                1
        KVDWNQFTY               <1   TTMGWLFLK                1
        WMSNGTWNY               <1   MMHEFFGPR                3
        LTAHYCFLY                1   KTYAPLAFR                3

                                                                    75 - 100% accuracy
        GMFSWNLAY                3   HMMKRMSYR                4
        LVFLGPGLY                6   KVNNHLFHR               10
        MTDVDLNYY               10   MTMFVTASK               12
        VIAAIHNAY               36   MAMSNYLLR               14
        SMIYFFHHY            1,454   MVAGRTPFK               63
        LMDHWRGYK           16,543   IVFAFHFYR              188
        LSNFGYPGY              non   SVYFWWLNR              402




Evaluation. MHC ligands from SYFPEITHI




                                                   Sort on
                                                   binding


                                                                    Top Rank: F-rank=0.0
                                                                  Random Rank: F-rank=0.5
SYFPEITHI benchmark
(1400 ligands restricted to 46 HLA molecules)




Prediction Primate MHCs

  • Can we predict binding specificities for
    non-human primates using the NetMHCpan
    method trained on human specificity data
    only?
Yes. Monkey are just like humans
                                               Patr B*0101
      Patr A*0101




 Sidney et al. (2006)   Sidney et al. (2006)




And even Pigs and Cows are (somewhat)
like humans
So, we can find the needle in the haystack




  • Given a protein sequence and an HLA molecule, we can
  accurately predict with peptides will bind (70-95%)
  • 15-80% of these will in turn be epitopes




But, can we find the haystack?
Epitope based vaccines and diagnostics

 • Challenges
   • Identify epitopes in pathogen genome
      • A small viral genome contains >> 1000 potential CTL
        epitopes
   • HLA diversity
      • No two humans will induce the same reaction to a
        pathogen infection
   • Viral escape and viral genomic diversity
      • No two viral strains will “host” the same set of T
        cell epitopes




 Viral escape and pathogen variability

 The virus of today is different from the virus of
 tomorrow (Viral escape)

               ???
                ??
              ????




                Figure courtesy Mette Voldby Larsen
Pathogen variability




HIV Gag phylogenetic tree
                            Clade C
                                           Few peptides
                                           conserved
                                           between all
                                           viral strains




                                      Clade D
  Clade AE




       Clade A                   Clade B
Immuno-dominance
• Highly immunogenic
peptides
• High variability = easy
escapable
• Immune response useless         Dominance


                              Sub-dominance

• Weakly immunogenic
peptides
• Low variability = no
escapable
• Immune response highly
effective = good vaccine
candidates




Polyvalent vaccines

  • The equivalent of this in epitope based
    vaccines is to select epitopes in a way so
    that they together cover all strains.

      Uneven coverage, Average coverage = 2

                                              Epitope

      Strain 1
      Strain 2


       Even coverage, Average coverage = 2

       Strain 1
       Strain 2
EpiSelect




                                    Pi j
                             S =#  j
                                    G

                                i " + Ci




       !

Cross-clade immunogens
                Table 3 Highly immunogenic epitopes and there cross-clade recognition. 21 HLA-supertype
                restricted epitopes were highly immunogenic and induced a CTL-response in at least four subjects.
                The table shows the subtype the responding subjects were infected with and at which frequency the
                epitope sequence is found among the HIV-1 subtype reference strains.
                Epitope sequence    HLA-supertype The subtypes                    Frequence of the epitope sequence in
                                    & protein region of the responders            subtype1:
                                                                                  A        B        C        D       AE
                  QVPLRPMTY        A1-nef          B, B, C, D, AE, nd
                  LTDTTNQKT        A1-pol          B, B, B, C, C, AE
                  KIQNFRVYY        A1-pol          B, D, AE, nd
                  FLGKIWPSHK A2-gag                A1, A1, A1, B, B, B, B, C, AE, nd
                  SLYNTVATL        A2-gag          A1, B, B, B, C, C, C
                  GALDLSHFL        A2-nef, var. 12 A1, B, B, B, C, AG
                  AAVDLSHFL        A2-nef, var. 2 A1, B, B, B, AG
                  ILKEPVHGV        A2-pol          B, B, B, B, C, C, nd
                  QLTEAVQKI        A2-pol          B, C
                  AVDLSHFLK        A3-nef, var. 1 A1, B, D, nd
                  ALDLSHFLK        A3-nef, var. 2 A1, B, D, nd
                  AFDLSFFLK        A3-nef, var. 3 B, C, C, C, C, AE, AE
                  WYIKIFIII        A24-env         B B, B, C, C
                  HYMLKHLVW        A24-gag         A1, B, B, C
                  IPRRIRQGL        B7-env, var 1   A1, B, C, AE
                  IPRRIRQGF        B7-env, var 2   A1, B, AE, CPX06
                  HPVHAGPVA        B7-gag          A1, B, C, D
                  RALGPGATL        B7-gag          A1, B, C, D
                  TPQDLNTML        B7-pol          A1, B, C, C
                  SPAIFQSSM        B7-pol          A1, A1, B, C, C, D, AE
                  QEILDLWVY        B44-nef         A1, A1, B, B, B, C
                1
                  The color represents the frequencies of the exact epitopes sequence in the different subtypes; blue:
                0%, light blue: 1-24%, orange: 25-49% and red: >50%. 2Subtype variants of the same epitope. nd:
                not determined
Perez. et al. JI, 2008
All HIV responsive patients respond to at
least one of nine peptides




                Perez et al., JI, 2008




PopCover - Searching in two dimensions.
            HIV class II case story
 • Data
    – 396 full length genomes with annotated tat, nef, gag and
      pol proteins covering A(50), B(104) ,C(156), D(40) and
      AE(46) strains
 • HLA-DR frequencies taken from
    – 43 (allele frequency in at least one population > 2.5%) HLA
      class II alleles
       • 36 HLA-DRB1, HLA-DR3,4,5, and 4 HLA-DQ alleles
 • Select predicted peptide binders
    – 5608(tat), 20961(nef), 31848(gag),42748(pol)
 • Select peptides from each protein with optimal
   genomic and HLA coverage
    – tat(4), nef(15), gag(15) and pol(15)
EpiSelect and PoPCover

     • EpiSelect
                    Pi j
             S =#
                j
                 G

                i " + Ci
     The sum is over all genomes i. Pji is 1 if epitope j is present in genome i. Ci is
       the number of times genome i has been targeted in the already selected
       set of epitopes

!
     • PopCover
                             j
                            Rki " fk " gi
            S j
               A+ G
                      = $$
                        i k  # + Eik
     The sum is over all genomes i and HLA alleles k. Rjki is 1 if epitope j is present
       in genome i and is presented by allele k, and Eki is the number of times
       allele k has been targets by epitopes in genome i by the already selected
       set of epitopes, and gi is the genomes frequency
!




    Benchmark

     • Create 10,000 virtual patients with a given
       HIV genomic sequence and HLA alleles as
       defined by the HLA allele frequencies and
       HIV genomic data
     • Test how many of these patients that are
       targets by at least on of the selected
       peptides
HIV patient coverage




       •Selected peptide pools
         –tat(4), nef(15), gag(15) and pol(15)




So, have we found the haystack?
MTB (mycobacterium tuberculosis)

 • Bacterial genome coding for more then
   4000 proteins
 • 700 known epitopes, found in only 30
   proteins (ORFs)




MTB (mycobacterium tuberculosis)

 • Bacterial genome coding for more then
   4000 proteins
 • 700 known epitopes, found in only 30
   proteins (ORFs)
 • Is this biology, or history?
   – More than 500.000 unique 9mer peptides
   – Where to start?
     • Each HLA allele will binding ~5000 of these
       peptides..
Functional bias in TB epitope proteins




Functional bias in TB epitope proteins
Where are the epitopes?




So no we cannot find the haystack?




     But, this is the same problem faced by
              experimental methods!
Conclusions

 • Rational epitope discovery is feasible
    – Prediction methods are an important guide for
      epitope identification
    – Given a protein sequence and an HLA molecule, we can
      predict the peptide binders (find the needle in the
      haystack)
 • Pan-specific MHC prediction method can deal
   with the immense MHC polymorphism
 • Epitope selection strategies can deal with
   pathogen diversity
 • For large pathogens, we still have no handle on
   how to select immunogenic proteins (we cannot
   find the haystack)




CBS immunology web servers
www.cbs.dtu.dk/services
Acknowledgements
•Immunological Bioinformatics group,
CBS, DTU
    – Ole Lund - Group leader
    – Claus Lundegaard - Data bases, HLA
      binding predictions
• Collaborators
    – IMMI, University of Copenhagen
        • Søren Buus: MHC binding
    – La Jolla Institute of Allergy and
      Infectious Diseases
        • A. Sette, B. Peters: Epitope
          database
• and many, many more




                      www.cbs.dtu.dk/services

More Related Content

What's hot

Disease e pidemiology regional veterinary laboratory_pokhara_2074
Disease e pidemiology regional veterinary laboratory_pokhara_2074Disease e pidemiology regional veterinary laboratory_pokhara_2074
Disease e pidemiology regional veterinary laboratory_pokhara_2074krishnaacharya22
 
160316_pizzaclub_part2
160316_pizzaclub_part2160316_pizzaclub_part2
160316_pizzaclub_part2RSG Luxembourg
 
High dose zidovudine plus valganciclovir for kaposi sarcoma
High dose zidovudine plus valganciclovir for kaposi sarcomaHigh dose zidovudine plus valganciclovir for kaposi sarcoma
High dose zidovudine plus valganciclovir for kaposi sarcomaEdwin Alvarado
 
Stop biocuration
Stop biocurationStop biocuration
Stop biocurationtwittkop
 
FINAL POSTER BIOTECH
FINAL POSTER BIOTECHFINAL POSTER BIOTECH
FINAL POSTER BIOTECHEmine Taytas
 
Good correlation between vaccine match in potency tests and r1-Value (A. Dek...
Good correlation between vaccine match in potency tests and r1-Value  (A. Dek...Good correlation between vaccine match in potency tests and r1-Value  (A. Dek...
Good correlation between vaccine match in potency tests and r1-Value (A. Dek...EuFMD
 

What's hot (7)

Disease e pidemiology regional veterinary laboratory_pokhara_2074
Disease e pidemiology regional veterinary laboratory_pokhara_2074Disease e pidemiology regional veterinary laboratory_pokhara_2074
Disease e pidemiology regional veterinary laboratory_pokhara_2074
 
160316_pizzaclub_part2
160316_pizzaclub_part2160316_pizzaclub_part2
160316_pizzaclub_part2
 
Blood 2011-uldrick
Blood 2011-uldrickBlood 2011-uldrick
Blood 2011-uldrick
 
High dose zidovudine plus valganciclovir for kaposi sarcoma
High dose zidovudine plus valganciclovir for kaposi sarcomaHigh dose zidovudine plus valganciclovir for kaposi sarcoma
High dose zidovudine plus valganciclovir for kaposi sarcoma
 
Stop biocuration
Stop biocurationStop biocuration
Stop biocuration
 
FINAL POSTER BIOTECH
FINAL POSTER BIOTECHFINAL POSTER BIOTECH
FINAL POSTER BIOTECH
 
Good correlation between vaccine match in potency tests and r1-Value (A. Dek...
Good correlation between vaccine match in potency tests and r1-Value  (A. Dek...Good correlation between vaccine match in potency tests and r1-Value  (A. Dek...
Good correlation between vaccine match in potency tests and r1-Value (A. Dek...
 

Viewers also liked

EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010Annie De Groot
 
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...ILRI
 
From functional genomics to functional immunomics
From functional genomics to functional immunomicsFrom functional genomics to functional immunomics
From functional genomics to functional immunomicsshrivaishnavishankar1610
 
Informatics Of Immunity
Informatics Of ImmunityInformatics Of Immunity
Informatics Of ImmunityGeoffrey Siwo
 
Peptide Chip for Antibody Epitope Mapping
Peptide Chip for Antibody Epitope MappingPeptide Chip for Antibody Epitope Mapping
Peptide Chip for Antibody Epitope Mappinggroder
 
Project Training Ppt
Project Training PptProject Training Ppt
Project Training Pptbiinoida
 
Large Scale Epitope Identification Screen and Its Potential Application to th...
Large Scale Epitope Identification Screen and Its Potential Application to th...Large Scale Epitope Identification Screen and Its Potential Application to th...
Large Scale Epitope Identification Screen and Its Potential Application to th...National Alopecia Areata Foundation
 
Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthbiinoida
 

Viewers also liked (20)

Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
Signals of Evolution: Conservation, Specificity Determining Positions and Coe...Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
Signals of Evolution: Conservation, Specificity Determining Positions and Coe...
 
Prediction of heparin binding sites on GAPDH
Prediction of heparin binding sites on GAPDHPrediction of heparin binding sites on GAPDH
Prediction of heparin binding sites on GAPDH
 
Cooperatividad en la Expresión Génica: Abordaje Estocástico
Cooperatividad en la Expresión Génica: Abordaje EstocásticoCooperatividad en la Expresión Génica: Abordaje Estocástico
Cooperatividad en la Expresión Génica: Abordaje Estocástico
 
About using new descriptors for cheminformatics
About using new descriptors for cheminformaticsAbout using new descriptors for cheminformatics
About using new descriptors for cheminformatics
 
Modelado de la proteína p35 de toxoplasma gondii
Modelado de la proteína p35 de toxoplasma gondiiModelado de la proteína p35 de toxoplasma gondii
Modelado de la proteína p35 de toxoplasma gondii
 
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
Structural Order and Disorder Dictate Sequence And Functional Evolution of th...
 
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
 
La Unidad de Bioinformática del INTA
La Unidad de Bioinformática del INTALa Unidad de Bioinformática del INTA
La Unidad de Bioinformática del INTA
 
EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010EpiVax FastVax 23Feb2010
EpiVax FastVax 23Feb2010
 
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...
Immunoinformatics and MHC-Tetramers, revolutionary technologies for vaccine d...
 
From functional genomics to functional immunomics
From functional genomics to functional immunomicsFrom functional genomics to functional immunomics
From functional genomics to functional immunomics
 
Informatics Of Immunity
Informatics Of ImmunityInformatics Of Immunity
Informatics Of Immunity
 
Peptide Chip for Antibody Epitope Mapping
Peptide Chip for Antibody Epitope MappingPeptide Chip for Antibody Epitope Mapping
Peptide Chip for Antibody Epitope Mapping
 
Project Training Ppt
Project Training PptProject Training Ppt
Project Training Ppt
 
Large Scale Epitope Identification Screen and Its Potential Application to th...
Large Scale Epitope Identification Screen and Its Potential Application to th...Large Scale Epitope Identification Screen and Its Potential Application to th...
Large Scale Epitope Identification Screen and Its Potential Application to th...
 
Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 month
 
Maize poster
Maize posterMaize poster
Maize poster
 
Mhc And Antigens
Mhc And AntigensMhc And Antigens
Mhc And Antigens
 
Mhc
MhcMhc
Mhc
 
Bioinformatica Proteinas
Bioinformatica ProteinasBioinformatica Proteinas
Bioinformatica Proteinas
 

Similar to Predicting peptide/MHC interactions: Application to epitope identification and vaccine design

Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'Fundación Ramón Areces
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Affymetrix
 
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...Andrew Su
 
Stop biocuration
Stop biocurationStop biocuration
Stop biocurationtwittkop
 
Varda Rotter - Weizmann Institute of Science.
Varda Rotter - Weizmann Institute of Science.Varda Rotter - Weizmann Institute of Science.
Varda Rotter - Weizmann Institute of Science.Fundación Ramón Areces
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionDarya Vanichkina
 
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...ExternalEvents
 
Widespread human T cell receptor beta variable gene polymorphism: implication...
Widespread human T cell receptor beta variable gene polymorphism: implication...Widespread human T cell receptor beta variable gene polymorphism: implication...
Widespread human T cell receptor beta variable gene polymorphism: implication...Thermo Fisher Scientific
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...Scintica Instrumentation
 
J Immunol-2008-Pulecio-1135-42
J Immunol-2008-Pulecio-1135-42J Immunol-2008-Pulecio-1135-42
J Immunol-2008-Pulecio-1135-42Federica Benvenuti
 
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord Blood
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord BloodSergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord Blood
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord BloodSingapore Society for Haematology
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...AvactaLifeSciences
 
Undergraduate Research Symposium Poster
Undergraduate Research Symposium PosterUndergraduate Research Symposium Poster
Undergraduate Research Symposium PosterTim Krueger
 
Transplant Updates on donor and conditioning for Aplastic Anemia.
Transplant Updates on donor and conditioning for Aplastic Anemia.Transplant Updates on donor and conditioning for Aplastic Anemia.
Transplant Updates on donor and conditioning for Aplastic Anemia.spa718
 

Similar to Predicting peptide/MHC interactions: Application to epitope identification and vaccine design (20)

Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
Dr. José Baselga - Simposio Internacional 'Terapias oncológicas avanzadas'
 
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
Use of Affymetrix Arrays (GeneChip® Human Transcriptome 2.0 Array and Cytosca...
 
P53 poster
P53 posterP53 poster
P53 poster
 
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
BOSC2017: Using Wikidata as an open, community-maintained database of biomedi...
 
Stop biocuration
Stop biocurationStop biocuration
Stop biocuration
 
Varda Rotter - Weizmann Institute of Science.
Varda Rotter - Weizmann Institute of Science.Varda Rotter - Weizmann Institute of Science.
Varda Rotter - Weizmann Institute of Science.
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolution
 
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
Applications of Whole Genome Sequencing (WGS) to Food Safety – Perspective fr...
 
Widespread human T cell receptor beta variable gene polymorphism: implication...
Widespread human T cell receptor beta variable gene polymorphism: implication...Widespread human T cell receptor beta variable gene polymorphism: implication...
Widespread human T cell receptor beta variable gene polymorphism: implication...
 
CFERV 2019 Frankel
CFERV 2019 FrankelCFERV 2019 Frankel
CFERV 2019 Frankel
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
 
RNA (gene expression) analysis of Prostate cancers and non-cancerous tissues t
RNA (gene expression) analysis of Prostate cancers and non-cancerous tissues tRNA (gene expression) analysis of Prostate cancers and non-cancerous tissues t
RNA (gene expression) analysis of Prostate cancers and non-cancerous tissues t
 
Dc mol testing7oct14
Dc mol testing7oct14Dc mol testing7oct14
Dc mol testing7oct14
 
J Immunol-2008-Pulecio-1135-42
J Immunol-2008-Pulecio-1135-42J Immunol-2008-Pulecio-1135-42
J Immunol-2008-Pulecio-1135-42
 
2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides2009 09 08 Wiltshire Ipit Seminar Slides
2009 09 08 Wiltshire Ipit Seminar Slides
 
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord Blood
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord BloodSergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord Blood
Sergio Querol - Advances in UCBT: UCB Banking, Making the Most of Cord Blood
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
 
Undergraduate Research Symposium Poster
Undergraduate Research Symposium PosterUndergraduate Research Symposium Poster
Undergraduate Research Symposium Poster
 
Transplant Updates on donor and conditioning for Aplastic Anemia.
Transplant Updates on donor and conditioning for Aplastic Anemia.Transplant Updates on donor and conditioning for Aplastic Anemia.
Transplant Updates on donor and conditioning for Aplastic Anemia.
 
Pathology Insights on Innovation in AML: The Rapid Emergence of Precision Dia...
Pathology Insights on Innovation in AML: The Rapid Emergence of Precision Dia...Pathology Insights on Innovation in AML: The Rapid Emergence of Precision Dia...
Pathology Insights on Innovation in AML: The Rapid Emergence of Precision Dia...
 

More from Asociación Argentina de Bioinformática y Biología Computacional

More from Asociación Argentina de Bioinformática y Biología Computacional (8)

Design of degenerated primers from bioinformatics online software for putativ...
Design of degenerated primers from bioinformatics online software for putativ...Design of degenerated primers from bioinformatics online software for putativ...
Design of degenerated primers from bioinformatics online software for putativ...
 
A structure-function analysis of s HSPs in plants
A structure-function analysis of s HSPs in plantsA structure-function analysis of s HSPs in plants
A structure-function analysis of s HSPs in plants
 
Data balancing for phenotype classification based on SNPs
Data balancing for phenotype classification based on SNPsData balancing for phenotype classification based on SNPs
Data balancing for phenotype classification based on SNPs
 
Gene selection via significant subset using silhouette index
Gene selection via significant subset using silhouette indexGene selection via significant subset using silhouette index
Gene selection via significant subset using silhouette index
 
Bolstered error estimation for discrete classifier applied to genomic signal ...
Bolstered error estimation for discrete classifier applied to genomic signal ...Bolstered error estimation for discrete classifier applied to genomic signal ...
Bolstered error estimation for discrete classifier applied to genomic signal ...
 
Biopython: Overview, State of the Art and Outlook
Biopython: Overview, State of the Art and OutlookBiopython: Overview, State of the Art and Outlook
Biopython: Overview, State of the Art and Outlook
 
¿Cuál es la estabilidad relevante de las proteínas?
¿Cuál es la estabilidad relevante de las proteínas?¿Cuál es la estabilidad relevante de las proteínas?
¿Cuál es la estabilidad relevante de las proteínas?
 
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacionalBiogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
Biogeografía histórica y Análisis de Vicarianza: Una perspectiva computacional
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Predicting peptide/MHC interactions: Application to epitope identification and vaccine design

  • 1. Predicting peptide/MHC interactions: Application to epitope identification and vaccine design Or Finding the needle in the haystack Morten Nielsen Center for Biological Sequence Analysis Department of Systems biology Technical University of Denmark mniel@cbs.dtu.dk Bridging between two worlds “Para serte sincero, no creo en este approach bioinformatico a la inmunologia, ...” “Hablar de detectar epitopes a partir del genoma de una bacteria entera me parece muy complicado. Me parece impracticable y "misleading", en el sentido de que puede quitar fondos, esfuerzos y atencion a las vias lentas pero seguras de llegar a este proposito por metodos experimentales.” FG, 2006
  • 2. Vaccines have been made for 36 of >400 human pathogens +HPV & Rotavirus Immunological Bioinformatics, The MIT press. Deaths from infectious diseases in the world in 2002 www.who.int/entity/whr/2004/annex/topic/en/annex_2_en.pdf
  • 3. The human immune system Vaccine review
  • 4. MHC Class I pathway Finding the needle in the haystack 1/200 peptides make to the surface Figure by Eric A.J. Reits MHC-I molecules present peptides on the surface of most cells Figure courtesy Mette Voldby Larsen
  • 5. CTL response Figure courtesy Mette Voldby Larsen The death of an infected cell
  • 6. Antigen Discovery Lauemøller et al., 2000 Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) >Segment 1 Genome agcaaaagcaggtcaattatattcaatatggaaagaataaaagaactaagagatctaatg tcgcagtcccgcactcgcgagatactaacaaaaaccactgtggatcatatggccataatc aagaaatacacatcaggaagacaagagaagaaccctgctctcagaatgaaatggatgatg gcaatgaaatatccaatcacagcagacaagagaataatggagatgattcctgaaaggaat and 13350 other nucleotides on 8 segments 9mer peptides >polymerase“ Proteins MERIKELRD ERIKELRDL MERIKELRDLMSQSRTREILTKTTVDHMAIIKKYTSGRQEKNPALRMKWMMAMKYPITAD RIKELRDLM KRIMEMIPERNEQGQTLWSKTNDAGSDRVMVSPLAVTWWNRNGPTTSTVHYPKVYKTYFE IKELRDLMS KVERLKHGTFGPVHFRNQVKIRRRVDINPGHADLSAKEAQDVIMEVVFPNEVGARILTSE KELRDLMSQ SQLTITKEKKEELQDCKIAPLMVAYMLERELVRKTRFLPVAGGTSSVYIEVLHLTQGTCW ELRDLMSQS EQMYTPGGEVRNDDVDQSLIIAARNIVRRATVSADPLASLLEMCHSTQIGGIRMVDILRQ LRDLMSQSR NPTEEQAVDICKAAMGLRISSSFSFGGFTFKRTNGSSVKKEEEVLTGNLQTLKIKVHEGY RDLMSQSRT EEFTMVGRRATAILRKATRRLIQLIVSGRDEQSIAEAIIVAMVFSQEDCMIKAVRGDLNF DLMSQSRTR ... LMSQSRTRE and 9 other proteins and 4376 other 9mers
  • 7. Experimental validation of Bioinformatics predictions HLA Elispot assay1 Elispot assay2 Peptide Sequence Restriction KD (nM) + peptide - peptide + peptide - peptide PB1591-599 VSDGGPNLY HLA-A1 6 18 ± 2 3±3 12 ± 4 1±1 NP44-52 CTELKLSDY HLA-A1 7 34 ± 5 4±1 13 ± 4 0±0 PB1166-174 FLKDVMESM HLA-A2 51 74 ± 10 11 ± 6 140 ± 36 20 ± 7 PB141-49 DTVNRTHQY HLA-A26 6 40 ± 3 20 ± 7 38 ± 5 24 ± 3 PB1540-548 GPATAQMAL HLA-B7 6 7±2 2±1 13 ± 2 6±1 NP225-233 ILKGKFQTA HLA-B8 664 9±4 1±1 19 ± 7 2±2 PA601-609 SVKEKDMTK HLA-B8 NB 23 ± 6 1±1 119 ± 8 2±1 PB1349-357 ARLGKGYMF HLA-B27 246 10 ± 6 1±1 14 ± 4 1±1 NP383-391 SRYWAIRTR HLA-B27 38 39 ± 6 1±1 40 ± 6 2±1 M1173-181 IRHENRMVL HLA-B39 13 14 ± 3 3±1 84 ± 11 3±1 NP199-207 RGINDRNFW HLA-B58 42 28 ± 5 1±1 15 ± 6 2±2 PB1347-355 KMARLGKGY HLA-B62 178 77 ± 20 3±2 91 ± 8 10 ± 3 PB1566-574 TQIQTRRSF HLA-B62 88 15 ± 5 2±2 21 ± 2 2±0 Wang et al., 2006 Epitope based vaccines and diagnostics • Challenges • Identify epitopes in pathogen genome • A small viral genome contains >> 1000 potential CTL epitopes • Bacteria genomes contain 100.000 distinct peptides • HLA diversity • No two humans will induce the same reaction to a pathogen infection • Viral escape and viral genomic diversity • No two viral strains will “host” the same set of T cell epitopes
  • 8. HLA diversity. Expression of HLA is co-dominant MHC specificity Figure courtesy of Can Kesmir
  • 9. HLA polymorphism HLA polymorphism • Few human beings will share the same set of HLA alleles – Different persons will react to a pathogen infection in a different manner • A T cell based vaccine must include epitopes specific for each HLA allele in a population – A peptide based vaccine must consist of many 100 HLA class I epitopes – (and ~1000 class II epitopes)
  • 10. Figure by Anne Mølgaard HLA binding motif SLLPAIVEL YLLPAIVHI TLWVDPYEV GLVPFLVSV KLLEPVLLL LLDVPTAAV LLDVPTAAV LLDVPTAAV LLDVPTAAV VLFRGGPRG MVDGTLLLL YMNGTMSQV MLLSVPLLL SLLGLLVEV ALLPPINIL TLIKIQHTL HLIDYLVTS ILAPPVVKL ALFPQLVIL GILGFVFTL STNRQSGRQ GLDVLTAKV RILGAVAKV QVCERIPTI ILFGHENRV ILMEHIHKL ILDQKINEV SLAGGIIGV LLIENVASL FLLWATAEA SLPDFGISY KKREEAPSL LERPGGNEI ALSNLEVKL ALNELLQHV DLERKVESL FLGENISNF ALSDHHIYL GLSEFTEYL STAPPAHGV PLDGEYFTL GVLVGVALI RTLDKVLEV HLSTAFARV RLDSYVRSL YMNGTMSQV GILGFVFTL ILKEPVHGV ILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CLGGLLTMV FIAGNSAYE KLGEFYNQM KLVALGINA DLMGYIPLV RLVTLKDIV MLLAVLYCL AAGIGILTV YLEPGPVTA LLDGTATLR ITDQVPFSV KTWGQYWQV TITDQVPFS AFHHVAREL YLNKIQNSL MMRKLAILS AIMDKNIIL IMDKNIILK SMVGNWAKV SLLAPGAKQ KIFGSLAFL ELVSEFSRM KLTPLCVTL VLYRYGSFS YIGEVLVSV CINGVCWTV VMNILLQYV ILTVILGVL KVLEYVIKV FLWGPRALV GLSRYVARL FLLTRILTI HLGNVKYLV GIAGGLALL GLQDCTMLV TGAPVTYST VIYQYMDDL VLPDVFIRC VLPDVFIRC AVGIGIAVV LVVLGLLAV ALGLGLLPV GIGIGVLAA GAGIGVAVL IAGIGILAI LIVIGILIL LAGIGLIAA VDGIGILTI GAGIGVLTA AAGIGIIQI QAGIGILLA KARDPHSGH KACDPHSGH ACDPHSGHF SLYNTVATL RGPGRAFVT NLVPMVATV GLHCYEQLV PLKQHFQIV AVFDRKSDA LLDFVRFMG VLVKSPNHV GLAPPQHLI LLGRNSFEV PLTFGWCYK VLEWRFDSR TLNAWVKVV GLCTLVAML FIDSYICQV IISAVVGIL VMAGVGSPY LLWTLVVLL SVRDRLARL LLMDCSGSI CLTSTVQLV VLHDDLLEA LMWITQCFL SLLMWITQC QLSLLMWIT LLGATCMFV RLTRFLSRV YMDGTMSQV FLTPKKLQC ISNDVCAQV VKTDGNPPE SVYDFFVWL FLYGALLLA VLFSSDFRI LMWAKIGPV SLLLELEEV SLSRFSWGA YTAFTIPSI RLMKQDFSV RLPRIFCSC FLWGPRAYA RLLQETELV SLFEGIDFY SLDQSVVEL RLNMFTPYI NMFTPYIGV LMIIPLINV TLFIGSHVV SLVIVTTFV VLQWASLAV ILAKFLHWL STAPPHVNV LLLLTVLTV VVLGVVFGI ILHNGAYSL MIMVKCWMI MLGTHTMEV MLGTHTMEV SLADTNSLA LLWAARPRL GVALQTMKQ GLYDGMEHL KMVELVHFL YLQLVFGIE MLMAQEALA LMAQEALAF VYDGREHTV YLSGANLNL RMFPNAPYL EAAGIGILT TLDSQVMSL STPPPGTRV KVAELVHFL IMIGVLVGV ALCRWGLLL LLFAGVQCQ VLLCESTAV YLSTAFARV YLLEMLWRL SLDDYNHLV RTLDKVLEV GLPVEYLQV KLIANNTRV FIYAGSLSA KLVANNTRL FLDEFMEGV ALQPGTALL VLDGLDVLL SLYSFPEPE ALYVDSLFF SLLQHLIGL ELTLGEFLK MINAYLDKL AAGIGILTV FLPSDFFPS SVRDRLARL SLREWLLRI LLSAWILTA AAGIGILTV AVPDEIPPL FAYDGKDYI AAGIGILTV FLPSDFFPS AAGIGILTV FLPSDFFPS AAGIGILTV FLWGPRALV ETVSEQSNV ITLWQRPLV
  • 11. HLA binding specificity High information positions If we have binding data, can we accurate describe the binding specificity! HLA specificity clustering A0201 A6802 A0101 B0702
  • 12. Coverage of HLA alleles Supertype Selected allele A1 A*0101 A2 A*0201 A3 A*1101 A24 A*2401 A26 (new*) A*2601 B7 B*0702 B8 (new*) B*0801 B27 B*2705 B39(new*) B*3901 B44 B*4001 B58 B*5801 B62 B*1501 Clustering in: O Lund et al., Immunogenetics. 2004 55:797-810 Data • Alleles characterized with 5 or more data points • 3% covered
  • 13. HLA polymorphism! B0807 B4804 B0710 B1513 A6817 B5130 A0204 B3503 A2415 B0740 B3929 A0250 B5204 A2420 B1804 B3523 B3502 A3202 B0802 A3601 B4047 A6601 A0268 B0817 B5002 B5602 B3811 B4810 A0103 B1530 B4415 A3111 B7803 A6804 B3520 B3528 A2610 A6802 A2404 A7406 B0744 B3701 B4058 B1803 B1527 B3801 A6826 B5606 B0725 B5603 A0110 B1586 A3205 A0212 B3511 A2603 B5120 A0251 A3106 A6801 B5135 B1567 B4012 A3401 B5106 B3912 B1525 B5703 B4402 B0733 A2901 B0711 A6603 B3907 B4023 B2717 B4507 B4502 B4807 A2438 B1312 B1590 A0258 B5310 B5124 B4103 B0811 B3927 B4104 A1110 B1553 A2621 B5115 B1599 A0102 B5102 A0207 B4444 A3002 A6813 B5709 B5515 B4439 B1561 A2618 B2728 A3404 A6820 A3107 A2430 A0235 A2914 B1301 B4004 A2620 B1573 A0259 B0804 B1548 A2616 B5401 B0707 A2453 A2609 B3554 A0245 B4411 A0220 B1510 A2433 B5512 B5306 B1540 B5114 B3934 B5510 B1521 B0810 B5137 B3932 B4802 B4044 B3709 B3915 B2729 B3810 A0238 B0729 B3537 A2314 B0734 B3702 A0214 B4805 A0269 A3102 B5206 A6819 B3707 A3011 A1123 B1822 A6823 A4301 B3917 B4702 B5118 B3708 A0265 B5203 A3013 B3530 B4701 B4061 A0316 B4814 B2710 A7411 B3930 B0702 B5702 A1107 B7801 A0246 B3534 A0228 B1596 A3305 B2711 B3526 B4445 A0216 B1539 A3308 A2455 A0206 B4605 B2725 A0310 B4037 A1104 A2622 B5607 B4504 B4602 B1598 A3112 B0813 B5113 A0237 A3602 B0805 A6808 B4505 B1544 A0285 A3108 B5402 B6701 A6901 B0730 B4056 B5205 B1310 B5805 B1404 A2435 A2614 A7405 B1520 B3920 A0254 B2702 A6815 A3201 B1570 A0255 B5708 B4033 B4435 A2405 B4007 B4034 B4806 B5615 A0218 B3527 B3512 B0814 B5301 A6829 B4904 B4038 A0304 A7408 B7805 B3549 B1503 B4420 A1120 B1815 B5129 B0801 B0827 B5001 A3402 A0314 B4405 A2305 B4438 B4052 B0823 A8001 B1302 B4021 A2909 B3933 B4408 B4105 B0727 B5508 B4108 A3405 B1315 B3517 A1116 B0731 B4053 B1516 B4704 B1403 A6830 B5610 A3009 B0714 B1303 B1566 B2714 B3923 B5801 A2439 B2719 A0219 A2602 A2413 B1821 A0260 B4410 A6605 B1309 B8202 B4426 A2623 B4042 B1805 B3902 A2503 B1536 A0302 A3209 A0205 B2715 B5131 A0262 A6805 B5201 A1119 B1402 A0270 A2450 A1111 A3008 B3806 A6822 A0202 B5503 B0826 B3926 A2428 A1114 A2414 A3301 A0239 B4054 B0825 A0308 B3563 A0305 B4036 B1589 B1314 B1563 B4005 A3104 B4440 B5122 A3206 B7804 B0718 B4446 B4905 B9509 A0112 A0256 A6604 B4029 B1807 B5901 A2906 B1304 B3501 A2502 B5509 B4107 B2707 A0117 B4032 B3914 B3509 A3306 A6602 B1504 B5611 A2904 B3535 A2447 B6702 B1572 A2417 B1811 A2452 B3542 A2612 B1542 B1507 B5406 B3911 A2421 A2443 B4404 A3015 B5704 B4437 B4427 B8101 B4002 B3901 A1103 B3928 A2408 A6827 B1517 B0824 B1576 B4601 A2303 B4811 B4003 A2605 B1505 B4808 A7407 B1809 A0222 B4031 B1511 B4429 B1564 A2406 B1515 B5601 A2301 B4101 B3506 A0113 B5710 A7404 B3531 A0201 B4902 B1581 A2907 B4431 A0252 B4102 A2601 A6825 B5116 B5608 B4201 B5110 B4422 B2720 B2727 A3304 B1306 A2425 B5501 A0233 B0736 A2423 B1549 A1109 B3558 B5134 B5139 A0289 B5121 B4208 A0271 B2705 A2407 B4501 B3550 A2410 B2706 B1552 A1101 A0273 B1546 B3905 B4409 B5808 A2313 B0706 B1534 B5138 B0803 A2429 B5507 A6810 B1405 B2713 B3547 B4013 A3003 B5119 A3010 B0726 A3204 B3552 B3802 A3105 B4062 B4018 B4403 B1550 A0317 B4432 B4433 B3551 B9505 B8201 A3303 B5804 B4008 A0208 A0230 B1819 B2726 B3533 B4428 B5404 A0267 B1529 B4046 A0106 B9507 B3505 B4016 B3922 A7410 B1509 B0822 A3012 A0319 B4503 B5207 B1531 B3904 A2910 B5613 B0717 A2403 A2912 B3510 B0818 B5806 B0724 B7802 B3561 B0728 B1585 B2730 B4030 B4604 B3513 B3809 B5403 B3529 A2617 A3110 B5128 B3504 B3924 B3539 B5511 B5103 B5109 B5604 B1575 A3007 A2627 B3536 A2437 B3805 B4812 A1113 B5518 B3803 A0313 B3514 B9502 A6816 B3808 A2911 A0108 B1524 A2606 B1578 B1538 A2504 B1813 B4407 A0244 B1556 B5307 A0272 A2608 B2723 A2913 A2619 A0231 B2721 B4051 B1551 B5112 B4035 B2701 A0209 B0806 B4418 A2454 A2902 B8301 B4057 B5520 A2903 A6824 B1545 A0275 B4417 A0114 B3548 A0322 B0732 B4059 B3918 A0241 B5132 A2444 B4430 B0739 A3006 B2724 B1818 A2418 A3103 B5514 B0723 A2456 B4060 B5308 B3559 B1547 B5616 B4205 A7402 B4421 B4001 B1597 B5101 B1308 B4406 B4015 A2309 B8102 B0720 B4813 B3557 A6812 A2419 A0277 B4703 B5605 B9506 B3545 A0261 A2615 B5504 B4436 A7403 B1502 B3935 A2312 B4441 A3307 B1592 B0703 B4803 B0708 B5133 B1587 A0225 B5311 B0745 B5519 A0263 B1562 A2458 A2501 B4020 B4009 A6803 A0278 A3004 B4606 B1574 B1535 B1583 B1820 B3909 A2427 B5208 A0234 B0715 B0743 B0709 B5305 A0236 A0274 A2310 B4901 B5706 A2441 B5126 A2426 A1102 A2446 A0307 B1554 A0318 A3001 B1588 B3524 B3936 B3519 B4603 A2442 B1812 A0227 A2424 B0741 A1117 B3546 HLA polymorphism! B1513 B3811 A3106 B3912 B5102 A3107 B3709 A2314 A7411 A0216 A3108 A2405 B4052 B4408 B4426 A0302 B4036 B5901 A2904 A3001 B1515 B4422 A0273 B4403 B5207 B3514 B1578 A6824 B2724 B5605 A2458 B0709 A2442
  • 14. Predicting the specificity Align A3001 (365) versus A3002 (365). Aln score 2445.000 Aln len 365 Id 0.9890 A3001 0 MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFSTSVSRPGSGEPRFIAVGYVDDTQFVRFDSDAA ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: A3002 0 MAVMAPRTLLLLLSGALALTQTWAGSHSMRYFSTSVSRPGSGEPRFIAVGYVDDTQFVRFDSDAA A3001 65 SQRMEPRAPWIEQERPEYWDQETRNVKAQSQTDRVDLGTLRGYYNQSEAGSHTIQIMYGCDVGSD :::::::::::::::::::::::::::: ::::: ::::::::::::::::::::::::::::: A3002 65 SQRMEPRAPWIEQERPEYWDQETRNVKAHSQTDRENLGTLRGYYNQSEAGSHTIQIMYGCDVGSD A3001 130 GRFLRGYEQHAYDGKDYIALNEDLRSWTAADMAAQITQRKWEAARWAEQLRAYLEGTCVEWLRRY ::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::: A3002 130 GRFLRGYEQHAYDGKDYIALNEDLRSWTAADMAAQITQRKWEAARRAEQLRAYLEGTCVEWLRRY A3001 195 LENGKETLQRTDPPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPA ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: A3002 195 LENGKETLQRTDPPKTHMTHHPISDHEATLRCWALGFYPAEITLTWQRDGEDQTQDTELVETRPA A3001 260 GDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWELSSQPTIPIVGIIAGLVLLGAVITGA ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: A3002 260 GDGTFQKWAAVVVPSGEEQRYTCHVQHEGLPKPLTLRWELSSQPTIPIVGIIAGLVLLGAVITGA A3001 325 VVAAVMWRRKSSDRKGGSYTQAASSDSAQGSDVSLTACKV :::::::::::::::::::::::::::::::::::::::: A3002 325 VVAAVMWRRKSSDRKGGSYTQAASSDSAQGSDVSLTACKV HLA-A*3001 HLA-A*3002
  • 15. NetMHCpan - a pan-specific method NetMHC NetMHCpan NetMHCpan, a Method for Quantitative Predictions of Peptide Binding to Any HLA-A and -B Locus Protein of Known Sequence. Nielsen et al. PLoS ONE 2007 Example Peptide Amino acids of HLA pockets HLA Aff VVLQQHSIA YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.131751 SQVSFQQPL YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.487500 SQCQAIHNV YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.364186 LQQSTYQLV YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.582749 LQPFLQPQL YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.206700 VLAGLLGNV YFAVLTWYGEKVHTHVDTLVRYHY A0201 0.727865 VLAGLLGNV YFAVWTWYGEKVHTHVDTLLRYHY A0202 0.706274 VLAGLLGNV YFAEWTWYGEKVHTHVDTLVRYHY A0203 1.000000 VLAGLLGNV YYAVLTWYGEKVHTHVDTLVRYHY A0206 0.682619 VLAGLLGNV YYAVWTWYRNNVQTDVDTLIRYHY A6802 0.407855
  • 16. Prediction for novel HLA alleles MHC allele A*8001 MHC allele A*7401 Sequence KD-value (nM) Sequence KD-value (nM) HSNASTLLY <1 RVYHLTWLR 1 KVDWNQFTY <1 TTMGWLFLK 1 WMSNGTWNY <1 MMHEFFGPR 3 LTAHYCFLY 1 KTYAPLAFR 3 75 - 100% accuracy GMFSWNLAY 3 HMMKRMSYR 4 LVFLGPGLY 6 KVNNHLFHR 10 MTDVDLNYY 10 MTMFVTASK 12 VIAAIHNAY 36 MAMSNYLLR 14 SMIYFFHHY 1,454 MVAGRTPFK 63 LMDHWRGYK 16,543 IVFAFHFYR 188 LSNFGYPGY non SVYFWWLNR 402 Evaluation. MHC ligands from SYFPEITHI Sort on binding Top Rank: F-rank=0.0 Random Rank: F-rank=0.5
  • 17. SYFPEITHI benchmark (1400 ligands restricted to 46 HLA molecules) Prediction Primate MHCs • Can we predict binding specificities for non-human primates using the NetMHCpan method trained on human specificity data only?
  • 18. Yes. Monkey are just like humans Patr B*0101 Patr A*0101 Sidney et al. (2006) Sidney et al. (2006) And even Pigs and Cows are (somewhat) like humans
  • 19. So, we can find the needle in the haystack • Given a protein sequence and an HLA molecule, we can accurately predict with peptides will bind (70-95%) • 15-80% of these will in turn be epitopes But, can we find the haystack?
  • 20. Epitope based vaccines and diagnostics • Challenges • Identify epitopes in pathogen genome • A small viral genome contains >> 1000 potential CTL epitopes • HLA diversity • No two humans will induce the same reaction to a pathogen infection • Viral escape and viral genomic diversity • No two viral strains will “host” the same set of T cell epitopes Viral escape and pathogen variability The virus of today is different from the virus of tomorrow (Viral escape) ??? ?? ???? Figure courtesy Mette Voldby Larsen
  • 21. Pathogen variability HIV Gag phylogenetic tree Clade C Few peptides conserved between all viral strains Clade D Clade AE Clade A Clade B
  • 22. Immuno-dominance • Highly immunogenic peptides • High variability = easy escapable • Immune response useless Dominance Sub-dominance • Weakly immunogenic peptides • Low variability = no escapable • Immune response highly effective = good vaccine candidates Polyvalent vaccines • The equivalent of this in epitope based vaccines is to select epitopes in a way so that they together cover all strains. Uneven coverage, Average coverage = 2 Epitope Strain 1 Strain 2 Even coverage, Average coverage = 2 Strain 1 Strain 2
  • 23. EpiSelect Pi j S =# j G i " + Ci ! Cross-clade immunogens Table 3 Highly immunogenic epitopes and there cross-clade recognition. 21 HLA-supertype restricted epitopes were highly immunogenic and induced a CTL-response in at least four subjects. The table shows the subtype the responding subjects were infected with and at which frequency the epitope sequence is found among the HIV-1 subtype reference strains. Epitope sequence HLA-supertype The subtypes Frequence of the epitope sequence in & protein region of the responders subtype1: A B C D AE QVPLRPMTY A1-nef B, B, C, D, AE, nd LTDTTNQKT A1-pol B, B, B, C, C, AE KIQNFRVYY A1-pol B, D, AE, nd FLGKIWPSHK A2-gag A1, A1, A1, B, B, B, B, C, AE, nd SLYNTVATL A2-gag A1, B, B, B, C, C, C GALDLSHFL A2-nef, var. 12 A1, B, B, B, C, AG AAVDLSHFL A2-nef, var. 2 A1, B, B, B, AG ILKEPVHGV A2-pol B, B, B, B, C, C, nd QLTEAVQKI A2-pol B, C AVDLSHFLK A3-nef, var. 1 A1, B, D, nd ALDLSHFLK A3-nef, var. 2 A1, B, D, nd AFDLSFFLK A3-nef, var. 3 B, C, C, C, C, AE, AE WYIKIFIII A24-env B B, B, C, C HYMLKHLVW A24-gag A1, B, B, C IPRRIRQGL B7-env, var 1 A1, B, C, AE IPRRIRQGF B7-env, var 2 A1, B, AE, CPX06 HPVHAGPVA B7-gag A1, B, C, D RALGPGATL B7-gag A1, B, C, D TPQDLNTML B7-pol A1, B, C, C SPAIFQSSM B7-pol A1, A1, B, C, C, D, AE QEILDLWVY B44-nef A1, A1, B, B, B, C 1 The color represents the frequencies of the exact epitopes sequence in the different subtypes; blue: 0%, light blue: 1-24%, orange: 25-49% and red: >50%. 2Subtype variants of the same epitope. nd: not determined Perez. et al. JI, 2008
  • 24. All HIV responsive patients respond to at least one of nine peptides Perez et al., JI, 2008 PopCover - Searching in two dimensions. HIV class II case story • Data – 396 full length genomes with annotated tat, nef, gag and pol proteins covering A(50), B(104) ,C(156), D(40) and AE(46) strains • HLA-DR frequencies taken from – 43 (allele frequency in at least one population > 2.5%) HLA class II alleles • 36 HLA-DRB1, HLA-DR3,4,5, and 4 HLA-DQ alleles • Select predicted peptide binders – 5608(tat), 20961(nef), 31848(gag),42748(pol) • Select peptides from each protein with optimal genomic and HLA coverage – tat(4), nef(15), gag(15) and pol(15)
  • 25. EpiSelect and PoPCover • EpiSelect Pi j S =# j G i " + Ci The sum is over all genomes i. Pji is 1 if epitope j is present in genome i. Ci is the number of times genome i has been targeted in the already selected set of epitopes ! • PopCover j Rki " fk " gi S j A+ G = $$ i k # + Eik The sum is over all genomes i and HLA alleles k. Rjki is 1 if epitope j is present in genome i and is presented by allele k, and Eki is the number of times allele k has been targets by epitopes in genome i by the already selected set of epitopes, and gi is the genomes frequency ! Benchmark • Create 10,000 virtual patients with a given HIV genomic sequence and HLA alleles as defined by the HLA allele frequencies and HIV genomic data • Test how many of these patients that are targets by at least on of the selected peptides
  • 26. HIV patient coverage •Selected peptide pools –tat(4), nef(15), gag(15) and pol(15) So, have we found the haystack?
  • 27. MTB (mycobacterium tuberculosis) • Bacterial genome coding for more then 4000 proteins • 700 known epitopes, found in only 30 proteins (ORFs) MTB (mycobacterium tuberculosis) • Bacterial genome coding for more then 4000 proteins • 700 known epitopes, found in only 30 proteins (ORFs) • Is this biology, or history? – More than 500.000 unique 9mer peptides – Where to start? • Each HLA allele will binding ~5000 of these peptides..
  • 28. Functional bias in TB epitope proteins Functional bias in TB epitope proteins
  • 29. Where are the epitopes? So no we cannot find the haystack? But, this is the same problem faced by experimental methods!
  • 30. Conclusions • Rational epitope discovery is feasible – Prediction methods are an important guide for epitope identification – Given a protein sequence and an HLA molecule, we can predict the peptide binders (find the needle in the haystack) • Pan-specific MHC prediction method can deal with the immense MHC polymorphism • Epitope selection strategies can deal with pathogen diversity • For large pathogens, we still have no handle on how to select immunogenic proteins (we cannot find the haystack) CBS immunology web servers www.cbs.dtu.dk/services
  • 31. Acknowledgements •Immunological Bioinformatics group, CBS, DTU – Ole Lund - Group leader – Claus Lundegaard - Data bases, HLA binding predictions • Collaborators – IMMI, University of Copenhagen • Søren Buus: MHC binding – La Jolla Institute of Allergy and Infectious Diseases • A. Sette, B. Peters: Epitope database • and many, many more www.cbs.dtu.dk/services