Towards peptide vaccines against Zika virus: Immunoinformatics combined with molecular dynamics simulations to predict antigenic epitopes of Zika viral proteins

The recent outbreak of Zika virus (ZIKV) infection in Brazil has developed to a global health concern due to its likely association with birth defects (primary microcephaly) and neurological complications. Consequently, there is an urgent need to develop a vaccine to prevent or a medicine to treat the infection. In this study, immunoinformatics approach was employed to predict antigenic epitopes of Zika viral proteins to aid in development of a peptide vaccine against ZIKV. Both linear and conformational B-cell epitopes as well as cytotoxic T-lymphocyte (CTL) epitopes were predicted for ZIKV Envelope (E), NS3 and NS5 proteins. We further investigated the binding interactions of altogether 15 antigenic CTL epitopes with three class I major histocompatibility complex (MHC I) proteins after docking the peptides to the binding groove of the MHC I proteins. The stability of the resulting peptide-MHC I complexes was further studied by molecular dynamics simulations. The simulation results highlight the limits of rigid-body docking methods. Some of the antigenic epitopes predicted and analyzed in this work might present a preliminary set of peptides for future vaccine development against ZIKV.

and Zika fetal syndrome occur simultaneously, suggests that these syndromes might share a common etiology such as an autoimmune response that targets neurologic functions, raising the possibility of epitope mimicry 12,13 . Furthermore, high viral loads of ZIKV identified in the semen of infected patients, have indicated the possibility of its transmission by sexual means 14 . Lack of therapeutics or approved vaccines against ZIKV, till date, makes it difficult to control and prevent the infection. As a matter of fact, considerable endeavors are in focus to enhance comprehension of the vulnerability and pathogenesis of ZIKV. Envelope protein (E), NS3 and NS5 have been identified as potential targets for therapeutics and vaccines development. These targets have been recognized to have a major role in viral entry into the cell and in viral replication 15,16 . The risk associated with autoimmune responses to potential epitope mimics must be addressed in the development of vaccines and therapeutics for Zika virus infections 13 . Availability of genomic and immunological data and computer algorithms has brought about a more efficient process in vaccine development that allows for identification of possible epitopes which can assist the production of effective subunit vaccines [17][18][19] .
One of the indispensable and significant steps of vaccine development is the recognition of exceedingly competent B-cell linear (continuous) or conformational (discontinuous) and Cytotoxic T-lymphocyte (CTL) epitopes. It has been reported that approximately > 90% of B-cell epitopes are discontinuous 20,21 . The target vaccine has been reported to elicit competent immune responses where T cells act as mediators. The progress in vaccine-designing research recently, has been facilitated by the development of refined assays designed to measure the T-cell responses against various vaccine candidates 22 . In order to explore T-cell epitopes in peptide sequences, aforementioned reports have accelerated research leading to the development of immunoinformatics methods 23 . With the help of immunoinformatics approach, we tried to make it possible to precisely narrow down potential B-cell and T-cell epitopes that can be characterized as effective vaccine candidates. Identifying which peptide of the virulent bacterial pathogen's proteome binds to the major histocompatibility complex (MHC) molecules is considered as the first step to vaccine design. T-cell immunogenicity is correlated to the strength with which epitopes bind to MHC molecule 24 . With the help of appropriate molecular modeling tools, peptide-MHC complexes were modeled and their post-docking interaction studies helped further towards the selection of potential candidates for the development of peptide vaccines.

Results
For designing peptide vaccines, this study aimed at identifying potential B cell and T cell epitopes with the help of an immunoinformatics approach. The identification of new vaccines by in silico approach can be carried out through the analysis of pathogenomics on a genome-wide scale 25 . There exist several limitations with respect to the conventional experimental approaches. With the help of immuno informatics, analysis of the complete spectrum of potential antigens is possible. Furthermore, the non-feasibility of pathogen culturing and the problems in the expression of antigens in vitro can also be circumvented 26 . and several in silico vaccine candidates reported by immune research groups have been known to produce promising preclinical results 26 .
In the current study, potential B-cell epitopes (continuous and discontinuous) and T-cell epitopes have been identified for the designing of peptide vaccines against ZIKV E, NS3, and NS5 proteins. With the help of immunoinformatics approach as well as docking tools for predicting peptide-protein complexes, the analysis was performed because these tools are very helpful for the identification of new vaccines 25 . Numerous researchers have documented in silico approach to design vaccines with promising clinical trials 27 . In silico process helps in reducing the number of in vitro experiments, hence, prediction of B-cell epitopes and CTL epitopes by various immunoinformatics approaches serves as an essential tool in vaccine design 28,29 . Prediction of Antigenic B-Cell Epitopes. Based upon the physicochemical characteristics of amino acids that have been observed in experimentally determined antigenic epitopes, Kolaskar and Tongaonkar's method 30 was employed to predict antigenic epitopes of a given sequence. It has been reported that this method provides 75% experimental accuracy 30 . This procedure predicted that the ZIKV E protein sequence of 403 amino acids has 21 antigenic peptides that fall in the range of 7-18 amino acids with 6 heptapeptides (Table 1). Similarly, it was predicted that the sequence of NS3 of 151 amino acids contains 6 antigenic peptides. The length of the antigenic peptides was 7-21 amino acids with 2 hendeca-peptides for NS3 (Table 2). Moreover, the sequence of NS5 of 641 amino acids has 22 antigenic peptides. The length of the antigenic peptides was 6-25 with 6 nonapeptides for NS5 (Table 3). Furthermore, Kolaskar and Tongaonkar's method also predicted the maximum residual score for each amino acid in E, NS3, and NS5. 252 out of 403 amino acids of ZIKV E protein has a residual score greater than 1.000. Leucine at the 258 th position, found in the antigenic peptide from position 252 to 260 (RQTVVVLGS), was identified as having the maximum residual score of 1.184.
Likewise, 100 out of 151 amino acids of ZIKV NS3 protein were predicted to have a residual score > 1.000. Leucine at the 81 st position, in the antigenic peptide from position 76 to 86 (SEVQLLAVPPG), was identified as having the maximum residual score of 1.201. It should be noted that the leucines at positions 80 and 81 from this ZIKV NS3 antigenic site were also predicted as comprising a CTL epitope, thus significantly enhancing the scope of amino acid leucine at positions 80 and 81 to be treated as potential peptide vaccine candidates. In ZIKV NS5,322 out of 641 amino acids were predicted to have a residual score > 1.000 and serine at 533 rd position of the antigenic peptide from position 528 to 539 (NAICSSVPVDWV), had the maximum residual score of 1.203.
Graphical representation of the predicted peptides of B-cell E, NS3 and NS5 proteins on the basis of their sequence position (along the x-axis) and antigenic propensity (along the y-axis) are shown in Fig. S1 (Supporting Information). The variation of antigenic propensity is associated with sequence length. The minimum antigenic propensity score for the ZIKV E protein was 0.861 while the maximum score was 1.186 (A). Moreover, the average antigenic propensity score came out to be 1.026 (A). On the other hand, minimum and maximum (B,C) value for NS3 and NS5 (0.911 and 0.851) and (1.202 and1.203) was observed respectively. The average antigenic propensity score for NS3 and NS5 came out to be 1.035 (B) and 1.008 (C) respectively. Surface accessibility for E protein, NS3, and NS5. The surface probability of a hexapeptide more than 1.0 (threshold) predicts that the sequence has an increased probability to be found on the surface 31 . The graphical representation of the predicted peptides for E protein, NS3 and NS5 depending upon the sequence position (along the x-axis) and surface probability (along the y-axis) is shown in Fig. S2. The minimum surface probability score calculated by the program was 0.067 for the E protein from amino acid position 387 to 392 (387IVIGVG392), while the maximum surface probability score calculated was 5.725 for the E protein from amino acid position 159 to 164 with the sequence of hexapeptide 159ETDENR164, where E159 is a surface residue (one with a greater than 20 Å 2 of water-accessible surface).
Moreover, the minimum surface probability score calculated by the program was 0.123 and 0.067 for NS3 and NS5 from the amino acid position 19 to 24 and 314 to 319, respectively. The maximum surface probability score calculated was 4.935 and 9.228 for NS3 and NS5 from the amino acid position 100 to 105 and 207 to 212 with the sequences of hexapeptides 100KTKDGD105 and 207KREKKQ212, where K100 and K207 are surface residues as shown in Fig. S2.
Surface flexibility for E protein, NS3, and NS5. The vibrational motion of atoms within a structure indicated by temperature or B factors is calculated by Karplus and Schulz flexibility method 32 . In well-ordered, that is, organized structure, the atoms have a low B-factor value; on the other hand, the higher the B factor 32 . The graphical representation of the surface flexibility results for E, NS3, and NS5 proteins are shown in the Fig. S3. The minimum flexibility score 0.894 was predicted by the program showing a more ordered structure with a sequence of heptapeptide 137YRIMLSV143 while the maximum score was 1.112 (367ESTENSK373 amino acid sequence) for E protein. On the other hand, the minimum flexibility score for NS3 and NS5 were predicted to be 0.887 and 0.839 respectively, showing a more ordered structure with a sequence of heptapeptides 29FHTMWHV35  Parker Hydrophilicity Prediction for E protein, NS3, and NS5. Based upon the peptide retention times during HPLC on reversed phase column, the Parker hydrophilicity scale method is employed for the prediction of hydrophilicity of peptides 33 . The depicted surface hydrophilic regions are associated with the known antigenic sites in immunological studies 33 . The graphical representation of the predicted peptides for E protein, NS3 and NS5 on the basis of the sequence position (along the x-axis) and hydrophilicity (along the y-axis) is shown in Fig. S4. The minimum hydrophilicity score calculated by the software was − 3.171 for E protein from amino acid position 217 to 223. According to the Parker Surface Hydrophilicity Prediction result is 217WFHDIPL223, the sequence of a heptapeptide. The maximum hydrophilicity score calculated was 7.057 for E protein from amino acid position 83 to 89 with the sequence of a heptapeptide 83DKQSDTQ89.
Moreover, the minimum hydrophilicity score calculated by the program was − 2.529 and − 5.886 for NS3 and NS5 from amino acid position 28 to 34 as well as 29 to 35 (NS3) and 223 to 229 (NS5), respectively. According to the Parker Surface Hydrophilicity Prediction result are 28VFHTMWH34 as well as 29FHTMWHV35 and 223AIWYMWL229 sequences of heptapeptides. The maximum hydrophilicity score calculated was 4.914 and 6.300 for NS3 and NS5 from amino acid position 73 to 79 and 347 to 353 as well as 348 to 354 with the sequences of heptapeptides 73DGHSEVQ79 and 347QDQRGSG353 as well as 348DQRGSGQ354, respectively.
The infected cell with antigen-presentation activates T-cell to become effector cell to kill any infected cell. Self-destruction or cell death is observed after the CTLs attack on infected cells. The peptide fragment of the pathogen and the MHC molecule bind together and appear on the cell surface of infected cells. The peptide-protein complex is recognized by CTLs, and, as a result, infected cells are killed. Peptide fragment (antigen) processing, as well as its presentation to the T-cell, is accomplished by various steps. Peptides are processed in the cytosol by the proteasome and later transported to the endoplasmic reticulum (ER) where MHC is synthesized. Here, transporter associated with antigen processing (TAP) moves the peptide into the MHC I molecule. After that peptide-MHC I complex is transported to the cell surface. A diverse range of peptides is bound to each allelic form of MHC I protein. The MHC molecule has the ability to bind with peptides tightly as the pathogens try to mutate the epitope of the MHC molecule. Hence, MHC molecule exhibits high binding affinity with a variety of peptides 34 .
In the scope of in silico vaccine designing, T-cell epitope prediction is rendered as a milestone 28 . CTL epitopes are proved to be potential candidates for peptide vaccine design for various diseases. By using in silico epitope prediction methods, numberless research groups have reported excellent results. In the development of different  vaccines against various infectious diseases, including autoimmune disease and cancer, these research groups have been found to have played a key role 35 . CTL epitope prediction is an important in silico tool in the vaccine designas it reduces the need for in vitro experiments. NetCTL 1.2 server 36 was employed for the CTL epitope prediction. Peptide sequences from ZIKV-E, NS3 and NS5 were predicted as CTL epitopes on the basis of their specific MHC binding affinity, proteasomal C-terminal cleavage, TAP transport affinity and potential MHC ligands were identified, whose prediction scores were > 0.75000 threshold. Ten peptide sequences from ZIKV-E were predicted as CTL epitopes whose prediction score were > 0.75000 (Table 4). For ZIKV-NS3, five peptide sequences were predicted as CTL epitopes with predicted score above threshold ( Table 5). As mentioned above, the leucine residues at positions 80 and 81 were also predicted as antigenic sites. Hence, 75HSEVQLLAV83, one of the CTL epitopes predicted from ZIKV-NS3, can be considered a potential vaccine candidate. Correspondingly, fifteen peptide sequences from ZIKV-NS5 were predicted as CTL epitopes whose prediction score were > 0.75000 (Table 6).
Homology modeling and structure validation. To perform structure-based epitope prediction of ZIKV proteins (E, NS3 and NS5), homology modeling of proteins E, NS3 and NS5 was performed (Fig. S5). It was observed that the 3D crystal structures of dengue virus type 1 and type 3 envelope proteins (3G7T chain A and 1UZG chain A), West Nile virus NS2b-NS3 protease in complex with 3, 4-dichlorophenylacetyl-Lys-Lys-GCMA (2YOL chain A) and RNA-dependent RNA polymerase domain from the West Nile virus (2HFZ chain A) were found to be the best hits based on E-value, query coverage and identity. Hence, these were considered the best templates to perform homology modeling (Table 7).
HHpred server 37 identified 57% maximum identity of glycoprotein E for ZIKV (403 amino acids) with the Protein Data Bank (PDB) structures: 3G7T chain A (crystal structure of dengue virus type 1 envelope protein) and 1UZG chain A (crystal structure of dengue virus type 3 envelope protein). The program employed PDB codes (3G7T chain A and 1UZG chain A) as a template and built the homology model with "100%" confidence. Greater than 90% confidence reflects that core model is very much precise and correct deviating 2-4 Ǻ in rmsd from the native protein structure. Moreover, the percentage identity between the template and the query sequence being more than 30-40% reflects good accuracy in the model.
As far as NS3 and NS5 proteins are concerned, 70% (NS3) and 72% (NS5) maximum sequence identity for ZIKV were found with the PDB structures 2YOL chain A (crystal structure of West Nile virus NS2b-NS3 protease in complex with 3, 4-dichlorophenylacetyl-Lys-Lys-GCMA) and 2HFZ chain A (crystal structure of RNA-dependent RNA polymerase domain from West Nile virus). Stereochemical quality of the models was also comparable with the template structures (see Methods).
Structure-based Epitope Prediction for ZIKV E, NS3 and NS5 proteins. For the epitope prediction in 3D structures, ElliPro 38 was employed. This web-based advanced program helps studying the correlation  Table 5. Predicted CTL epitopes from the ZIKV NS3 protein. *Prediction score threshold was set at > 0.75000. Bold indicates amino acids that were also predicted as antigenic sites.
between solvent accessibility, flexibility, and antigenicity of a protein structure. Moreover, an important feature is that the predicted epitopes are differentiated on the basis of protein-antibody interactions. For E-protein of ZIKV three-discontinuous peptides with a score value of 0.7 or more were selected. The score (Protrusion Index, PI) reflects the percentage of protein atoms that extend beyond the molecular bulk and are responsible for antibody binding 38 . The highest probability for the E protein was computed as 74.4% (PI score: 0.744). Moreover, the highest probability for both NS3 and NS5 was computed as 87.1% and 73.6% (PI score: 0.871 and 0.736), respectively. Amino acid residues, the number of residues, sequence location as well as their scores are tabulated in the Tables 8, 9 and 10. The graphical representation of the discontinuous epitopes is displayed in Fig. 1.
Molecular Docking of ZIKV-E protein with HLA-A0201. Out of 10 epitopes predicted, five CTL epitopes predicted from ZIKV-E protein were docked to MHC class I HLA-A0201. All five CTL peptides exhibited strong binding affinities in terms of global energy and attractive van der Waals energy (vdW) ranging from − 52.10 to − 59.59 kcal/mol and − 19.87 to− 29.58 kcal/mol as tabulated in Table 11. Among the five peptides, post-docking analysis of three CTL predicted epitopes (MAEVRSYCY, FSDLYYLTM, and TMNNKHWLV) revealed good interactions with HLA-A0201 (Fig. 2). In MAEVRSYCY-MHC HLA-A0201 complex, three hydrogen bonds within a distance of 3.5 Å, suggested the stability to the docked complex. Significant hydrogen bond interactions were seen between the residues Cys8 and Tyr9 of the peptide and Asp77 of the MHC protein. It was further predicted that   (Table 11). His6, Trp7, and Val9 of the peptide have been predicted to be antigenic. Molecular interactions of the specific peptides and their respective docked poses are represented in Fig. 2.

Molecular Docking of ZIKV-NS5 protein with HLA-C0801. Eight CTL epitopes predicted from
ZIKV-NS5 protein, out of 15 peptides, were docked against class I MHC-HLA-C0801. The docked complexes are shown in Fig. 4. The overall stability of the complex structures seems to be well preserved by the formation of hydrogen bonds (Table 13). It can be noted that common residues including Arg97, Tyr99, Tyr116 and Gln155 from MHC-HLA-C0801 were involved in the formation of hydrogen bonding with various CTL predicted    Table 13. Both nitrogens (NH1 and NH2) of Arg97 from the MHC-HLA-C0801 formed hydrogen bonds with the backbone oxygen atom of Gln7 of FTNLVVQLI. Likewise, Gln155 from MHC-HLA-C0801 formed several hydrogen bonds; hydrogen bond interactions were formed between oxygen (OE1) of Gln155 and OG1 of Thr4 and Thr7 of IAMTDTTPY, and as well as with sulphur (SG) of Cys4 (ETACLAKSY). The detailed depiction has been given in the Table 13 and illustrated in Fig. 4.

Molecular dynamics simulations.
The stability of the docked peptide-MHC I protein complexes was further investigated by performing molecular dynamics (MD) simulations of the complexes in an explicit water box at 300 K for a period of 5 nanoseconds. The potential energy of the simulation systems remained stable during the MD simulations (data not shown). Some of the docked complexes had steric clashes between the amino acids of the binding partners (see Tables 11,12 and 13). However, during the energy minimization of the complexes, these clashes were removed and, on the other hand, new favorable interactions (e.g. hydrogen bonds) were formed (Tables 11, 12 and 13). A few peptides kept their initial conformation during the simulation, whereas others lost it partially (Supporting Information, Fig. S6-8; Table S1). Moreover, many peptides moved from their original docking position and formed new favorable interactions during the simulation (Tables 11, 12 and 13). The  binding groove ('F pocket') 39 also changed its size variably depending on the peptide that was inside the groove (Supporting Information, Table S1, Fig. S6-8).

Discussion
The urgent need for preventive measures against a global threat of a ZIKV epidemic has awakened the researchers to investigate the pathogen. Especially cost-effective and fast methods such as immunoinformatics tools have been quickly harnessed by researchers in different countries to, for example, predict possible antigenic epitopes from ZIKV proteins (especially the E protein) for peptide vaccine development (see for example these very recent studies in refs 40-47). In general, such in silico approaches help reduce the number of in vitro experimental assays and provide an essential tool in vaccine design [27][28][29] . In the current study, we have predicted potential epitopes not only for the E protein but also for the NS3 and NS5 proteins. Some of the recent studies have also employed different docking tools to investigate the binding of the predicted peptides to various MHC-I proteins. For example, Alam et. al. 45 docked two predicted epitopes to HLA-A*53:01 with Autodock and reported good predicted binding affinities for the peptides. In none of these studies on ZIKV epitopes has MD been used to investigate the stability of the peptide-MHC-I complexes. On the other hand, there are several studies that have generally investigated the dynamics of peptide binding in the MHC-I binding groove by means of MD simulations (e.g. refs 39, 48 and 49). For example Fleischmann et al. studied the mechanism of how tapasin, a chaperone, takes part in controlling the quality of the binding peptides (high or low affinity binders) 50 . Their study suggests that a high-affinity peptide succeeds to close the binding groove tightly, while a low-affinity peptide (or the absence of a peptide) widens the groove. In our study, two of the predicted epitope peptides were closing the groove as they reduced the F pocket size (Supporting Information, Table S1): MAEVRSYCY (from ZIKV E protein) and IAMTDTTPY (from ZIKV NS5 protein). The rest of the peptides widened the groove to variable extent. In case of linear epitopes, the leucine residues at positions 80 and 81 from the predicted ZIKV-NS3 antigenic site were also predicted as comprising a CTL epitope, thus significantly enhancing the possibility of this peptide to become a potential peptide vaccine candidate. Moreover, surface accessibility, surface flexibility as well as hydrophilicity of epitopes for E protein, NS3 and NS5 have been reported in the current study. The stability of the peptide-MHC I complexes has also been examined in the current study by MD simulations. Overall, the complexes get stabilized during the MD simulation after the favorable interactions have been formed and the possible steric clashes resulting from the rigid-body docking have been removed. Most of the electrostatic and hydrogen binding interactions are formed between the N-and C-terminal ends of the peptides, which is in agreement with the well-known pattern of epitope binding to MHC-I proteins. However, the PEP-FOLD3 predicted conformations of the peptides can be possible in solution but when the peptides bind to the binding groove, the experimental structures show a rather elongated conformation, C-terminus residing mainly in the more flexible F pocket and N-terminus in the less flexible pocket at the other end of the groove and some anchor residues in between interacting with the groove. In this case, flexible docking methods might work better when trying to dock the epitope peptides to the MHC-I protein binding groove in the described way.
Structural and genomic data, alongside the drastic development of vast genome sequence databases, aids in the design and discovery of novel vaccine candidates when coupled together with computational tools. Infection of ZIKV is a serious problem concerning morbidity and mortality worldwide. Unfortunately, the unavailability  Continued of vaccines against the ZIKV has affected many precious lives in various regions of the world. Researchers have beenstruggling to gather data associated to ZIKV to understand its biology, transmission and pathophysiology in order to eradicate the disease successfully. We inferred that the predicted epitopes possess therapeutic potential, with promising scope in the near future. Our immunoinformatics analysis will aid in the development of potential peptide vaccines using the predicted peptides.

Materials and Methods
ZIKV protein sequences. The primary amino acid sequences of ZIKV E, NS3, and NS5 proteins were retrieved from the ZIKV polyprotein sequence (GenBank ID: AHZ13508.1) that is deposited in the National Center for Biotechnology Information (NCBI) database (http://www.ncbi.nlm.nih.gov/protein/). The individual sequence lengths for the viral protein segments were 403 (E), 151 (NS3) and 641 (NS5).

Prediction of Linear and Conformational B-Cell epitopes. Antigen B-cell epitope interacts with
B-lymphocyte which causes the B-lymphocytes to differentiate into antibody-secreting plasma and memory cells 51 . B cell epitope has two important characteristics including hydrophilic nature and accessibility for a flexible region of an immunogen 52 . According to the Parker hydrophilicity prediction 33  (http://www.cbs.dtu.dk/services/Net CTL), CTL epitopes were predicted. The activation of CTLs takes place on the surface of antigen-presenting MHC molecules. The prediction of the MHC class I binding, TAP transport efficiency and the proteasomal C-terminal cleavage was integrated with the help of NetCTL 1.2 server. As an input, the FASTA sequence of the organism was provided. Human leukocyte antigen (HLA) alleles and peptide lengths were both selected and submitted. As an output, T-cell epitopes were predicted. In order to predict proteasomal C-terminal cleavage and MHC class I binding, the artificial neural network was used while a weight matrix was used to predict the TAP transport efficiency.
Homology modeling and structure validation. Homology modeling was performed for E protein, NS3 and NS5 proteins of ZIKV 37 . The Protein Data Bank (PDB) repository was employed to find suitable templates to generate the three-dimensional (3D) coordinate structures for all three protein sequences that were taken from UniProt (www.uniprot.org). PSI-BLAST 55 was used to search for the homologous proteins from the PDB. ALIGN2D module was used to generate the initial alignment between the target and the template. Using a restrained-based approach in MODELLER v. 9.12, the 3D structures of all three proteins (E-protein, NS3, and NS5)  were generated 56 . Moreover, spatial limitations were calculated with stereochemistry by CHARMM 57 . The stereochemical quality and correctness of the modelled structures was verified with WhatIF 58 , PROCHECK 59 and Verify 3D 60 .
Peptide-MHC protein complex and molecular docking studies. The predicted CTL epitope peptides of ZIKV E, NS3 and NS5 proteins that contained antigenic amino acid residues were selected for the docking analysis. 3D structures of these peptides were modeled with the PEP-FOLD3 server 61 , using 200 simulation runs to sample the conformations. After PEP-FOLD3 had clustered the different conformational models, they were sorted using the sOPEP energy 62 value. Subsequently, the best ranked peptide models were docked to the selected class I MHC molecules including HLA-A (PDB ID: 2GIT), HLA-B (PDB ID: 2BST) and HLA-C (PDB ID: 4NT6), using the PatchDock rigid-body docking server 63,64 . PatchDock calculates potential complexes of two given molecules. All complexes that exhibit undesirable penetrations of the atoms of the receptor to the ligand are discarded and the remaining complexes are categorized according to the geometric shape complementarity score 63,64 . The docking results were then refined and scored with the FireDock server 65,66 . FireDock assists in resolving the flexibility and scoring problems produced by fast rigid-body docking programs 67 . The hydrogen bonding interactions of the docked structures were studied with the molecular visualization programs UCSF Chimera 1.11 68 and PyMOL 69 (Schrödinger, Inc).

Molecular dynamics simulations.
The stability of the docked complexes was further investigated by molecular dynamics (MD) simulations using the AMBER 12 simulation package 70 . After a stepwise minimization and equilibration protocol the solvated system (with TIP3P 71 water) was submitted to a production simulation of 5 ns at 300 K and at 1 bar pressure (see Supporting Information for the detailed protocol). The ptraj module of AMBER was used for the trajectory analysis.