Designing a novel multi‑epitope vaccine against Ebola virus using reverse vaccinology approach

Ebola virus (EBOV) is a dangerous zoonotic infectious disease. To date, more than 25 EBOV outbreaks have been documented, the majority of which have occurred in Central Africa. The rVSVG-ZEBOV-GP vaccine (ERVEBO), a live attenuated vaccine, has been approved by the US Food and Drug Administration (FDA) to combat EBOV. Because of the several drawbacks of live attenuated vaccines, multi-epitope vaccines probably appear to be safer than live attenuated vaccines. In this work, we employed immunoinformatics tools to design a multi-epitope vaccine against EBOV. We collected sequences of VP35, VP24, VP30, VP40, GP, and NP proteins from the NCBI database. T-cell and linear B-cell epitopes from target proteins were identified and tested for antigenicity, toxicity, allergenicity, and conservancy. The selected epitopes were then linked together in the vaccine's primary structure using appropriate linkers, and the 50S ribosomal L7/L12 (Locus RL7 MYCTU) sequence was added as an adjuvant to the vaccine construct's N-terminal. The physicochemical, antigenicity, and allergenicity parameters of the vaccine were all found to be satisfactory. The 3D model of the vaccine was predicted, refined, and validated. The vaccine construct had a stable and strong interaction with toll-like receptor 4 (TLR4) based on molecular docking and molecular dynamic simulation (MD) analysis. The results of codon optimization and in silico cloning revealed that the proposed vaccine was highly expressed in Escherichia coli (E. coli). The findings of this study are promising; however, experimental validations should be carried out to confirm these findings.


Results
Identification and retrieval of viral protein sequences. Based on previous studies [38][39][40] and the Protegen server, the VP35, VP24, VP30, VP40, GP, and NP proteins were chosen as target proteins for epitope prediction. The VaxiJen v2.0 server predicted that all target proteins had an antigen score of > 0.4 (threshold value); their accession number and antigenicity scores are given in Table 1.
T-cell epitope prediction. Cytotoxic T lymphocytes (CTLs) are critical elements of adaptive immunity that play a key role in the elimination of infected cells 41 . CTL epitopes are involved in the development of longlasting cellular immunity and can remove circulating viruses as well as virus-infected cells 42 . The NetCTL 1.2 server identified a total of 591 CTL epitopes from six proteins. To determine the best CTL epitopes from a set of CTL epitopes, we first chose CTL epitopes that bind strongly to at least three MHC class I supertypes. These epitopes were then filtered for antigenicity, toxicity, allergenicity, and conservancy. Finally, ten antigenic, nontoxic, and non-allergenic CTL epitopes with more than 80% conservancy were chosen (Tables S1-S6).
Helper T lymphocytes (HTLs) play an important in the elimination of extracellular pathogens using various cytokines and stimulating humoral immune responses 43 . HTL epitopes help regulate the adaptive immune system by inducing the production of T cell cytokines 44 . For the six target proteins, 912 HTL epitopes of 15-mer length were predicted by the NetMHCII 2.3 server. Among them, HTL epitopes that could bind strongly to at least 5 human MHC class II alleles were screened for antigenicity, toxicity, allergenicity, and conservancy, as well as the Evaluating physicochemical, solubility, allergenicity, and antigenicity parameters of the designed vaccine. Several important physicochemical features of the multi-epitope vaccine were characterized by the ExPASy ProtParam tool. The theoretical pI and GRAVY of the vaccine construct were estimated to be 9.09 and − 0.264, respectively. The half-life of the vaccine was calculated to be 30 h in mammalian reticulo-  For the refining process, the model with the highest C value (− 0.61) was chosen. The GalaxyRefne server was used to refine the selected model. The GalaxyRefne server provided a total of five refined models of the vaccine. Model 3 was considered the best refined model based on its model quality scores (Table 3). Higher GDT-HA values indicate higher quality models 48 ; model 3 has a GDT-HA score of 0.9421, which was higher than all refined models. A lower RMSD value means more stability, and an RMSD score of 0 to 1.2 is normally considered acceptable 48 .
The RMSD score for this model is 0.442. The MolProbity score represents the crystallographic resolution of the protein's 3D model. A lower MolProbity score indicates a less critical error 49 . Model 3 has a MolProbity score of 2.428, which is significantly lower than the initial model. The Clash score indicates the number of undesirable all-atom spatial overlaps, and the poor rotamers score indicates the number of residues with a lower capacity of conceivable rotation in their side chains 50 ; the lower the score of these parameters, the better the 3D structure of the protein; the Clash and poor rotamers scores of model 3 are 23.9 and 0.9, respectively. The higher the Rama favored value, the higher the model's quality. Model 3 has Rama favored of 89.6. The initial and refined 3D model of the vaccine has been shown in Fig. 3. The quality of the initial and refined models of the multi-epitope vaccine www.nature.com/scientificreports/ was evaluated using the SWISS-MODEL Structure Assessment and ProSA-web server. Ramachandran plot analysis of the initial model revealed that 72.29% of the residues were in the favoured region, whereas in the refined model 88.76% of the residues were in the favoured region (Fig. S3). The z-scores calculated by the ProSA web server for the initial model and the refined model of the vaccine were − 2.46 and − 3.47, respectively (Fig. S3).
Discontinuous B-cell epitope prediction. The ElliPro server predicted 16 discontinuous B-cell epitopes with scores ranging from 0.585 to 0.829 (at a threshold of 0.5). The size of the epitopes varied from 8 to 39 residues (Table 4). Discontinuous B-cell epitopes with a score above 0.8 are shown in (Fig. S4).
Disulfide engineering of the vaccine construct. In the refined model of our vaccine, the DbD2 server predicted 59 residue pairs with the potential to form disulfide bonds. The residue pairs were screened based on χ3 angle (between − 87° and + 97°) and bond energy (less than 2.2 kcal/mol) parameters. Only six residue pairs were chosen for disulfide bond formation after careful consideration, including PHE28-GLY46, THR31-VAL42, PHE32-LYS79, PRO 101-ALA113, PHE284-GLY317, and GLY519-MET549. The original and mutant models of the vaccine candidate are shown in Fig. S5.
Molecular docking analysis. ClusPro 2.0 server conducted molecular docking of the multi-epitope vaccine and TLR4 and generated 30 docked complexes with different cluster members and the lowest energy. Cluster No. 2 had the lowest energy score of − 1279.1 kcal/mol with 38 members, which was the most negative score (strongest interaction) among all docked complexes (Fig. 4). The residues involved in the interaction between TLR4 and vaccine were analyzed by LigPlot software (Fig. 5). Also, the list of residues with hydrogen bonds along with the bond length is given in Tables 5 and 6.
Molecular dynamics simulation of the docked complex. The molecular dynamic simulation of the vaccine-TLR4 complex was performed using GROMACS 2019.6 software for 40 ns. The RMSD graph is used to assess the structure's stability during the simulation. The RMSD value of TLR4 increased rapidly at the beginning of the simulation, reaching about 0.3 nm at 3000 ps and remaining stable until the end of the simulation. An initial sudden change occurred in the RMSD graph of the vaccine and after 6000 ps it has reached the value of 1.3 nm and showed slight fluctuations around this value until the end of the simulation (Fig. 6a). The input and output structures of the molecular dynamics simulation are stacked so that any changes can be detected. After Table 3. Results of the GalaxyRefne server. Model 3 was selected as the best refined model based on the GDT-HA, RMSD, MolProbity, Clash score, Poor rotamers, and Rama favored parameters.  www.nature.com/scientificreports/ simulating molecular dynamics, the vaccine shows a relatively high conformational change, as shown in Fig. 6b. In molecular dynamics studies, RMSF is the most commonly used method for measuring the oscillating motions of macromolecules. As seen in the previous section, the vaccine is attached to chain B of TLR4, and since the chains A and B have the same sequence on the TLR4, to accurately investigate the effect of vaccine binding on chain B flexibility, the RMSF plot for chains A and B is provided. The flexibility of the residues at both ends of the TLR4 sequence (N-terminal and C-terminal) in chain A was greater than in chain B, while the flexibility of the other regions in the two chains was the same and did not show a notable change. The RMSF plot of the vaccine revealed that the majority of the vaccine's residues had slight flexibility, indicating that the vaccine construct has established stable interactions with TLR4 ( Fig. 6c).
Codon optimization and in silico cloning. The vaccine construct was back translated using JCat to a cDNA sequence with a length of 2034 nucleotides. The vaccine sequence had a CAI value of 0.9571 and a GC content of 53.09%. The vaccine sequence was cloned into the pET-28a (+) vector by SnapGene software. The length of the cloned vaccine was estimated to be 7.166 kbp (Fig. 7).

Discussion
EBOV disease is one of the most lethal viral diseases affecting both humans and non-human mammals. The disease's mortality rate is reported to be 90% 51 . The Ebola outbreak was labeled a public health emergency of international concern by the World Health Organization (WHO) in August 2014 52 . Ebola and Marburg are both Table 4. Predicted discontinuous B-cell epitopes from the refined 3D model of the multi-epitope vaccine.  53 .
Vaccines are a safe and effective way of limiting the spread of lethal infectious diseases and saving millions of lives 54 . Conventional vaccine development methods are labor-intensive, costly, and time-consuming. Furthermore, the likelihood of failure in subsequent trials is high 55 . The reverse vaccinology will employ computational approaches and bioinformatics tools for antigen identification. This method can identify new potential antigenic proteins that play an important role in the immunogenicity and the safety of the vaccines 56 64 .
In this work, we designed a multi-epitope vaccine against EBOV. The identification of antigenic proteins is an important step in vaccine design. Using computational methods, we conducted an organized and thorough evaluation of the Ebola proteins and selected the VP35, VP24, VP30, VP40, GP, and NP proteins as target proteins for epitope prediction. CTL, HTL, and linear B-cell epitopes were predicted from these proteins. In the first step, epitopes (CTL and HTL epitopes) that interacted with a large number of HLA alleles were chosen for further investigation. To ensure that the selected CTL, HTL, and linear B cell epitopes are sufficiently antigenic and conserved while remaining non-toxic and non-allergenic to vaccine recipients, various immunoinformatics tools were used to screen epitopes for the aforementioned parameters. Aside from the aforementioned parameters, HTL epitopes were tested for their ability to induce IFN-gamma and IL-4 production. IFN-gamma has a crucial www.nature.com/scientificreports/ function in both adaptive and innate immune activation, in addition to interfering with viral replication 65 . IL-4 causes allergic reactions. This is due to the fact that IL-4 directs the development of TH2, which results in the production of IgE 66 . Different linkers were utilized to connect epitopes with various patterns in several primary structures of vaccine construct. After analyzing the physicochemical characteristics, particularly the stability of the generated constructs, it was found that AAY, GPGPG, and KK linkers are appropriate for joining CTL, HTL, and linear B-cell epitopes, respectively. Our vaccine construct's primary structure included 1 adjuvant, 7 CTL epitopes, 17 HTL epitopes, 6 linear B-cell epitopes, 1 EAAAK linker, 6 AAY linkers, 17 GPGPG linkers, and 6 KK linkers. The EAAAK linker is a rigid linker that, due to its helix formation properties, helped improve the immunogenic properties 43,67 . AAY, GPGPG, and KK linkers are typically made up of flexible and hydrophilic amino acids, which could help to prevent domain disruption 68 . The vaccine's physicochemical characteristics were evaluated to facilitate subsequent experimental evaluations of the vaccine and to enable the successful setup of in vitro and in vivo assays. The theoretical pI of the vaccine was determined to be 9.09, suggesting that the vaccine is basic in nature. The vaccine construct's GRAVY was − 0.264; a negative value for this parameter indicates that the vaccine is hydrophilic and also has a high degree of solubility 69 . Our vaccine construct has a lower GRAVY than the vaccine designed in the study by Kadam et al. 38 , indicating that our vaccine will interact better with water molecules. The half-life of our vaccine in mammalian, yeast, and E. coli was determined to be 30, 20, and 10 h, respectively, implying that the vaccine is exposed to the immune system for a longer period of time and induces more immune responses 70 . The predicted half-life of the vaccine designed in the study of Shankar et al. 40 was identical to the half-life of our vaccine. The vaccine's molecular weight was determined to be 73.15 kDa; vaccines with molecular weights less than 110 kDa could be a good vaccine candidate since they are easier to clone and express in expression systems than large proteins 71 . The aliphatic index score was 81.47; an aliphatic index greater than 50 indicates that the vaccine is stable at higher temperatures 72 . The vaccine instability index was 26.72, indicating its stability. The instability index of the vaccine designed in the study by Kadam et al. 38 was 38, indicating that our vaccine is more stable. In general, a protein with an instability index of less than 40 is considered stable 73 . The Solpro server result showed that our vaccine was soluble. The vaccine construct was identified as non-allergen by the AllerTOP v. 2.0 server. Our vaccine had high antigenicity scores on both the ANTIGENpro and the Vaxijen v2.0 servers. The results of predicting the vaccine's second structure were acceptable because the number of helices in the second structure, which indicates the number of hydrogen bonds and thus the protein's stability, was sufficient 74 .
I-TASSER server predicted the vaccine 3D model, which was then refined by the GalaxyRefine server. The Ramachandran plot revealed that 72.29% of the residues in the initial vaccine model were in the favoured region, which increased to 88.76% percent after refining. The initial model had a z-score of − 2.46, whereas the refined Table 5. List of residues in the docked complex that are involved in hydrogen bonds between the vaccine and TLR4 (chain B).

TLR4 (chain B)
Vaccine Bond length (Å)  Table 6. List of residues in the docked complex that are involved in hydrogen bonds between the vaccine and TLR4 (chain D). www.nature.com/scientificreports/ model had a z-score of − 3.47. A more negative z-score indicates that the 3D refined model of the vaccine is of higher quality 75 . In the 3D refined model of the vaccine, 16 discontinuous B-cell epitopes were predicted. Because discontinuous B-cell epitopes play a critical role in humoral immune responses by producing antibodies 76 , our designed vaccine has the potential to induce large amounts of antibody production. The vaccine's refined structure was subjected to disulfide engineering, and 6 disulfide bonds were introduced in the refined model to increase the vaccine structure's stability 77 .

TLR4 (chain D) Vaccine Bond length (Å)
Since TLR4 is involved in the activation of proinflammatory mediators after EBOV infection 78 , the molecular docking analysis of the vaccine construct with TLR4 was carried out. The results of this analysis revealed that the vaccine had a significant affinity for TLR4, implying that it may elicit both an innate and adaptive immune response 79 . The MD simulation results of the vaccine-TLR4 docked complex revealed that the vaccine reached a stable state in less than 40 ns and that its flexible residues were present in the range of 500-678, because this region had no interaction with TLR4, and can move freely, so its flexibility is higher than other regions of the vaccine construct. JCat server revealed that the vaccine construct had a CAI of 0.9571 and a GC content of 53.09%. CAI greater than 0.890 80 and GC content between 30 and 70% 76 is ideal for target organism expression. In the present study, we used a set of bioinformatics software to design a multi-epitope vaccine against Ebola, and the results were promising. The only limitation of this study is the need for further in vitro and in vivo studies to demonstrate the vaccine candidate's efficacy and safety.

Conclusion
EBOV is one of the most dangerous viruses among viral hemorrhagic fevers. The purpose of the current study was to design a multi-epitope vaccine against EBOV utilizing immunoinformatics approaches. T-cell and B-cell epitopes of target antigens were predicted, and epitope screening was conducted correctly and sequentially. The vaccine possessed all of the desirable qualities of a vaccine candidate, including good physicochemical properties, solubility, high antigenicity, and non-allergenicity. The vaccine's strong affinity for TLR4 was validated by molecular docking analyses, and the vaccine's stability was ensured by MD simulation. The codon optimization also presented an optimistic CAI value and GC content, confirming the vaccine's expression in a bacterial host. The rational design of the linear structure of the multi-epitope vaccine, which resulted from the proper arrangement of the selected epitopes and adjuvant, made the results of this study more impressive than previous studies.

Materials and methods
Identification and retrieval of viral protein sequences. In the first step of the study, antigen proteins of the Zaire Ebola virus were identified using literature 38,39,81 and the Protegen server (http:// www. violi net. org/ prote gen/). The Protegen is a web-based central database and processing system for collecting, storing, and analyzing protective antigens 82 . The reference sequences of the identified proteins were retrieved in FASTA format from the NCBI database (https:// www. ncbi. nlm. nih. gov/), and their antigenicity was evaluated using the VaxiJen v2.0 server (http:// www. ddg-pharm fac. net/ vaxij en/ VaxiJ en/ VaxiJ en. html). VaxiJen is the first server for alignment-independent prediction of antigenic peptides or proteins from a variety of organisms such as bacteria, viruses, parasites, fungi, and tumors [83][84][85] .
T-cell epitope prediction. CTL epitopes were predicted for the selected proteins using the NetCTL 1.2 server (http:// www. cbs. dtu. dk/ servi ces/ NetCTL/). The server predicts CTL epitopes (9-mer) for 12 MHC class I supertypes, including A1, A2, A3, A24, A26, B7, B8, B27, B39, B44, B58, and B62. This server identifies epitopes based on proteasomal C terminal cleavage, MHC class I binding peptide, and TAP transport efficiency scores 86 . For the present study 12 MHC class I supertypes were selected to predict CTL epitopes, and the cutoff value for epitopes prediction was set at 0.75. The NetMHCII 2.3 server (http:// www. cbs. dtu. dk/ servi ces/ NetMH CII/) was used to predict HTL epitopes. The NetMHCII 2.3 server identifies the binding of the peptides to the HLA-DR, HLA-DQ, HLA-DP, and mouse MHC class II alleles using artificial neural networks (ANNs) 87 . The threshold values corresponding to the strong and weak binder were set at 2% and 10%, respectively.
The predicted CTL and HTL epitopes were analyzed for antigenicity, toxicity, allergenicity, and conservancy. The antigenicity of the epitopes was checked with the help of the VaxiJen v2.0 server at a threshold value of 0.4. The toxicity or non-toxicity of the epitopes was determined by the ToxinPred server (https:// webs. iiitd. edu. in/ ragha va/ toxin pred/ design. php) using the SVM (Swiss-Prot) based method 88 . The AllerTOP v. 2.0 server (https:// www. ddg-pharm fac. net/ Aller TOP/ method. html) was used to analyze the allergenicity of the epitopes. AllerTOP is an alignment-free server for allergenicity prediction based on protein physicochemical parameters 89 . The Epitope Conservancy Analysis tool from the Immune Epitope Database (IEDB) (http:// tools. iedb. org/ conse rvancy/) was used to determine the degree of the conservancy of predicted epitopes. This tool calculates the degree of conservation of an epitope within a defined protein sequence 90 . In addition to the above mentioned screenings, HTL epitopes were checked for IFN-gamma and IL-4 production. The IFNepitope server (http:// crdd. osdd. net/ ragha va/ ifnep itope/ design. php) 91 and the IL4pred server (http:// crdd. osdd. net/ ragha va/ il4pr ed/) 92 were used to predict the ability of HTL epitopes to induce IFN-gamma and IL-4, respectively. org/ bcell/) was used to predict linear B-cell epitopes. The epitopes were predicted using the Bepipred Linear Epitope Prediction 2.0 method 93 . Finally, these predicted epitopes were assessed for antigenicity, toxicity, allergenicity, and conservancy using VaxiJen v2.0, ToxinPred, AllerTOP v. 2.0, and Epitope Conservancy Analysis, respectively.

Multi-epitope vaccine construction.
To construct a multi-epitope vaccine, the selected CTL, HTL, and linear B-cell epitopes were incorporated using suitable linkers. AAY (Ala-Ala-Tyr), GPGPG (Gly-Pro-Gly-Pro-Gly), and KK (bi-lysine) linkers were used to connect the CTL, HTL, and linear B-cell epitopes, respectively. These linkers were added to achieve the successful separation of individual epitopes 94 . Finally, to improve the immunogenicity of the multi-epitope vaccine, 50S ribosomal L7/L12 (Locus RL7_MYCTU) sequence with accession no. P9WHE3 as an adjuvant was added to the N-terminal of the vaccine construct via an EAAAK linker.
Evaluating physicochemical, solubility, allergenicity, and antigenicity parameters of the designed vaccine. The ProtParam tool of the Expasy server (https:// web. expasy. org/ protp aram/) was used to analyze a set of physicochemical parameters of the multi-epitope vaccine, including theoretical pI, grand average of hydropathicity (GRAVY), half-life, molecular weight, aliphatic index, and instability index 95 . The solubility of the proposed vaccine was predicted by the SOLpro server (http:// scrat ch. prote omics. ics. uci. edu/). The SOLpro uses a two-stage SVM algorithm based on multiple representations of the primary sequence to predict whether a protein would be soluble when overexpressed in Escherichia coli (E. coli) 96 . AllerTOP v. 2.0 server was utilized to evaluate the allergenicity of the multi-epitope vaccine. The antigenicity of the proposed vaccine was checked by ANTIGENpro and VaxiJen v2.0 servers. ANTIGENpro server (http:// scrat ch. prote omics. ics. uci. edu/) predicts the antigenicity of proteins or peptides using a two-stage architecture of their sequence and five machine learning algorithms 97 .
Secondary structure prediction. PDBsum (http:// www. ebi. ac. uk/ thorn ton-srv/ datab ases/ cgi-bin/ pdbsum/ GetPa ge. pl? pdbco de= index. html) was employed to predict the secondary structure of the proposed vaccine. PDBsum is a web server that provides information on the protein secondary structure, protein-ligand, protein-DNA interactions, and protein structure quality 98 .
Tertiary structure prediction, refinement, and validation of the vaccine construct. The tertiary structure of the multi-epitope vaccine construct was modeled using the I-TASSER server (https:// zhang group. org/I-TASSER/). I-TASSER server builds 3D structures from the amino acid sequence by reconfiguring the excised sections from the threading templates and computes the C-score to determine the correctness of the predicted models 47,99,100 . The GalaxyRefne server (http:// galaxy. seokl ab. org/ cgi-bin/ submit. cgi? type= REFINE) then refined the selected 3D model of the vaccine. The GalaxyRefine server employs a refining approach that was validated in CASP10. After reconstructing and repacking the side chains, this approach uses molecular dynamics simulation to achieve a general relaxation of the three-dimensional structure. Based on the CASP10 evaluation, this method performed the best in terms of improving the quality of the local structure 101,102 . The SWISS-MODEL Structure Assessment (https:// swiss model. expasy. org/ assess) and ProSA-web server (https:// prosa. servi ces. came. sbg. ac. at/ prosa. php) were used to compare the quality of the initial 3D structure and the refined structure of the vaccine. The Ramachandran plot is obtained using SWISS-MODEL Structure Assessment 103 . The Ramachandran plot illustrates the allowed and disallowed dihedral angles psi (ψ) and phi (ϕ) of amino acid, which are calculated based on the van der Waals radius of the side chain 104 . ProSA web server estimates the overall quality of the model as a z-score. If this z-score falls outside of the normal range for native proteins, the structure most likely contains errors 105,106 .
Discontinuous B-cell epitope prediction. The ElliPro server (http:// tools. iedb. org/ ellip ro/) was utilized to identify discontinuous B-cell epitopes in the refined 3D model of the vaccine. ElliPro assigns a score to each predicted epitope, which is characterized as a Protrusion Index (PI) value averaged over epitope residues 107 . The minimum score in this analysis was set at 0.5, while the maximum distance was set at 6 Å.
Disulfide engineering of the vaccine construct. Disulfide engineering is a new strategy of designing new disulfide bonds in the target protein through cysteine mutation of protein structure residues, resulting in increased protein structure stability 108 . Therefore, the Disulfide by Design 2 (DbD2) server (http:// cptweb. cpt. wayne. edu/ DbD2/) was used to identify residue pairs that had the potential for mutation and could be used in disulfide engineering 109 .

Molecular docking analysis.
Molecular docking is a computational approach that predicts the preferred orientation of the ligand to the receptor when they are joined together to form a stable complex 110 . We used the ClusPro 2.0 server (https:// clusp ro. org/ login. php) to perform molecular docking [111][112][113][114] . Initially, the 3D structure of TLR4 in PDB format (PDB ID: 4G8A) was obtained from the protein data bank (RCSB). The ligands attached to the retrieved TLR4 structure were then removed using Chimera 1.15rc software. The refined 3D model of the multi-epitope vaccine and TLR4 was submitted in the ClusPro 2.0 server as ligand and receptor, respectively. The LigPlot software was also used to analyze the bonds established between the ligand and receptor residues in the docked complex 115 . www.nature.com/scientificreports/ Molecular dynamics simulation of the docked complex. MD simulation is an effective method for studying ligand and receptor stability in a docked complex at the microscopic level 116 . In this study, MD simulation was run for 40 ns by GROMACS 2019.6 software 117 . Firstly, the input structure was prepared using the ff99SB force field. Following that, Na + and Cl − ions were introduced into the system to neutralize the net charge of the system. The complex was then solvated into a 10 Å layer of TIP3P using the gmx solvate software. The energy minimization of the solvated system was performed using the steepest descent approach to eliminate steric clashes. The system was gradually heated to 300 K for 20 ps and then the system was equilibrated at constant pressure, and the SHAKE algorithm was used to keep the hydrogen bond restrained. Finally, the root mean square deviation (RMSD) and root mean square fluctuation (RMSF) were generated using the gmx rms and gmx rmsf modules, respectively.
Codon optimization and in silico cloning. The Java Codon Adaptation Tool (JCat) (http:// www. jcat. de/) was utilized for reverse translation, codon optimization, and calculating the codon adaptation index (CAI) value and GC content of the vaccine construct in E. coli (strain K12) 118 . Three options were chosen here: avoid rho-independent transcription terminators, avoid prokaryotic ribosome binding sites, and avoid cleavage sites of restriction enzyme. The XhoI and BglII restriction sites were tagged at the 5′ and 3′ ends of the vaccine's DNA sequence, respectively. Finally, using SnapGene software (https:// www. snapg ene. com/ free-trial/), the obtained sequence was inserted into the pET28a (+) expression vector between the XhoI and BglII restriction sites.