Immunoinformatics design of a novel epitope-based vaccine candidate against dengue virus

Dengue poses a global health threat, which will persist without therapeutic intervention. Immunity induced by exposure to one serotype does not confer long-term protection against secondary infection with other serotypes and is potentially capable of enhancing this infection. Although vaccination is believed to induce durable and protective responses against all the dengue virus (DENV) serotypes in order to reduce the burden posed by this virus, the development of a safe and efficacious vaccine remains a challenge. Immunoinformatics and computational vaccinology have been utilized in studies of infectious diseases to provide insight into the host–pathogen interactions thus justifying their use in vaccine development. Since vaccination is the best bet to reduce the burden posed by DENV, this study is aimed at developing a multi-epitope based vaccines for dengue control. Combined approaches of reverse vaccinology and immunoinformatics were utilized to design multi-epitope based vaccine from the sequence of DENV. Specifically, BCPreds and IEDB servers were used to predict the B-cell and T-cell epitopes, respectively. Molecular docking was carried out using Schrödinger, PATCHDOCK and FIREDOCK. Codon optimization and in silico cloning were done using JCAT and SnapGene respectively. Finally, the efficiency and stability of the designed vaccines were assessed by an in silico immune simulation and molecular dynamic simulation, respectively. The predicted epitopes were prioritized using in-house criteria. Four candidate vaccines (DV-1–4) were designed using suitable adjuvant and linkers in addition to the shortlisted epitopes. The binding interactions of these vaccines against the receptors TLR-2, TLR-4, MHC-1 and MHC-2 show that these candidate vaccines perfectly fit into the binding domains of the receptors. In addition, DV-1 has a better binding energies of − 60.07, − 63.40, − 69.89 kcal/mol against MHC-1, TLR-2, and TLR-4, with respect to the other vaccines. All the designed vaccines were highly antigenic, soluble, non-allergenic, non-toxic, flexible, and topologically assessable. The immune simulation analysis showed that DV-1 may elicit specific immune response against dengue virus. Moreover, codon optimization and in silico cloning validated the expressions of all the designed vaccines in E. coli. Finally, the molecular dynamic study shows that DV-1 is stable with minimum RMSF against TLR4. Immunoinformatics tools are now applied to screen genomes of interest for possible vaccine target. The designed vaccine candidates may be further experimentally investigated as potential vaccines capable of providing definitive preventive measure against dengue virus infection.

parasitic helminths, bacterial, protozoan, fungal, ectoparasitic and viral diseases. The top 10 hotspots where NTDs dominate include sub-Saharan Africa, Asia, Oceania and the Middle East 5 . For extremely pathogenic NTDs novel therapies are being sought. Although probably less pathogenic or less frequently encountered such as dengue fever, which is neglected despite an apparent increase in prevalence 6 .
The endemic nature of dengue virus has been implicated in more than 100 countries notably in regions of Africa, Americas, Eastern Mediterranean, South-East Asia and the Western Pacific. The largest number of global dengue infection cases ever reported was in 2019 affecting all regions whereas dengue virus transmission was discovered in Afghanistan for the first time. Due to the asymptomatic and mild nature of this disease, the actual number of cases are under-reported with many being misdiagnosed as other febrile illnesses 7 . Dengue is reported as the second most diagnosed cause of fever after malaria and also the dengue fever (DF) being the most rapidly spreading mosquito-borne disease 8 . The rate of infection of dengue virus by Aedes mosquito species (Aedes aegypti and Aedes albopictus) increases geometrically with annual infection of about 400 million and 22,000 deaths 9 . Some of the symptoms include high fever accompanied by severe headache, anorexia, abdominal discomfort, maculopapular rash, fatigue, muscle and joint pain, unpleasant metallic taste in the mouth, loss of appetite, vomiting and diarrhea 10,11 . The World Health Organization (WHO) classified DF into uncomplicated and severe DF. In addition, dengue transmission is attributed to four serotypes of the dengue virus (DENV 1-4), where, individuals can be infected by all the four serotypes in their lifetime.
There are several potential dengue vaccines at different stages of development in clinical trial with only one licensed for use. Most of these vaccines are primarily developed from the envelope proteins prM and E, which are expected to elicit protective immune responses in humans 13,14 . Nonetheless, it is essential to consider that the human immune response to DENV is dominated by highly cross-reactive antibodies equipped with neutralizing and enhancing activity 15,16 . The Dengvaxia manufactured by Sanofi Pasteur is a live, attenuated, and tetravalent recombinant vaccine called ChimeriVax 12 . This formulation was the first vaccine to be licensed against dengue in the year 2015 due to its promising results obtained in various clinical studies 17 . In May 2019, Dengvaxia was licensed and approved by the US Food and Drug Administration (FDA) in the United States in areas where dengue was prevalent. This vaccine is available in 19 countries, and its use is limited to for a particular age range and cannot be administered in Flavivírus-naïve individual 18 . Its formulation consists of chimeric viruses constructed by infectious clone technology. Individuals who receive the vaccine and have not been previously exposed to DV may likely develop severe dengue if they are infected after vaccination 19,20 . Although Dengvaxia was evaluated and licensed for human use in 20 countries, it does not contain the non-structural proteins of DENV which presents low protective efficacy and increases their risk of hospitalization 21 . In 2018 the safety issues from the phase III clinical trial of Dengvaxia was published 22 . Due to the increased rate of hospitalization occurring in seronegative vaccinated children between the age of 9-11, Dengvaxia was predicted as vaccine failures rather than as vaccine enhanced dengue disease 23 .
Other vaccine formulations against DENV include live attenuated tetravalent vaccine for dengue (LATV), TAK-003, tetravalent dengue (TDEN), dengue purified inactivated vaccine (DPIV), tetravalent DNA vaccine against dengue (TVDV), and genetic constructs of the tetravalent vaccine formulation (V180) 24 . The LATV was developed by the National Institute of Allergy and Infectious Diseases (NIAID) using recombinant DNA technology 25 . The formulation of this anti-dengue formulation contains the structural and non-structural DENV proteins and the trial study was proven to be safe 26,27 . LATV is currently in phase III clinical trial. The main challenge of LATV is that it requires a significant amount of DENV2 for a balanced seroconversion and the efficacy is yet unknown 28 . Tak-003 produced by Takeda Pharmaceutical Company Limited, is based on an attenuated virus and chimeric viruses constructed using recombinant DNA technology 29 . Currently in phase III clinical trial, Tak-003 in the was shown to be safe, immunogenic, and effective against all the four DENV serotypes with reduced cellular immune response 30,31 . Its efficacy was reported to be 80.2%. However, Tak-003 induces low levels of antibodies to DENV3 32 .
V180 is another anti-dengue potential vaccine registered under the ClinicalTrials.gov is under the phase I clinical trial. It is composed of the recombinant forms of the DENV envelope glycoprotein 33 . Currently, V180 is being developed by Merck and the vaccine formulation is adjuvanted by ISCOMATRIX™ 34 . In preclinical study, V180 was proven to be safe and immunogenic due to the high quality of the recombinant proteins and ISCOMA-TRIX™ adjuvant. Nonetheless, the immunogenic properties of V180 is dependent on adjuvants and three-dose immunization regimen is required 35 . TDEN, DPIV, and TVDV produced by the U.S. Army Medical Research and Materiel Command have been reported to be immunogenic, well-tolerated and safe. While DPIV and TVDV are in phase I clinical trial, TDEN is in phase II clinical trial [36][37][38] . Challenges associated with these vaccines include general body pain, neutralizing antibody titers decrease over time, TVDV presents only envelop (E) and membrane glycoprotein precursor (M) proteins and requires both adjuvant and high amount of vaccine antigen 39 . Due to the tetravalent nature of DENV, effective vaccine, long-term surveillance for potential adverse events, compatibility with current vaccine schedules, cost and stability are the current challenges in the implementation of the DENV vaccines. Therefore, the improvement in vaccine development against dengue is much needed.
Structurally, the genome of DENV is made of 10,696 nucleotides and 10,173 of these nucleotides are responsible for the single open reading frame (ORF) which encodes a polypeptide of 3,391 amino acid residues 40 . In addition to its single and positive sense RNA stranded (+ ssRNA), it encodes for both structural and non-structural proteins. Immunologically, DENVs are related but genetically and antigenically distinct with approximately 75% relatedness 41,42 . DENV serotypes are subdivided according to the envelope gene responsible for incomplete crossdefensive immunity towards other serotypes in Homo sapiens 43 . As perceived by the immune system, disease development, viral replication and antigenic determinants have been linked with DENV proteins 44,45 . Therefore, any antigenic DENV protein sequences that is fit to elicit immune responses its host could be employed in the identification of vaccine for dengue 46  www.nature.com/scientificreports/ Asides palliative therapy, the treatment of both dengue and severe dengue are not specific and the mode of action by which the immune system react to this infection is unclear. Moreover, successful mosquito control measure to limit the spread of DF is still in its infant stage. As such, epitope-based vaccine development remains a clear solution for this infection.
In this study, several computational approaches were applied to design multiple epitope-based vaccines against DENV that comprises multiple epitope, which elicit the activation of cytotoxic T lymphocytes (CTLs) and helper T lymphocytes (HTLs). These vaccines require simple peptides that are easily chemically synthesized rather than pathogens. These approaches present an advantage over conventional methods of vaccine development by reducing the chance of allergenic reactions. In addition, these methods are rapid, easy and cost-effective. Although, similar approaches have been widely employed for the construction of vaccines against viruses such as SARS-CoV-2 47 , Ebola 48 , Zika 49 , and Chikungunya 50 , to date, no commercial and licensed epitope-based vaccine are available on the market using these bioinformatics approaches 51,52 .

Results
Sequence selection. In order to develop an epitope/ multi-epitope based vaccine for DEN-4, the protein sequences of DENV (structural and non-structural sequences) were obtained from NCBI and UniProt. The process of component sequence selection was modified from Ali et al. 53 . These components are presented in Table 1. Their physical and chemical properties are also presented in Table 2. Structural proteins are responsible for host invasion and particle assembly. On the other hand, enzyme synthesis which aids viral replication and structural protein synthesis are attributed to non-structural proteins. After infection, they prompt different immune responses. Prior to downstream analysis, the retrieved sequences were subjected to antigenicity and transmembrane helices prediction, and physicochemical analysis. All the three structural proteins were confirmed antigenic at a threshold of 0.4. While the non-structural components NS2A, NS2B and NS4A were non-antigenic due to their low prediction score, NS1, NS3 and NS5 were predicted to be antigenic. With a prediction score of 0.544, non-structural NS5 was the most antigenic viral component of DEN-4 ( Table 3).
The TMHMM Server DTU Bioinformatics prediction tool revealed the position and amino acid range of the sequences of the selected viral components ( Table 4). The capsid protein and selected sequence length of the envelop protein (466-471) were predicted to be localized inside the cell. All selected antigenic non-structural protein components and membrane glycoprotein precursor sequences were localized outside the cell.   Table 6 shows the final list of the selected epitopes for each of the antigenic viral components. Briefly, the selected viral components of DV were subjected to various MHC I and II epitopes prediction web-based servers (immune epitope database and analysis resources as well as the CTLPred). T cell epitopes (HTL and CTL) were ranked based on stringent in-house criteria before being shortlisted for downstream analysis. These rules of selection include: Good IEDB score, high conservancy, good binding affinity, B-cell epitope overlap, ≥ 9mer for MHC I, 15mer for MHC II, significantly antigenic/immunogenic and topographically accessible to membrane-bound or free antibody. For high immunogenicity, lower percentile ranks and IC 50 value were considered. According to these specifications, 21 epitopes each (C-2, M-3, E-4, NS1-1, NS3-3, and NS5-8) and (C-5, E-5, NS1-1, NS3-5, and NS5-5) were shortlisted for MHC-I and MHC II binders for multi-epitope based vaccine designed.

T-cell epitopes predictions and prioritization.
Immune receptors. The receptors considered for docking include two immune alleles (HLA-A*11-01 and HLA-DRB1*04-01) and two immune molecules toll-like receptors (TLR2 and TLR4). Their respective crystallographic structures with PDB IDs (HLA-A*11-01: 5WJL and HLA-DRB1*04-01: 5JLZ; TLR2: 3FXI and TLR4: 2Z7X) were retrieved from the protein data bank. Since proteins from the PDB database often lack coordinates for side-chains on the surface of the protein given the complexity to determine side chain coordinates from x-ray crystallographic data, all proteins were refined through the protein preparation wizard in Schrödinger to improve the overall quality of the output. These refined outputs ( Fig. 1) were then used for docking analysis.  Designed vaccine properties analysis. The various vaccine components including their specific properties are presented in Table 3. In addition to the different adjuvants used for the design, the antigenicity, allergenicity, toxicity and surface accessibility of these vaccines were also evaluated ( Table 8). All designed vaccines Table 5. Selected B-cell epitopes. The B-cell epitopes were predicted using BCPREDS web-based tool and were validated by BepiPred 2.0 software. Epitopes are represented as BCPREDS predicted. Green colored epitopes were predicted by BepiPred. Additionally, underlined amino acids are structurally coiled whereas the amino acids (not underlined) are structurally helix. For surface accessibility, subscripted amino acids are buried as against the exposed amino acids. VC viral component; Pos. position; B-score BCPred score, A-score ABCpred score, AT-score antigenicity score. For antigenicity prediction, the threshold was set to 0.4, which implies that any predicted epitope with antigenicity score ≥ 0.4 is said to be antigenic. In addition, regions of the B-cell epitopes predicted by SVMTrip with a score of 1.00 were italicized.  Table 6. Composite table of the prioritized epitopes for vaccine development. Underlined residues are structurally coiled whereas the amino acids (not underlined) are structurally helix. For surface accessibility, subscripted amino acids are buried as against the exposed amino acids. In addition, regions of the B-cell epitopes predicted by SVMTrip with a score of 1.00 were italicized. LGKAYAQMWSLMYFH TREEFISKVRSNAAI CLGKAYAQMWSLMYF PWDVIPMVTQLAMTD ACLGKAYAQMWSLMY www.nature.com/scientificreports/ are antigenic in nature, non-allergenic and non-toxic. Furthermore, their surface accessibilities were predicted to be localized outside of the cell. The physicochemical properties of these vaccines were computed using various web-based servers. Parameters evaluated include; solubility, amino acid composition (AAS), molecular weight Mol. W), theoretical pI (Theo. pI), extinction coefficient (Ext. coeff), half-life, instability index (I.I), aliphatic index (A.I), and the grand average of hydropathicity (GRAVY) ( Table 9). DV-1 had the highest predicted solubility and Theo. pI of 0.636 and 9.87 respectively. However, DV-4 had the highest AAs sequence and predicted ext. coeff of 400 and 59,360 M −1 cm −1 respectively. All of the designed vaccines were found to have similar predicted half-life of 30 h. However, DV-3 had the highest predicted aliphatic index (82.08) while DV-2 had the highest GRAVY of -0.167 among the designed vaccines. However, further laboratory studies are required to investigate the accuracy of these results.    www.nature.com/scientificreports/ Secondary structural prediction and validation of the designed vaccines. The secondary structure of the four designed vaccines were predicted by PSIPRED (Fig. 2). The structures are composed of coils, helixes and strands with no structural deformities. Epitopes often are found in coils and needs to be exposed. The sequences were then modelled by several prediction tools, including Schrödinger, to ensure accuracy (Fig. 3).
Docking study of the designed vaccines against the immune molecules. The interactions between an antigenic molecules and an immune receptors molecules are crucial to effectively transport the antigenic molecule and to activate immune response 54,55 . Therefore, docking analysis was carried out between the immune receptor molecules (MCHs and TLRs) and the designed vaccines (DV-1-4) in order to investigate possible interactions, binding energy and poses. Docking analysis was carried out using Schrödinger as well as PATCHDOCK and FIREDOCK to enhance the accuracy of the prediction (Fig. 5). Schrödinger was specifically used to evaluate the generated poses including their solid molecular surface display (Figs. 6 and 7) 56 . The PATCH-    www.nature.com/scientificreports/ DOCK server was used to generate the binding scores, area and ACE. The top 100 poses were refined using the FIREDOCK server where the global and hydrogen bond energies were generated for the docking calculation (Table 10)    Immune simulation analysis. The result of the immune simulation after injection of DV-1 is presented in Fig. 10. In response to several exposures of DV-1 to the host immune system, a significantly high level of the secondary and tertiary antibodies was observed (higher antibody detection) when compared to the primary antibodies. Consequently, rapid clearance of the antigen was detected (decrease in antigen concentration) (Fig. 10A). The different immune cell population/state count is presented in Fig. 10B-H. These immune cells were significantly increased with significant memory cell development; however, T-cells were gradually decreased after their initial increase. Similarly, an increase in the levels of cytokines were observed for IFN-ϒ (> 400,000 ng/ml) (Fig. 10). These observations suggest that immune memory development and the vaccine constructs may therefore confer immunity against the dengue virus.

Conformational stability analysis of the designed vaccines. Stability of the complex interactions
as well as the dynamic behaviour observed between the designed vaccines and the receptors were elucidated through 100 ns MDs at 300 °K. The statistical parameters such as the root mean square deviation (RMSD), root mean square fluctuation (RMSF), and radius of gyration (rGyr) were used to probe the system for stability and structural adjustment of the vaccine in the binding domains of the selected receptor ( Fig. 11A-D). As a measure of the distance between the Cα atoms of superimposed proteins, RMSD was used to evaluate the stability of DV-1 in the binding domain of TLR4 (Fig. 11B). Residual flexibility of DV-1-TLR4 was investigated by RMSF (Fig. 11C) in order to determine the extent and effect of contact deviation on the binding of DV-1 to TLR4. The average RMSF detected for the system was 1.2 Å. In addition; rGyr was used to characterize the secondary elements regular packing in the 3D structure of the DV-1-TLR4 complex and to evaluate complex compactness.
The rGyr provides effective information on the tendency of protein structures to expand during MDs 57 . At the time of simulation, the rGyr starts with a sudden increase of 4 ns and remained constant with an average value of 5.98 Å (Fig. 11D). The higher the rGyr the less compact the protein, with less folded stability. Interestingly, the rGyr profile of the system is consistent with the RMSD and RMSF profiles for complex fluctuation 58 .  68 . Individuals with primary infection are highly prone to develop dengue hemorrhagic fever and dengue shock syndrome during secondary infection due to antibody-dependent enhancement 69,70 . To date, there is no specific treatment for dengue fever and the main preventive measure is by vaccination in order to reduce the burden of the disease. Hence, this study aimed at designing novel multiepitope based DENV vaccine candidates capable of evoking immune responses in DENV infected individuals using computational approaches. Epitope-based vaccines have potential advantages over conventional vaccines and they have received considerable attention in recent years. DENV protein components were retrieved and used for the multi-epitope based design of dengue vaccine in this study. The selected components were subjected to physical and chemical properties prediction prior to antigenic determination of the sequences. Using the threshold of 0.4, the viral components were classified into antigens and non-antigens using the tumor model of VaxiJen (VaxiJen v2.0). Scores below this threshold were classified as non-antigen and discarded. For accurate component selection, the localization and transmembrane helices of the protein sequences were further determined. This process ascertains the suitability of candidate www.nature.com/scientificreports/ vaccine selection for experimental validation in vaccine development process 71,72 . The final result of all the structural proteins and NS1, NS3, and NS5 from the non-structural proteins were considered for epitope-based vaccine design in this study. Furthermore, the B and T-cell epitopes were predicted from these antigenic viral components. A good vaccine should be able to induce immunity through the antigen with a durable adaptive immunity. A total of 21 antigenic and non-allergenic epitopes were predicted for both MHC classes, while nine antigenic, non-toxic and non-allergenic epitopes were predicted for the B-cells. The CTL epitopes are responsible for developing durable immunity capable of eliminating circulating virus and infected cells 73 . On the other hand, HTL epitopes are associated with the production of both humoral and cellular immune responses. These epitopes elicit a CD4 + helper response for the generation of protective CD8 + T-cell memory and activation of B-cells 74 .
The homology modeling of all the predicted multi-epitope vaccines were predicted by Schrödinger and structurally confirmed by three web-based protein structural prediction tools (Phyre 2 , RaptorX, and SCRATCH predictions). The predicted 3D structures of the multi-epitope vaccines were validated using PROCHECK at PDBSum Generate server and ProSA-web. The result of the predicted structural analysis by PROCHECK server is a Ramachandran plot. This plot gives information about the torsional angles (phi (φ)and psi (ψ)) of the residues contained in the input PDB file. The statistics confirmed that approximately 90% of the predicted vaccine residues are in both the most favoured and additional allowed regions while only few residues were detected in the generously allowed and disallowed regions. The model structures were further queried for potential errors by ProSA-Web. This software tool provides a graphical output of the overall and local model quality based on the input and Z-score of the predicted model. The modeled protein structures of Yadav et al. 75 were reported to be − 5.98 kcal/mol and 6.66 kcal/mol for template and target, respectively. In addition, an investigation of immunogenic properties of Hemolin using an immunoinformatic approach, showed that the Z-score of the modeled protein was − 7.08 kcal/mol 76 . With Z-scores of − 5.26 kcal/mol., − 9.5 kcal/mol and − 2.11 kcal/mol, Overall the results of the validation tools provide evidence that the predicted vaccine structures are acceptable and of good quality. Therefore, the 3D-structures of the predicted vaccines are reliable and were considered for downstream analysis. Molecular docking is a crucial tool employed in the field of Bioinformatics to explore specific interactions and binding affinities of ligands towards selected receptors based on the lock and key mechanism. In the field of immunoinformatics, protein-protein docking is usually employed to investigate the best, stable and effective vaccine through their poses, interacting atoms and their binding energies. This procedure was used to examine the best possible vaccine construct(s) against the MHC alleles (HLA-A*11-01 and HLA-DRB1*04-01) as well as the immune molecules (TLR2 and TLR4). Several immunoinformatics studies have shown the importance of these receptors in the production of an effective immune response [80][81][82][83] . TLRs have a significant function in innate immunity such as immune activation and adaptive immune response. Also, the involvement of TLR2 and TLR4 were previously investigated in the recognition of viral structural proteins leading to inflammatory cytokine production 84 . The docking calculation between the designed constructs (DV-1-4) and the receptors (MHCs and TLRs) showed that the designed vaccines perfectly fit into the binding pockets of the receptors, and therefore, the vaccines may generate stable immune responses. Due to the global energies of the DV-1 and its excellent performance, it was considered for MDs study.
In silico cloning was carried out in a pET28a (+) vector after codon optimization by the JCAT web server in order to avoid codon bias 85 . The CAIs and the GC content of the optimized nucleotides were within the accepted www.nature.com/scientificreports/ ranges of 0.8-1.0 and 30-70% respectively. These data confirmed that the designed constructs were reliable with efficient expression in the E coli strain K12. C-ImmSim is a data-driven immune prediction algorithm and an agent-based simulator of the immune response previously used to study the immunogenic potentials of predicted vaccines 86 . This algorithm simulates the major functional mammal system components (bone marrow, thymus and lymph node). Since a potent vaccine must imitate the natural immunity induced by antigen with the production of long-lasting adaptive immunity, the response of the immune cells (HTL, CTL, B-cells, dendritic cells, immunoglobulins and cytokines) were reported against the designed vaccines. Immune simulation with DV-1 was predicted to trigger IgG, IgM, B-cell, T-cell and cytokines. Similar to the results of previous immunoinformatics vaccine designed studies [87][88][89] , the designed vaccines may offer protection against DENV.
The stability of the complex (DV-1-TLR4) was validated by performing a 100 ns at 300° Kelvin MDs using the Schrödinger software package. OPLS_2005 force field and a predefined solvent model of TIP-3P was chosen for the calculation. Furthermore, NaCl was added as (Na + and Cl − ) and the final salt concentration was 0.15 M in order to simulate the background salt at psychological conditions.From the result obtained, a slight fluctuation of TLR4-vaccine complex was observed from the start of the simulation period to about 300 ns after which there was no major fluctuation in the RMSD profile of the complex. The RMSF profile showed the presence of a highly fluctuating N-terminal region at the beginning of the simulation. This result was consistent with previous studies where immunoinformatics of vaccine-receptor complex stabilization was achieved within a similar time-scale using MDs 59,73,90,91 . The rGyr measures the ' extendedness' of a ligand, and is equivalent to its principal moment of inertia. The rGyr of the DV-1-TLR4 complexes ranged between 5.5 Å to 6.1 Å. This change in rGyr reveals the compactness of the vaccine construct-TLR4 complexes. This result implies that the designed vaccines can strongly interact with immune receptors.
Limitations. This study highlighted an alternative vaccine approach based on multi-epitope construction of the protein components of dengue genome to handle antigenic complexity. Although the predicted vaccines were recommended based on immunoinformatics strategies and believed to be immunogenic, the extent of protection from dengue infection is unknown. Immunoinformatics approaches are extremely useful to perform in silico studies and can guide laboratory experiments contributing to save time and money. Nonetheless, the next phase is to carry out in vitro immunological assays to validate the predicted vaccines, determine their immunogenicity, and further devise challenge-protection preclinical studies to eventually certify these approaches.
There are some limitations to this current study. First, challenges such as standard benchmark, insufficient prediction methods, and unavailability of precise datasets for different computational analysis are associated with immunoinformatics approach. Secondly, little is known about the pathogenic entry and specific immunebypass mechanism of dengue within the human host. With this knowledge, the vaccine could be developed more effectively and efficiently to control and/or prevent the pathogenesis of dengue. Thirdly, the proposed vaccine constructs await experimental investigated through in-vitro and in-vivo bioassays to demonstrate its safety, efficacy, and immunogenicity against dengue by producing an active immunological memory.
On a whole, the application of these results is pending validation in the wet lab experimental models. Figure 12 illustrates the stepwise flow of the methodology followed to design a vaccine against DENV.

Methodology
Sequence retrieval. All the protein components of DENV 1-4 were retrieved from the virus pathogen resource (VIPR) database at https:// www. viprb rc. org/ brc/ home. spg? decor ator= vipr and confirmed with NCBI. These protein components include: the structural proteins (Capsid protein (C), membrane glycoprotein precursor (M), and envelop protein (E)) and the 7 non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B and NS5). A single file containing all the sequences was generated in FASTA format.
Protein selection for vaccine preparation. The complete amino acid sequences of the proteins were retrieved from the NCBI database in the FASTA format. The proteins were subjected to signal peptide analyses using SignalP 4.1 server (http:// www. cbs. dtu. dk/ servi ces/ Signa lP/) to discriminate classical secretory and nonsecretory proteins. SignalP integrates a prediction of cleavage sites and signal/non-signal peptide predictions based on a combination of several artificial neural networks 92 signal was reported to outperform other predictors in benchmark tests 93 . Localization predictions were conducted using DeepLoc (http:// www. cbs. dtu. dk/ servi ces/ DeepLoc/), a template-free algorithm that uses deep neural networks to predict protein subcellular localization using only sequence information with good accuracy 94 .
Multiple sequence alignment and antigen selection. The NCBI BLAST program at https:// blast. ncbi. nlm. nih. gov/ Blast. cgi was used to identify highly conserved sequences of the DV. In addition, multiple sequence alignment was achieved using the Multalin server at https:// www. multa lin. toulo use. inra. fr/ multa lin. The antigenicity of the viral components (structural and non-structural) were evaluated using VaxiJen 2.0 server (http:// www. ddgph armfac. net/ vaxij en/ VaxiJ en/ VaxiJ en. html 95 . Finally, the most highly conserved sequences and antigenic peptides were selected for downstream analysis. Allergenicity prediction. Allergenicity prediction is a critical step in therapeutics and bio-pharmaceuticals due to their involvement in foods and/or food products. Sequence identity of known allergens is important for predicting the cross-reactive potential of novel proteins and their resistance to pepsin digestion and glyco- www.nature.com/scientificreports/ sylation status is used to evaluate denovo allergenicity potential 96 . AllerTOP v2.0 is a bioinformatics tool and web-based server used for protein allergenicity prediction in this study. As outlined by Dimitrov et al. 97 , this tool (https:// www. ddg-pharm fac. net/ Aller TOP/ index. html) uses an auto cross covariance transformation of protein sequences into uniform equal-length vectors. The method was previously applied to quantitative structureactivity relationship studies of peptides with different lengths using five descriptors such as hydrophobicity, molecular size, helix-forming propensity, relative abundance of amino acids and β-strand forming propensity 98 .

Antigenicity.
VaxiJen is a web-based server for prediction of protective antigens, tumour antigens and subunit vaccines 95 . This server accessed at http:// www. ddg-pharm fac. net/ vaxij en/ was used to determine the best antigenic protein components and to identify epitopes from DENV-4. The viral components' sequences, predicted epitopes, and constructed vaccines were used as input with 'tumor' as selected organism target at a threshold of 4.0.
B-cell epitope prediction and selection. The B-cell epitopes were predicted using the BCPreds server at http:// ailab. ist. psu.edu/bcpred/. This server uses a novel kernel method for predicting linear B-cell epitopes. The epitope length was kept at 20 amino acids with 75% sensitivity. The use of multiple tools in epitope prediction has been reported to improve the rate of true positives in epitopes prediction 99 . Therefore, other prediction servers such as IEDB analysis Resource (http:// tools. iedb. org/ popul ation) 100 and the BepiPred-2.0 web server (http:// www. cbs. dtu. dk/ servi ces/ BepiP red/) were used for validation. The method employed by BepiPred-2.0 is more superior to other available tools for sequence-based epitope prediction based on the epitope data derived from solved 3D structures and a large collection of linear epitopes downloaded from the IEDB database. In addition, the proteins were subjected to linear epitope prediction using ABCpred at https:// webs. iiitd. edu. in/ ragha va/ abcpr ed/ index. html to compute ABCpred epitope scores at a threshold of 0.5. At this threshold, the tool shows 65.93% accuracy with equal sensitivity and specificity using a window length of 16. Finally, the server, SVMTrip at http:// sysbio. unl. edu/ SVMTr iP/ predi ction. php was used to improve the prediction performance for the predicted B-cell epitopes.
T-Cell epitope prediction analysis. Following the combined methods of Shey et al. 101 , Ali et al. 53 and Ullah et al. 48 , the IEDB epitope analysis resource at https:// www. iedb. org/ home_ v3. php was used to predict the MHC class I and MHC class II binding epitopes. Briefly, the MCH class I restricted CD8 + cytotoxic T lymphocyte (CTL) epitopes from the antigenic viral components of DENV-4 were predicted using the NetMHCpanEL 4.0 prediction method for HLAA*11-01 allele. The result of this method was further subjected to the SMM method in order to prioritize the selected CTL epitopes using the IC 50 threshold of 500 nM and percentile ranks of ≤ 0.2. In addition, the 15-mer MCH class II-restricted CD4 + helper T lymphocyte (HTL) epitopes were predicted using the NetMHCIIpanEL 4.0 prediction method for HLA DRB1*04-01 alleles. The predicted classes of the MHCs were further screened for antigenicity, non-allergenicity, non-toxicity, highly conservancy and transmembrane topology. www.nature.com/scientificreports/ Toxicity and transmembrane topology analysis. ToxinPred is an in silico software tool used for the prediction as well as the design of both toxic and non-toxic peptides. This tool predicts the toxicity of the epitopes according to their physicochemical properties of the input sequences. With the default settings, the predicted sequences were used as input in order to determine their toxicity. Epitopes that were not predicted as allergens by AllerTOP were subjected to toxicity analysis by the ToxinPred webserver at http:// crdd. osdd. net/ ragha va/ toxin pred/. All the peptides predicted as toxic were discarded. In addition, the transmembrane topology experiment of the viral component and the predicted epitopes were evaluated using the TMHMM v2.0 server (http:// www. cbs. dtu. dk/ servi ces/ TMHMM/). This tool is user-friendly for determining the transmembrane topology of epitopes using their sequences as input 102 .
Analysis of the physicochemical properties and construct solubility. The physicochemical properties of the viral components and the constructed vaccines were predicted by ExPASy ProtParam at https:// web. expasy. org/ protp aram/ 103 . Characteristics such as number of amino acids, molecular weight, theoretical pI, half-lives, instability indexes, extinction coefficient, aliphatic index and grand average of hydropathicity were evaluated 104 . Protein-Sol (https:// prote in-sol. manch ester. ac. uk/) was used to evaluate the solubility of the multiepitope based vaccines 105 .
Epitope conservancy analysis. The predicted epitopes were subjected to conservancy analysis in order to predict the degree of similarity within the serotypes of dengue virus at sequence identity threshold of ≥ 80%.
Using the IEDB analysis resource at http:// tools. iedb. org/ conse rvancy/, each of the epitopes predicted were used as input against all the serotypes of dengue (DENV 1-4).
Vaccine construction. Vaccine construction requires three requisite elements in order to create an efficient molecule capable of eliciting an immune response. These elements include the identified epitopes (the B-and the T-cell epitopes), adjuvants and the spacer sequences. The predicted epitopes from the DENV-4 were selected for the final vaccine construction after the in-house screening processes. In order to amplify the immune system, molecules with immunomodulatory properties were added to the vaccine constructs as safe vaccine adjuvants. The adjuvants were connected to the epitopes using the rigid linker EAAAK. The glycine-proline rich GPGPG linkers were used to space the B-cell and HTL epitopes. In addition, the CTL epitopes were linked using an Homology modelling of the predicted epitopes. The 3D structures of the epitopes and the designed vaccines were modeled using the structure prediction wizard of prime module in the Schrodinger suites v2020-3 using the Maestro v12.3 software interface. Since the process of constructing a 3D atomic-resolution model is based on the principle that evolutionary related proteins share similar sequence and protein structures, homology modeling therefore, relies on high quality sequence alignment and template structure (knowledge-based method of modeling). The predicted epitopes and designed sequences were used as input individually in FASTA format followed by the BLAST homology search. Homology modeling steps such as template recognition and initial alignment, alignment correction, backbone generation, loop and side chain modeling, model optimization and validation were all performed on the 3D predicted models. In addition, the online protein structure predictions; Protein Homology/analogY Recognition Engine V 2.0 (Phyre 2 ) at http:// www. sbg. bio. ic. ac. uk/ phyre2/ html/ page. cgi? id= index, RaptorX at http:// rapto rx. uchic ago. edu/ Conta ctMap/, and SCRATCH protein predictor at http:// scrat ch. prote omics. ics. uci. edu/ were used to remodel and ascertain the modeled 3D structures of the designed vaccines.
Secondary structure prediction. PSIPRED V3.3 at http:// bioinf. cs. ucl. ac. uk/ psipr edtest was utilized to predict the secondary structures of the designed vaccines using their primary sequences as input. PSIPRED is a simple and accurate 2D structure prediction method, incorporating two feed-forward neural networks that performs an analysis on output obtained from PSI-BLAST (Position Specific Iterated-BLAST) 106 .
Structural validation of the modeled vaccines. Following the modified method of Saha and Prasad 107 , the PDBSum module in the EMBL-EBI database (https:// www. ebi. ac. uk/) and ProSA-Web at https:// prosa. servi ces. came. sbg. ac. at/ prosa. php were used to analyze the designed vaccine constructs before and after minimization and refinement in order to validate their purities 108 . Specifically, the PDBSum Generate at http:// www. ebi. ac. uk/ thorn ton-srv/ datab ases/ pdbsum/ Gener ate. html use the PDB files of the prepared constructs after homology modeling and preparation as input to evaluate purity using the PROCHECK and produces a graphical output in the form of a Ramachandran plot 109 . The higher the percentage of atoms in the most favored region the greater the resolution and purity and the models can be used for molecular docking calculations.
Receptor selection. It  www.nature.com/scientificreports/ (TLRs). Following previously methods of receptor selections 64,101,110 , the crystal structures of MHC class I allele (HLA-A*11-01) and MHC class II allele (HLA DRB1*04-01) with PDB IDs, 5WJL and 5JLZ were retrieved from the Protein Data Bank (PDB) at https:// www. rcsb. org/. In addition, the 3D structure of TLR4 dimer and TLR2 (PDB ID: 3FXI and PDB ID: 2Z7X) were retrieved for additional docking purposes. These proteins were refined, minimized and optimized using the protein preparation wizard in the Schrodinger suite and were further considered for molecular docking analysis with the multi-epitope constructed vaccines 111 .
In silico docking calculation. To inspect the binding affinity of the modeled vaccines to MHCs and TLRs molecules, the designed vaccines were docked against the selected receptors using the Prime protein-protein docking module in Schrodinger 112 . The binding score and the atomic contact energy were analyzed using the PATCHDOCK server at https:// bioin fo3d. cs. tau. ac. il/ Patch Dock/ php. php while the refinement of the generated poses including the global energy were carried out using the FIREDOCK server at http:// bioin fo3d. cs. tau. ac. il/ FireD ock/ php. php.
Optimization of designed vaccine candidate. Reverse translation and codon optimization were performed using the Java Codon Adaptation Tool (JCat) server (http:// www. jcat. de/) in order to express the multiepitope vaccine construct in a selected expression vector 113 . Codon optimization was performed in order to express the final vaccine construct in the E. coli (strain K12) host, as codon usage of E. coli differs from that of 'homo sapiens' , where the sequence of the final vaccine construct is derived. In the additional options section, the 'avoid the rho-independent transcription termination' , 'avoid prokaryote ribosome binding site' , and 'avoid restriction enzymes cleavage sites' were all checked. The JCat output includes the codon adaptation index (CAI) and GC content (%), which can be used to assess protein expression levels. CAI provides information on codon usage biases; the ideal CAI score is 1.0 but > 0.8 is considered a good score 114 . The GC content of a sequence should range between 30-70%. GC content values outside this range suggest unfavorable effects on translational and transcriptional efficiencies 53 . To clone the optimized gene sequence of the final vaccine constructs in E. coli pET-28a (+) vector, XhoI and NdeI restriction sites were added to the N and C-terminals of the DNA sequences of the predicted vaccines, respectively. Finally, the optimized DNA sequences (with restriction sites) were inserted into the pET-28a (+) vector using SnapGene tool to ensure vaccine expression.
Immune simulation of the designed vaccines. In order to understand the immune system relative to the predicted vaccines, C-ImmSim at http:// 150. 146.2. 1/C-IMMSIM/ index. php? page=1 were employed to assess the molecular binding of immune complexes. The principle of operation of this server is based on the position-specific scoring matrix and machine learning approaches to predict subunit vaccine and binding interactions. Following the clinical vaccine recommended procedures 115,116 , and immune simulation methods previously reported 89,101,117,118 , three injections were administered for a total of 1000 steps of simulation using the designed vaccine sequences as input.

Molecular Dynamic simulation (MDs).
MDs of the designed vaccines of the immunogenicity were characterized by the system build and molecular dynamic Desmond package in Schrödinger. Prior to simulation, the docked complexes were prepared using the system builder module in Maestro v12.4. The optimized potentials for the liquid simulations OPLS-2005 force field were used to determine the receptor interactions which were solvated with the simple point charged (TIP-3P) water model 119,120 . The orthorhombic water box was used to create a 10 Å buffer region between the atoms on the receptors and box sides. The volume of the box was minimized and the overall charge of the system was neutralized by adding Na + . The temperature and pressure were kept constant at 300° Kelvin and 1.01325 bar using the Nose-Hoover thermostat 121 and Martyna-Tobias-Klein barostat methods. The simulations were performed using NPT ensemble by considering the number of atoms, pressure and timescale and the simulation time at 100 ns. The MD results were analyzed by simulation interactions diagram module and MS-MD trajectory analysis.

Conclusion
In this study, immunoinformatics approaches were employed in the development of potential vaccine candidates against DENV considering their ease of use in experimental investigations. Since the only viable defense for DENV infection is a vaccine; four potential vaccines were designed and predicted to elicit specific immune responses in individuals with dengue infection. The results of this study are based on computational approaches. Therefore, it is required that both in-vivo and in-vitro laboratory experiments be carried out to ascertain the results of this study.

Data availability
All data generated or analysed with respect to this study are included in the submitted manuscript. The sequences of the protein analysed can be retrieved from NCBI (https:// www. ncbi. nlm. nih. gov/), UniProt (https:// www. unipr ot. org/), and Protein Data Bank (https:// www. rcsb. org/) using their accession numbers. www.nature.com/scientificreports/