Identification and construction of a multi-epitopes vaccine design against Klebsiella aerogenes: molecular modeling study

A rapid rise in antibiotic resistance by bacterial pathogens is due to these pathogens adaptation to the changing environmental conditions. Antibiotic resistance infections can be reduced by a number of ways such as development of safe and effective vaccine. Klebsiella aerogene is a gram-negative, rod-shaped bacterium resistant to a variety of antibiotics and no commercial vaccine is available against the pathogen. Identifying antigens that can be easily evaluated experimentally would be crucial to successfully vaccine development. Reverse vaccinology (RV) was used to identify vaccine candidates based on complete pathogen proteomic information. The fully sequenced proteomes include 44,115 total proteins of which 43,316 are redundant and 799 are non-redundant. Subcellular localization showed that only 1 protein in extracellular matrix, 7 were found in outer-membrane proteins, and 27 in the periplasm space. A total of 3 proteins were found virulent. Next in the B-cell-derived T-cell epitopes mapping phase, the 3 proteins (Fe2+− enterobactin, ABC transporter substrate-binding protein, and fimbriae biogenesis outer membrane usher protein) were tested positive for antigenicity, toxicity, and solubility. GPGPG linkers were used to prepare a vaccine construct composed of 7 epitopes and an adjuvant of toxin B subunit (CTBS). Molecular docking of vaccine construct with major histocompatibility-I (MHC-I), major histocompatibility-II (MHC-II), and Toll-like receptor 4 (TLR4) revealed vaccine robust interactions and stable binding pose to the receptors. By using molecular dynamics simulations, the vaccine-receptors complexes unveiled stable dynamics and uniform root mean square deviation (rmsd). Further, binding energies of complex were computed that again depicted strong intermolecular bindings and formation of stable conformation.


Subcellular localization analysis.
Surface-localized proteins were prioritized as vaccine candidates due to their capability of being easily recognized by the host immune system. PSORTb 3.0 analysis provided the solution for such tasks 25 . Only extracellular, outer membrane and periplasmic proteins were selected while other were not processed.

Virulent protein analysis. A dataset of virulence factors for bacteria is the virulent factor database
(VFDB). We analyze the prioritized subcellular localized proteins for virulence using Basic Local Alignment Search (BLASTp) against VFDB complete proteome dataset 26 . The Cut-off values remained 100 for bit score and 30% of sequence identity. A protein that did not meet the mentioned criteria was dropped out 23 .

Analysis of BLASTp against microbiome and humans. The proteins homology was compared using
BLASTp. The filtered virulent proteins were compared to the human proteome and normal flora. To prevent autoimmune responses to self-antigens, a homology check was performed. The selected proteins were BLASTp against humans and three different types of bacteria: Lactobacillus casei (taxid number: 1582), L. johnsonii (taxid number: 33959) and L. rhamnosus (taxid number: 47715) 27 . A cut-off E-value of 10 −4 and sequence identify of ≤ 30% were used.
Vaccine epitopes prioritization phase. The proteins were investigated further for B-cell and T-cell epitopes prediction. The epitopes then underwent physicochemical analysis, the number of transmembrane helices analysis, antigenicity, allergenicity and adhesion probability prediction 28  www.nature.com/scientificreports/ Physiochemical analysis. The physiochemical characteristics of the selected proteins were analyzed, including instability index, molecular weight, atomic composition, theoretical protein index, aliphatic index, amino acid composition, and the grand average of hydropathicity (GRAVY).ProtParam 29 was used to calculate the physicochemical properties of the proteins. Protein molecular weight was cut-off at 110 KDa and the instability index was < 40 30 . This was used to identify the protein as an appropriate vaccine target. Stable proteins were analyzed further, while unstable proteins were removed from the study.
Transmembrane helices. A transmembrane helices detection was done with HMMTOP 2.0 and TMHMM 2.0 31 . Numbers less than 0 or 1 of transmembrane helices were selected. Only proteins having transmembrane helices less than the threshold were used as such proteins can easily cloned and purify 32 . Antigenicity and allergenicity prediction. An antigenicity refers to proteins ability to attach with immune cells and produces the appropriate immune responses. Through VaxiJen 2.0, a cut-off value of ≥ 0.4 was used to determine potential antigen vaccine candidates. An antigenic vaccine may provoke the host immune system more effectively 33 . These potential candidate must pass through allergenicity check through online server named allertop 2.0 34 .
Adhesion probability analysis. The following step was to just select and prioritize proteins with an ability to adhere based on their adhesion probability. It was determined that adhesive candidates should have a value over 0.5. A robust host immunity can be ensured when the vaccine is actively attached to the host immune cells.
For an adaptive immunity response, antibodies and TCR (T-cell receptors) are produced through adhesion and  www.nature.com/scientificreports/ interaction with the immune system receptors of the host. Web server from Vaxigen was used to accomplish this task 35 . To anticipate B-cell and T-cell epitopes, the immune-epitope database (IEDB) was employed 36 . Immune epitopes of all types are featured in the IEDB and they can be easily accessed. The first step was to predict B-cell epitopes and then to use these epitopes to predict T-cell epitopes. A low percentile rank was used to prioritize predicted epitopes. A binding potency analysis was also performed through the MHC-Pred tool for DRB*10,101 allele-containing putative epitopes 37 . VaxiJen 2.0 was used to examine the antigenicity of selected epitopes i.e., Virulent-Pred, ToxinPred, and proteins to determine their antigenicity, toxicity, virulence, and water solubility, respectively 27 .

Epitopes prediction phase.
Multi-epitope vaccine designing and processing. Multiple antigenic epitopes can be combined in multi-epitope vaccines. Following the screening of all epitopes and the usage of GPGPG linkers 38 , the multiepitope vaccine was created. Cholera toxin B subunit (adjuvant) 39 was added to the vaccine as adjuvant to boost immunogenicity.
To determine the most stable vaccine structure that will allow molecular recognition, we modeled the vaccine construct using the 3Dpro Scratch tool. The design's stability and durability were assured by inspecting its 3D structure 40 . Loops modeling and galaxy refinement. A Galaxy loop tool of the Galaxy web server was used to model the vaccine construct to remove unnecessary loops 41 . Galaxy Refine v 2.0 was then used to analyze the loop-modeled vaccine construct and steric clashes were removed from the vaccine and its side chains were reconstructed. Refined vaccine construct was deemed good and promising vaccine candidate 42,43 . Disulfide engineering. To improve the vaccine's stability, a disulfide engineering approach was used.
Reduced conformational energy increased the stability of the vaccine. The outer and inner chain bonds were treated to enhance and improve stability. The Design 2.0 webserver was used to undertake disulfide engineering 44,45 . Codon optimization. The multi-epitope vaccine's DNA sequence was generated and transformed to Escherichia coli expression system utilizing the Java Codon Adaptation Tool (JCat) platform. The vaccine expression in E. coli was measured through the codon adaptation index (CAI) and the vaccine's GC percentage 46 . Molecular docking. The vaccine construct was docked with a variety of immune cell receptors in order to determine how effective it will be in generating an immune response 47 . The vaccine's binding affinity was predicted in a blind docking study for TLR-4 (4G8A), MHC-I (1I1Y), and MHC-II (1KG0) receptors 46 . Using PatchDock 48 docking investigation was accomplished followed by FireDock server to refine the docked complexes 49 .
Molecular dynamics simulation. Molecular dynamics simulations were performed using AMBER20 software. Antechamber was employed for preprocessing of the complexes and TIP3P water boxes were used for submersion of vaccine-receptor complexes. FF14SB force field was served as force field for docked complexes. Hydrogen bonds were constrained by the SHAKE algorithm 50 . An equilibration was conducted for system for 1 ns and followed by a production run were performed for 100 ns. The trajectories were examined using CPP-TRAJ module. In addition, the AMBER MMPBSA.py module was utilized to assess the vaccine's and receptor's intermolecular affinity 51 by examining only 100 frames.
C-Immune simulation. For deciphering host-immune responses to the designed vaccine, the C-ImmSim server was used 52 . Antigens on this server were characterized and their profiles calculated to determine the immune responses of the host, which in this case is the human body.

Results
Complete proteome retrieval. As part of the present research, the NCBI databases (https:// www. ncbi. nlm. nih. gov/) were consulted to retrieve the complete proteome of 17 strains of K. aerogenes followed by a complete pan-genome analysis. GC content of these pathogens varies from 54.46 to 57.06, while strain size ranges from 5.09 Mb to 5.81 Mb. Table 1 describes strain's genome size, type, and GC content of the strains. BPGA analysis. In Fig. 2, each of the 17 strain is presented against the number of gene families. Considering the open nature of pathogen genomes, which can be seen in the core-pan plot, there is a very high probability of gaining new genes over time as a result of genome plasticity. Further, the core proteins also play a role in metabolic regulation and metabolic biogenesis as demonstrated by the distribution analysis of COGs. Information is stored and processed by genes, primarily via unique proteins. As part of the core genome, RNA is processed, and the recombination genes are duplicated, translated, and transcribed. Figure 3 shows the pan-phylogenetic tree for all 17 K. aerogenes species.
CD-HIT Analysis. The CD-hit analysis was employed to retrieve core proteome sequences without duplicates 23 . According to Fig. 4A www.nature.com/scientificreports/ dant while the whole proteome has 44,115 proteins. In this study, redundant protein sequences were eliminated because the repetitions of the same proteins were not necessary for vaccine development, while the redundancyfree core proteins were used in the subcellular localization phase and virulent analysis.

Subcellular localization.
When using surface or membrane proteins in the design of a vaccine, the immune system is able to recognize them easily; therefore, potential immune responses are generated. Surface proteins should also be subjected to subcellular localization analysis 25 . A subcellular localization analysis revealed total 35  VFDB analysis. Infectious pathogens are mainly characterized by the presence of virulent proteins 53 . The VFDB analysis defined these protein sequences as virulent if bit score > 100% and sequence identity is greater than 30%. Three proteins sequence out of 35 subcellular located proteins were identified as virulent which consists of two periplasmic and one outer membrane protein (Table 2).
Human and normal flora, adhesion probability, physiochemical property analysis. All the virulent proteins were examined for homology against the gut flora and human genomes in order to prevent the autoimmune responses. In this study, one protein showed homology with humans and with the flora of the normal host and was discarded. A sequence with more than one transmembrane helix was removed from further analysis 31 , while those with no transmembrane helices or one were considered. No sequence discarded in this step. Only two protein sequences were forwarded for further analysis. www.nature.com/scientificreports/ Vaccine epitopes prioritization phase. A phase of epitope prioritization was carried out for prioritized proteins that passed the above-mentioned steps and checks. It was predicted that both B-and T-cell epitopes would stimulate immune responses during the epitope prioritization phase.

B-cell epitopes prediction. The process in which B-cells transform into plasma cells after stimulation
may also be called humoral immunity. A list of two proteins for B-cell epitope prediction was assembled namely: Fe2 +− enterobactin ABC transporter substrate-binding protein and fimbrial biogenesis outer membrane usher protein.
Here we first anticipate B-Cell epitopes followed by predicting T-Cell epitopes. Here we predict peptides for two selected proteins (Table 3).

T-cell epitopes prediction. T-cell epitopes are primarily responsible for triggering a cellular immune
responses. This is known as cellular immunity or T-cell-dependent immunity. Following recognition of peptide antigens, T-cell lymphocytes multiply and differentiate into the primary immune response. Using B-cell epitopes generated from T-cell epitopes, we predicted B-cell-derived T-cell epitopes that are able to activate cellular immunity, and so the lowest percentile scores were used to identify MHC-I and MHC-II epitopes ( Table 4). Epitope prioritization phase. A number of analyses were performed after the epitope prediction in the prioritization phase, including binding affinity assessments using DRB*0101, followed by allergenicity and solu-  www.nature.com/scientificreports/ bility analysis. In order for the immune system to function properly, vaccines must bind to immune cell receptors. DRB*0101 analysis later investigated the potential to bind with HLA DRB*0101 allele at all selected epitopes 54 . However, only those epitopes that had an IC50 lower than 100 nM were considered for further analysis.  www.nature.com/scientificreports/ Antigenicity, allergenicity, solubility, and toxicity analysis of selected epitopes. A host's immune system can only be stimulated by antigenic proteins 33 . All possible non-antigenic sequences of proteins were excluded from the study in order to achieve this aim. For removal of all toxic and allergic proteins and poor water-soluble epitopes, allergenicity and toxicity analyses were performed to avoid allergic and toxic responses 34,41 . InvivoGen was used as a webserver to calculate solubility, which can be accessed at https:// www. inviv ogen. com/ ova-pepti de. All shortlisted epitopes for multi-epitopes are given in Table 5.
Multi-epitope vaccine construction and processing. An epitope-based vaccine is composed of more than one type of epitope rather than a single epitope. To overcome the limitations of single-peptide-based vaccines, which are unable to generate an effective immune response against variants of the same pathogen, the vaccine construct is designed by linking screened epitopes together through specific linkers, i.e., GPGPG linkers 55 . EAAAK is another linker that connects the adjuvant CTBS 39 with the vaccine construct in order to increase immune efficacy 50 . In addition to their rigidity, these specific linkers enable the separation of epitopes which have been efficiently recognized by the immune system. Therefore, the designed vaccine generates an immune response that is safe, robust, and efficient. Figure 5 is schematic representation of vaccine construct.
Structure modeling of vaccine. 3Dpro Scratch 40 was used to model the three-dimensional structure of the vaccine. Since no appropriate template structure was available, the structure modeling was done by ab initio instead of using a homology-based or threading approach as shown in Fig. 6. Good vaccine candidates must have structural stability. During analysis, we modeled the following loops in the vaccine candidate to avoid structure instability: Cys30-Ile38, Glu50-Ile60, Ile61-Gly66, Ala67-Pro74, Thr99-Glu104, Lys105-Asn111, Table 5. List of all shortlisted possible antigenic, non-allergenic, nontoxic, and water-soluble peptides.  Disulfide engineering. Disulfide engineering has been used to ensure structure stability using Design 2.0 45 . Because covalent bonds keep the protein structure stable, the construct retains its geometry. Also, some amino acid residues are susceptible to degradation by enzymes. Figure 7 shows yellow sticks for those cysteine residues that are enzyme-degradable amino acids.

MHC-Pred MHC-Pred IC50 value (nM) Antigenicity Allergenicity Solubility Toxin Pred
Codon optimization. Using codon optimization, one can make sure that the construct's codons make use of the host immune system as efficiently as possible for maximum protein production. Maximal expression and production of proteins, CAI values between 105 and 107 have been estimated to be ideal 52 .
Analysis of molecular docking. To activate cellular and humoral immunity, both the designed vaccine construct and the host's innate and adaptive immune cells should interact with each other. The vaccine was therefore docked with the host immune receptor in order to estimate its binding affinity. The binding affinities with receptors of the host were analyzed using a blind docking approach 56 with MHC-I, MHC-II, and TLR-4.   Table 1, S-Table 2, and S- Table 3 Table 4, S- Table 5, and Table 6.

Docked confirmation of vaccine with immune receptors.
We have visualized the best-docked complexes for each immune receptor as illustrated in Fig. 8. This tight binding exposes epitopes, enabling the immune system to recognize and process them. Immune pathways can be stimulated by vaccine epitopes, which further implies that strong and protective immune responses can be formed.
Interactions of vaccine to immune receptors. In order to generate appropriate immune responses, vaccines need to interact properly with immune cell receptors on the host. By using Different residues of MHC-I interacted with the model vaccine construct. In molecular docking, we utilized specific residue-wise interactions between MHC-I, MHC-II, and TLR-4 receptors on vaccine constructs. Similarly, the vaccine created a significant interaction network with the MHC-II molecule 58 . In addition to hydrogen bonds and salt bridges, van der Waals interactions are also all occurring at close distances. The TLR-4 protein belongs to the TLR family of proteins that initiates adaptive and acquired immune responses. In the following Table 6 you will find the residues that interact with TLR-4, MHC I and MHC II of the model vaccine.  dynamics simulations are used in silico. It is allowed for atoms and molecules to interact for a specified period of time, giving an idea of how a system "evolves." There was an in-depth analysis of the docked complexes over a period of 250 ns, however, it is important to measure the binding affinity of the vaccine constructs to dock with receptors over a given period of time 59 . Nevertheless, it is imperative to make sure that the antigens in the vaccine are exposed and recognized by the immune system so it can develop an adequate immune response.
The simulation period showed no drastic changes as shown in Fig. 9. RMSD in terms of carbon alpha atoms was the first analysis conducted. Each system's RMSD increased over time. When the vaccine was used with MHC-I and MHC-II, binding was lower than that with TLR-4. RMSD values for TLR4 vaccine shows deviation which is 4 Å because of high loops number and larger system size and RMSD value for MHC-I and MHC-II vaccines are 2.5 Å and 2.7 Å, respectively.
Calculation of binding free energies. Dock complexes binding free energies were calculated using the MM-GB/PBSA approach 50 . There were − 349.45 kcal/mol free binding affinities for the TLR4 receptor and vaccine construct, as well as − 194.72 kcal/mol free binding affinities for MHC-I and the vaccine construct, and − 188.28 kcal/mol free binding affinities for MHC-II. Complex formation is most favorable in the MM-GB/ PBSA when electrostatic and van der Waal energies combine. All three complexes show a dominant gas phase energy, while the polar energy is non-favorable as well. Table 7 summarizes the binding energy terms for the different complexes.

Discussion
Microbes are developing multidrug resistance quickly and are infecting people and other organisms with deadly infections. As bacteria evolve, their efficacy decreases, and AMR begins to develop as a result 60,61 . Many pathogens require significant candidate vaccines because of the occurrence of adverse effects associated with different antibiotic therapy measures. A range of infective microorganisms was employed to anticipate and discover possible new antigenic vaccine candidates through "in silico" subtractive proteomics. It takes a long time to conduct an experimental vaccination. Furthermore, computational methods combined with advances in genomic sciences have significantly reduced the time and resources that are required to develop vaccines against AMR pathogens. Computer analyses, in conjunction with the accessibility of complete genome sequences, suggest that we might be able to carry out an "in silico" subtractive proteome analysis to identify prospective vaccine candidates for K. aerogenes. The study also indicates several therapeutic target priority parameters 53 . Indole negative, Gram-negative, catalase-positive, oxidase-negative, Klebsiella aerogenes are oxidase-negative and rod-shaped bacterium that has peritrichous flagella that allows them to move 62 . K. aerogenes is an opportunistic pathogen that causes nosocomial infections. Intensive care unit outbreaks are associated with it. This bacterium has originally been called K. aerogenes. In human feces and water, K. aerogenes are a strain that produces gas, is non-motile, and is often encapsulated. At present, Klebsiella pathogens cannot be treated by immunological methods, although immunological treatments have been developed successfully for other Gram-negative bacteria obtained in hospital settings. There is a 50% mortality rate, even when antimicrobial treatments are used 63 .
Most of these bacteria are susceptible to most antibiotics designed for them, but K. aerogene inducible resistance mechanisms are a challenge, particularly lactamases, which cause them to rapidly develop resistance to standard antibiotics during treatment. Healthy people generally do not get sick from aerogenes, which are In the past, vaccines prevented millions of cases of infectious diseases, saving lives. Successful vaccines include those used to prevent Spanish flu and smallpox, which prevented pandemic mortality 66 .
The study examined 2 vaccine targets, which includes Fe2 +− enterobactin ABC transporter substrate-binding protein and fimbrial biogenesis outer membrane usher protein. There were enzymes found that could be used for vaccines that met all requirements. Therefore, a wide range of pathogens can be included in the development of vaccines. It has also been established that these proteins are found on the surface of the pathogen. They are quickly recognized by the host defense system. Because of their antigenic determinants, these proteins are also immunostimulatory 67 . A further advantage is that the selected proteins do not belong to human proteomes, as similarity with human proteomes leads to potential resistance to autoimmune reactions. Additionally, antibodies are antigenic and able to bind to acquired immunity proteins and activate immune signaling pathways. Identified antigenic epitopes in the proteins have a strong binding affinity for DRB*0101 alleles, and they do not appear to be toxic or allergic. This allele is present in most human populations, causing quick and accurate responses of the immune system. To overcome the limitations of single peptide vaccines, multiepitope vaccines can be designed with predicted epitopes. The created vaccine was stable with respect to different immune receptors as well as binding to MHC-I, MHC-II, and TLR4. The enzymes identified met all the necessary requirements to be considered vaccine candidates. It ensures that the vaccine is designed to cover a variety of pathogens. It has also been established that these proteins are located on the pathogen's surface. This enables the immune system of the host to combat them.
The use of genomic information in computer-assisted vaccine design continues to gain popularity as vaccine development advances. Furthermore, it can produce results in a relatively short period of time, both saving time and money. The vaccine designed, based on the results, is a good candidate for both in vitro and in vivo testing.

Conclusion and limitations
This study is attempting to develop a multi-epitope vaccine against K. aerogene, a bacterial pathogen. A variety of computer-aided approaches are being employed to develop it, including reverse vaccinology, subtractive proteomic analysis, immune-informatics, and biophysical analyses. Two potential targets were used to predict vaccine epitopes; Fe2 +− enterobactin ABC transporter substrate-binding protein and fimbrial biogenesis outer membrane usher protein. A variety of computer-aided approaches are being employed to develop it, including reverse vaccinology, subtractive proteomic analysis, immune-informatics, and biophysical analyses. Its antigens and epitopes also do not cause allergies and have a high affinity for binding to B-cells and T-cells. Immune responses after vaccination were simulated and revealed primary, secondary, and tertiary responses. Based on the results of all these studies, the vaccine appears to be a suitable candidate for in vivo testing for antigen-specific immunity. A vaccine against K. aerogene may be developed more quickly based on the findings and data of this study. The selection of our selection criteria throughout the study was quite thorough, but there are still some concerns that need to be addressed in future studies. Furthermore, the vaccine does not consider the order of the epitopes when evaluating its efficacy of vaccination. It has not been tested extensively whether MHC epitope prediction algorithms are accurate.

Data availability
The data presented in this study are available within the article. Table 7. Overview of the differences in binding free energies between vaccines and their receptors and are given in kcal/mol.