Designing a multi-epitope vaccine to provoke the robust immune response against influenza A H7N9

A new strain of Influenza A Virus (IAV), so-called "H7N9 Avian Influenza", is the first strain of this virus in which a human is infected by transmitting the N9 of influenza virus. Although continuous human-to-human transmission has not been reported, the occurrence of various H7N9-associated epidemics and the lack of production of strong antibodies against H7N9 in humans warn of the potential for H7N9 to become a new pandemic. Therefore, the need for effective vaccination against H7N9 as a life-threatening viral pathogen has become a major concern. The current study reports the design of a multi-epitope vaccine against Hemagglutinin (HA) and Neuraminidase (NA) proteins of H7N9 Influenza A virus by prediction of Cytotoxic T lymphocyte (CTL), Helper T lymphocyte (HTL), IFN-γ and B-cell epitopes. Human β-defensin-3 (HβD-3) and pan HLA DR-binding epitope (PADRE) sequence were considered as adjuvant. EAAAK, AAY, GPGPG, HEYGAEALERAG, KK and RVRR linkers were used as a connector for epitopes. The final construct contained 777 amino acids that are expected to be a recombinant protein of about ~ 86.38 kDa with antigenic and non-allergenic properties after expression. Modeled protein analysis based on the tertiary structure validation, docking studies, and molecular dynamics simulations results like Root-mean-square deviation (RMSD), Gyration, Root-mean-square fluctuation (RMSF) and Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) showed that this protein has a stable construct and capable of being in interaction with Toll-like receptor 7 (TLR7), TLR8 and m826 antibody. Analysis of the obtained data the demonstrates that suggested vaccine has the potential to induce the immune response by stimulating T and Bcells, and may be utilizable for prevention purposes against Avian Influenza A (H7N9).

In February 2013, for the first time in China, a human infection associated with the new avian influenza A (H7N9) virus was reported. The new Asian lineage avian-origin influenza A (H7N9) virus has a higher infection risk for humans 1 . Since then, H7N9 infections have continued to occur during five waves of epidemics. The virus has caused increased mortality in any epidemic periodic by causing acute respiratory distress syndrome, especially in the elderly 2,3 .
Influenza A virus (IAV) is a negative single-strand RNA (ssRNA) virus that belongs to the family Orthomyxoviridae. Eight gene segments include HA, NA, Nuclear export protein (NEP), Matrix protein 2 (M2), Nucleocapsid protein (NP), Polymerase PA (PA), Polymerase PB1 (PB1) and Polymerase PB2 (PB2) in the genome, approximately 13,000 bp of IAVs, are responsible for encoding about 17 different types of proteins 4,5 . Studies have shown that the H7N9 influenza virus originated probably from the reassortment gene so that its surface genes such as HA and NA related to migratory birds and other internal six genes come from H9N2 avian influenza 6 . In general, gene reassortment, mutation, and recombination are considered as evolution mechanisms in the RNA viruses such as influenza which distinguishes between different strains of influenza [7][8][9] . HA glycoprotein antigen is known as one of the H7N9 major envelope proteins that are divided into two subunits of HA1 and HA2. Continuous mutations over time have resulted in extensive changes in HA proteins, resulting in all classified influenza strains in two phylogenetic-mediate major groups and six clades: Group 1 [H1 clade (H1, H2, H5 and H6); H9 clade (H8, H9 and H12); H11 clade (H11, H13, H16; and Bat Has clade (H17, and H18)] and Group 2 [(H3 clade (H3, H4 and H14) and H7 clade (H7, H10 and H15)] 10 . The HA1 subunit is used as a major factor in inducing immune responses in traditional vaccines; however, the presence of continuous mutations due to immune pressure in the HA1 subunit leads to a lack of cross-immune protection in the use of traditional

Results
Phylogenetic evolution and Protein sequences retrieval. Evolution relationships of NA and HA antigens selected from A/Shanghai/2/2013 show that they belong to the Eurasian strain of avian influenza viruses and are closely related to H7N9 subtypes spread to Europe and the United States. HA antigen of A/ Shanghai/2/2013 can be seen in a common clade with avian-derived H7N9 (Fig. 1). Besides, HA shows a closer relationship with the 2013 and 2014 subtypes. But it still has a common ancestor with other H7N9s. In the same vein, NA antigen is also found in a clade with several human-derived H7N9s and avian-derived H7N9s (Fig. 2).
HA and NA proteins of H7N9 influenza were investigated through categorized population frequently and as shown in Fig. 3a and c, it has spread in three continents of Asia, Europe and America. HA of H7N9 influenza is 100% avian-derived in Europe and Canada, while 3% environment-derived of H7N9 influenza was observed in the USA. H7N9 influenza was 16% human-derived in china, as well as 80% and 4% avian-derived and environment-derived, respectively. NA protein showed 100% avian-derived in both Europe and America. Similar to HA protein, NA protein in China was 80% avian-derived while 17% was human-derived.
It was clarified that more than 81% to been avian-derived among 761 HA proteins of H7N9 influenza, while less than 15% were human-derived and the rest were environment-derived (Fig. 3b). Also, 741 NA proteins of H7N9 influenza showed similar percent so that more than 81% were related to avian-derived and less than 16% were estimated to human-derived (Fig. 3d).
Linear and conformational B cell epitopes prediction. B cell epitopes are an important component of the H7N9vac because of their ability to induce humoral immunity. Both linear and discontinuous B cell epitopes in the final construct were predicted by the BepiPred 2.0 and iBCE-EL servers, respectively. A total of 27 linear B cell epitopes were extracted from the final construct and shown in Table 1. Conformational B cell epitopes from HA and NA are summarized in supplementary Table S1. Each of the 8 predicted scores for conformational B cell epitopes was visualized in supplementary Fig. S1.
Selection of CTL epitopes prediction. The NetMHC4 server was used for HA and NA MHC class I prediction. MHC-I epitopes were selected based on the highest immunogenicity and antigenicity. Furthermore, MHC-I epitopes were considered with a strong binding score based on the NetMHC4 server rating. Selected MHC-I epitopes that had overlapped with each other were merged and finally 13 MHC I epitopes, including 8 HA epitopes and 5 NA epitopes were utilized at influenza construct ( Table 2). All predicted CTL epitopes (binding to MHC-I) are included in Supplementary Table S2. TAP transport/proteasomal cleavage. Due to the importance of TAP transport and Proteasomal cleavage in the antigen presentation pathway binding affinity, HA and NA antigens were examined via the NetCTL1.2 servers. The scores associated with tap transport and Proteasomal cleavage for each of the CTL epitopes shown in Table 2  Construction design and physicochemical properties. Final selected epitopes were linked by linkers so that AAY linkers were selected for linking of MHC-I epitopes as well as GPGPG linkers for linking of MHC-II epitopes. HβD-3 and PADRE sequence as adjuvants were linked by EAAAK linkers in the N-terminal site. The last MHC-I epitope was linked to the first MHC-II epitope via the HEYGAEALERAG linker. KK linkers also were used to bind B-cell epitopes. A "Histidine Tag" was considered in the C-terminal and linked to the final construct through the RVRR linker (Fig. 5a). Obtained results from the ProtParam server demonstrated H7N9vac amino acid sequence (777 amino acid) will translate into a protein with a molecular weight of ~ 86.14 kDa. Theoretical pI and instability index (II) were estimated equal to 9.80 and 34.65, respectively, that it grouped Influenza vaccine in stable proteins. Furthermore, Figure 2. NA circular phylogenetic tree. The amino acid sequence of NA (A/Shanghai/2/2013 H7N9) was evaluated with other 620 (H7N9) NA amino acid-based on NCBI Influenza Virus Resource. The tree scale was considered equal to 0.1. All H7N9 NA proteins are related to human-derived, avian-derived and environmentderived. Spread H7N9 HA in the Americas, Europe and Asia are shown in blue, green and pink line, respectively. NA protein of A/Shanghai/2/2013 H7N9 has been magnified and specified in an orange dashedline box.
Modeling of 3D construct, refinement and validation. RoseTTAFold server was used to model H7N9vac 3D structure (Fig. 5b). Next step, the achieved H7N9vac 3D structure submitted to the GalaxyRefine server. The predicted model validation was performed by online software PROCHECK, PROSA. As shown in Fig. 5c, the Z-Score point was set to − 9.29 after refinement. The Z-Score point is used to validate the modeled protein based on the NMR or X-ray methods. The NMR method is used for the crystallography of proteins with fewer than 200 amino acids and the X-ray method for proteins with more than 200 amino acids. If the Z-Score dot in the PROSA graph is located on the (blue-blue) NMR and (pale-blue) X-ray regions, the simulation accuracy is the highest and the simulated model has the lowest error rate with the highest confidence. Local model quality related to structure is shown in Fig. 5d.
Model analysis of the predicted model after drawing the Ramachandran graph revealed that 92.4% (714/773) of all residues were in favored (98%) regions. Overall, approximately 97.8% (756/773) of all residues were in allowed (> 99.8%) regions. In the Ramachandran graph, amino acids are grouped according to the angles of phi and psi (Fig. 5e). . Chart frequency of HA and NA proteins from H7N9 Influenza. Frequency of H7N9 influenza HA protein, separately by country (a) and as a whole (b). Frequency of H7N9 influenza NA protein, separately by country (c) and as a whole (d). Human-derived, environment-derived and avian-derived H7N9 influenza (based on HA and NA protein) are shown in purple, yellow and blue, respectively. www.nature.com/scientificreports/ Insilico cloning and vaccine optimization. Codon Adaptation Tool (JCAT) was used for the optimization of nucleotide sequences of the final construct. E. coli K12 was selected as a host expression organism. Also, we adjust the properties of JCAT to avoid rho-independent transcription terminators, avoid prokaryotic ribosome binding sites and avoid cleavage sites of restriction enzymes. Codon Adaptation Index-Value (CAI -Value) and GC-Content of the improved Influenza construct with 2310 nucleotide sequence (without His-Tag) were 1.0 and 54.20, respectively, which confirms the probability of appropriate protein expression. Nde I and Xho I restriction enzymes was added to the N-terminal and C-terminal of nucleotide sequence, respectively, so that the PelB sequence is removed from the final structure for achieving an intracellular expression. Furthermore, a stop codon was considered after His-Tag sequence. Finally, the construct was cloned in pET-26b(+) plasmid by SnapGene (Fig. 6). Nucleotide sequence of H7N9-Vac along with Open reading frames of protein expression are included in Supplementary Table S5 and Supplementary Fig. S4, respectively.
Protein-protein docking. Docking results showed H7N9vac can form interaction with TLR7, TLR8 and m826 antibody 3D structures. We selected and included two PADRE and HβD-3 sequences as the adjuvant in the N-terminal of the final construct. As shown in Figs. 7a and 8a, initial areas amino acids of the H7N9vac were involved with TLR7 and TLR8. Furthermore, the H7N9vac directly formed interaction with several amino acids of TLR7 and TLR8 binding sites.
The H7N9vac was also able to interact with both the heavy and light chains of the m826 antibody (Figs. 9a and 10a). HCDR1, HCDR2 and HCDR3 loops from the heavy chain and LCDR1, LCDR2 and LCDR3 loops from the light chain were selected from the m826 antibody to bind to specific regions of the HA antigen in H7N9vac. As shown in Fig. 9a and b as well as Fig. 10a, and b, the interaction was established between the amino acids His94, Gly89 and Lys496 from H7N9vac with the amino acids Ser31, Thr57, Ala58 and Asn59 heavy chain m826 antibody. The amino acids Lys378, Glu382, Glu383, Glu335, Arg336 and Glu332 from H7N9vac also are interacted with the amino acids Ser35, Ser32, Tyr33, Arg97 and Thr95 from the m826 antibody light chain.
The H7N9vac's structure and TLR7, TLR8 and m826 were submitted to the HADDOCK server for identifying the interaction reigns by protein-protein docking, respectively. The highest rankings for each complex were www.nature.com/scientificreports/ selected at the lowest intermolecular binding of the whole-molecular H7N9vac-Proteins complexes from the HADDOCK with the lowest mean RMSD ( Table 5). As shown in Fig. 7b, the H7N9-vac interacted with TLR7 via 6 hydrogen bands. Besides, further investigations of HADDOCK server output show that electrostatic bonds also play an important role in the interactions generated in the H7N9vac-TLR7 complex. In this regard, in the H7N9vac-TLR8 complex, 3 hydrogen bonds and 4 salt bridges are responsible for creating interactions between the H7N9vac and TLR8, just as electrostatic bonds play a major role in the interactions of this complex (Fig. 8b).
Regarding the binding of the vaccine to the m826 antibody, 4 hydrogen bonds in H7N9vac-m826 Heavy chain complex and 6 hydrogen bonds in H7N9vac-m826 Light chain complex were responsible for the interaction between the vaccine and the antibody, respectively (Figs. 9b and 10b). simulation results showed that the hydrogen bond in the H7N9vac-TLR7 and H7N9vac-TLR8 complexes was formed with the RMSD of 0.5 and 1 nm, respectively (Fig. 11a). Also, RMSD for H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes were 0.6 and 0.9 nm, respectively (Fig. 11b).
The RMSF diagram indicated the residues in the H7N9vac-TLR7 and H7N9vac-TLR8 complexes in the MD simulation have very low volatility (Fig. 11c,d). H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes showed a RMSF about 0.4 to 0.5 nm (Fig. 11e,f). These results indicate the maintenance of H7N9 structure in binding with the TLR7, TLR8 and m826 Heavy/Light chain. The H7N9vac-TLR7 and H7N9vac-TLR8 complexes returned to equilibrium after 12 ns, while, H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes showed equilibrium about 10 ns since MD starting.
The gyration radius related to the H7N9vac-TLR7, H7N9vac-TLR8, H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes are shown in supplementary Figure S10 as well as the H-Bond overall results.
The binding affinity of H7N9vac-TLR7, H7N9vac-TLR8, H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes was confirmed using MM/PBSA calculations. The energy of the bonds formed between the receptor and the vaccine in each complex is calculated and as shown in Table 6. The total binding energy Table 2. CLT epitopes prediction by NetMHC4 server. HA and NA epitopes were selected based on MHC-I HLAs binding affinity (nM), %rank and antigenicity rate. TAP transport/Proteasomal C-terminal cleavage potential selected epitopes were evaluated and degree of conservation was considered for all selected epitopes. Overlapped CLT epitopes are shown in underlined characters as well as CLT epitopes overlapped with HTL epitopes are shown in bold characters. a Rank Threshold for Strong binding peptides is equal to < 0.5 and for Weak binding peptides is equal to < 2.0. b Rank Threshold for epitope identification is equal to < 1.0. www.nature.com/scientificreports/ in H7N9vac-TLR7 and H7N9vac-TLR8 complexes is − 793.544 ± 35 kJ/mol and 1574.578 ± 72 kJ/mol, respectively, which indicated a high affinity between H7N9vac and TLR7. However, the amount of positive energy in H7N9vac-TLR8 complex probably stems from the existence of salt bridges between the H7N9vac and TLR8 structures. The H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes also have binding energies of − 503.155 ± 67 kJ/mol and − 657.380 ± 66 kJ/mol, respectively, which indicates that the structure of the H7N9vac binds to both the heavy and light chains of m826 antibody.

Pos
Besides, further studies show that in addition to total binding energy, van der Waals and electrostatic energies have also influenced the complexes that the contribution of van der Waals energy in the H7N9vac-TLR7 and H7N9vac-TLR8 complexes is − 285.137 ± 24 kJ/mol and 178.801 ± 18 kJ/mol, respectively. The electrostatic energy was − 986.686 ± 83 kJ/mol for H7N9vac-TLR7 complexes as well as 987.490 ± 67 kJ/mol for H7N9vac-TLR8.
van der Waal energy and Electrostatic energy for the H7N9vac-m826 Heavy chain complex were − 300.517 ± 14 and − 570.977 ± 89, respectively. In addition, the H7N9vac-m826 Light chain complex had van der Waal energy and Electrostatic energy equal to − 163.482 ± 25 and − 1685.757 ± 160, respectively. Generally, these results indicate that both complexes are stable, especially in H7N9vac-TLR7, H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes.
Immune simulation. The results of online server C-ImmSim showed increased production of secondary immune responses consistent with the actual immune response. Increased levels of IgG2 antibodies and total IgM + IgG1 also increased as the immune response increased (Fig. 12a). The amount of B memory (y 2 ) also increased with decreasing IgG1 and IgG2 isotypes (Fig. 12b). Besides, it has been shown that as TH memory (y 2 ) increases (Fig. 12c), TH and TC cell populations increase (Fig. 12d). Finally, the produced IFN-γ concentration and TH cell population were increased (Fig. 12e,f). The results show that the simulated immune response immunization by the Influenza H7N9vac on the C-ImmSim server corresponds to the actual immune response.

Discussion
All of the influenza A virus subtypes that have been identified so far belong to the waterfowl. This indicates that birds can be a natural host of the influenza A virus 29 . Studies have shown that the H7N9 virus is derived by reassortment between antigens expressed on the surface of the virus (H7N9) from wild birds and antigens Table 3. HTL epitopes prediction by NetMHCII 2.3 server. HA and NA epitopes were selected based on MHC-II HLAs binding affinity (nM), %rank and antigenicity rate. All epitopes were evaluated based SVM a method as IFN-γ inducer potential and degree of conservation were considered for all selected epitopes. Overlapped HLT epitopes are shown in underlined characters as well as HTL epitopes overlapped with CTL epitopes are shown in bold characters. a Rank Threshold for Strong binding peptides is equal to < 0.5 and for Weak binding peptides is equal to < 2.0. b Motif and SVM hybrid/IFN-gamma versus other cytokines. www.nature.com/scientificreports/ expressed on the internal of the virus (H9N2) in poultry 6 . So far, 5 waves of epidemics have been caused by H7N9 30,31 . Reports indicate that the first wave of epidemic viruses is most similar to the first virus discovered from this lineage (A/Shanghai/1/2013) 32,33 . Phylogenetic tree analysis for both HA and NA antigens showed that all H7N9 viruses in this study were located in polytomous clades with a common ancestor. HA and NA are spread in three continents of America, Europe and Asia. However, A/Shanghai/2/2013 influenza H7N9 from China, as human-derived influenza H7N9 along with avian and environmental influenza H7N9, associated with a common ancestor in the phylogenetic tree. Further studies of source diversity show that more than 81% of HA and NA influenza H7N9 is avian-mediated. Human-mediated H7N9 influenza is also approximately 15% and 16% related to HA and NA, respectively. Influenza virus transmission from birds to humans is rare due to host range restrictions. Besides, studies have shown that influenza A viruses select birds' intestinal tract for replication, while in humans, the virus replicates in the respiratory system. But reports confirm the transmission of H7N9, which can cause infection in humans. Although sustained transmission of avian influenza from human to human has not been reported, the occurrence of mutations in which the virus can adapt to human circulating, along with the lack of protective antibodies for humans, raises the risk of an H7N9 virus pandemic 34,35 . Furthermore, immune selection pressures, along with natural selection pressures, are the most important causes of antigenic drift and antigenic shift in IAVs, causing them to evolve continuously. As a result of this continuous evolution, the influenza epidemic and recurring pandemics occur, which can have very serious consequences for human society, such as the Covid-19 pandemic 36 .
Here, is introduced a multi-epitope vaccine against the H7N9 virus including immunogenic peptides of HA and NA proteins. HA protein, which is the most important determinant of antigen and viral entry into the host cell. NA protein was also used in the design of the vaccine, which is known for contributing to the release of the influenza virus into cell 37 . We utilized immunoinformatics tools to develop multi-epitope vaccines that are capable of producing humoral and cellular immunity. The multi-epitope vaccine provides its immunogenicity advantage, based on short immunogenic sequences containing appropriate immune responses. This method can prevent antigen overload as well as allergic responses in the host 38 . Total antigen sequence analysis can be performed using immunoinformatics and molecular instruments to investigate the possibility of T cell receptor (TCR) binding to host immune proteins 39 .
One of the unique features of multi-epitope vaccines over traditional vaccines is that multi-epitope vaccines can induce both humoral and cellular immune responses due to the presence of both B cell and T cell epitopes 40 . For screening of predicted CTL and HTL epitopes, special immunological characteristics such as antigenicity, immunogenicity and ability to bind to several MHC class I and MHC class II alleles were considered. T cell epitopes were predicted and it was found that a vaccine candidate made using the appropriate epitope could provide adequate immunogenicity if expressed. www.nature.com/scientificreports/ B-cell epitopes are regions on the surface of antigens that specific antibodies recognize and attach to and stimulate the immune response [41][42][43] . The ability to identify these binding sites in the antigen sequence or structure is important for the development of recombinant vaccines 44 . Linear epitopes, which consist of a linear sequence of residues and conformational epitopes, which are composed of residues that are not interconnected in the native protein sequence, but are assembled by the folded protein structure 43 . However, the vast majority of B cell epitopes are conformational estimated and to a lesser extent linear 45 . Using BepiPred-2.0 and iBCE-EL servers, we identified linear B cell epitopes and included them in the final vaccine construct. We also used HA and NA crystallographic structures to identify the conformational epitopes obtained from them and after determining the overlap with the linear B cell epitopes, we included them in the final structure to predict the maximum responses related to humoral immunity.  www.nature.com/scientificreports/ HA, as a well-known trimeric surface glycoprotein, is the primary target for the development of antiviral vaccines against influenza virus. HA can bind to sialic acid receptors through the globular head region of HA1 subunit, thereby facilitating the entry of the virus into the cell 46,47 . In addition, HA causes fusion between viral and cell membranes under the influence of structural changes caused by acidic pHs. It is noteworthy that most influenza antibodies produced by immunization or infection are directed against five antigen sites on the HA1 globular head, Ca1, Ca2, Cb, Sa, and Sb [48][49][50] . In addition, studies have shown that mAbs are able to target the stem region of HA2 and exert their antiviral effect through Fc-Fcγ receptor interactions and antibody-dependent cell-mediated cytotoxicity (ADCC) 46,[51][52][53][54] .
Analysis of the immunogenic properties of a HA1-specific fully human monoclonal antibodies (hmAbs) called m826 has shown that m826 binds specifically to hemagglutinin H7 (HA1). The m826 has minimal divergence from their germline predecessors, selected from a big naive antibody library created from the blood of healthy adult donors, and specifically targeting H7N9 46 . Interestingly, Yu et al. Showed that the m826 antibody induces very strong antibody-dependent cytotoxicity (ADCC) activity and is highly effective against H7N9 virus infection in the mouse model, and is completely protects mice exposed to the lethal challenge H7N9 virus via mechanisms that may include ADCC 46 . They also used crystal structure to show that m826 binds with a pH-dependent high affinity to a unique epitope distinct from the common HA1 antigen sites identified in the H7N9 HA1 and m826 complexes 46 . m826 is a germline antibody. this unique feature demonstrates that m826 as a promising therapeutic candidate, provides a tool to study the mechanisms of inhibition of virus infection by antibodies that could be a model for H7N9 vaccine immunogens as well as have the potential to be developed as antiviral agents for vaccine candidates 46 . In our immunoinformatics studies, we included HA1 B-cells epitopes involved in the heavy and light chain of m826 antibody in our construct, based on the crystallographic structure of m826 in complex with H7N9 HA1.
Prolonged immune responses can be enhanced by adding adjuvants to multi-epitope vaccines 55 . The multiepitope vaccines can keep us away from the dangers of working directly with antigens or pathogens factors in vitro 56,57 . The multi-epitope vaccines, in addition to proving their efficacy in vivo by inducing protective immunity, have also entered phase I clinical trials 58,59 .
The EAAAK linker was used to link the adjuvants to each other and attach them to the construct. As rigid linkers between protein domains, EAAAK provides an Alpha helix-forming structure that increases stability, maintains a constant distance, and maintains the independent function of domains 60 . MHC-I epitopes were linked through AAY. The AAY linker helps to form epitopes in a natural form and does not allow the formation of junctional epitopes, which improves the presentation of the epitope 61,62 . CTL and HTL epitopes were fused via HEYGAEALERAG to bring an enhanced epitope presentation for the vaccine. HEYGAEALERAG can www.nature.com/scientificreports/ provide this property through the creation of a specific cleavage target for Proteasomal and lysosomal degradation systems 63,64 . The GPGPG linker was used to link MHCII epitopes. GPGPG is a glycine-rich linker that in addition to increasing construct solubility, provides freely of activity and high accessibility and flexibility for adjacent domains 65 . B-cell epitopes were bonded to each other via KK linkers. The KK linker of the lysosomal protease target sequence is cathepsin B, which is one of the important proteases for antigen processing in the field of MHC-II antigen presentation 66,67 . Linked peptides by KK can help to specially presentation of each peptide to antibodies via avoiding of inducing antibodies into the amino acid sequence that results from the binding of the two peptides 67 . Finally, the RVRR linker was embedded in the C-terminal construct, where it provides the binding of the vaccine to the His-tag sequence 68 . The allergenicity and toxicity of the construct designed using the H7N9vac were evaluated using the Aller-TOP v.2.0 and ToxDL servers, respectively, and it was confirmed that this protein is non-allergenic and nontoxic. The ProtParam tool provided by the ExPASy server was used to analyze the physicochemical properties of the designed construct. The M/W of the structure was ~ 86.38 kDa and pI was equal to 9.80. The aliphatic index indicated the thermal resilience of the construct after expression. The estimated instability index puts the H7N9vac's protein in the classification of stable proteins. Because a protein with an instability index of less than 40 is considered stable, and if it tends to be more than 40, it is predicted to be unstable. The computed GRAVY index of the H7N9vac was considered as a reflection of the polar nature of H7N9vac that its effective interaction with water a lower score indicates the high solubility of this structure.
Structural validation of H7N9vac via Z-score by web server ProSA score (− 9.29), indicated that this protein can be expected to fall into the category of proteins that include Z-scores of structures determined by X-ray crystallographic experiments (more than 200 amino acids). These results confirmed the overall quality of H7N9vac. Evaluation of the third structure using Ramachandran diagram analysis showed that the maximum amino acids (99.1% residual) were in the allowed range, so the third validity ofH7N9vac structure is proven.
Adding adjuvants to a subunit vaccine can help increase the immune response it produces. Adjuvants can act as agonists for TLRs and can act as an effector to expand antibody recognition 69 . It has been indicated IVA can be recognized by TLR7 and TLR8 because of the TLR7 and TLR8 ability in coupling to ssRNAs 70 . TLR-7 has been shown to induce a humoral cellular response and long-term memory response in response to the ssRNA-mediate Recognition of ssRNA by TLR7 is primarily associated with the activation of downstream pathways of myD88, which can lead to IFN-α antiviral-cytokine and IL23/IL-1β responses (related to LT CD4 + T H17 and T-helper17) via IRF7 and mitogen-activated protein kinase (MAPK), respectively. Also TLR 8, after recognition and binding to ssRNA, is associated with the activation of downstream pathways of myD88, which activates IRF3, IFN-B responses, and stimulates B cell-mediated humoral immunity. B cell epitopes are recognized by B cell receptors or secretory antibodies and play a very important role in creating an efficient immune response. Both TLR7 and TLR 8, which are referred to as fraternal twins, are jointly involved in myD88-dependent NF-kB activation, ultimately producing IL-12 and stimulating LT CD4 + T H1 and T-helper1 cellular immunity [74][75][76][77][78][79] .
In this regard, HβD-3 was used as an adjuvant, to take advantage of its synergistic and interactive properties with other TLRs and pathways associated with Myd88 downstream, as well as activating the expression of stimulatory molecules in antigen-presenting cells in this pathway 80,81 . Besides, studies have shown that HβD-3, through TLR1/TLR2-mediated pathways, can induce IFN-γ secretion associated with the cytokine Th-1; in contrast, it is independently involved in the secretion of interferon-gamma (IFN-γ) and the activation of natural killer (NK) cells 82,83 .
The interaction of H7N9vac protein structures with TLR7 and TLR8 structures was analyzed by protein-protein docking. Analysis of H7N9vac-TLR7 and H7N9vac-TLR8 complexes showed that 6 and 4 hydrogen bonds involved these interactions, respectively. Also, 4 and 6 hydrogen bonds were involved in the interaction of H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes, respectively.
The H7N9vac-TLR7 and H7N9vac-TLR8 complexes showed that proteins interact with each other through the formation of hydrogen bonds. The H7N9vac-TLR7 and H7N9vac-TLR8 complexes RMSD diagrams showed that the simulation was quite stable. However, the mean RMSD of 0.6 nm for the H7N9vac-TLR7 complex is higher than the mean RMSD of 0.3 nm for the H7N9vac-TLR87 complex, indicating that the intermolecular energies in the H7N9vac-TLR8 complex are higher. H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes showed they are close to each other and mimic each other in RMSD. Also, the graph of gyration shows the amino acid's consistency in their radial axis and they do not go beyond the range of their radius www.nature.com/scientificreports/ axis, which confirms the stability of the complex during the simulation. The RMSF diagram is also consistent with the results; implying that the amino acids of the H7N9vac-TLR7, H7N9vac-TLR8, H7N9vac-m826 Heavy chain and H7N9vac-m826 Light chain complexes binding sites maintain their stability during the simulation. In general, docking and MD results showed that the complexes can maintain their stability during the simulation. The obtained results in the present study showed that this recombinant construct can help stimulate TLR7/8. Binding of the heavy and light chain of m826 antibody to the residues of the vaccine structure showed that the production of antibodies against the proposed vaccine in this study is to be expected. Because it's important to have the correct 3D structure of the antigen to be recognized by the humoral immune system. Therefore, the B-cell epitopes in the 3D vaccine structure need to mimics the correct fold in the native structure of the antigen to induce the same immune response 46 . Besides, the MM/PBSA results showed that in addition to the total energy, the complexes are also affected by the energies resulting from covalent and electrostatic bonds. However, the positive binding energy generated in H7N9vac-TLR8 complex can be affected by the salt bridges in the interaction between the vaccine structure and TLR8. Studies have shown that although salt bridges may help stabilize protein associations, they also destabilize protein cores 84 . In addition, the findings indicate that the salt bridge interactions between acidic amino acids (Glu − and Asp − ) and alkaline amino acids (Arg + , Lys + , and His + ) are the strongest residual residue interactions. But, may be weakened by the effects of solvation and broken down by lower pH conditions 85 . These findings are consistent with our results. As seen in H7N9vac-TLR8, salt bridges are formed between glutamine-arginine, aspartate-lysine, and lysine-glutamine and may have affected interactions between vaccine and TLR8.
Influenza viruses are ssRNA that, in addition to the humoral immunity induced by B cells, are also required to activate T cell-dependent immune responses to inhibit intracellular survival and proliferation. As the results of the C-ImmSim Online server prediction show, the development of B cells and T cells caused long-term memory. IgG1 and IgG2, which are indices of Th1 and Th2 response to influenza antigens, respectively, are involved in the protection against influenza infection 86,87 . Previous studies have shown that both human and avian hosts show a decrease in T cell populations in response to AI infection 88 . Decreased T lymphocytes are often associated with the upregulation of proinflammatory cytokines such as interferons (IFNs), especially IFN-γ, as well www.nature.com/scientificreports/ as interleukins (ILs) such as IL-6, which eventually lead to hypercytokinemia and Activation of cell apoptosis 89 . Elevated levels of IL-12 and IFN-γ are due to strong cellular immunity. Although studies have shown that T epitopes of H7N9 strains mimicking human sequencing are associated with attenuated IFN responses in humans, the ICM server results in an excellent stimulation of FN-γ by this shows the vaccine [90][91][92] . However, ICM server results indicate that the increase in the number of Th1 cells is associated with an increase in the level of Cytotoxic T cells.  www.nature.com/scientificreports/  Linear and conformational B Cell epitope prediction. B cell epitopes are one of the most important factors contributing to the vaccine design because they are characterized by the immune system. In this study, we used BepiPred-2.0 (http:// www. cbs. dtu. dk/ servi ces/ BepiP red/) to identify linear B Cell epitope on HA and NA protein sequences 43 . BepiPred-2.0 is a web server for predicting B-cell epitopes from antigen sequences. BepiPred-2.0 is based on a random forest algorithm trained on epitopes annotated from antibody-antigen protein structures. BepiPred-2.0 utilizes sequence-based epitope prediction both on epitope data derived from 3D structures, and on a large collection of linear epitopes downloaded from the IEDB database.
Also, iBCE-EL (http:// thegl eelab. org/ iBCE-EL/) was used to confirm B-cell epitopes which were selected by BepiPred-2.0. iBCE-EL is a web based prediction server for linear B-cell epitope. It is an ensemble method that combined extremely randomized tree and gradient boosting algorithms, which respectively utilizes a combination of amino acid composition and physicochemical properties and a combination of dipeptide and physicochemical properties as an input features. For a given peptide, iBCE-EL predicts its calss and probability values 95 .
It is estimated that more than 90% of B cell epitopes are discontinuous; implying they contain sequences that are far from each other by long-distance pathogenic protein sequences but are adjacent by protein folding properties. From the server ElliPro (http:// tools. iedb. org/ ellip ro/) B cell epitopes were used to predict the threedimensional structure.
Cytotoxic T cell epitope. Prediction of T cell epitopes was performed using an improved neural network method provided by the NetMHC-4.0 server at https:// servi ces. healt htech. dtu. dk/ servi ce. php? NetMHC-4.0. All epitopes were considered as 9mers-long for cytotoxic T Cell prediction (recommended from server), which are recognized by the HLA Class including HLA-B, HLA-B, and HLA-C molecules. The threshold for strong and weak binders was adjusted 0.5 and 2, respectively.   Insilico cloning and vaccine optimization. The codon optimization for recombinant construct expression in the pET-26b(+) vector was performed applying the Java Utility Server (JCat) (http:// www. prodo ric. de/ JCat). Given that the expression host used in the present study is different from the native influenza host, the codon optimization was adjusted based on the codon preference of influenza structure in E. coli K12. The CAI score and GC content are ideally 1.0 and 30-70% respectively, which are used to predict protein expression measures. Using the SnapGene tool, the sites of the NdeI and XhoI restriction enzymes were identified on the optimized nucleotide sequence, and the recombinant influenza gene sequence was inserted into the pET-26b(+) vector. EAAAK linkers were used to link HβD-3 and PADRE adjuvants at the N-terminal construct. Then AAY linkers bonded CTL epitopes to each other. The HEYGAEALERAG linker was placed as the interface between CTL epitopes and HTL epitopes. The GPGPG linkers also acted as the link between HTL epitopes. The KK linkers were used to stick B-cell epitopes. Finally, RVRR bonded construct to HisTag sequence in C-terminal of vaccine construct.
Prediction of disulfide bond formation in construct was performed via DiANNA 1.1 server (http:// clavi us. bc. edu/ ~clote lab/ DiANNA/ main. html). This server utilizes a SVM, along with an architecture neural network state-of-the-art method to determine cysteine species and disulfide connectivity.
Physicochemical, antigenicity, allergenicity, toxicity properties of the vaccine. ProtParam tool in the ExPASy server (https:// web. expasy. org/ protp aram/) provided a prediction of physicochemical properties for the final protein construct. VaxiJen v2.0 server on http:// www. ddg-pharm fac. net/ vaxij en/ VaxiJ en. html was used for predicting the antigen. The VaxiJen is considered an independent alignment method. Server accuracy varies from 70 to 89% 96 . Total protein sensitivity was calculated by AllerTOP v. 2.0 (https:// www. ddg-pharm fac. net/ Aller TOP/). AllerTOP v. 2.0 server benefits from an auto cross-covariance (ACC) and k-nearest neighbor algorithm (kNN, k = 1) to predict allergenicity of protein in terms of hydrophobicity, molecular weight, secondary structure properties and relative abundance of amino acids. ToxDL server (http:// www. csbio. sjtu. edu. cn/ bioinf/ ToxDL/ index. html) was utilized to determine protein toxicity. ToxDL server utilizes an interpretable deep learning-based method to classify proteins in two toxic and non-toxic based on multimodal method including three component of CNNs, InterProscan database and word2vec encoder protein domain.
Modeling, refinement, and validation of vaccine construct. RoseTTAFold online software (https:// robet ta. baker lab. org/) was used to build the 3D structure. RoseTTAFold is a "three-track" neural network, meaning it simultaneously considers patterns in protein sequences, how a protein's amino acids interact with one another, and a protein's possible 3D structure and achieved accuracies approaching those of DeepMind. In RoseTTAFold architecture, one-, two-, and three-dimensional information flows back and forth, allowing the network to collectively reason about the relationship between a protein's chemical parts and its folded structure 97  www.nature.com/scientificreports/ To ensure the accuracy of the simulated building, http:// servi ces. mbi. ucla. edu/ PROCH ECK and http:// servi ces. mbi. ucla. edu/ SAVES were used. The PROSA server (https:// prosa. servi ces. came. sbg. ac. at/ prosa. pHp) determined the Z-Score point and protein energy balance. The Ramachandran plot graph was drawn by the online website http:// molpr obity. bioch em. duke. edu/. Chimera V 1.13.1 software determined the best spatial resolution for the optimal energy used in the present study. The website of http:// galaxy. seokl ab. org/ cgi-bin/ submit. cgi? type= REFINE was used to structure refinement.
The prediction of protein secondary structure. NetSurfP-2.0 online software (https:// servi ces. healt htech. dtu. dk/ servi ce. php? NetSu rfP-2.0) was used for predicting the second structure. NetSurfP-2.0 as a sequence-based tool provides a prediction of amino acids secondary structure so that with utilizing a deep machine learning it combined predicted structural features of amino acids including disorders, Phi/Psi angles and surface accessibility to visualizes a protein secondary structure.
Molecular docking of vaccine constructs with TLR7 and TLR8 receptor. TLR7 (PDB code: 7CYN), TLR8 (PDB code: 3w3g) and human germline monoclonal antibody (m826) (PDB code: 5vag) 3D structures were retrieved from RCSB server. Then, the 3D structures of TLRs and m826 were corrected in PyMOL v2.3.4 software for energy. All water molecules were removed from the PDB structure of the vaccine, TLRs and m826. After preparing the 3D structure in Chimera V 1.13.1 software, Influenza constructs, TLR7, TLR8 and m826 were submitted to the HADDOCK server so that the interaction regions will be identified. The highest rankings for each complex were selected at the lowest intermolecular binding of the whole-molecular influenza-TLRs and whole-molecular influenza-m826 heavy and light chains from the HADDOCK with the lowest mean RMSD. H-band formation was assessed by the LigPlot + .
Molecular dynamics and MM/PBSA analysis. The molecular dynamics calculation of the vaccine-TLR7/8 and vaccine-m826 complexes were performed by Gromacs v 4.6.5 and CHARMM36 all-atom force field. Protein-protein complexes were filled with water molecules and then the system was neutralized. The energy minimization of the system was done. The NVT and NPT optimization were done at a temperature of 300 K, the pressure of 1 bar and restraint forces of 1000 kJ/mol. Then, molecular dynamics simulation was performed for 40,000 ps. All bonds were constrained by applying the LINCS algorithm during the simulation.
Immune simulations of vaccine construct. Simulating and predicting the immune response of recombinant influenza structure was performed using the C-ImmSim server. C-ImmSim used an agent-based model for machine learning techniques and immune epitope prediction that utilized a position-specific scoring matrix (PSSM) that allowed it for the prediction of immune interactions. C-ImmSim simulates three distinct anatomical regions for mammals simultaneously. These include bone marrow (to simulate hematopoietic stem cells, lymphoid cells and myeloid cells), thymus (native T cell selection), and the third lymph node, such as the lymph node. Taking into account the 4-week interval between the first and second dose of the vaccine, the Simulation Steps parameter was set to 1050. Then 3 injections were considered so that time-steps were set at 1, 84 and 168, while an one time-step is equal to eight hours of real life. Other parameters were applied by default in this study.