Rank orders of mammalian pathogenicity-related PB2 mutations of avian influenza A viruses

The PB2 gene is one of the key determinants for the mammalian adaptation of avian influenza A viruses (IAVs). Although mammalian pathogenicity-related mutations (MPMs) in PB2 genes were identified in different genetic backgrounds of avian IAVs, the relative effects of single or multiple mutations on viral fitness could not be directly compared. Furthermore, their mutational steps during mammalian adaptation had been unclear. In this study, we collectively compared the effects of individual and combined MPMs on viral fitness and determined their rank orders using a prototypic PB2 gene. Early acquired mutations may determine the function and potency of subsequent mutations and be important for recruiting multiple, competent combinations of MPMs. Higher mammalian pathogenicity was acquired with the greater accumulation of MPMs. Thus, the rank orders and the prototypic PB2 gene may be useful for predicting the present and future risks of PB2 genes of avian and mammalian IAVs.

Waterfowl are reservoirs for influenza A viruses (IAVs), and close interaction between waterfowl and other animals causes occasional cross species transmission to result in successful settle-down by acquiring host adaptive mutations in their eight segmented genomes (PB2, PB1, PA, HA, NP, NA, M, and NS) 1 . In particular, the PB2 protein, which is involved in cap snatching of the host mRNA, has been regarded as one of the key molecules to overcome species-specific host barriers [2][3][4] . PB2 interacts with various host proteins, including importin-α for nuclear localization and ANP32A for polymerase stabilization and mitochondrial antiviral signaling proteins (MAVS), Tu elongation factor, and mitochondrial (TUFM) for anti-viral activity [5][6][7][8] .
Previously, we found a prototypic avian PB2 gene (01310 PB2) that completely attenuated PR8-derived recombinant virus (7 + 1 reassortant) in mice and did not possess any MPMs except 661A and 683T, and identified the
The frequency of each mutation was apparently different between different categories. The frequencies of the mutations D9N, A199S, K526R, A588I, E627K, A674T, and K702R were highest in Human relative to the other categories, but the frequencies of mutations I147T, A588T, and Q591R were higher in the Pig and pdmH1N1 categories than in the others. Although the frequency of D701N in Pig (18.02%) was not high, it was relatively higher than that in other categories (0.06-0.25%). Only T271A and G590S were commonly frequent in the Pig, pdmH1N1 and Human categories. Although most of mutations were less frequent in Bird than in other categories, I147T, K339T, A588T, G590S, E627K, and K702R were relatively more frequent than the other mutations. In the Bird-Human category, the frequencies of E627K (from 2.77 to 40.96%), K339T (from 8.8 to 34.13%), and K526R (from 0.09 to 20.48%) increased steeply in comparison with the corresponding mutations in the Bird category. The frequencies of E158G, E192K, D253N, F404L, Q591K, and S714R were very low, less than 2.4%, in all categories.
The effect of each MPM on polymerase activity, replication efficiency and pathogenicity in mammalian hosts. To compare the effect of each PB2 mutation on polymerase activity, we selected 19 MPMs except A674T, which had no effect on polymerase activity 29 . We measured the polymerase activity by using an in vitro mini-genome assay in 293T cells ( Fig. 2A). Except for the F404L, A588T, and S714R mutations, all PB2 mutations significantly increased polymerase activity relative to the wild-type 01310 PB2 gene. In particular, the E627K, E158G, Q591K, and D253N mutations showed the highest polymerase activities in order and increased polymerase activity by more than 20-fold in comparison with that of the 01310 PB2 gene ( Fig. 2A). D701N, E192K and K526R showed approximately 10-fold increases in polymerase activities, followed by A588I and Q591R. Although D9N, I147T, A199S, T271A, K339T, G590S, and K702R showed significantly higher polymerase activity than wild-type 01310 PB2, they increased polymerase activities by 3-fold or less. The rank order of the effects of PB2 mutations on polymerase activity was summarized in Table S2.
To investigate the effects of the MPMs on viral replication, we selected 10 mutations that increased the polymerase activity by more than 3-fold (E158G, E192K, D253N, T271A, K526R, T588I, Q591K, Q591R, E627K, and D701N) and generated PR8-derived recombinant viruses containing the 10 mutated 01310 PB2 genes. In Figure 1. Mammalian pathogenic PB2 mutations and frequency in different hosts. The location and frequency of 20 MPMs of 18 amino acid residues of PB2 were compared and the frequency difference was visualized by heat-map, 0 (yellow) to 100% (red).
MDCK cells, most of the mutant viruses showed significantly higher replication efficiency than the wild-type virus, rPB2(01310), except rPB2(01310)-588I (P < 0.05) (Fig. 2B). The E627K mutation increased virus titers significantly in comparison with other mutations. The Q591K mutation more greatly increased virus titer than the E158G and D701N mutations (Fig. 2B). The rank order of the effects of MPMs on replication efficiency in MDCK cells was summarized in Table S2.
We investigated the effect of the 10 PB2 mutations on mouse pathogenicity by observing body-weight loss of BALB/c mice (Fig. 3). Most of the single mutations introduced into the prototypic 01310 PB2 gene did not cause body-weight loss during the observation period, but rPB2(01310)-E627K slightly induced the body weight loss of mice up to 12% (Fig. 3A). Only rPR8 caused mortality during the observation period (Fig. 3B). However, all of the single amino acid mutant viruses could replicate in the lungs on 3 and/or 6 dpi in contrast to the parent virus, rPB2(01310) ( Table 1). As expected, rPB2(01310)-627K showed the highest virus titers among other viruses on 3 dpi, but rPB2(01310)-701N reached to the highest virus titers on 6 dpi. The effects of E158G, 192K, 271A and Q591K on virus replication in the lungs were less apparent than those of E627K and D701N but higher than those of D253N, A588I and Q591R. The rank order of the effects of MPMs on virus replication in the lungs was summarized in Table S2.
Inference of the evolutionary steps of swine and human PB2 genes. To understand the mutational steps of MPMs, we performed median joining network analysis using the profiles of 20 mutations of PB2 genes. According to the results, predominant PB2 genotypes in avian, A(H1N1) pdm2009, and human seasonal flu viruses were apparent: MVV, SIB-5-1 and HIB-9-1, respectively. The avian MVV genotype may have evolved to become SIB-5-1 and HIB-9-1 through multiple mutational steps (Fig. 4). However, the chronological data of the PB2 genes were not reflected in the median joining network analysis tree. Therefore, we modified the order of mutational steps from MVV to SIB-5-1 and HIB-9-1 based on chronology-frequency consideration of PB2 mutations as described below.

Discussion
The mutation rate per site may be similar, but the effect of each mutation on viral fitness may be different 28,36 .
Mutations that increase viral fitness may become prevalent among competing mutations. During several pandemics and continuous endemics, frequent infection and transmission among humans may facilitate competition between IAVs and impose selective pressures for more competent PB2 mutations. Therefore, more than 94% frequencies of D9N, A199S, T271A, K526R, E627K, A674T, and K702R mutations in the Human category may reflect the results of intensive competition and selection of such mutations.
The earliest swine IAVs date back to 1918 and are believed to be closely related to pandemic and endemic viruses 37 . However, 'triple reassortant' H3N2 viruses probably appeared via reassortment of the HA, NA and PB1 genomes from human virus, NP, M and NS genomes from 'classical' swine virus, and PB2 and PA genomes from avian virus 38 . Then, the 'triple reassortant' swine descendant virus reassorted with the European avian-like porcine virus by acquiring the NA and M genomes to become A(H1N1) pdm2009 virus [39][40][41] . Thus, avian-origin PB2 and PA genes have co-evolved in the genetic backgrounds of swine IAVs. Although there are no pig-or human-specific mutations, higher numbers of I147T, A588T, and Q591R mutations in the Pig category than in the Human category, and different combinations of mutations, may reflect the unique mutational steps of avian-origin PB2 genes in swine 21 . As the MPM profiles of A(H1N1) pdm2009 and seasonal flu viruses are different, the early starting MPMs are likely to determine the accumulation patterns of subsequent mutations. However, the reasons why reassortant A(H1N1) pdm2009 viruses with stronger PB2 genes of seasonal flu viruses did not appear during their co-circulation may reflect importance of genetic compatibility of viral genomes 42,43 .
Most PB2 mutations tested in this study increased polymerase activity, and the rank orders of the effects on polymerase activity and replication efficiency in the mouse lungs were identical in the cases of E627K, Q591K, E192K, K526R, A588I and Q591R. However, E158G, D253N, T271A and D701N showed different rank orders, and F404L, A588T and A674T did not increase polymerase activity. Therefore, we could not exclude the positive or negative epistasis of already-present mutations of 01310PB2 (661A and 683T) or other PR8 genes such as PA and PB1 genes 30,43 . As expected, E627K is most potent mutation for the mammalian pathogenicity of IAVs, and the high ranking of E627K, D701N and Q591K was coincident with their high frequency among quasi-species during the first mouse infection of a PR8-derived recombinant virus with the 01310 PB2-MVV gene 20,29,32,33 . The 591K mutation was much more potent than 591R in terms of mammalian pathogenicity. The high ranking but extremely low frequency of E158G mutation may raise concerns about a potent, potential risk factor for avian and swine PB2 genes if they evolve to the most compatible genomic constellation 10  www.nature.com/scientificreports www.nature.com/scientificreports/ Figure 5. The effect of combined multiple MPMs on polymerase activity. Polymerase activity was measured using mini-genome assays in 293T cells at 33 and 37 °C. The data were normalized to the polymerase activity of the wild-type 01310 PB2 gene. Statistical significance was calculated using one-way ANOVA. Panel A, Polymerase activity of each combined MPM of the SIB-series (compared to PB2(01310)-MVV, * P < 0.05, ** P < 0.01, and *** P < 0.001; compared to SIB-5-1, # P < 0.05, ## P < 0.01, ### P < 0.001) . The data are the average of three independent experiments ± s.d. Panel B, Polymerase activity of each combined MPMs of the HIBseries (compared to PB2(01310)-MVV, * P < 0.05, ** P < 0.01, and *** P < 0.001; compared to HIB-1-1, # P < 0.05, ## P < 0.01; compared to HIB-8, + P < 0.05). The data are the average of three independent experiments ± s.d.
PB2 interacts with different host factors to play multiple roles. The MPMs are located in different domains and may take part in different functions. The 627 domain containing closely located E627K, Q591K/R, G590S, and A/ V588T/I mutations directly interacts with ANP32A to stimulate RNA synthesis, and E627K increases interaction with the mammalian isoform of ANP32A to result in high mammalian pathogenicity 6,9,44 . TUFM is another host restriction factor differentiating 627E from 627K and impedes the replication of avian IAVs with 627E via autophagy 7 . E627K and Q591R showed negative epistasis, and no synergistic effect was observed in combination 45 . This may be the reason for the extremely low frequency of E627K together with Q591R/K PB2 genes in all categories. However, other mutations such as G590S and A/V588T/I were acquired in addition to E627K or Q591R. The E627K mutation was not selected in avian but in mammalian hosts 29 . Therefore, acquisition of mutations in the 627 domain may be crucial for avian PB2 genes not only to replicate efficiently in mammalian hosts but also to acquire additional mutations for better fitness during competition between early mammal-adapted avian-origin PB2 genes.
The NLS domain interacts with importin-α, and the second highest effect of D701N on the replication efficiency in mouse lungs may reflect the importance of the nuclear localization of PB2 8 . However, unsuccessful acquisition of multiple mutations by PB2 genes with the first D701N mutation may imply that the first strategy to steeply increase nuclear localization of PB2 may not be sustainable. In addition, a significantly higher frequency of less active 702R (98.94%) relative to 701N (0.25%) in the Human category in PB2 bearing E627K may reflect a tendency to determine subsequent mutations by preceding mutations.
The exact function of the third highest effect of E158G in the lid domain is still unclear, but two additional mutations (E192K and A199S) in the lid domain may support the important role of the domain. The N-terminal 200 amino acid residues of PB2 are involved in the interaction with RNP to transcribe and replicate viral RNAs 46 . In addition, basic amino acids in the N-terminal half of PB2, such as 124R, 142R/143R, 268R, and 331K/332R, play important roles in both the transcription and replication of vRNAs 47 . The exact function of 147T is still unclear, but its close location to 142R/143R may be noteworthy 23 . The T271A mutation in the mid domain increased the replication efficiency of IAVs at the human upper respiratory tract temperature, 33-34°C, and individual (T271A) and combined (T271A with MVV) mutations significantly increased polymerase activity at 37°C or at both 33 and 37°C, respectively, relative to 01310PB2 in this study 24,30 .
The role or function related to the highly ranked K526R mutation in the cap-627 linker domain is unclear, but the impinging position of 520-535 residues on the cap-binding site may support speculation regarding its role in viral mRNA transcription 48 . In addition, it is noteworthy that K526R showed a higher effect than the cap-binding site mutations on polymerase activity. Considering the report that K526R enhances the effects of E627K on viral replication via coordination with NEP (nuclear export protein), the role of K526R may not be as simple as has been postulated 14 . Although the polymerase activity of HIB-7 and HIB-8 was indistinguishable, the acquisition of the K526R mutation by rPB2(01310)-SIB-6 increased polymerase activity, replication efficiency in A549 cells www.nature.com/scientificreports www.nature.com/scientificreports/ and mouse pathogenicity in comparison with rPB2(01310)-SIB-5-1. The K526R mutation was suggested to be a mammalian adaptation marker of avian H7N9 viruses in collaboration with the E627K and D701N mutations and to play a role in the successful adaptation of H3N2 seasonal flu since approximately 1970 14 .
The T588I mutation in the PB2 genes of A(H1N1) pdm2009 viruses increased polymerase activity at 33 and 37 °C and increased mouse pathogenicity by way of probable interaction with MAVS, resulting in inhibition of IFN-β expression 11 . The A588T mutation also increased polymerase activity and mouse pathogenicity, while 588I showed significantly higher polymerase activity than 588T in this study 23 . Although HIB-9-1 and HIB-9-2 did not show dramatic differences in polymerase activity and mouse pathogenicity, rPB2(01310)-SIB-5-2, which contained 588I, showed higher viral replication and pathogenicity in mammals than rPB2(01310)-SIB-5-1. The www.nature.com/scientificreports www.nature.com/scientificreports/ high frequency of HIB-9-1 but recent dominance of HIB-9-2 may reflect the fine-tuning of mutation potency for viral fitness (I588T) as observed in mutational selection between 591K and 591R or 701N and 702R (Fig. 4). Interestingly, E627K was incompatible with the PB2 gene of A(H1N1) pdm2009, but most of its MPMs (147T, 590S, 591R, 271A, 526R and 588T/I) were successfully accumulated to E627K (Table S8) 45 . Therefore, the type of early MPMs may be important.
Previously, MVV mutation was speculated to be the minimum essential for avian PB2 genes to acquire additional mutations for mammalian adaptation 29 . The PB2 gene of A(H1N1) pdm1918 (A/Brevig Mission/1/1918) possessed only MVV, E627K, A199S, and K702R mutations, and they may be minimum essential PB2 mutations for catastrophic pandemics 29,49 . Although A(H1N1) pdm2009 viruses without strong PB2 mutations became pandemic, the low mouse pathogenicity of SIB-5-1 is coincident with the low pathogenicity of A(H1N1) pdm2009 viruses 50 . Early mutations in the 627 domain, e.g., G590S and E627K, may be crucial for viruses to become pandemic. A strong E627K mutation may be enough for viral fitness, but a weak G590S mutation may need to recruit more mutations, Q591R and A588T, in the 627 domain 51 . Although the D701N mutation is strong, it may be unsuccessful for recruiting additional mutations to become pandemic at present. However, PB2 genes possessing I147T-K339T mutations have evolved by acquiring E627K or A588T mutation, and H5N1 viruses carried those PB2 genes 23 . The high prevalence of I147T-K339T mutations in the avian category may mean they did not mitigate viral fitness in avian hosts, and they may need to be monitored for additional mutation acquisition and mammalian transmission (Fig. 4).
In this study, we collectively compared individual and combined MPMs in the same genetic backgrounds and determined their rank orders of effects on polymerase activity, replication efficiency in mammalian hosts and mouse pathogenicity. In addition, we provided hypothetical mutational steps of avian PB2 genes during mammalian adaptation. Thus, our data and the prototypic PB2 gene may be useful to predict present and future risks and to understand the evolution of avian and mammalian PB2 genes. Additionally, our present study may encourage more systematic studies to simulate co-evolution and epistatic interaction of PB2, PB1 and PA genes of IAVs.

Materials and Methods
Viruses, eggs, cells and plasmids. A/Puerto Rico/8/34 (H1N1) (PR8) and PR8-derived viruses containing the mutated 01310 PB2 gene were generated using a Hoffmann's reverse genetic system 29,52,53 . They were passaged two times in 10-day-old SPF embryonated chicken eggs (ECEs) via the allantoic cavity route (Charles River Laboratories, Wilmington, MA, USA). 293T, MDCK, and A549 cells were purchased from the Korean Collection for Type Cultures (KCTC, Daejeon, Korea). 293T and MDCK cells were maintained in DMEM supplemented with 10% FBS (Life Technologies Co., Carlsbad, CA, USA), and A549 cells were maintained in DMEM/ F12 supplemented with 10% FBS. To generate recombinant viruses, Hoffmann's reverse genetic system was used as described previously 52,53 . Recombinant viruses were generated and passaged two times in 10-day-old SPF ECEs and then stored at −80 °C until experimental use.
Data mining and sequence analysis. PB2 genes containing a complete coding sequence were collected from the NCBI Influenza Virus Resource (http://www.fludb.org), and they were translated and compared with the BioEdit program (v7.2.5). The profiles of MPM genes were analyzed by median joining analysis with Network 5.0.0.3 54 . Cloning and site-directed mutagenesis. The 01310PB2 segment previously cloned into Hoffmann's bidirectional transcription vector pHW2000 was used in this study 29 . The insert sequence was confirmed by sequencing with primers cmv-SF and bGH-SR as previously described (Macrogen Co, Seoul, Korea) 55 . Site-directed mutagenesis to introduce individual or combined MPMs into 01310 PB2 gene was performed using a Muta-direct Site Directed Mutagenesis Kit as per the manufacturer's protocol (iNtRON, Sungnam, Korea).
Mini-genome assay. To evaluate the polymerase activity of mutated PB2 genes, we used the pHW-NP-Luc plasmid as previously described 29 . Briefly, 293T cells in 12-well plates were cotransfected with 0.1 μg each of pHW-NP-Luc and mutated 01310 PB2 and the PR8 PB1, PA and NP genes. Additionally, 0.1 μg of the Renilla luciferase plasmid pRL-TK (Promega, Madison, WI, USA) was also cotransfected, which served as an internal control to normalize variations in transfection efficiency and sample processing. Then, 24 hours after transfection, luminescence was measured using a Dual-Glo Luciferase Assay System (Promega, Madison, WI, USA) in accordance with the manufacturer's instructions on a TECAN Infinite 200 pro machine (Tecan Benelux bv, Giessen, Netherlands). All results shown are the average of triplicate experiments, and the standard deviation was calculated.

Generation of recombinant viruses by reverse genetics
PR8-derived recombinant virus was generated by transfecting Hoffmann's eight reverse genetics plasmids into 293T cells as described previously, with some modifications 52,53,55 . One day before the transfection, 293T cells were cultured in 6-well plates (5 × 10 5 cells/well), and 300 ng of each plasmid was transfected together into the cells using Lipofectamine Plus reagent (Life Technologies) in 1 mL of Opti-MEM (Life Technologies). After overnight incubation, 1 mL of Opti-MEM and 0.5 µg/mL L-1-tosylamide-2-phenylethyl chloromethyl ketone (TPCK)-treated trypsin (Sigma-Aldrich, USA) were added. After another overnight incubation, 0.2 mL of the harvested culture medium was inoculated into 10-day-old SPF ECEs via the allantoic cavity route. The allantoic fluid was harvested from inoculated ECEs after incubation at 37 °C and checked for the presence and titers of recombinant viruses using plate and 96-well hemagglutination tests with 1% (v/v) chicken red blood Animal experiments. Six-week-old female BALB/c mice (KOATEC Co. Pyeongtaek, Korea) were used for a mouse pathogenicity test (BIoPOA Co., Yongin, Korea). Five BALB/c mice were intranasally inoculated with 10 6 EID 50 /50 µL of each virus after anaesthetized intraperitoneal injection of 15 mg/kg Zoletil 50 (Virbac, Carros, France). Negative control (mock) mice were injected with the same volume of sterilized PBS. To measure the 50% mouse lethal dose (MLD 50 ), four mice were anaesthetized via intraperitoneal injection of 15 mg/kg Zoletil 50 (Virbac, Carros, France) and then intranasally inoculated with 10 2 to 10 5 EID 50 /50 µL of each virus. Mortality and weight loss were measured for 14 days. Mice that lost more than 20% of their original weight were euthanized and recorded as dead. MLD 50 was calculated by the Spearman-Karber method. For the measurement of virus replication in infected mouse lung, six mice from each group were injected with PBS (mock) or 10 6 EID 50 /50 µL of mutant virus. The lungs were collected at 3 and 6 dpi and then stored at −80 °C until use. The lungs were ground using a TissueLyzer 2 (Qiagen, Hilden, G) with 5 mM stainless steel beads and a volume of PBS equal to 10% of the lung weight in suspension. Then, 10 volumes of PBS were mixed with the ground tissues. After centrifugation at 2000 × g for 10 min, the supernatants were used for viral titers, which were measured as EID 50 . Statistical analysis. The polymerase activity and virus titer were compared by one-way analysis-of-variance (IBM SPSS Statistics ver. 23; IBM, USA). The results were considered statistically significant if P < 0.05, P < 0.01 and P < 0.001.

Data availability
All data generated or analysed during this study are included in this published article (and its Supplementary Information files).