First genome report and analysis of chicken H7N9 influenza viruses with poly-basic amino acids insertion in the hemagglutinin cleavage site

We report the full-length sequence of two chicken source influenza A (H7N9) viruses found in Guangdong live poultry market (LPM) during the most recent wave of human infections (from October 2016 to the present time). These viruses carry insertion of poly-basic amino acids (KGKRTAR/G) at the protease cleavage site of the HA protein, which were previously found in the highly pathogenic (HP) human influenza A (H7N9) [IAV(H7N9)] strains. Phylogenetic analysis of these two novel avian influenza viruses (AIVs) suggested that their genomes reassorted between the Yangtze River Delta (YRD) and Pearl River Delta (PRD) clades. Molecular clock analysis indicated that they emerged several months before the HP human strains. Collectively, our results suggest that IAV(H7N9) viruses evolve in chickens through antigenic drift to include a signature HP sequence in the HA gene, which highlights challenges in risk assessment and public health management of IAV(H7N9) infections at the human-animal interface.

(43/57) known to have avian contact 2 . Therefore, avian contacts are considered to be closely related to human IAV(H7N9) cases. Previous findings have also hinted that live poultry markets (LPMs) play a key role in multiple subtype AIV infecting humans 10,11 . We therefore have recently conducted a survey of AIV infections among chickens, ducks, geese and pigeons sold in a LPM in Guangdong province and have found that 2 samples from chickens were positive for AIV. Importantly, our phylogenetic analysis has shown that these novel avian viruses are reassortants of the two distinct genetic lineages of the YRD and PRD clades and that their HA proteins contain the KGKRTAR/G motif with insertion of poly-basic amino acids at the HA cleavage site, which has previously been shown to be among the sequences of HP human influenza virus strains. We therefore propose that this HA sequence change observed in the novel AIVs in our study might serve as a potential precursor to those found in HP human H7N9 viruses.

Results
Sample information. On February 25 th , 2017, during the fifth IAV(H7N9) epidemic, from a LPM of Foshan City in the Guangdong province of China, we collected a total of 200 tracheal swab samples from 50 chickens, 50 ducks, 50 geese and 50 pigeons. These birds showed no significant clinical signs. After real time RT-PCR, 2 samples were found to be H7 HA gene positive. These two swab samples were collected from chickens kept in separate cages and sold by different vendors that showed no significant clinical signs. The two strains were named A/Chicken/Guangdong/J1/2017(H7N9) and A/Chicken/Guangdong/J2/2017(H7N9) (abbreviated as CK/J1 and CK/J2, respectively). Accession numbers KY855515-KY855530 of these two strains were obtained after whole genome sequences were submitted to NCBI GenBank. Viral Proteins and Molecular Characteristics of the Two Novel AIVs. The molecular signatures of CK/J1 and CK/J2 associated with host adaptation, receptor specificity, and potential pathogenesis and antiviral resistance were assessed, and compared with three recently described HP human IAV(H7N9) isolates, including A/Guangdong/17SF006/2017(H7N9), A/Qingyuan/GIRD1/2017(H7N9) and A/Taiwan/1/2017(H7N9), abbreviated as GD/17SF006, QY/GIRD1 and TW/1, respectively, with amino acid insertion in the HA cleavage site (Table 1). Q226L/I and G228S substitutions in the HA protein, which are the two main mutations contributing to the high-affinity binding of viruses to human receptors, were not identified in the two novel AIVs. However, several substitutions that may increase the binding ability of the viruses to the human α-2,6-linked sialic acid receptor in HA were detected, namely S138A, T160A, and G186V. NS1 E172K, which induces viral replication in mammalian cells, was found in both chicken and human influenza viruses 12 . Virulence-related signatures P42S and D92E substitutions in the NS1 protein, were also identified. M2 S31N, which confers resistance to amantadine and rimantadine, was found in all these viruses. However, mutations of the NA protein that may cause Oseltamivir resistance, such as E119V and I222L, were not detected in these viruses. Different from human strains, the two AIVs found in this study still harbored known avian influenza virus signatures, e.g. the PB2 E627K mutation was found in 3 human strains, but not in CK/J1 and CK/J2. Meanwhile, PA K356R did not occur in CK/J1 but harbored by the CK/J2 and 3 human viruses (Table 1).
Additional Amino Acid Insertion in HA Cleavage Sites of the Two Novel AIVs. In the first four epidemic waves (from Mar 2013 to Oct 2016), all reported viruses had the low pathogenic cleavage site in the amino acid sequence PEIPKG/GLF of the HA proteins (Fig. 1a). However, during the fifth wave, 5 human IAV(H7N9) strains isolated in Guangdong, A/Qingyuan/GIRD1/2017, A/Guangdong/SP440/2017, A/Guangdong/ HP001/017, A/Guangdong/17SF003/2017 and A/Guangdong/17SF006/2017 (or QY/GIRD1, GD/SP440, GD/ HP001, GD/17SF003 and GD/17SF006, respectively) had an insertion of 3 basic amino acid residues (RKR) in the cleavage site connecting the HA1 and HA2 peptide regions, carrying the PEVPKRKRTAR/GLF sequence, which is a signature of HPAI viruses 8 . The two novel AIVs CK/J1 and CK/J2 also had similar insertions in the cleavage site of HA protein, but they slightly differed from the human strain signature. The chicken viruses carried the PEVPKGKRTAR/GLF sequence with only 2 basic amino acid residues inserted (KR) (Fig. 1a). Such amino acid mutations corresponded to nucleotide changes. Specifically, 12 additional nucleotides (AACGGACTGCGA) were found in CK/J1, CK/J2 and the 5 new human isolates, but not in the first 4 epidemics (W1 to W4) strains (Fig. 1b). A nucleic acid mutation G1012A (H7 numbering) was also found in the human strains, but not in CK/ J1, CK/J2 and other H7 viruses with the LP cleavage site (Fig. 1b). The HA sequence peak map on the cleavage site regions of CK/J1 and CK/J2 showed a single peak for each nucleotide, which indicates that sequencing results are trustworthy (Fig. 1c). This was the first demonstration of such a molecular sequence-signature characteristic in chicken IAV(H7N9) viruses since their emergence in 2013, according to the alignment of viral HA sequences available in the GISAID database.  Figures S1-S4). Two glycoprotein genes, HA and NA, of the novel AIVs (CK/J1 and CK/J2) and human strains recently found in Guangdong province, all belong to the PRD lineage. As for genetic relationship, these viruses closely related to W3 strains (Fig. 2, Supplementary  Figures S2 and S3). The polymerase tripartite genes (PB2, PB1 and PA) of CK/J1 and CK/J2 were derived from the YRD clade, and close to W3 and W4 viruses. Similar with the two chicken viruses, PB2 of the five Guangdong human isolates also originated from the YRD clade in W3 to W4. However, PA genes of the 5 human strains and PB1 of GD/SP440, GD/HP001 and GD/17SF003 were located in the PRD W3-like lineage, while QY/GIRD1 and GD/17SF006 were located in the OR clade W3-like branch (Fig. 2, Supplementary Figures S1 and S2). NP genes of the two avian and five human strains were closely related to W3 to W5 strains. Except QY/GIRD1 and GD/17SF006 strains, which derived from PRD, the NP genes of other 5 strains originated from the YRD clade (Fig. 2, Supplementary Figure S3). The M and NS genes of all 7 viruses derived from the YRD clade. The M genes of avian and human strains had close genetic relationship with W3 to W4 viruses (Fig. 2, Supplementary  Figure S4). Although all NS genes originated from the YRD clade, they derived from different epidemic periods. CK/J1 was genetically closely related to the W3 viruses, and CK/J2 to the W1 to W2 strains. NS genes of the 5 human viruses were from W4 strains. As a common ancestor of IAV(H7N9) virus, all segment genes of A/ Shanghai/1/2013(H7N9) (termed as SH/1) derived from the YRD clade. From the phylogenetic trees, polymerase tripartite and NS genes were close to avian strains instead of human ones (     (Figs 3D and 4D). The Clade I-B was estimated to start around June 2012 Together with the results of molecular clock analysis of HA and NA genes of the novel HP IAV(H7N9), we further confirmed that the avian HP strains appeared months earlier than those human strains during the evolution process.

Discussion
In this study, the positive rate of potentially HP IAV(H7N9) influenza virus in real time RT-PCR detection was 1% (2/200). Since we only collected samples from one LPM, this finding does not necessarily represent the entire situation of the ongoing 5 th epidemic of influenza virus infections in China, but it still reveals a potential threat on public health. In this study, all the sampled birds, including the sources of CK/J1 and CK/J2, showed no significant clinical signs. It is known that occasionally an H5 or H7 virus causes only mild illness in chickens and turkeys, although it has a genetic signature that classifies it as an HPAI virus [14][15][16] . According to the amino acid signature analysis in our study, this may be related to the non-continuous basic amino acid insertion in the HA cleavage site, different from human HP strains; a further study is required to verify the relationship between such differences and viral pathogenicity 17 . The 226Q and 228G amino acids in the HA protein and/or the 627E and 701D in the PB2 protein indicate that such viruses may still adapt to the avian species 18,19 . However, these viruses also show some human or mammalian receptor binding characteristics, such as the 138 A, 160 A, 186 V in the HA proteins, indicating their potential adaption to human 2,6-linked sialic acid receptor [20][21][22] . A study on the recent human HP H7N9 strains GD/17SF003 and GD/17SF006, which both carried the amino acids insertion on HA cleavage site suggested their preference for both avian-and human-type receptors 23 . Considering the LPM plays a critical role in AIV spread to humans, birds with no significant clinical signs may still shed viruses during breeding and growing in farms, transportation and sale in LPMs, which may not be easily noticed. This would increase the risk of AIV spreading to humans and other animals 10 . In such conditions, annual systematic and Phylogenetic analysis indicates that genes of the two novel AIVs (CK/J1 and CK/J2) and those of 5 human strains all were reassortant viruses (Fig. 2). The avian strains harbored two surface glycoprotein genes (HA and NA) from PRD clades while the other six internal genes were from the YRD clade (Fig. 2). Different from chicken strains, the human strains in Guangdong province during the fifth wave carried two different genotypes; QY/ GIRD1 and GD/17SF006, which contained genes from three clades (PB2, M and NS from YRD; PA, HA, NP and NA from PRD; PB1 from OR). GD/SP440, GD/HP001 and GD/17SF003 contained genes from two clades (PB2, NP, M and NS from YRD; the remaining genes from PRD) 13 . These results indicate that the two main IAV(H7N9) sources were YRD and PRD, corroborating previous findings 13 . In addition, the results suggest that IAV(H7N9) Figure 3. Bayesian maximum clade credibility (MCC) phylogeny of H7 gene sequences of waves 1 to 5. Viruses isolated in the Yangtze River Delta, Pearl River Delta, and other regions are highlighted in red, blue, and green, respectively. BEAST tree of H7N9 viruses estimated using HA gene sequences. Orange shadow indicates Clade I of H7N9 viruses; purple represents Clade II. Figure 3A, time of most recent ancestor (TMRCA) of Clade II; Fig. 3B, TMRCA of the entire tree. Figure 3C, TMRCA of Clade I. Figure 3D, TMRCA of the Clade I-A lineage. Figure 3E, TMRCA of the Clade I-B lineage. Figure 3F, TMRCA of the lineage containing H7N9 viruses detected in avian or humans with HP characteristic cleavage site. Purple characters, human HP strains; orange characters, chicken HP strains detected in this study.
viruses with different genotypes are co-circulating in avian and humans 7,13 . Moreover, the surface glycoprotein genes, HA and NA of the two avian and five human strains derived from wave 3 are similar to the PRD lineage viruses. According to Wu and colleagues, genes located in such lineage began to circulate in central region of Guangdong province in wave 2 24 . Some genes of the two novel AIVs CK/J1 and CK/J2, such as HA, NA, NP and M, were closely related to human strain lineages [20][21][22]25 . However, unlike genes of those 5 human strains, the polymerase tripartite and NS genes of CK/J1 and CK/J2 were closely related to avian strains, indicating that such genes may still circulate and undergo adaptation in poultries of Guangdong.
The molecular clock analysis of the HA and NA genes suggested that the two CK/J1 and CK/J2 strains appeared from July 2015 to March 2016, several months earlier than the newly found human strains with additional cleavage sites in the HA protein, which were estimated to appear in March 2016 (Fig. 5). Such results suggest that the newly identified  Figure 4C, TMRCA of Clade I. Figure 4D, TMRCA of the Clade I-A lineage. Figure 4E, TMRCA of the Clade I-B lineage. Figure 4F, TMRCA of the lineage containing H7N9 viruses detected in avian or humans with HP characteristic cleavage site. IAV(H7N9) with HP status may have obtained some of the HP mutations in chicken before adaptation to humans 26 . As shown by the molecular clock study, nine of thirteen viruses in the branch containing the HA gene of novel HP IAV(H7N9) strains were of human sources; only four viruses were avian, including CK/J1 and CK/J2 (Fig. 3F). Taken together, molecular clock analysis, molecular characterization and phylogenetic analyses indicate that the two novel AIV strains may constitute an intermediary virus during IAV(H7N9) spread among avian and humans. However, direct evidence to show that these novel AIVs transmit from avian to humans remains to be demonstrated.
Some limitations of this study are noteworthy. First, because of the necessary biosafety restrictions, we did not attempt to isolate and culture the AIVs collected from positive swab samples. However, since their full genomes have been obtained, these novel AIV can potentially be reconstructed through reverse genetics (e.g., in collaboration with researchers with a BSL-3 lab) for further basic virology research. Secondly, sample collection was carried out in a single day at one LPM; therefore, the results could not be generalized to the whole region and for a longer period of time. Additional surveys with longer time frame and in more sampling sites (e.g., in poultry farms and LPMs in southern China, including but not necessarily limited to Fujian, Hunan, Guangxi and Hainan provinces besides Guangdong province) are being planned. The results from these additional studies will be described in subsequent reports.
In conclusion, we demonstrate for the first time that the IAV(H7N9) viruses isolated from chickens in LPM in Guangdong province have acquired additional basic amino acids in the HA cleavage site, with only one amino acid difference from those found in HP IAV(H7N9) human strains, which can play an important role in increased virulence in humans. According to the divergence time scale, these two AIV strains seemed to have appeared months before the isolation of the human HP strains in Guangdong province. Therefore, the acquired poly-basic insertion in the novel AIVs may be attributed to persistent circulation in poultry species and may serve as intermediary viruses to human infections. Further investigation is required to determine whether the poly-basic HA cleavage site of the IAV(H7N9) virus is associated with increased avian and/or human disease severity. The molecular characteristics of these novel chicken AIV strains highlight challenges in risk assessment of IAV(H7N9) infections at the human-animal interface.

Materials and Methods
Ethics Statement. Swab sampling and experiments were approved by the Institutional Animal Care and Use Committee of Guangzhou Medical University. All methods were performed in accordance with the relevant guidelines and regulations.

Sample collection.
A total of 200 tracheal swab samples were collected from chickens, ducks, geese and pigeons, without significant clinical signs, in a LPM located in Guangdong Province, on Feb 25, 2017. Swabs were placed into sterile tubes containing 1 mL of phosphate-buffered saline (PBS) on ice, and transported to the laboratory for further sample processing and RNA extraction within 4 hours. Swabs were vortexed vigorously for 15-20 s and pressed against the tube wall to remove as much organic materials as possible. Then, the samples were cleared by centrifugation for 5 mins at 3000 rpm. The resulting supernatants were stored at −80 °C until use.
Real time RT-PCR detection and viral genome sequencing. Before testing, samples were thawed at room temperature. Viral RNA was extracted using the RNeasy Minikit (Qiagen, Germany). Real-time reverse transcription-polymerase chain reaction (RT-PCR) was performed with the PCR-Fluorescence Detection Kit for H7 Influenza A virus RNA (Cat No. SJ-LG-004-3, Shanghai Biogerm Biological Technology Co., LTD, Shanghai, China) following the manufacturer's instructions, on the ABI-7500 Real-time PCR system (Applied Biosystems, Foster City, CA). Ct value ≤ 30 was considered to be positive. One-step RT-PCR was performed on the Bio-Rad T100TM Thermal Cycler (Bio-Rad, Hercules, CA) with the Invitrogen Superscript kit, following the manufacturer's instructions. The full-length genomic sequences of the two IAV(H7N9) strains were amplified with specific RT-PCR primers. PCR products were separated by 1% agarose gel electrophoresis and purified using the Qiagen gel extraction kit (Qiagen, Inc., Valencia, CA), and cloned into the pMD18-T vector (TaKaRa). The pMD18-T vectors were transformed into DH5α competent cells, and cultured in 37 °C for 20 h. Five colonies of each virus gene segment were sent to Sangon Biotech (Shanghai) Co., Ltd. for complete genome sequencing on the ABI 3730XL automatic DNA analyzer (Applied Biosystems) with the ABI BigDye Terminator v3.1 cycle sequencing kit (Applied Biosystems). None of the work included in this study involved culturing the live viruses. Sample preparation and viral RNA extraction from the swaps were conducted inside a biosafety cabinet in a BSL-2 lab facility following proper biosafety guidelines and procedure approved by the local institution.