Introduction

During the 2013 H7N9 influenza outbreak in southeast China, there were 139 serologically confirmed cases and 48 deaths1,2. The H7N9 viruses were generated by the subsequent reassortment of H7 viruses with enzootic H9N2 viruses3,4. H7N9 infection was likely mediated by exposure to poultries because about 55.9% of H7N9 influenza patients had a clearly defined poultry exposure5. In addition, closing the poultry trading markets in China was coincided with control of the outbreak. Although many people may have contacted with these poultries, only a minority became sick6,7, suggesting that there may be genetic determinants of both host susceptibility and severity of infection8. The acute onset and rapid progress to severe pneumonia and acute respiratory distress syndrome of H7N9 infection with high mortality highlight the importance of identifying genetic polymorphisms that might predict host response to H7N9 infection. Knowledge of these polymorphisms might help predict both susceptibility to infection and the severity of host response during potential influenza outbreaks in future.

Material and Methods

Experimental Design

This study has been approved by Ethics Committee of Fudan University and all experiments were performed in accordance with relevant guidelines and regulations of Ethics Committee of Fudan University. Due to budget limitation, we collected blood samples from 18 H7N9 infected patients during pandemic outbreak period and patients follow up after hospital discharge. We performed exon sequencing on 8 patients of the 12 survivors after H7N9 and verified all the gene mutation in all the 18 blood samples including the non-survivors (Table 1). Control used in house data from BGI-Shenzhen9.

Table 1 The clinical information of the 18 patients.

Sample Collection

After informed consents were obtained from all subjects, blood samples were collected from 18 H7N9 infected patients, while exon sequencing was performed on 8 patients. Patients were considered to have H7N9 pneumonia if the following criteria were fulfilled: (1) nasopharyngeal swab positive for H7N9 and; (2) Chest X-ray or CT showing pulmonary infiltrates; (3) clinical symptom of fever and cough. This study was approved by ethical committee of Shanghai Public Health Clinical Center.

Blood collection and DNA extraction

After consent, 10 ml venous blood was drawn from each patient in an EDTA-containing tube. Samples were immediately centrifuged at 500 g for 10 min and plasma was removed and stored for future measurement. Genomic DNA was extracted from the remaining cell pellet using the SQ Blood DNA Kit II (omegabiotek D0714-250). Briefly, cells were lysed and then cell nuclei and mitochondria were separated by centrifugation. The isolated nuclei were resuspended in XL Buffer (supplied by omegabiotek) which contains chaotropic salt and proteinase to remove contamination. Lastly, genomic DNA was purified by isopropanol precipitation.

Exome capture, library preparation and sequencing

The isolated genomic DNA from 8 patients was fragmented into DNA strands with lengths of 150 to 200 bp by Covaris technology, and then adapters were ligated to both ends of the resulting fragments. The adapter-ligated templates were purified by the AgencourtAMPure SPRI beads and fragments with the insert size of about 200 bp were excised. Extracted DNA was amplified by ligation-mediated polymerase chain reaction (LM-PCR), purified, and hybridized to Agilent SureSelect Human All Exon (50 M) human exome array for enrichment. Hybridized fragments were bound to strepavidin beads whereas non-hybridized fragments were washed out after 24 h. Captured LM-PCR products were subjected to Agilent 2100 Bio-analyzer to estimate the magnitude of enrichment. Each captured library was then loaded on Hiseq2000 platform, and high-throughput sequencing for each captured library was performed. Raw image files were processed by Illumina base calling Software 1.7 for base calling with default parameters and the sequences of each individual were generated as 90 bp paired-end reads (Table 2).

Table 2 Exon sequencing data summary.

Read mapping and variation detection

After removing reads containing sequencing adapters and low-quality reads, high-quality reads were aligned to the NCBI human reference genome (hg19/GRCh37) using BWA (Burrows-Wheeler Aligner, v0.5.9-r16) with default parameters. Low-quality read was defined as more than half of a read was constituted with low quality bases (less than or equal to 5) or a read in which unknown bases were more than 10%. Picard (v1.54) (http://picard.sourceforge.net/) was used to mark duplicates. Subsequently, BAM files (sequence alignment/map format) were compressed to SAM files (the binary files of BAM files). SNPs (Single-nucleotide polymorphism) and InDels (Small insertions/deletions) were detected by module Unified Genotyper of GATK (Genome Analysis Toolkit v1.0.6076). And then ANNOVAR was used to do annotation and classification for SNPs and InDels respectively. Our data have been identified by dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi), 1000 human genomes database (www.1000genomes.org/) and BGI’s inhouse control database. We used BGI’s inhouse control, most controls coming from a Whole Exome Sequencing based study of genetic risk for psoriasis which has been published9, and the controls comprised 800 normal people across the whole country. These inhouse control cases were specially used for analysis of rare diseases. Considering only 33 patients had confirmed H7N9 infection in Shanghai, and large population exposed to risk factors, the H7N9 infection was a low possibility case, and could be considered as rare disease, so this control data could be used in this study. We collected 40 genes which were correlated with avian influenza from HuGE Navigator by keyword search with “influenza” and extracted 89 exonic SNVs (single nucleotide variations) (Supplementary Dataset 1) located in 27 genes (Table 3) of the 40 genes from SNPs result.

Table 3 21 virus infection relevant genes identified from exon sequencing.

Gene mutation verification

The 89 mutations were verified in all the 18 patients by Sanger sequencing. 47 fragments of each patient were amplified from their genomic DNA by PrimeSTAR® HS (Premix) (Takara R040A) to verify the 89 mutated sites. The PCR products were subjected to 1% agarose gel electrophoresis and then purified from the gel by QIAquick Gel Extraction Kit (QIAGEN No. 28706). The purified PCR products were subjected to Sanger sequencing (ABI 3730).

Bioinformatics analysis

A protein-protein interactions (PPI) network of the resulting genes was constructed using the Disease Association Protein-Protein Link Evaluator (DAPPLE, http://www.broadinstitute.org/mpg/dapple/dapple.php)8 with 1000 permutations selected and 2 interacting binding degree as a cutoff. And Database for Annotation, Visualization and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov/)10, a bioinformatics tool that can identify the biological processes, in which a group of genes are involved, were used for functional annotation.

Results

Mutational Analysis of Genes from 18 H7N9 Infected patients

We have admitted 18 H7N9 infected patients around 10 days after disease onset and a series of clinical manifestation, laboratory examinations and prognosis were carried out for the following 15 weeks. 6 of the 18 patients died and we found the increased plasma CRP (Creactive protein), PCT (Procalcitonin) and virus positive days were associated with mortality11. After exon sequencing of 8 survivors, 64 exonic SNPs, located in 21 genes, were found to be enriched in the H7N9 patients compared to controls from the NCBI human genome (hg19) (Supplementary Dataset 1 and Table 4). These mutations were found in genes encoding proteins responsible for multiple key host defense mechanisms, including cytokine production, airway epithelium barrier function and pathogen associated molecular pattern signaling pathway, suggesting biological plausibility (Table 2).

Table 4 The 64 validated exonic SNVs and their distribution between different groups.

Bioinformatics analysis

The resulting genes with exonic SNPs were uploaded to the online tool DAPPLE for PPI network analysis. The results indicate that the PPI network was statistically significant. There were 5 disease proteins participating in the direct network with 3 direct interactions in total expected direct interactions = 0.347, p = 0.004 (Fig. 1, Table 5). Moreover, there were 13 genes participating in the indirect network under the same condition (Fig. 2).

Figure 1: Direct connections among gene products from exome sequencing result.
figure 1

Colours indicate significance of participation in the PPI network.

Table 5 The PPI network statistics.
Figure 2
figure 2

Indirect connections among gene products from exome sequencing result.

We further confirmed the functions of these candidate genes using the online tool DAVID. The genes were significantly enriched for defense-related processes such as response to stimulus (p = 1.81 × 10−8), immune response (p = 8.85 × 10−7), immune system process (p = 1.16 × 10−6), response to biotic stimulus (p = 5.48 × 10−6) and modulation by symbiont of host immune response (p = 1.53 × 10−5) (Table 6).

Table 6 Top 10 go term analysis results.

Gene mutation distribution between different groups

Whole exome sequencing was performed on 8 H7N9 patients and 89 exonic SNPs were identified. These SNPs were subjected to Sanger sequencing in all the 18 patients and 64 exonic SNVs were verified. We compared the mutation rate of the case and the inhouse control using the Fisher Exact Test and found significant statistical difference (Supplementary Dataset 1 and Table 4). There were 17 SNVs significantly different between the case and the inhouse control and we have validated 16 of them by Sanger sequencing (Table 7). The 16 validated SNVs were located in 12 genes, and the protein-protein interaction among them (Fig. 3) was consistent with the protein-protein interaction among the 21 genes done before (Fig. 2). It is more likely that both the genes identified from this study that showed statistical difference of mutation frequency and the genes with same mutation rate between patients and controls have participated in the pathogenesis of H7N9 virus infection. We also did Mann-Whitney U test between the first 8 patients and the additional 10 patients and none of P-value was significant (Table 4), which could prove the inhouse control data do not introduce any false signals. Moreover, We compared the mutation rate of death group and survival group and analysis the mutation rate by Mann-Whitney U test and no significant difference was found between the survival and non-survival group (Table 4), suggesting some of genes identified in this study may be associated with H7N9 influenza susceptibility.

Table 7 The 17 SNVs significantly different between case and control.
Figure 3
figure 3

The protein-protein interaction among the 12 genes significantly different between case and control.

Discussion

The 2013 Chinese H7N9 influenza outbreak lead to an estimated 48 deaths with 33% mortality and significant morbidity in patients who survived the virus. An important observation during the recent H7N9 outbreak in China was the wide variation in host response to infections, with some patients developing only mild upper respiratory tract infections, while other patients developed severe ARDS and died. Although several determinants of the host response to infection have been identified, many important genetic factors that dampen or exacerbate the host response to H7N9 infection likely remain undiscovered. Previous studies suggested that genetic mutations in the protein machinery that comprise key host defense mechanisms could impact outcomes of influenza infection12. The differential susceptibilities to influenza A(H7N9) were affected by functional variants of LGALS1 causing the expression variations13. The H7N9 influenza outbreak in China provides an unique opportunity to study mutations in this machinery, because many poultry workers were exposed to the virus, yet comparatively few became infected. This may suggest that genetic mutations in host defense mechanisms could be responsible for the selectivity of H7N9 infection. Others have identified genes that are protective during influenza infection, including MX1, NCR1, CCR5, IFITM3 and IL1014. Mutations in these genes may lead to increased host susceptibility to infection or to a heightened, and potentially deleterious, host response to infection. We hypothesized that the exome sequencing of these patients may reveal genetic mutations that increased susceptibility to viral infection, and that in the future, these mutations could provide information regarding risk of infection, especially poultry workers or family members of infected patients.

Using a variety of computational genetic techniques, we identified 21 genes that showed a high rate of mutation in patients infected with H7N9 when compared to the general population. Among these genes, some have been identified in prior studies of H7N9 susceptibility genes14. For example, Wang et al. reported that IFITM3 dysfunction is associated with increased cytokine production during H7N9 infection and is correlated with mortality14. IFITM3 (chr11, 320772, A > G) was reported to be enriched in patients hospitalized due to H1N1/09 infection15. Polymorphisms of CPT2, a carnitine palmitoyltransferase 2 protein, were found in patients suffering from influenza-associated encephalopathy; results of overexpression of CPT2 variants in vitro suggested that the variants were heat-labile and failed to perform optimally during fever16,17. Four disease outcome-associated SNPs were identified on chromosomes 17 (RPAIN and C1QBP), chromosome 1 (FCGR2A), and chromosome 3 (unknown gene). C1QBP and GCGR2A play roles in the formation of immune complexes and complement activation, suggesting that the severe disease outcome of H1N1 infection may result from an enhanced host immune response12,16.

Among the 21 genes we identified, we use the online tool DAPPLE to performed a PPI analysis and found 5 proteins directly participates the PPI network. Those proteins include: LEP, IFNAR1, IL10RB, HLA-DQA1, HLA-DQB1. The PPI analysis suggested significant role of these proteins in influenza infection and may provide target for interventional therapy.

The primary limitation of this study is the relative small sample size. Only 18 patients were enrolled, and confirmation of these findings in subsequent studies will be needed. We are planning to collect more samples for next step sequencing.

Conclusion

Using comparative genetic analysis in 18 patients with confirmed H7N9 viral infection in China, we identified 21 genetic mutations that occurred at a higher rate in infected patients when compared to the general population. Many of the identified genes are involved in key host defense mechanisms, which gives strong biologic plausibility to the role of these genes in both host susceptibility to infection as well as host immune response related pathology. Further investigations into the function of these genes in host susceptibility may help identify individuals who are at high risk for infection. In addition, translational research into the function of the genes identified in this study may provide new potential therapeutic targets for influenza virus infection.

Additional Information

How to cite this article: Chen, C. et al. Multiple gene mutations identified in patients infected with influenza A (H7N9) virus. Sci. Rep. 6, 25614; doi: 10.1038/srep25614 (2016).