Genetic mutations in NF-κB pathway genes were associated with the protection from hepatitis C virus infection among Chinese Han population

Host genetic polymorphism is one of major unalterable major factors for HCV infection. NF-κB proteins play multiple roles in immune response and involve in HCV infection and progression. This study was conducted to explore the relationship between single nucleotide polymorphisms (SNPs) in NF-κB pathway and the susceptibility as well as resolution of HCV infection. A total of 1642 Chinese subjects were enrolled in the study, including 963 uninfected control cases, 231 cases with spontaneous viral clearance and 448 cases with persistent HCV infection, and four SNPs (Rel rs842647, NF-κB2 rs12769316, RelA rs7101916, RelB rs28372683) were genotyped by TaqMan assay among them. Potentially functional polymorphisms were analyzed using online bioinformatics tools. The logistic analyses results indicated that RelA rs7101916 T allele (PBonferroni = 0.016) and RelB rs28372683 A allele (PBonferroni = 4.8e-5) were associated with an decreased risk of the susceptibility to HCV infection among Chinese Han population, which were consistent with the results of cumulative effects and haplotype analysis. The silico analysis of SNPs function suggested that the genetic variation of rs7101916 and rs28372683 could influence gene transcriptional regulation and expression, subsequently affecting NF-κB pathway activation and the susceptibility to HCV infection. This study firstly reported that the carriage of RelA rs7101916 T or RelB rs28372683 A was the potential protective factor against HCV infection among the Chinese population.

study population. This study conducted a case-control study with a total of 1642 subjects, including 693 hemodialysis (HD) subjects recruited from nine hospital hemodialysis centers in southern China from October 2008 to May 2015, and 949 paid blood donors recruited from six villages within Zhenjiang City from October 2008 to September 2016. Subjects co-infected with hepatitis B virus (HBV) or human immunodeficiency virus (HIV), or suffered from autoimmune, alcoholic or metabolic liver diseases, or received polyethylene glycol (Peg) IFN-α plus ribavirin (RBV), or oral DAAs treatment were excluded from this study. The flowchart of the selection of patients included in the study was shown in Fig. 1.
The subjects were put into three groups depending on their anti-HCV and HCV RNA results. Group A was comprised of HCV-uninfected controls with seronegative anti-HCV and seronegative HCV RNA. Group B was composed of spontaneous clearance subjects with seropositive anti-HCV and seronegative for HCV RNA. Group C was composed of persistent infection patients with seropositive anti-HCV and seropositive HCV RNA. Individuals in group B or group C were defined as infected individuals. All results of serologic tests were verified by three separate experiments within the 12 consecutive months. The control subjects (group A) were matched to the infected individuals (group B or group C) by age (5-year interval), gender and the village of recruitment.
Structured questionnaires were administered to trained interviewers for interviews carried out with every participant to collect the demographic information, the environmental exposure history and the medical history of HCV infection. Quality control program was established to guarantee the reliability of all data obtained. Viral testing. Venous blood samples were drawn from each participant after the interview and stored at −80 °C after centrifugation until the experiment. Anti-HCV antibodies were detected using a third-generation enzyme-linked immunosorbent assay (ELISA: Diagnostic Kit for Antibody to HCV 3.0 ELISA, Intec Products Inc, Xiamen, China) according to the manufacturer's instructions. HCV RNA was extracted using Trizol LS reagent (Takara Biotech, Tokyo, Japan), and reverse transcription polymerase chain reaction (PCR; Takara Biotech) was performed. The Murex HCV Serotyping 1-6 Assay ELISA kit (Abbott, Wiesbaden, Germany) was used to detect type-specific antibodies of various HCV genotypes 30 . sNps selection. TagSNP (Rel rs842647) was selected using Haploview software (version 4.2; Broad Institute, Cambridge, MA, USA) on the basis of the data of HapMap Phase II CHB (Chinese in Beijing) obtained from www.nature.com/scientificreports www.nature.com/scientificreports/ 1000 Genomes Project resources (http://www.1000genomes.org/). RelA rs7101916, NF-κB2 rs12769316 and RelB rs28372683 were selected based on the functional prediction for possible transcription factor binding sites or miRNA binding sites using National Institute of Environmental Health Sciences (NIEHS) (https://www.niehs. nih.gov/), Search RegulomeDB (http://www.regulomedb.org/) and based on the related literatures in which SNPs were reported to be associated with immune-related diseases. In addition to the above strategies, the minor allele frequency (MAF) of the candidate SNP must be more than 5% among the Chinese Han population.
Genotyping assays. Genomic DNA was extracted from leukocytes derived from subjects' blood samples using protease K digestion, phenol-chloroform extraction and ethanol precipitation. Before the genotyping, we used ultraviolet spectrophotometer (UV-2700220V CH) to detect the concentration and purity of DNA, and excluded OD 260 /OD 280 < 1.75, OD < 100 ng/ul sample. Candidate SNPs were genotyped by using the TaqMan allelic discrimination assay on an ABI 7900HT Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). The primers and probe sequences for candidate SNPs are shown in Supplementary Table 1. The experimenters were blind to the subjects' demographical and clinical data. Quality control was performed using two negative controls and two positive controls included in each 384-well plate to identify the agent or system contaminations. The success rates of genotyping for four SNPs were all above 99.8% and a 100% concordance rate was showed in 10% random selected samples for repeated testing. The genotyping results were analyzed using Sequence Detection System software (SDS, version 2.3; Applied Biosystems, Foster, CA, USA).
In silico analysis. SNPInfo web server (http://snpinfo.niehs.nih.gov/) was run based on the data from HapMap China Han Beijing (CHB) population to predict the impact on genes of potentially functional polymorphisms. The Vienna RNAfold web server (http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi) was used to predict secondary structures of single stranded RNA sequences based on the latest ViennaRNA Package (Version 2.4.4). The secondary structure with the minimum free energy (MFE; i.e. MFE structure) or the minimal base pair distance (i.e. centroid secondary structure) was computed and drew to characterize and compare the the influence on the secondary structure of gene between wild type and mutant type of the polymorphism site. We also carried out the expression Quantitative Trait Loci (eQTL) analysis on the public database Genotype-Tissue Expression (GTEx) Portal (http://www.gtexportal.org/home) to identify whether this SNP could cause any differential expression in different tissues. statistical analysis. All data were analyzed using Statistical Package for the Social Sciences software (SPSS: version 21.0.0.0; SPSS Institute, Chicago, IL, USA) and Statistical Analysis System software (SAS: version 9.4; SAS Institute, Cary, NC, USA). Demographics data, clinical characteristics and genotype distribution of subjects were compared using one-way analysis of variance (ANOVA), χ 2 -test or Kruskal-Wallis test where appropriate. A goodness-of-fit χ 2 test was used to estimate Hardy-Weinberg equilibrium (HWE) for each SNP among the controls. Linkage disequilibrium (LD) parameters (r 2 and D′) were calculated using Haploview software (version 4.2; Broad Institute, Cambridge, MA, USA). The associations between SNPs with the susceptibility to HCV infection or the spontaneous clearance of HCV infection were estimated by comparing group A v.s. group (B + C) or group B v.s. group C, respectively, and were calculated by binary logistic regression analysis, unadjusted or adjusted for age, gender and route of infection, and were explained by odds ratios (ORs), 95% confidence intervals (CIs) using

Results
Basic characteristics. The demographics and clinical characteristics of all subjects are presented in Table 1.
No significant age or gender difference was found among three groups (all P > 0.05) or between any two groups (all P > 0.05, data not shown). However, the alanine aminotransferase (ALT) levels, aspartate aminotransferase (AST) levels, routes of infection and HCV genotype were significantly different among three groups (all P < 0.01) or between any two groups (all P < 0.01, data not shown). The allele distribution of four SNPs was in accordance with Hardy-Weinberg equilibrium expectations in the control group (group A: P = 0.120 for rs842647, P = 0.271 for rs7101916, P = 0.153 for rs12769316, P = 0.603 for rs28372683).

Association between NF-κB genes polymorphisms and the susceptibility to HCV infection.
Genotype distribution of SNPs rs842647, rs7101916, rs12769316 and rs28372683 among three groups is shown in Table 2. It can be seen that the allelic and genotypic frequencies of rs7101916 and rs28372683 were significantly different across the three groups (P < 0.05). However, no significant difference was seen in the distribution of rs842647 and rs12769316 genotypes among the three groups. Only genetic dominant model was used for analyze the association between SNPs and the susceptibility to HCV infection, because the number of the subjects carried the minor allele homozygote of two SNPs (rs842647 and rs12769316) among some groups was small (<10) for powerful statistical analysis. As shown in Table 3, before or after adjusting for gender, age, and the route of infection, the results of logistic regression analysis showed that, compared with the major allele homozygote (rs7101916 CC genotype or rs28372683 CC genotype), the carriage of rs7101916 T allele or rs28372683A allele was associated with a decreased risk of the susceptibility to HCV (rs7101916: OR = 0.728, 95% CI = 0.588-0.901, P = 0.004; rs28372683: OR = 0.499, 95% CI = 0.366-0.681, P = 1.2e-5), and they remained significant after the multiple comparisons using Bonferroni correction (rs7101916: P = 0.016; rs28372683: P = 4.8e-5).
Further stratification analysis (Table 4) indicated that, compared with the major allele homozygote (rs7101916 CC genotype or rs28372683 CC genotype), a significant decreased risk was found in rs7101916 T allele or rs28372683A allele in the male subgroup (OR = 0.734, 95% CI = 0.572-0.941, P = 0.015) for rs7101916 and in all subgroup for rs28372683 (all P < 0.05, shown in Table 4), after adjusting for gender, age, and the route of infection.

Association between NF-κB polymorphisms and spontaneous clearance of HCV infection. No
association was found between the four SNPs (rs842647, rs7101916, rs12769316 and rs28372683) and spontaneous clearance of HCV in our logistic regression analysis using dominant model, adjusting for gender, age, the route of infection and HCV genotype (all P > 0.05, Table 3), or in the further stratification analysis (all P > 0.05, Table 4).  www.nature.com/scientificreports www.nature.com/scientificreports/ Cumulative effects of rs7101916 and rs28372683 on the susceptibility of HCV infection. The analysis of combined protective alleles (rs7101916 T allele and rs28372683 A allele) suggested that subjects carried 1-3 protective alleles was associated with a decreased risk of the susceptibility of HCV infection when compared with subjects carried 0 protective allele (all P < 0.05, Table 5). Additionally, these two SNPs have trended influence on a decreased risk of the susceptibility of HCV infection after the Cochran-Armitage trend test     www.nature.com/scientificreports www.nature.com/scientificreports/ (OR = 0.641, 95% CI = 0.514-0.799, P = 7.9e-5, Table 5). Moreover, after analyzing the combined protective genotypes (rs7101916 TT and rs28372683 AA) on the susceptibility to HCV infection shown in Table 6, we found that carrying 1 protective genotypes offered a significant high protective effect (OR = 0.696, 95% CI = 0.492-0.984, P = 0.04).

Haplotype analysis of rs7101916 and rs28372683 on the susceptibility of HCV infection. The
two-locus haplotypes were consisted of rs7101916 and rs28372683 variant alleles. Compared with the most frequent CC haplotype, TC and CA haplotype were significantly associated with the decreased risk of the susceptibility to HCV infection (TC haplotype: OR = 0.68, 95% CI = 0.568-0.813, P < 0.001; CA haplotype: OR = 0.604, 95% CI = 0.425-0.858, P = 0.005) ( Table 7).
In silico analysis of sNps function. Rs7101916 is located near the 5′ end of the RelA gene, which contains 11 exons and is mapped to chromosome 11q13.1. Using the SNP Information file (SNPInfo) web server, rs7101916 was predicted to be a transcription factor binding site (TFBS). Combined with the location of rs7101916, this mutation could involve in altering the binding of TF and mediating the transcriptional regulation. Therefore, RelA RNA secondary structures with rs7101916 major allele or minor allele were further predicted through energy minimization using RNAfold web server, based on the latest ViennaRNA Package (Version 2.4.6). The local structure changes are shown in Fig. 2. The minimum free energy of the centroid secondary structure (a structure with minimal base pair distance) for minor T allele of rs7101916 (−49.90 kcal/mol) was lower than that of major C allele (−44.70 kcal/mol). Results of eQTL analysis indicated that different genotypes of rs7101916 could cause differential mRNA level of ribonuclease H2 subunit C (RNASEH2C) in transformed fibroblasts cells and influence liver fibrosis process after HCV infection.
Rs28372683 is located in the 3′-untranslated region (3′-UTR) region of RelB gene, which contains 12 exons and is mapped to chromosome 19q13.32. According to the SNPinfo web server, rs28372683 was predicted to be a TFBS or a micro RNA (miRNA) (hsa-miR-1224-3p, hsa-miR-532-3p) binding site, or involved in the exotic splicing (enhancer or silencer). Considering the possible RelB gene expression regulation effect of rs28372683 variation at the translational or post-translational level, RelB RNA secondary structure was further predicted using RNAfold web server. The local structure changes are shown in Fig. 3. The minimum free energy of the centroid secondary structure for mutant A allele (corresponded to U allele in Fig. 3) of rs28372683 (−35.00 kcal/mol) was lower than that of wild C allele (corresponded to G allele in Fig. 3, −24.30 kcal/mol).
Rs12769316 is located near the 5′ end of the NF-κB2 gene, which contains 25 exons and is mapped to chromosome 10q24.32. Using the SNPinfo web server, rs12769316 was predicted to be a TFBS and could be involved in transcriptional regulation. Then we also analyzed NF-κB2 RNA secondary structure using RNAfold web server.

SNPs
Allele Subgroups  www.nature.com/scientificreports www.nature.com/scientificreports/ However, no difference between wild type and mutant type was found (Supplementary Fig. 1). The lowest free energy of the centroid secondary structure for the wild C allele (corresponded to G allele in Supplementary Fig. 1, −23.40 kcal/mol) of rs12769316 is nearly identical to that of the mutant T allele (corresponded to A allele in Supplementary Fig. 1, −24.40 kcal/mol). Therefore, rs12769316 could not influence the transcriptional control of NF-κB2 gene expressions or HCV infection.

Discussion
Our study firstly indicated that RelA rs7101916 T allele and RelB rs28372683 A allele were associated with the decreased risk of the susceptibility to HCV infection among the Chinese Han population. Host genetic background strongly influences the susceptibility and the response to HCV infection. Our previous studies, as well as other previous studies, have showed that many genetic variants affect HCV infection immune response and related to different disease outcomes, such as RelA 19 , Toll-like receptor 7 31,32 , interleukin-18 33 , human leukocyte antigen class II [33][34][35][36] , vitamin D receptor 37,38 and estrogen receptor α 39 .
In this study, RelA rs7101916 (located near the 5′ end of the gene) mutant T allele was found to be linked to the protection from HCV infection. RelA gene encoded transcription factor p65, also known as NF-κB p65 subunit, which formed the heterodimeric p65-p50 complex, the most abundant one of the transamination complexes 11 . According to the in silico analysis in this study, the minimum free energy of the local structure for T allele of rs7101916 was lower than that of C allele. Considering that rs7101916 located near the 5′ end of the RelA gene, it may be a TFBS and influence the binding of TF (according to the SNPInfo), followed by the transcriptional     www.nature.com/scientificreports www.nature.com/scientificreports/ regulation 40,41 . Besides, according to the information from UCSC (http://www.genome.ucsc.edu) and the results of eQTL analysis, this SNP located at regulatory elements region, can have possible functions on gene transcription, expression process and influence liver fibrosis process after HCV infection excepting the association with HCV susceptibility. Therefore, the genetic variation of rs7101916 may influence the RelA gene transcriptional regulation and gene expression, subsequently affecting NF-κB pathway activation and the susceptibility to HCV infection.
In our study, RelB rs28372683 (located in the 3′-UTR region of the gene) mutant A allele was associated with the protection from HCV infection. RelB is an important arm of the RelB/p52 NF-κB complex in a non-canonical NF-κB pathway activation through a mechanism dependent on inducible processing of p100 11,15 . According to the results of in silico analysis, rs28372683 location could not only bind to the transcription or miRNAs, but also be exotic splicing (enhancer or silencer) site. In addition, the minimum free energy of the centroid secondary structure for A allele of rs28372683 was lower than that of C allele. Therefore, the genetic variation of RelB rs28372683 may be associated NF-κB pathway activation and the susceptibility to HCV infection through potential functional mechanisms.
The results of the cumulative effects and haplotype analyses suggested that the more protective alleles (rs7101916 T and rs28372683 A) subjects carried, the greater the protective effect on the susceptibility of HCV infection exhibited, except that some groups with negative results had too few subjects to detect with sufficient statistical power.
There were no significant relationship of rs842647 and rs12769316 to spontaneous clearance of HCV infection. Rs842647 is located in the intron region of RelB gene. Introns on DNA are transcribed into the precursor RNA, the introns on the RNA are cleaved before the RNA leaves the nucleus for translation, and ultimately not in the mature RNA molecule. Rs12769316 is located 1.5 kb upstream of 5′ of the NF-κB2 gene and was found to be related to NF-κB2 protein and mRNA expression and predicted to be a TFBS. However, we did not find the correlation between rs12769316 and HCV clearance. The reason may be that the SNP is meaningless in our Changes in the local structure were illustrated by the RNAfold Web Server. The arrow indicates the position of the mutation (50 bases upstream and 50 bases downstream from the mutation). The minimum free energy of the mRNA centroid secondary structure (a structure with minimal base pair distance) for wild type and mutant rs7101916 were estimated to be −44.70 kcal/mol ( Fig. 2A) and −49.90 kcal/mol (Fig. 2B), respectively. The wild-type and mutant-type sequences are listed below. Underline type indicates the overlapping nucleotide letter that are unreadable in www.nature.com/scientificreports www.nature.com/scientificreports/ ethnicity and the number of subjects included in the study is insufficient. More studies should be performed to confirm this result.
Several limitations of our study merit consideration. First, it is difficult to acquire the subjects' exact age and time of the initial infection with HCV, thereby possibly impacting subjects' immune response and outcomes of HCV infection. Second, the partial lack of HCV genotypes among spontaneous clearance subjects and persistent infection patients could lead to unreliable analysis of the association between NF-κB polymorphisms and spontaneous clearance of HCV infection, especially for the compilation of the genotypes distributions between spontaneous clearance subjects v.s. persistent infection patients. Therefore, more studies were needed to confirm the negative results of the associations between the four SNPs and spontaneous clearance of HCV infection in this study. Direct-acting antiviral (DAA) regimens have reformed the treatment of HCV. Predicting treatment response to an IFN-based regimen is still far from enough. However, the new therapy has not been used extensively because of its unknown adverse effects and expensive costs in developing countries like China. As it was before, PEG-IFN/RBV regimen is still the first-line treatment for patients with HCV type 1 infection in China. Future studies should focus on the relationship between genetic polymorphisms and DAA treatment response.
In conclusion, this study firstly reported that the carriage of RelA rs7101916 T or RelB rs28372683 A was the potential protective factor against HCV infection among the Chinese population. These findings can serve as a reference for the preventive, predictive and therapeutic strategies of HCV infection. Additionally, further epidemiological and functional research on the NFKB pathway genetic variants is required. Figure 3. The influence of rs28372683 on mRNA centroid secondary structures of RelB in the 3′-UTR region. Changes in the local structure were illustrated by the R NA fo ld Web Server. Th e arrow indicates the position of the mutation (60 bases upstream and 60 bases downstream from the mutation). The minimum free energy of the mRNA centroid secondary structure (a structure with minimal base pair distance) for wild type and mutant rs28372683 were estimated to be −24.30 kcal/mol (Fig. 3A) and −35.00 kcal/ mol (Fig. 3B), respectively. The wild-type and mutant-type sequences are listed below. The underline bold type indicates the nucleotide difference between the wild and mutant allele. Wild-type sequence: AGAUUGUACAUAUGGGAGGAGGGGGCAGAUUCCUGGCCCUCCCUCCCCAGACUUGAAGGU GGGGGGUAGGUUGGUUGUUCAGAGUCUUCCCAAUAAAGAUGAGUUUUUGAGCCUCCGGGGU Mutant-type s eq ue nce: A GA UUGUACAUAUGGGAGGAGGGGGCAGAUUCCUGGCCCUCCCUCCCCAG ACUUGAAGGUUGGGGGUAGGUUGGUUGUUCAGAGUCUUCCCAAUAAAGAUGAGUUUUUGAGCC UCCGGGGU.