Introduction

Hepatocellular carcinoma (HCC), one of the most common cancers worldwide, ranked the third leading cause of cancer-related death and the second most lethal cancer only after pancreatic ductal adenocarcinoma1,2. Most HCC cases occur in East and Southeast Asia and in sub-Saharan Africa, however, China alone accounts for 55% of newly diagnosed case (approximately 400,000 cases) in the world1,3. The biological mechanism of hepatocarcinogenesis was driven by diverse aetiologies involving in both host genetic and environmental factors. Established environmental risk factors associated with HCC include chronic hepatitis B (HBV) and hepatitis C (HCV) viral infections, chronic alcohol consumption and aflatoxin-B1-contaminated food4. Among these, HBV infection has been considered as the major etiology of HCC and significantly associated with both HCC incidence and mortality, especially in China5. When only a fraction of exposed individuals eventually develop to HCC during their lifetimes and the clustering of HCC within families exists, genetic component can be apparently introduced to explain the observed variants in individuals' genetic susceptibility to HCC.

The mechanisms underlying the genetic susceptibility to HCC have been extensively investigated and the recent explosions of genetic association studies including previous candidate gene and subsequent genome-wide association (GWA) approaches have rapidly advanced our understanding of the genetic basis of HCC. However, it was already obvious that the currently identified gene variants with poorly understood functions were estimated to account for only a portion of the heritability of HCC and hampered to fully elucidate the genetic architecture of HCC. SWI/SNF (mating type switch/sucrose non-fermenting) complexes, which are capable of regulating gene transcription through ATP-dependent nucleosome remodeling have been extensively investigated to play a widespread role in tumor suppression6. SWI/SNF complexes consisted of at least 14 subunits encoded by 28 genes, of which, frequent inactivating mutations in SWI/SNF subunits including ARID1A, BRD7, PBRM1, SMARCA4 and SNF5 were repeatedly identified in a variety of cancers6. Notably, overwhelming evidence from exome and whole-genome sequencing studies of HCC has also drawn attention to the contribution of somatic mutation in SWI/SNF chromatin remodelling complexes to the carcinogenesis of HCC7,8,9,10. Meanwhile, the similar genetic characteristics of SWI/SNF complexes have also been observed in other cancers including renal carcinoma, pancreatic cancer and gastric cancer by exome and whole-genome sequence data11,12,13. Intriguingly, a new study employing proteomic and bioinformatic analysis has further established the substantial role of SWI/SNF complexes in human malignancy and indicated that specific subunits of SWI/SNF complexes protect against cancer in specific tissues14. The core subunit SNF5 which was inactivated in nearly all malignant rhabdoid tumors possessed the tumour suppressor properties being elucidated by genetically engineered mouse models15. In addition, the ARID1A subunit was repeatedly observed to be specifically mutated in various human cancers and involved in repression of key cell cycle regulators16,17. Furthermore, previous studies indicated that the catalytic subunit SMARCA4 (BRG1) was frequently inactivated in a variety of cell lines and primary tumors18. Clear evidence has indicated that reduced expression or haploinsufficiency of SMARCA4 played a potential role in driving cancer formation19.

Given the amounting biological evidence and new important observations by exome and whole-genome sequence studies, an alternative hypothesis was motivated that multiple common genetic variants in subunits of SWI/SNF complexes may modulate expression levels and/or protein structures and eventually contribute to HCC susceptibility. Hereby a two-stage case-control study with a total of 1003 cases and 1032 controls has been carried out to comprehensively examine single nucleotide polymorphisms (SNPs) of six candidate genes in SWI/SNF complexes associated with risk of HCC in a Chinese population.

Results

Subject characteristics

Table 1 summarized the baseline characteristics of participants included in the two-stage study. In both stages, the cases and controls were well matched on the distribution of sex and age. Significant difference in smoking status was observed in both stages, with ORs equal to 1.89 (95% CI: 1.42–2.53), 1.38 (95% CI: 1.06–1.81) and 1.59 (95% CI: 1.31–1.93) for smokers in stage 1, stage 2 and the combined analyses, respectively. Besides, the borderline significance of drinking status was found to be associated with the risk of HCC in both stages, with OR equal to 1.28 (95% CI: 1.06–1.56) in the combined analyses. As expected, HBV infection was considered as an important risk factors for HCC, with higher prevalence rate of HBsAg among cases (67.9%) than that among controls (6.2%) in the combined analyses.

Table 1 Characteristics of participants in the two-stage case-control study

Association between individual SNP and HCC risk

The detailed information for 27 tagSNPs included in stage 1 has been shown in Supplementary table 1. Three SNPs, including rs34502618 (ARID1A) and rs3786725 (SMARCA4) significantly deviated from the HWE in controls (FDR-P < 0.05) and rs12685 (ARID1A) had a MAF < 0.05, were excluded to analyze further. All the selected SNPs have no significantly different missing data rates between cases and controls (P > 0.001). In the single-loci analysis, only two SNPs, including SMARCA4 rs11879293 and rs2072382, were significantly associated with HCC risk (the P values for the Cochran-Armitage trend test = 0.001 and 0.024, respectively). Besides, SMARCB1 rs2267032 showed a borderline association with the risk of HCC (Ptrend = 0.059). However, after correction for multiple comparisons (24 single tests) by FDR, only rs11879293 showed a significant association with HCC risk (FDR-Ptrend = 0.024) (Supplementary table 2). Additionally, the associations between all the candidate SNPs and risk of HCC calculated by logistic regression analysis after adjusting for age, sex, smoking status, drinking status and HBsAg have been presented in Supplementary table 3. Individuals carrying SMARCA4 rs11879293 AG or AA genotype had an OR of 0.66 (95% CI: 0.51–0.86) compared with individuals with GG genotype and the rs11879293 A allele was associated with significantly decreased risk of HCC under an additive model (OR = 0.72, 95% CI: 0.59–0.87). For SMARCA4 rs2072382 SNP, the variant rs2072382 CT or TT genotype was significantly associated with an increased risk of HCC under a dominant model (OR = 1.33, 95% CI: 1.03–1.73) compared with the wild-type rs2072382 CC.

Considering the false negative possibility, the most promising SNPs SMARCA4 rs11879293 with significant association after FDR correction, SMARCA4 rs2072382 and SMARCB1 rs2267032 with significant or borderline association before FDR correction were all chosen for the stage 2 validation. The genotype call rates for SMARCA4 rs11879293, rs2072382 and SMARCB1 rs2267032 in stage 2 were 98.7%, 99.1% and 99.4%, respectively. The genotypes distribution of the three SNP in controls conformed to HWE (P > 0.01). In agreement with the stage 1, rs11879293 exhibited a significantly decreased risk of HCC in stage 2. In the combined analysis, rs11874392 was also significant associated with HCC risk, with an OR of 0.73 (95% CI: 0.62–0.87) under an additive model (Table 2). For SMARCA4 rs2072382 and SMARCB1 rs2267032, no significant association were observed in stage 2 (Supplementary table 4).

Table 2 Genotypes of SMARCA4 rs11879293 and the association with risk of HCC in both stages

Stratified analyses of association between SMARCA4 rs11879293 and risk of HCC

The risk of HCC associated with SMARCA4 rs11879293 was further examined stratified by age, sex, smoking, drinking and HBV infection in combined analysis (Table 3). We observed that the decreased risk of HCC associated with rs11879293 AG/AA genotypes was more evident for males (OR = 0.74, 95% CI: 0.58–0.95), but with borderline significance for females (OR = 0.63, 95% CI: 0.39–1.00). Similarly, more evident associations were also found among the group aged 60 years or younger, nondrinkers and especially HBsAg positive individuals. Among HBsAg positive subgroup, an enhanced evidence of the association between SMARCA4 rs11879293 and HCC risk was observed, with OR of 0.47 (95%CI: 0.27–0.80), however, no significant association was shown among HBsAg negative subgroup, with OR of 0.81 (95%CI: 0.62–1.07). Stratified by smoking status, the risk of HCC associated with SMARCA4 rs11879293 exhibited no significant difference between nonsmokers and smokers subgroups. Meanwhile, no significant heterogeneity was found between all the sub-groups (P > 0.05).

Table 3 Risk of HCC associated with SMARCA4 rs11879293 genotypes stratified by Age, Sex, Smoking, Drinking status and HBV infection in combined analysis

Bioinformatics-based approaches

To further refine significant association signals and fine-mapping the causal variant, bioinformatics-based approaches were employed to assess the functional information of genetic variants which are shown in Supplementary table 5. Firstly, 12 SNPs which showed strong linkage disequilibrium (LD) with rs11879293 (r2 > 0.8) were found within the promising LD block by utilizing the genotyping data from 1000 Genome Project data (http://browser.1000genomes.org/Homo_sapiens/Info/Index) using 1000GENOMES: phase_1_CHS among Southern Han Chinese. Besides, three integrated bioinformatics tools, ‘F-SNP’20, ‘FASTSNP’21 and ‘SNPinfo’22 were applied to predict the potential function of the above 12 SNPs and rs11879293. The FASTSNP has found four possibly functional variants with risk score by using the decision tree. Among these, rs11879293 and rs4804556 act as intronic enhancer with risk score 1–2 which may alter the transcription factor binding site (TFBS) in intronic region and signify the risk rank of low to medium and rs1019935 and rs11880865 located in promoter/regulatory region with risk score 1–3 which may affect the level, location or timing of gene expression and signify the risk rank of low to medium. The result of SNPinfo has indicated that nine SNPs may be classified as affecting TFBS activity with the difference in the matrix similarity scores (MSS) or core similarity TFBS scores (CSS) between the two alleles ≥ 0.2. Meanwhile, seven SNPs which may influence transcriptional_regulation have been observed by F-SNP database analysis and the detailed FS scores for each SNP have also been shown in the table. In addition, the eQTL database (http://www.hsph.harvard.edu/liming-liang/software/eqtl/) was searched to look into the potential functional SNPs which may influence expression levels of the corresponding genes23. We found that 11 SNPs were cisSNPs within 1 Mb which showed associations with one heritable expression trait (Probe: GI_21071055-S) with lod score>6.

Discussion

In this study, we conducted two-stage case-control study to explore the association of genetic variants in SWI/SNF complexes with risk of HCC in Chinese population. The study demonstrated that SMARCA4 rs11879293 was convincingly replicated to influence the risk of HCC across both stages of the study. In addition, this study highlights the potential role of genetic variant in SWI/SNF complexes modulated by HBV infection in carcinogenesis of HCC.

With the new discoveries from human cancer genome project emerging, chromatin remodeling has been established as an important characterization of cancer genomes and SWI/SNF complexes is the major mutational target affecting chromatin remodeling24. Previous exome and whole-genome sequencing studies have identified chromatin remodeling complex as important determinants of HCC based on somatic mutations, but not on germline variants. Therefore, it is not unexpected that there are no impressive associations between genetic variants in SWI/SNF complexes and risk of HCC in the present study. SWI/SNF complexes modulate transcription by using the energy of ATP to remodel chromatin structure, which is the most studied effect of SWI/SNF activity but not the only mechanism contributing to carcinogenesis. Importantly, increasing evidence have indicated that SWI/SNF complexes directly interact with numerous important proteins to modulate the formation of cancer25. Besides, SWI/SNF complexes not only play an essential role in the activation of transcription but also involved in transcriptional repression26. Furthermore, SWI/SNF complexes have emerged as tumor suppressors as specific inactivating mutations in subunits of the complex have recently been identified in a various cancers.

In this study, it was observed that SMARCA4 rs11879293 was significantly associated with a decreased risk of HCC even by multiple comparison correction and successfully validated in stage 2 and combined analysis. SMARCA4 is located in chromosomal region of 19p13.2 and its protein is the central catalytic component of the SWI/SNF complexes and is composed of multiple domains including an evolutionarily conserved catalytic ATPase domain, a conserved C-terminal bromodomain, AT-hook motif and the less characterized N-terminal region which all play important roles in the recognition of modified histone proteins, DNA binding, or recruitment of SWI/SNF27,28. SMARCA4 has been established to interact with a diverse group of nuclear proteins involved in various cellular processes including transcriptional regulation, cell cycle control, proliferation, DNA repair and recombination29. It is worth to note that with frequently inactivating mutations in a variety of tumor cell lines18, SMARCA4 has been proposed to be a tumor suppressor via diverse biological mechanisms. Firstly, reintroduction of SMARCA4 into SMARCA4-deficient tumor cells resulted in Rb-dependent cell cycle arrest and a flattened morphology, suggesting SMARCA4 may function as a tumor suppressor30,31. Besides, SMARCA4 heterozygotes are predisposed to differentiated epithelial tumors, which supports that SMARCA4 plays an important role in the regulation of cellular proliferation32. In addition, SMARCA4 has shown to involve in tumor suppression by physical interaction with other tumor suppressors including pRb, p53 and c-Myc33,34,35. Meanwhile, SMARCA4 has also been found under expressed in human hepatocellular carcinoma by searching gene expression atlas (http://www.ebi.ac.uk/gxa/qrs)36. In the study, SMARCA4 rs11879293 A allele presented a significantly decreased risk of HCC compared with rs11879293 G allele in both the detection and validation stages. By now, no association study on rs11874392 with HCC risk has been conducted and the variant rs11874392 has unknown biologically functional consequences. The variant rs11874392 is located in the intron between exon 1 and exon 2 of SMARCA4. The region which exon 1 and exon 2 was located in is necessary for SMARCA4 interaction with SS18L1/CREST. SS18L1 has been considered as a putative SWI/SNF subunit and suggested it might play a role in driving hepatocarcinogenesis by remodeling chromatin structure37. To further refine significant association signals and fine-mapping the causal variant, bioinformatics-based approaches were employed to assess the statistical association between SMARCA4 rs11879293 and risk of HCC. Based on the results of multiple integrated bioinformatics tools and eQTL database, it can be concluded that rs11879293 may act as an intronic enhancer to alter Pol2-binding site or be tagging functional variants in promoter/regulatory region that influence gene expression to confer susceptibility to HCC. However, independent replication studies with a larger sample size are warranted to verify the significant association and the underlying molecular mechanisms for the effect of SMARCA4 rs11879293 on the risk of HCC remained to be fully dissected in the follow-up studies.

In the present study, the risk of HCC associated with SMARCA4 rs11879293 is more pronounced in males, younger individuals (≤60 y) and nondrinkers. Remarkably, the most interesting finding was that a greatly differentiated risk was observed between SMARCA4 rs11879293 and HCC risk by HBV infection status. HBV infection is a firmly established risk factor and may modify the risk of HCC associated with SMARCA4 rs11879293. So far, no association study and biological function evidence has been observed for the relationship between SMARCA4 and HBV infection by searching published literature. Fortunately, it has been reported that another core SWI/SNF subunit SMARCE1 played an important role in modulating the replication efficiency of HBV38. It can be speculated that SMARCA4 may have the similar DNA binding characteristics with SMARCE1 and modify HBV replication. Meanwhile, SMARCE1, as a SMARCA4-associated factor, may physically interact with SMARCA4 to involve in diverse biological activities. There is also the possibility that SMARCA4 presented the statistical association with risk of HCC modified by HBV infection status only via the biological relationship with SMARCE1. Taken together, the statistical association between SMARCA4 and risk of HCC greatly modified by HBV infection should be validated in larger sample size association studies and the speculations may need to be identified by using DNA-protein blot binding approach and other biochemical experiments.

In summary, our study highlighted the potential role of the SWI/SNF complexes in conferring susceptibility to HCC, especially modified HCC risk by relationship with HBV infection. Moreover, our study advances our understanding of the genetic etiology of HCC, raising the prospect of emerging insights into personalized strategies to prevent HCC. However, there are some limitations to this study. The ORs of HCC risk for genetic polymorphisms with smaller minor allele frequencies were too weak to detect because of the sample size at the detection stage and replication studies with larger sample sizes are warranted to verify our results. Besides, the individuals in the hospital based case-control study could not prime represent the whole population. In addition, although bioinformatics-based approaches were employed to assess the statistical associations between genetic variants in SMARCA4 and risk of HCC, biology functional analyses are still warranted to dissect the molecular mechanism underlying the significant association.

Methods

Subjects

The two-stage study design with a total of 1003 newly diagnosed HCC cases and 1032 controls was applied to comprehensively examine genetic polymorphisms in the SWI/SNF chromatin remodelling complexes contribution to HCC risk. In both stages, all participants were unrelated Han Chinese. The stage 1 study consisted of 502 HCC cases and 487 cancer-free controls. Patients were consecutively recruited between January 1, 2009 and June 30, 2012 at Tongji Hospital of Huazhong University of Science and Technology (HUST), Wuhan, central China. Cancer-free controls which were randomly selected from a healthy screening at the same hospital during the same time were matched according to the frequency of cases in 5-year age groups and sex, part of which were also involved in our previous epidemiological studies39,40. The stage 2 study included 501 cases and 545 cancer-free controls. Patients were recruited between January 1, 2009 and June 30, 2012 at the Peking Union Medical College Hospital, Chinese Academy of Medical Sciences (Beijing) in northern China. Cancer-free controls were also frequency matched to cases by sex and age (±5 years) and came from a community cancer screening program conducted at the same hospital during the same time. The inclusion criteria for all the patients included histopathologically confirmed HCC, without previous chemotherapy or radiotherapy and no restriction in regards to sex, age, or disease stage. At recruitment, written informed consent was obtained from every participant and demographic characteristics including sex, age, smoking and drinking habits was collected by interviewers. The detailed definitions of smoking and drinking status have been described previously39. The information on serological testing including HBsAg, anti-HBs, HBeAg, anti-HBe and anti-HBc was collected from the medical records with the patients' permission in the corresponding hospital. This study was conducted under the approval of the institutional review boards of Tongji Medical College of Huazhong University of Science and Technology and Peking Union Medical College Hospital, Chinese Academy of Medical Sciences (Beijing).

SNPs selection and genotyping

The candidate important genes in the SWI/SNF chromatin remodelling complexes were selected based on recently remarked findings about the ARID1A and ARID2 subunits with repeatedly inactivating somatic mutations in HCC by exome and whole-genome sequence study7,8,9 and previous evidence on the frequently mutated gene encoding SWI/SNF subunits including ARID1A, BRD7, PBRM1, SMARCA4 and SNF5 in a variety of cancers6. In stage 1, TagSNPs were selected by SNPs genotype information downloaded from HapMap (http://www.hapmap.org/) using phase 2 and phase 3 Data Release 27 among Chinese population (Chinese Han from Beijing-CHB) for genes ARID2, ARID1A, BRD7, PBRM1, SMARCA4 and SNF5, using the criteria of r2 > 0.8 and minor allele frequency (MAF) > 0.05 across the region of candidate genes. 27 tagSNPs were identified as candidate SNPs in stage 1 of the study and then were genotyped in stage 1 of the study using the TaqMan Openarray assay system. Each 48-sample array chip contained one NTC (without template DNA) and one duplicated sample to verify the genotyping accuracy. The average call rate for all the candidate SNPs genotyped was >95% and the concordance rate for the duplicate sets was 100%. In stage 2 of the study, three promising SNPs were genotyped by TaqMan real-time polymerase chain reaction (PCR) Assay (Applied Biosystems, Foster city, CA) without knowledge of the case or control status of the subjects. Quality control was monitored by including 5% duplicate and negative controls, with the 100% concurrence rate of the duplicate sets.

Statistical analysis

Pearson's χ2 test was used to examine differences between cases and controls in the distribution of demographic characteristics. Hardy-Weinberg equilibrium for genotypes was tested by a goodness-of-fit χ2 test in control group. The Cochran-Armitage trend test was used to examine the association between SNP genotypes and HCC risk. The risks of HCC associated with SNPs were estimated by using odds ratios (ORs) and 95% confidence intervals (95% CIs) calculated in logistic regression model after adjustment for sex, age, smoking status, drinking status and HBsAg. The dominant model or recessive model was chosen by the comparisons of three odds ratios (ORs), that is, homozygous variant versus homozygous wild type (OR1), heterozygous versus homozygous wild type (OR2) and homozygous variant versus heterozygous (OR3)41. A two-tailed P < 0.05 was used as the criterion of statistical significance. The final P values were corrected by false discovery rate (FDR) for multiple comparisons42. For SNPs with MAF of 0.10, we calculated that the power for our sample size to detect an OR of 1.60 is as follows: stage 1 of the study, power = 0.68; stage 2 of the study, power = 0.70; combined study, power = 0.94. All the statistical analyses were conducted by SAS v8.2 software (SAS Institute, Cary, NC).