Introduction

Hepatitis B is a worldwide disease induced by hepatitis B virus (HBV) infection that affects the liver condition and causes hepatocellular carcinoma development. The natural course of chronic HBV infection (CHB) is long and complicated, with differential underlying changes in liver histology. Patients can shift from a phase with no liver scarring and high viral replication rate to active liver diseases, followed by an inactive hepatitis B phase, and also after years return to the active disease stage. Progression to advanced fibrosis can be rapid, slow, or sporadic. Moreover, liver inflammation, scarring, and even early stage of severe scaring may be reversal after hepatitis B suppression1,2,3. Although the mechanisms by which HBV causes persistent infection are weakly understood, it is known that viral replication itself does not cause liver damage in a short time. Therefore, it is accepted that the individual’s immune response is essential for sustained viral control and the inadequate response is associated with chronic hepatitis B. In fact, 90% of infants whose immune system is not fully matured, become persistently infected when exposed to HBV at birth or in perinatal age. HBV infection results in an acute disease that resolves over time. However, when an individual's immune response fails adults may develop chronic hepatitis B and become more exposed to liver failure development4.

Apart from widely known host agents affecting the course of chronic hepatitis B (CHB), such as sex, immune status, and underlying diseases, host genetic background has been extensively studied providing evidence of its role in the susceptibility to HBV persistence, treatment response and the dynamics of liver injury progression to cirrhosis and hepatocellular carcinoma (HCC). The strongest evidence of host genetic factors’ influence on HBV infection outcomes was demonstrated several years ago in twins studies, where higher accordance rates of HBV carriers and antibody titers in response to the HBV vaccine were observed in monozygotic twins in comparison to same-sex dizygotic twins5,6. During the past few years, significant progress has been made in genetic and disease-related research due to the development of high-throughput genotyping methods. Since the first implementation of genome-wide association studies (GWAS) on chronic hepatitis B infection in 2009, several single nucleotide polymorphisms (SNPs) have been identified as potential genetic markers influencing the pathogenesis of HBV-related traits7,8,9,10,11,12,13. However, only a small proportion of SNPs are located in protein-coding regions of the genome, with the vast majority situated within non-coding areas, such as regulatory and intergenic areas, and may thus impact gene regulation14.

Moreover, systematic localization of common disease-associated variation has shown that nearly 60% of non-coding GWAS SNPs and other variants are located within DNase I hypersensitive sites (DHSs), which serve key roles in the regulation of gene transcription as markers of cis-regulatory elements (CREs)15. Because DHS profiles reflect the occupancy of DNA-binding proteins such as transcription factors (TFs), these loci may alter the transcription factor binding site (TFBS) or induce variation in gene expression1. In this study, we have focused on SNPs within TFs and TFBSs of genes associated with the HBV lifecycle, which have been previously associated with multifactorial diseases or traits.

Results

Study group characteristics

The study group consisted of 284 chronic hepatitis B (CHB) patients, including individuals with mild fibrosis (92), moderate to severe fibrosis (78), liver cirrhosis—LC (63), hepatocellular carcinoma—HCC (13), and no fibrosis (38) participants. Table 1 summarizes the distribution of variables evaluated at study inclusion. The overall incidence rate of liver cirrhosis and HCC among CHB patients was 22.18% (63/284) and 4.5% (13/284) respectively, with male patients outnumbering females in both subgroups. The mean age of patients with liver cirrhosis was 61 years, and they were significantly older than no fibrosis (p = 0.000146) as well as patients with fibrosis (p = 0.000032). No significant differences in clinical parameters were observed between patients with mild and moderate to severe liver fibrosis. Aspartate aminotransferase (AST) (p = 0.016084) and total cholesterol (TC) (p = 0.014987) levels were higher among individuals with liver cirrhosis versus the no fibrosis group. As well, the prevalence of portal hypertension (HT) and thrombocytopenia (platelet count below 150,000) were much higher in the cirrhotic group.

Table 1 Characteristics of chronic hepatitis B patients with liver cirrhosis, hepatic fibrosis, and healthy controls.

DNA samples from all subjects included in the study were successfully analyzed, and high-quality genotyping data was generated for all eleven SNPs. The distribution of genotypes did not follow the Hardy–Weinberg equilibrium (HWE) for the liver fibrosis, cirrhosis, and no fibrosis group except for rs225014, rs2016520, rs4794067 in cirrhotic patients, and rs2016520 in no fibrosis group that was consistent with HWE (p > 0.5). Surprisingly in the HCC group, only rs12031994 (AKT3) displayed deviation from HWE. Evaluation of the Linkage Disequilibrium (LD) pattern with the use of the correlation coefficient r2 between pairs of analyzed SNPs showed that all of them were independent (r2 < 0.5). The distribution of SNPs genotypes was compared between no fibrosis, fibrosis, cirrhosis, and HCC groups (Table 2). Significant differences in genotype distribution were observed for rs225014 (DIO2) and rs4794067 (TBX21) between groups of patients affected by different HBV-related liver diseases (Table 2). Rs225014 TT genotype was more common in patients with no fibrosis (52%) in comparison to the cirrhosis group (44%), and its frequency dropped to 8% in patients with HCC. On the other hand, the rs4794067 T allele was more common in patients with more advanced HBV-related liver disease. Moreover, both SNPs within the GADD45A gene (rs532446, rs37834688) demonstrated different distributions between cirrhosis and fibrosis groups.

Table 2 Genotypic distribution of analyzed SNPs among chronic hepatitis B patients.

Association of gene polymorphisms with viral and clinical characteristics

We have found significant differences in DIO2 gene polymorphism between males and females, and rs225014 CC (p = 0.00124) and rs225017 TT (p = 0.00417) genotypes were more common in men. Furthermore, higher HBsAg levels were found in individuals with AKT3 rs12031994 TT (sex-adjusted TT vs. CC: OR 0.22, 95% CI 0.05–0.75, p = 0.016) genotypes. On the other hand, AKT3 rs12031994 CC genotype (sex-adjusted TT vs. CC: OR 4.80, 95% CI 1.49–15.43, p = 0.008) was associated with higher AST levels at study inclusion. Additionally, in rs205014 CC (DIO2) carriers, the presence of the HBeAg antigen was more common (sex-adjusted TT vs. CC: OR 2.83, 95% CI 1.24–6.47, p = 0.013).

Gene polymorphisms and liver aminotransferase levels

In a univariate correlation analysis, ALT levels correlated with sex (p = 0.015), thrombocytopenia (p = 0.008), and HBV DNA levels (p = 0.03). Among the SNPs, we have observed significant associations between ALT concentration and the DIO2, GADD45A, and AKT3 genotypes (Table 3). The presence of the minor allele at rs204014 and rs12031994, and a major allele at rs205017 and rs532446 were more common in patients with elevated ALT levels.

Table 3 Results of logistic regression analyses for elevated ALT risk in chronic hepatitis B patients.

Next, a multivariate regression analysis was used to identify independent predictors of ALT levels in our patients with HBV infection. Serum ALT activity was considered the dependent variable. The results of this analysis showed that thrombocytopenia, rs225014 TT, rs12031994 TT, and rs532446 CC were independently associated with ALT levels (Table 4).

Table 4 Final multiple logistic regression model for ALT levels.

Genetic polymorphisms and the liver fibrosis progression in chronic hepatitis B

We next assessed the association between analyzed SNPs and liver fibrosis progression. Genotype distribution of the T allele within rs225014 was significantly different in the fibrosis score F0 group when compared to F1 (p = 0.003), F2 (p = 0.012), F3 (p = 0.0002), and F4 (p = 0.0003) patients. Significant differences were also found in genotype occurrence within F0 and F score groups for PPARG rs10865710 (p = 0.028), and TBX21 rs4794067 (p = 0.028, Fig. 1). Also, the GADD45A rs532446 TT genotype was more common in the F0 score in comparison to the F4 group.

Figure 1
figure 1

Graphs showing genotype distribution of DIO2 rs225014 (A), PPARG rs10865710 (B), and TBX21 rs4794067 (C) in patients with different fibrosis scores.

In further analysis when the cohort was segregated into those with mild (F0–1) versus advanced fibrosis (F2–4), carriage of DIO2 rs225014 TT and rs225017 AA, and PPARG rs10865710 CC genotypes were associated with a significantly increased risk of advanced fibrosis, independent of age and gender (Table S1). In multivariate analyses adopting a dominant model rs225014 TT (DIO2) and rs10865710 CC (PPARG) and portal hypertension remained an independent predictor of advanced fibrosis (Table 5), whereas DIO2 rs225017 lost significance (p > 0.05).

Table 5 Final multiple logistic regression model for advanced fibrosis (F ≥ 2).

A corresponding analysis was made to investigate associations between evaluated variabilities and liver cirrhosis. Univariate analyses of variables associated with liver cirrhosis showed significant association observed for age (p < 0.0001), thrombocytopenia (p < 0.0001), total cholesterol levels (p = 0.00014), gamma-glutamyl transferase levels (p < 0.0001), BMI (p = 0.003), alcohol consumption (p = 0.006), ALT level (p 0.0036), AST level (p < 0.0001), GADD45A rs532446 (p = 0.033), ATF3 rs11119982 (p = 0.014), and TBX24 rs4794067 CC (p = 0.023). Next, in a multiple logistic regression model thrombocytopenias, higher ALT levels, rs532446 TT, and rs11119982 TT remained significant predictors of cirrhosis (Table 6).

Table 6 Final multiple logistic regression model for liver cirrhosis.

HCC was detected in 13 of 284 (4.6%) CHB patients. No association was found between analyzed SNPs and HCC presence. However, different genotypic distribution was found for DIO2 rs225014 between patients with cirrhosis who have developed primary malignancy of the liver and those without HCC (p = 0.010426). Rs225014 CC variant was identified in 38% of patients with HCC, and 12% of cirrhotic patients without HCC.

In silico trial results

Using SIFT algorithm substitution at position 92 from T to A was predicted to be tolerated with a score of 0.51. Median sequence conservation was 3.50. SHOPE report showed that the mutant residue is smaller and more hydrophobic than the wild-type residue, and this variant’s MetaRNN score was 2.324709e-05. Furthermore, rs225014 was analyzed by I-Mutant 3.0 and MUpro servers. The free energy change (∆∆G) values were below − 0.5 kcal/mol (∆∆G = − 1.30 for I-Mutatnt 3.0; ∆∆G = − 1.4718185 for MUpro), which indicates that the mutation can largely destabilize the DIO2 protein.

For two SNPs analyzed with RegulomeDB, the predicted rank was 5, which suggested that these SNPs have a minimal probability to affect TF binding and/or DNase peak (Table 7).

Table 7 RegulomeDB results for SNPs within selected regions.

Remarkably, the highest evidence of regulatory function was shown for rs225014, rs10865710, rs532446, and rs4794067. RegulomeDB revealed that rs10865710 is linked to PPARG and TIMP4 expression, and may likely affect JUN protein binding, as well as falls within NFATC1, NFATC3, NFATC4, and NFAT5 binding motifs. With the same RegulomeDB rank, rs532446 was shown to affect numerous different proteins (Supplementary Table S2) and is localized within ATF4 and PRDM binding motifs. Similarly, rs225014 was demonstrated to affect target gene expression and a variety of protein binding (Supplementary Table S3). Additionally, rs4794067 was shown to have an impact on multiple genes expression (Supplementary Table S4), and influence on EZH2 and CTCF binding.

The histone modification analysis showed that rs10865710, rs532446, and rs4794067 were predicted to locate in enhancer histone marks (liver, endocrine gland, exocrine gland). The key information regarding histone modification analysis restricted to the liver organ is shown in Table 8. More detailed information can be found in Table S5. Furthermore, miRNASNP analysis demonstrated that all SNPs may influence the recognition and targeting of miRNA (Table 9).

Table 8 Key histone modification analysis results restricted to liver organ obtained by RegulomeDB.
Table 9 Effect of SNPs on the binding of miRNA (gain or loss).

Discussion

It is well-established that multiple risk factors contribute to cirrhosis and HCC development in CHB patients16. Apart from the well-known risk factors such as older age, male gender, chronic active hepatitis, higher ALT levels, or history of decompensation, accumulation of genetic alteration during progression from health, through fibrosis to HCC are now considered of great importance17. In this study, we have focused on genetic polymorphism within transcription factor binding sites which are recently suggested as important players in downstream gene expression and phenotypic variations predisposing to different disease development18. We have performed an extensive literature review for candidate SNPs located at TFBSs identified by GWAS contributing to complex disease risk. Afterward, we limited the number of loci to those which had a potential impact on TF regulation associated with hepatitis B and/or liver disease progression. As a result, our study demonstrated that rs225014 (DIO2), rs532446 (GADD45A), rs12031994 (AKT3), rs11119982 (ATF3), rs10865710 (PPARG) might contribute to the increased risk of liver disease progression in chronic hepatitis b carriers. Other parameters including metabolic markers, such as body mass index (BMI), diabetes, and triglyceride levels were not significant in our study. Additionally, no literature data regarding the possible role of investigated SNPs on these variabilities were found.

The strongest prognostic value was found for rs225014 (DIO2) and rs532446 (GADD45A), which were correlated with liver tissue scaring, as well as with elevated ALT. The CC genotype of DIO2 rs225014 or C allele occurred more frequently in patients with higher ALT levels, and with more advanced liver fibrosis. Consequently, the C allele had a risk effect for liver disease progression as it was more common in cirrhotic (56%) and HCC (92%) patients. In the same manner, the C allele at rs532446 of GADD45A was more common in CHB carriers with both raised ALT concentrations and liver cirrhosis. On the other hand, the TT genotype at both rs225014 (DIO2) and rs532446 (GADD45A) had a protective effect on liver scarring progression. Additionally, the genotype distribution differed significantly for rs225014 (DIO2) between groups of patients affected by different stages of HBV-related liver diseases, and between the cirrhosis and fibrosis group for rs532446 (GADD45A). Furthermore, functional mechanisms analysis of these SNPs using computational approaches demonstrated their influence on miRNA binding, target gene expression levels, and different protein binding. To the best of our knowledge, this is the first report presenting an association between polymorphisms of the above genes and the severity of liver disease in CHB patients.

Rs225014 (DIO2), also known as Thr92Ala, is involved in thyroid hormone (TH) metabolism and its regulation19. This polymorphism was demonstrated to have an impact on TH levels and therefore may influence on a variety of clinical aspects as well as the quality of life or cognition. DIO2 SNP rs205014 has been so far associated with symptomatic osteoarthritis20,21, type 2 diabetes mellitus22, atherosclerosis23, and bone mineral density24 demonstrating the C allele as a risk factor. On the other hand, inversely to our results, the C allele at rs225014 was protective in response to lung injury25. Although DIO2 is not typically expressed in the liver, it has been shown that the lack of the neonatal DIO2 in mice hepatocytes leads to hepatic epigenetic reprogramming that can alter different liver functions modifying susceptibility to alcohol or diet-induced hepatic steatosis, hypertriglyceridemia, and obesity26,27. This may be explained by the fact that the liver is susceptible to the dynamic of THs, which participate in hepatic homeostasis. As the liver is one of the main target tissues of TH, any disruption of TH signals is closely associated with multiple liver-related diseases28,29,30. Moreover, the rs225014 DIO2-C allele creates unique TFBS for the NK3 homeobox 2 (NKX3-2) TF which are eliminated by the T-allele20,31. Because homeobox genes are known players in the regulation of HCC tumorigenesis, the elimination of the NKX3-2 binding site by the T-allele the elimination of the NKX3-2 binding site by the T-allele may in part be associated with the protection against liver disease progression. Furthermore, NKX3-2 (also known as BAPX1) has already been demonstrated as a poor prognostic factor for gastric cancer in vivo32. Furthermore, the BAPX1 gene was also reported to be up-regulated in breast and prostate cancers at the mRNA level33.

GADD45A, TP53-regulated and DNA-damage responsive protein, plays a leading role in human tumorigenesis. Although the exact mechanism remains uncertain, the expression patterns of GADD45A vary in different carcinomas34. The decreased expression has been observed in patients who suffer from non-small cell lung 35 and prostate36 cancers. Also GADD45A mRNA level was down-regulated in most HCC patients in comparison to adjacent nonneoplastic tissue37. GADD45A has been also shown to exert a protective effect against hepatic fibrosis in mice38. In contrast, higher GADD45A expression was observed in breast cancer tissues compared with non-neoplastic tissue samples39. Furthermore, GADD45A expression has been associated with the survival of patients with patient esophageal cancer with reduced expression of GADD45A as a poor prognostic factor40. Even though the GADD45A gene is highly conserved in mammals, point mutations in exon 4 have been found in patients with pancreatic cancer, and GADD45A expression combined with p53 status correlated with patients’ survival41. Also, rs681673 and rs607375 polymorphisms have been recently found to be associated with breast cancer risk34, and GADD45A promoter SNP (rs581000) with reduced susceptibility to acute liver injury42. Several other studies have found correlations between GADD45A polymorphism and ovarian cancer43 and rheumatoid arthritis44. In our study, we have analyzed located in the p53 binding region an intronic rs532446 (T3812C), which has been previously reported to possess a functional role in acute lung injury42. We have found that the minor allele T at rs532446 is associated with decreased susceptibility to liver cirrhosis development. As the SNP is located in the p53 binding region, it may affect the regulatory activity of p53, which is important for both liver homeostasis and dysfunction. On the one hand, p53 regulates cell cycle checkpoints to protect from transformation. On the other hand, it induces apoptosis of damaged cells and activates liver stem/progenitor cells leading to functional recovery of the organ45. Therefore it is quite surprising that the minor variant at rs532446 makes a beneficial effect on a patient. However, despite the previously proposed algorithm predicting the affinity of tumor suppressor p53 for binding sites in DNA, response elements containing equal numbers of mismatches still show different affinities for p53. It is probably caused by higher mutation tolerance in an ​unusually long DNA-binding site within which only 20% of nucleotides remain unchanged46,47,48. Moreover, other mechanisms including minor groove shape recognition49 and chromatin status50 should also be considered when explaining differences in the binding of p5348.

In the current study, we also observed an association between enhancer polymorphism rs10865710 in the PPARG gene and liver fibrosis progression. Although a C → G substitution at this site does not cause an amino acid change, rs10865710 was proposed as a risk factor for a variety of diseases51, such as asthma52, systemic sclerosis53, obesity54, as well as a non-alcoholic fatty liver disease55. Moreover, Lu et al.51 have recently demonstrated that carriers with rs10865710 CG/GG genotypes express lower levels of PPARG in comparison to individuals with CC genotype, which may be associated with the downregulation of PPARG expression. Additionally, hepatic PPARG expression has been noted to promote liver steatosis56, and inhibition of PPARG has been shown to suppress steatosis-associated liver cancer in mice57. Associated with lower PPARG level rs10865710 minor allele was more common in patients with low fibrosis scores in our study. Of the six unique TFBS generated by the G allele, MEIS1 has been already shown to play a role in cardiovascular regeneration58. Because inhibitory effects of MEIS1 on tumorigenesis in renal clear cell carcinoma59, non-small-cell lung cancer cells 60, or prostate cancer61 have been reported, we suppose that the creation of the MEIS1 TFBS with the minor G allele may in part be responsible for the association of this SNP with liver fibrosis risk. In the same manner, associated with cirrhosis risk rs11119982 (ATF3) C allele creates one unique TFBS for the helicase-like transcription factor (HLTF) which is involved with altering chromatin structure. On the other hand, the minor T allele at this site is located in the binding site of five TFs that regulate transcription, and control hematopoietic progenitor cell control, cellular transcription, and repression.

This cross-sectional study has some limitations. Although we have analyzed ALT levels within groups with different liver damage scores, these measurements were performed at the time of liver assessment and we have no information regarding the further progression of the liver. Given that ALT is tend to fluctuate, people with early stages of liver cirrhosis can have normal liver function tests. Secondly, our study was performed on Caucasian subjects only. Therefore, similar studies on other geographic regions with different genetic populations should be done.

This study showed that rs225014 (DIO2), rs532446 (GADD45A), rs12031994 (AKT3), rs11119982 (ATF3), rs10865710 (PPARG) are associated with the increased risk of liver disease progression in chronic hepatitis b carriers. The presence of the C allele at both DIO2 rs225014 and GADD45A rs532446 was independently associated with liver tissue scarring. Moreover, the occurrence of HCC in the study group was more common in individuals carrying the rs225014 CC genotype, and the rs532446 together with rs11119982 were associated with liver cirrhosis development.

Materials and methods

Patients

This study included 284 patients with confirmed CHB infection (HBsAg positive for more than 6 months) from the ANRS CO22 HEPATHER cohort (ClinicalTrials.gov registry number: NCT01953458). All the subjects were of Caucasian ethnicity and had no other concomitant liver etiologies (viral coinfection, autoimmune or metabolic). Patients were excluded if they were currently treated or had undergone antiviral treatment within 6 months before the initiation of the study. Serum samples were collected before liver fibrosis assessment, and underwent the standard procedure in the local clinical center laboratory, including hepatitis serologic variables (HBsAg, HBsAb, HBeAg, HBeAb, HBcAb, HBV DNA levels). Fibrosis scores were assessed by non-invasive transient elastography by using FibroScan (Echosens, Paris, France), and the METAVIR scoring system62 was used for patient classification. Patients were subdivided as follows: no fibrosis (no scarring, stage 0), mild fibrosis (fibrosis stage I), liver fibrosis (fibrosis stages II-III), and cirrhosis (fibrosis stage IV; confirmed by two experienced pathologists). The procedures employed followed the ethical standards of the 1975 Declaration of Helsinki revised in 2013. The study protocol was approved by the Local Independent Bioethics Committee and the ANRS CO22 HEPATHER scientific committee. We have received the agreement to use the HEPATHER cohort in the frame of the INFECT-ERA project, and access to samples was paid for by the University of Gdansk. All enrolled subjects signed a free and informed consent form for participation in the study.

SNP genotyping

DNA was extracted from the whole blood samples (200 uL) using the MagNa Pure LC DNA Isolation System, and according to the standard manufacturer protocol for MagNA Pure Compact Nucleic Acid Isolation Kit I (Roche, Mannheim, Germany). The genotypes were determined by mass spectrometry method with the use of an iPLEX Pro chemistry for single base extension reaction according to the protocol (Agena Bioscience, San Diego, CA, USA). For each SNP, one primer pair and a single extension primer sequence were designed using the Mass Assay Designer software package (v.4.0). All primer sequences are listed in Table S6. Out of the nine SNPs included in the study, only one was localized within the transcription factor (TBX21, rs4794067). The remaining SNPs identified at the transcription factor binding site (TFBS) include DIO2 (rs225017, rs225014), PPARG (rs10865710, rs2016520), ATF3 (rs11119982), AKT3 (rs12031994), and GADD45A (rs532446, rs37834688).

41 µL of ultrapure water was used to dilute the final extension product following the transfer into Chip Prep Module (Agena Bioscience, San Diego, CA, USA) for automated sample handling including desalting and dispensing samples onto the SpectroChip Array (Agena Bioscience, San Diego, CA, USA). Mass spectra were acquired with a MassARRAY® Analyzer 4 mass spectrometer and analyzed with MassARRAY® Typer 4.0 software. All procedures were performed according to the company’s recommendations.

Statistical analysis

Statistical analyses were performed using STATISTICA software version 13.3 (StatSoft, Tulsa OK, USA). The Hardy–Weinberg equilibrium of analyzed SNPs was conducted by the MIDAS software. Chi-squared or Fisher’s exact test was used to analyze the relationship between categorical vs. categorical variables. Logistic regression analysis was used to evaluate the contribution of genetic and nongenetic factors under the dominant, recessive, and additive models. A backward stepwise regression approach was applied when building multivariate models. All of the p-values presented were two-sided and only p < 0.05 was considered significant.

Bioinformatics analysis of statistically significant SNPs

Four software were used to analyze the effect of rs225014 on DIO2 protein. SIFT web server (https://sift.bii.a-star.edu.sg/www/SIFT_seq_submit2.html) was used to predict SNP impact on protein function based on sequence homology and the physical properties of amino acids. A score below or equal to 0.05 in a range between 0 and 1 conferred the deleterious effect of SNP on protein function. MUpro (http://mupro.proteomics.ics.uci.edu/) and I-mutant 3.0 (http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi) web tools were used to determine whether the Thr92Ala amino acid substitution affects DIO2 protein’s stability. Structural and functional effects of rs225014 were analyzed by the HOPE (Have (y) Our Protein Explained) (https://www3.cmbi.umcn.nl/hope/) server. MetaRNN pathogenicity prediction score was used (range 0–1), which when higher shows higher pathogenicity.

Investigation of any potential harmful effect of non-coding SNPs was performed at Regulome DB v2.1 (https://beta.regulomedb.org/regulome-search//), which gives a ranking based on DNA binding, provides Chip data, chromatin states, and motifs. The RegulomeDB probability score is ranging from 0 to 1, with 1 being the most likely to be a regulatory variant. Furthermore, to predict the target gain/loss effect of SNPs in miRNA seed regions, miRNASNP was employed (miRNASNP-v3 (hust.edu.cn)).