TANC1 methylation as a novel biomarker for the diagnosis of patients with anti-tuberculosis drug-induced liver injury

We aimed to elucidate the differences in genomic methylation patterns between ADLI and non-ADLI patients to identify DNA methylation-based biomarkers. Genome-wide DNA methylation patterns were obtained using Infinium MethylationEPIC (EPIC) BeadChip array to analyze 14 peripheral blood samples (7 ADLI cases, 7 non-ADLI controls). Changes in the mRNA and DNA methylation in the target genes of another 120 peripheral blood samples (60 ADLI cases, 60 non-ADLI controls) were analyzed by real-time polymerase chain reaction and pyrosequencing, respectively. A total of 308 hypermethylated CpG sites and 498 hypomethylated CpG sites were identified. Significantly, hypermethylated CpG sites cg06961147 and cg24666046 in TANC1 associated with ADLI was identified by genome-wide DNA methylation profiling. The mRNA expression of TANC1 was lower in the cases compared to the controls. Pyrosequencing validated these two differentially methylated loci, which was consistent with the results from the EPIC BeadChip array. Receiver operating characteristic analysis indicated that the area under the curve of TANC1 (cg06961147, cg24666046, and their combinations) was 0.812, 0.842, and 0.857, respectively. These results indicate that patients with ADLI have different genomic methylation patterns than patients without ADLI. The hypermethylated differentially methylated site cg06961147 combined with cg24666046 in TANC1 provides evidence for the diagnosis of ADLI.

Tuberculosis (TB) remains one of the top 10 causes of death worldwide 1 . Directly observed treatment shortcourse chemotherapy (DOTS), which emphasizes the use of a combination of isoniazid (H), rifampicin (R), pyrazinamide (Z), and ethambutol (E) for 6 to 8 months, was the most effective strategy 2 . The combination of these medications was most likely to result in an increase in the incidence of ADLI 3 . The incidence of ADLI currently ranges from 2 to 28%, depending on the definition of liver toxicity and the population being studied 4,5 . ADLI has significant side effects, such as interruption of treatment, prolonged disease progression, and development of drug resistance, and may even lead to death 6 . There are currently no sensitive and specific biomarkers for the diagnosis of ADLI, and the diagnosis of ADLI still depends on serum biochemical tests. This is because the identification novel potential non-invasive biomarker for the early diagnosis of ADLI in China, as well as across the world, remains a challenge. DNA methylation is one of the most stable epigenetic modifications in mammalian cells. It controls a variety of cellular and developmental processes, including embryonic development, X-inactivation, chromosome stability, and imprinting 7 . Over the last decade, aberrant DNA methylation has been shown to be a candidate biomarker of cancer and occurs very early in cancer development 8 . Moreover, it has become clear that DNA methylation is reversible and dynamic as a result of enzymatic DNA de-methylation 9 ; therefore, aberrant DNA methylation modifications have attracted increased interest as potential drug targets 10 .
Numerous studies have demonstrated that the hypermethylation of CpG islands in the promoter regions of CYP2E1, CYP2D6, and GSTP1 is associated with the occurrence of ADLI 11,12 . However, these studies have only focused on single or multiple genes, and the research conclusions drawn remain limited. At present, there have been studies using Agilent Human DNA Methylation Microarray 1 × 244 K array for genome-wide DNA methylation. However, the Agilent Human DNA Methylation Microarray 1 × 244 K array focuses only on CGI and differentially methylated regions. It only covers 27,627 human CpG islands and 5081 UMR regions, and cannot reach the resolution of a single base. The EPIC BeadChip not only contains over 850,000 probes and covers the entire CpG islands, promoter, coding region, open chromatin, and enhancer, but also includes CpG sites outside the CpG islands, known differentially methylated region sites, and microRNA promoter regions. Moreover, it has the advantage of single base resolution, which can directly detect the exact site of methylation. It is currently the most suitable DNA methylation research technology for apparent genome-wide association analysis (EWAS) research. At present, there is no comprehensive and systematic genome-wide DNA methylation analysis using EPIC BeadChip array in peripheral blood of ADLI patients As such, in the present study, we chose to use EPIC BeadChip to characterize genome-wide DNA methylation profiles in the peripheral blood of ADLI and non-ADLI patients. The differentially methylated CpG sites (dmCpGs) were identified by differential methylation analysis, and pyrophosphate sequencing was used to verify the selected gene sites. In summary, this study aimed to identify epigenetic changes in peripheral blood samples in order to identify potential biomarkers of ADLI.

Materials and methods
Study population and ethics. Patients with newly diagnosed TB over 18 years old hospitalized in the Tangshan Fourth Hospital between March 2016 and July 2017 were recruited. TB was diagnosed based on previously described guidelines 13 . All patients received standardized daily treatment with isoniazid (H), rifampicin (R), pyrazinamide (Z), and ethambutol (E) in the first two months, and H and R on a daily basis in the following four months 14 . The follow-up period for patients was from the beginning of treatment until 6 months later. During this period, patient compliance, the choice of treatment options, tuberculosis-related symptoms, and adverse drug reactions were strictly monitored by trained staff. Liver enzymes and bilirubin, as biomarkers of liver function, were monitored. To this end, 10 mL Peripheral blood samples were collected every two weeks after starting antituberculosis treatment for the initial 2 months and every 4 weeks for the next 4 months, or any time when symptoms and signs of hepatitis developed during treatment 15 . We collected peripheral blood samples from patients at 8-9 am and asked patients to fast for 8-12 h before collecting blood samples to ensure the accuracy of the test results. Detailed demographic and clinical characteristics of the patients were obtained from electronic medical records.
We selected 67 patients, who developed ADLI within 2 to 8 weeks after receiving anti-TB treatment as the liver injury group, and then identified 67 patients without ADLI based on similar characteristics of age, sex, and admission time during the same period 15 . A total of 134 TB patients met the inclusion and exclusion criteria and were included in the study. We first performed EPIC BeadChip array of 7 ADLI patients and 7 age-and sex-matched non-ADLI patients. Subsequently, an independent cohort of 60 patients with ADLI and 60 patients with non-ADLI was used for pyrosequencing. This study was reviewed and approved by the Ethics Committee of North China University of Science and Technology (process no. 14-016). Informed consent was obtained from patients prior to any experimental procedures. Diagnosis of anti-tuberculosis drug-induced liver injury. Hepatotoxicity due to anti-TB drug treatment was not only based on the liver enzyme results, according to the criteria of the American Thoracic Society (ATS) 16 , but also took into account the diagnostic criteria developed by the Centre for Drug Re-evaluation (CDR) of the Chinese State Food and Drug Administration, as well as various previously definitions 2,17 . Specifically, ADLI cases need to meet one of the following criteria: (1) an increase in serum alanine aminotransferase (ALT) or aspartate aminotransferase (AST) that is over threefold 2 the upper limit of normal (ULN) in the presence of liver injury symptoms; (2) an increase in total bilirubin (TBIL) that is over twofold the ULN in the presence of liver injury symptoms; 3) a fivefold increase in the ULN of serum ALT, AST, or TBIL with or without liver injury symptoms.

Illumina Infinium MethylationEPIC BeadChip array.
Microarray-based DNA methylation profiling was performed using the Illumina Infinium MethylationEPIC (EPIC) BeadChip (Illumina, Inc., San Diego, CA, USA) on 7 paired blood samples. Genomic DNA from peripheral blood samples was extracted using a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany). Bisulfite conversion of isolated genomic DNA (500 μg) was performed using the EZ DNA methylation Gold Kit (Zymo Research, Irvine, USA). Bisulfite-converted DNA was then whole-genome amplified, enzymatically fragmented, and hybridized to the array as per the EPIC Bead-Chip protocol 18 . Subsequent scanning of chips was performed using an Illumina HiScan2000. The raw intensity of the data was determined using GenomeStudio methylation module version 1.9.0 (Illumina, Inc.).
Methylation EPIC BeadChip data pre-processing. The raw intensity data (IDAT) were imported into R version 3.4.2 and processed using the R/Bioconductor package minfi (version 1.22.1) 19 . Low-quality data (probes with detection P-value > 0.05), probes from the X and Y chromosome, and probes overlapped with single-nucleotide polymorphisms were removed 20 . The background data were normalized using the Noob method 21 to generate methylation beta (β) values, which were used for subsequent analysis.
Differential DNA methylation analyses. The β-values were used as an indicator of the methylation of each locus in each sample. Delta beta (Δβ) is defined as the difference in the β values between the two groups, in which the absolute value is directly proportional to the degree of difference. We calculated the mean detection P-value to check the overall data quality. In the present study, dmCpGs between the groups were identified with P < 0.05 and |Δβ|> 0.10. Subsequently, dmCpGs located in the promoter regions (promoters were defined as regions located between 1500 bp upstream of TSS and 500 bp downstream of transcriptional start sites (TSS)) and genes containing multiple differentially methylated probes were selected as our candidate CpG sites.
Real-time quantitative polymerase chain reaction. RNA was extracted using TRIzol reagent (Invitrogen, Grand Island, NY, USA). cDNA was generated using the PrimerScript RT Kit with the gDNA Eraser (DRR047A, Takara, Dalian, China). Subsequently, the cDNA was amplified using SYBR Primix EX Taq II (RR820A; Takara, Dalian, China). The data were normalized to the reference gene glyceraldehyde 3-phosphate dehydrogenase (GAPDH). The primers are shown in Supporting Information

Results
Characteristics of the study population. The characteristics of the two study populations used in EPIC BeadChip for discovery, pyrosequencing, and validation are presented in Table 1. All participants were native Han Chinese. In the discovery and validation groups, there were no significant differences between the cases and controls in terms of smoking status, drinking status, ALP levels, TBIL levels, albumin (ALB) levels, or TP levels. However, statistically significant differences in the levels of ALT and AST were found when the cases were compared to controls.

Differential DNA methylation patterns between ADLI cases and controls.
To clarify the difference in the levels of methylation between the ADLI patients and the control group, we analyzed the methylation status of 866,091 CpG sites in 7 paired blood samples from 14 subjects using the EPIC BeadChip array. The high quality of the samples is reflected in the pattern and highly comparable distribution shown by the density plot of the β-value of the probe (Fig. 1a). Principal component analysis (PCA) of the full methylomes clearly differentiated between ADLI patients and the controls, as shown in Fig. 1b. Subsequent to data pre-processing and quality filtering, a final data matrix comprised β-values (methylation levels) across 841,456 loci in 14 blood samples was created for further statistical analyses. Next, a pooled t-test was used to identify the differentially methylated CpG loci between the ADLI patients and the matched control blood samples, resulting in the identification of 806 significantly differentially methylated CpG loci (P < 0.05 and |Δβ|> 0.10) (Fig. 2a). Among the 806 dmCpGs, 308 were hypermethylated and 498 were hypomethylated (Fig. 2a). Furthermore, unsupervised hierarchical clustering of the 806 dmCpGs showed a clear segregation between ADLI and without ADLI (Fig. 2b), confirming that the DNA methylation patterns of leukocytes in ADLI patients differ from those of non-ADLI patients. Figure 3a depicts the genomic distribution of dmCpGs and distinguishes between CpG island-related regions and gene-related regions. Both hypomethylated and hypermethylated dmCpGs were found to be enriched in open sea regions rather than the CpG islands (Fig. 3a). In terms of gene-related locations, hypermethylated and hypomethylated CpGs were preferentially situated in gene body regions and intergenic regions, respectively, and both were impoverished at the 3'UTR regions (Fig. 3b).
We found a total of 53 genes, including multiple differentially methylated CpGs. (see Supporting Information Table S1). These genes were located in different gene-related regions. We focused on the analysis of dmCpGs in the promoter since many studies have suggested that promoter methylation significantly affects the levels   www.nature.com/scientificreports/ of gene expression. A total of 10 genes contained two or more dmCpGs in the promoter regions, including 5 hypomethylated genes (C22orf39, HCG27, PKD1L2, BIRC7, and LOC100507140) and 5 hypermethylated genes (C1orf141, CD177, FMOD, LOC102723376, and TANC1) ( Table 2).

Expression levels of candidate genes.
To determine whether the methylation of a candidate gene affected the gene expression levels, the mRNA expression of these target genes was analyzed using RT-PCR. For one of the hypomethylated differentially methylated genes, LOC100507140, it expression was higher in the ADLI patients compared to the controls (Fig. 4a), while the expression of C22orf39, HCG27, PKD1L2, and BIRC7 showed no significant difference (Fig. 4b-e). Among the hypermethylated differentially methylated genes (DMGs), only TANC1 expression was low in the ADLI patients compared to controls (Fig. 4 f), while the expression levels of the other four genes were not significantly different (Fig. 4g-j). As such, we selected LOC100507140 (2 sites) and TANC1 (2 sites) for the subsequent pyrosequencing experiments.  www.nature.com/scientificreports/ Pyrosequencing validation of differentially methylated ADLI-associated sites. The pyrosequencing results of two CpG sites (cg18472223 and cg20517941) within the promoter region of LOC100507140 and two CpG sites (cg18472223 and cg20517941) within the promoter region of TANC1 displayed lower and higher methylation levels in ADLI samples, respectively, consistent with the EPIC BeadChip array (Fig. 5). Representative images of pyrosequencing for four differentially methylated loci in one patient are shown in Fig. 6.

Correlation analysis results of various indicators.
The mRNA expression levels of TANC1 gene was negatively correlated with ALT and AST (ρ = − 0.818 and − 0.800, respectively; all P < 0.001), while the mRNA expression levels of LOC100507140 was positively correlated with ALT and AST (ρ = − 0.893 and − 0.824, respectively; all P < 0.001). The mRNA expression levels of TANC1 gene was negatively correlated with the methylation levels of the two CpG sites (cg06961147 and cg24666046), and the correlation coefficient ρ was − 0.500 and − 0.515, respectively (all P < 0.001). Similarly, the mRNA expression levels of LOC100507140 gene was negatively correlated with the methylation levels of the two CpG sites (cg18472223 and cg24666046), and the correlation coefficient ρ was − 0.383 and − 0.368, respectively (all P < 0.001).
Ingenuity pathway analysis of the TANC1 gene. A total of 77 upstream and downstream genes related to the TANC1 gene were enriched in IPA (see Supporting Information Fig. S1). These enriched genes were selected for the analysis of canonical pathways using IPA. The IPA analysis results in the identification of five significant canonical pathways that were significantly enriched with these genes (Fig. 8). The pathways included liver hyperplasia/hyperproliferation, hepatocellular carcinoma, liver inflammation/hepatitis, kidney failure, and heart failure. The results showed that these genes were mainly enriched significantly in several pathways related to liver metabolism, suggesting that TANC1 is related to liver disease.

Discussion
In this study, the genome-wide patterns of DNA methylation in the blood of ADLI patients and non-ADLI controls were analyzed. Our results suggest that DNA methylation differs significantly between these two groups. Furthermore, using pyrosequencing in a larger sample size, we demonstrated that the expression and methylation of LOC100507140 and TANC1 could be used to distinguish between ADLI patients and the controls. These findings indicate that changes in DNA methylation levels may be related to alterations in the expression of certain genes during ADLI occurrence. These findings also suggest that the abnormal expression of DNA methylation could be used as an indicator of ADLI, similar to the predictive effect of abnormal DNA methylation in other diseases [24][25][26] .
To our knowledge, this study is the first to identify distinct differential DNA methylation patterns in blood samples from patients with ADLI and non-liver injury using the EPIC BeadChip array platform. The EPIC BeadChip array has the advantage of covering the whole genome, including the gene promoter region, the gene coding region, CpG islands, and the enhancer regions found in the ENCODE 27 and FANTOM5 28 project. In our results, fundamental differences in DNA methylation patterns between the ADLI and non-ADLI patients were elucidated by PCA (Fig. 1b). Further analysis complemented these results by identifying site-specific CpG percentage (%) methylation changes that were responsible for pattern differentiation between the two groups (Fig. 2). The volcano plot in Fig. 2a shows the full distribution of the observed differential sites. The statistically www.nature.com/scientificreports/ significant CpG sites (N = 806) in the methylation heatmap (Fig. 2b) show a distinct pattern between the groups. The combined results provide a more comprehensive display of the genome-wide methylation of ADLI. Our understanding these epigenetic changes will enable the use of epigenetic biomarkers for diagnosis of disease in early stages. In general, our results display the genome-wide methylation of ADLI more comprehensively. The genomic distribution of the dmCpGs suggests that the hypermethylation of dmCpGs may vary depending on the location: based on dmCpGs in CpG island-related or gene location-related status, dmCpGs were mainly www.nature.com/scientificreports/ observed in the open sea of the CpG-poor regions and the gene bodies, respectively. This phenomenon underscores the value of screening technologies that accurately examine CpG-sparse regions. In addition, the fact that the majority of methylation changes were identified in gene bodies emphasized that we cannot focus solely on methylation sites located in CpG-denser regions, such as CpG islands and gene promoters. Evidence is increasingly suggesting that DNA methylation in gene bodies is able to promote oncogene expression 29,30 , such that gene body methylation may serve as a therapeutic target for the treatment of cancer 31,32 . Our results suggest that the DNA methylation of gene bodies will be an important topic for future studies in the field of cancer research. Notably, we found that the combination of two CpG sites (cg06961147 and cg24666046) of the TANC1 gene could act as potential biomarkers for the diagnosis of ADLI cases and non-ADLI controls. This is a novel discovery. Although there are currently very few reports on TANC1 research, many studies have reported that TANC1 is an important synaptic scaffold protein that plays a critical role in regulating the density of synaptic spines and excitatory synapse strength 33 . Studies have also indicated that TANC1 is a candidate gene for neurodevelopmental disorders (NDD) 34,35 . Furthermore, the TANC1 locus can influence the development of late radiation-induced damage 36 . In the present study, the TANC1 gene was analyzed with IPA to identify any associated significant pathways, upstream regulators, diseases, and functions. A total of 77 upstream and downstream genes related to the TANC1 gene were found to be enriched in IPA (Fig. 8a). Our results indicated that these genes were significantly enriched in several pathways related to liver metabolism, including liver hyperplasia/hyperproliferation, hepatocellular carcinoma, and liver inflammation/hepatitis (Fig. 8b), suggesting that the TANC1 gene is related to liver disease. To date, the biochemical properties of TANC1 proteins remain largely unknown, and there is currently no research on the relationship between TANC1 methylation and diseases. Since methylated DNA is very stable 37 and can be detected in clinical blood samples 38,39 , it is a promising target for use in disease diagnostics. Taken together, our results indicate that DNA methylation is a promising diagnostic target for ADLI.  www.nature.com/scientificreports/ Our study contains a number of limitations. First, our sample size was relatively small, with limited power; hence, future studies should investigate whether the CpG sites identified in this study can be replicated in an independent population. In addition, given the large number of dmCpGs found and distributed at different locations in the present study, we were unable to determine all of the dmCpGs. Therefore, we only performed mRNA and pyrosequencing verification of dmCpGs in the TSS1500 region. The dmCpGs located in other regulatory regions will need to be validated in future studies. Finally, although we identified two differentially methylated genes, LOC100507140 and TANC1, we did not study their possible mechanisms of action further. Despite these limitations, this study is the first to use the EPIC BeadChip array platform identify the DNA methylation signatures associated with ADLI.
In summary, the distinctive differences in DNA methylation patterns between ADLI and non-ADLI patients were the main finding of our study. The expression of LOC100507140 and TANC1, as the differentially methylated genes, was found to vary significantly during the occurrence of ADLI. More importantly, we found that the combination of the hypermethylated differentially methylated site cg06961147 and cg24666046 in TANC1 provides a potential target for the diagnosis of ADLI.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.  Fig. S1) were significantly enriched in the above 5 pathways. The image was generated through the use of IPA (QIAGEN Inc. software version 65367011, https:// www. qiage nbioi nform atics. com/ produ cts/ ingen uity-pathw ay-analy sis) 23  www.nature.com/scientificreports/