Genome-wide identification and characterization of circular RNA in resected hepatocellular carcinoma and background liver tissue

Circular RNA (circRNA) is a type of non-coding RNA known to affect cancer-related micro RNAs and various transcription factors. circRNA has promise as a cancer-related biomarker because its circular structure affords high stability. We found using high-throughput sequencing that seven candidate circRNAs (hsa_circ_0041150, hsa_circ_0025624, hsa_circ_0001020, hsa_circ_0028129, hsa_circ_0008558, hsa_circ_0036683, hsa_circ_0058087) were downregulated in HCC. The expression of these circRNAs was examined by quantitative PCR in 233 sets of HCC and matched background normal liver tissues, and correlations between candidate circRNA expression and prognosis were evaluated. The results of quantitative PCR showed that expression of hsa_circ_0041150, hsa_circ_0001020 and hsa_circ_0008558 was significantly lower in HCC than in background normal liver tissues. Kaplan–Meier analysis revealed that low expression of hsa_circ_0001020, hsa_circ_0036683, and hsa_circ_0058087 was associated with poor recurrence-free (RFS) and overall survival (OS) in HCC. Additionally, multivariate analysis revealed that low hsa_circ_0036683 expression was a significant prognostic factor, independent from other clinicopathological features, for inferior RFS and OS. There was no significant association between the expression of these circRNAs and hepatitis B/C status or cirrhosis. This study therefore identified circRNAs as potential prognostic markers for patients who undergo curative surgery for HCC and highlighted hsa_circ_0036683 as the most useful biomarker.


Results
Identification of a circRNA signature for HCC and background liver tissues using high throughput sequencing. To develop a circRNA signature specific to HCC tumour and background normal liver tissue, we first interrogated high throughput sequencing (HTS)-based circRNA expression profiles for four cases of resected HCC. This genome-wide circRNA analysis identified 15 circRNAs that were differentially expressed between HCC and background normal liver tissue, omitting circRNAs with null expression in any of the four samples. A heatmap of the 15-circRNA signature is shown in Fig. 1, together with hierarchical clustering and principal component analysis. Eight of the 15 circRNAs corresponded to known hsa_circ IDs and we successfully generated primers for seven of these eight using CircPrimer software for the subsequent validation of the HTS data using qPCR 19 . The IDs for the seven circRNAs were hsa_circ_0041150, hsa_circ_0025624, hsa_ circ_0001020, hsa_circ_0028129, hsa_circ_0008558, hsa_circ_0036683, and hsa_circ_0058087. According to the HTS data, these circRNAs were all expressed at a lower level in HCC tissues, when compared with paired background normal liver tissues (absolute log2 fold change > 2.0, p < 0.05, Student's t-test).
qPCR-based validation of HTS data using HCC patient samples. HCC and background normal liver tissues taken from 233 HCC patients who underwent liver resection, were analyzed by qPCR to determine the expression of the seven candidate circRNAs identified in the HTS analysis (hsa_circ_0041150, hsa_ circ_0025624, hsa_circ_0001020, hsa_circ_0028129, hsa_circ_0008558, hsa_circ_0036683, hsa_circ_0058087). Expression profiles for these circRNAs in the collected paired samples are shown in Fig. 2. Among these seven circRNAs, the expression of hsa_circ_0041150, hsa_circ_0001020, and hsa_circ_0008558 was significantly lower Figure 1. Differential expression of circRNAs between HCC and background normal liver. The figure is prepared using R software (version 3.5.3, https ://www.r-proje ct.org/) 43  www.nature.com/scientificreports/ in HCC tissues than in background normal liver tissues (p < 0.001 for all). These five circRNAs were therefore found to be expressed at a consistently lower level in HCC tissues, when compared with normal background tissues, in both the HTS and qPCR analyses. There was no correlation between the expression of these candidate cirRNAs in normal and tumour samples ( Supplementary Fig. 1).
circRNA expression according to background liver status. Expression of the seven candidate cir-cRNAs according to hepatitis virus or liver cirrhosis status is shown in Supplementary Fig. 2. Regarding background normal tissues, hsa_circ_0025624 was lower in HCV-positive patients than in HBV-positive patients (p = 0.047), hsa_circ_0008558 was lower in HCV-positive patients than in HBV-positive or HBV/HCV-negative patients (p = 0.012 and < 0.001, respectively), and hsa_circ_0036683 was lower in HCV-positive patients than in HBV/HCV-negative patients (p = 0.042). Regarding HCC tissues, hsa_circ_0041150 was lower in HBV/ HCV-negative patients than in HCV-positive patients (p = 0.020), and hsa_circ_0058087 was higher in HBV/ HCV-negative patients than in HBV or HCV-positive patients (p = 0.032 and 0.017, respectively). In HBV-positive patients, hsa_circ_0041150, hsa_circ_0001020, hsa_circ_0028129 and hsa_circ_0008558 were all lower in HCC tissues than in background normal tissues (p < 0.001 for all). In HCV-positive patients, hsa_circ_0041150, hsa_circ_0001020, hsa_circ_0028129, and hsa_circ_0008558 were lower in HCC tissues than in background normal tissues (p < 0.001 for all). In HBV/HCV-negative patients, hsa_circ_0041150, hsa_circ_0001020, hsa_circ_0028129, and hsa_circ_0008558 were all lower in HCC tissues than in background normal tissues (p < 0.001, p < 0.001, p = 0.001 and p < 0.001, respectively). hsa_circ_0001020, hsa_circ_0008558 and hsa_circ_0036683 expression in background normal tissues was lower in patients with liver cirrhosis than in patients without cirrhosis (p = 0.010, p < 0.001 and p = 0.002, respectively), while expression in HCC tissues was independent of cirrhosis status. hsa_circ_0041150, hsa_ circ_0025624, hsa_circ_0001020, hsa_circ_0028129, hsa_circ_0008558, and hsa_circ_0058087 were all lower in HCC tissues than in background normal tissues in patients with liver cirrhosis (p < 0.001, p = 0.005, p < 0.001, p < 0.001, p < 0.001 and p = 0.036, respectively). hsa_circ_0041150 hsa_circ_0001020, hsa_circ_0028129, and hsa_circ_0008558 were lower in HCC tissues than in background normal tissues in patients without liver cirrhosis (p < 0.001 for all).
circRNAs are predictive of HCC patient prognosis. Analysis of recurrence free-survival (RFS) and overall survival (OS) indicated that low expression of hsa_circ_0001020, hsa_circ_0036683, and hsa_ circ_0058087 in HCC tissues was associated with a significantly inferior RFS ( Fig. 3  When the clinical features of the 233 HCC cases were stratified according to circRNA expression, hsa_circ_0001020 was found to be significantly associated with serum alpha fetoprotein (AFP) levels (p = 0.003), hsa_circ_0036683 (p = 0.006) and hsa_circ_0001020 (p = 0.004) with tumour size, hsa_circ_0001020 (p = 0.008) with portal vein or hepatic vein invasion, and hsa_circ_0001020 (p = 0.010) with pathological stage (Table 1).

Univariate and multivariate analysis of prognostic factors associated with RFS and OS in HCC.
Univariate analysis revealed that pathological stage (≥ III), hsa_circ_0001020 expression (low), hsa_ circ_0036683 expression (low), and hsa_circ_0058087 expression (low) were all significant predictors of inferior RFS. When multivariate analysis was performed hsa_circ_0036683 expression (low) and hsa_circ_0058087 expression (low) were identified as independent predictive factors for inferior RFS (Table 2). Regarding OS, univariate analysis revealed that AFP (≥ 20 ng/dl), pathological stage (≥ III), hsa_circ_0001020 expression (low), hsa_circ_0036683 expression (low), and hsa_circ_0058087 expression (low) were all significant predictors of worse OS. When multivariate analysis was performed on these predictors, AFP (≥ 20 ng/dl), and hsa_circ_0036683 expression (low) were identified as independent predictive factors for worse OS (Table 3).
Synergistic effect of hsa_circ_0036683 on alpha fetoprotein, as a biomarker. Supplementary   Fig. 3 shows the results of ROC analysis of RFS and OS 2 years after surgery. Regarding RFS, the combination of hsa_circ_0036683 and AFP had a higher AUC (0.63) than hsa_circ_0036683 (0.57) or AFP alone (0.59). In addition, for OS, the combination of hsa_circ_0036683 and AFP had a higher AUC (0.69) than hsa_circ_0036683 (0.62) or AFP alone (0.62).

Discussion
RNA deep sequencing technology has revealed roles for ncRNAs in the progression of several diseases, including hepatitis, cirrhosis and liver cancer 20 . circRNA is a class of RNA that exhibits a unique single-stranded circular structure that is formed when the upstream 3′ splice site and downstream 5′ splice site join to form a covalently closed and stable continuous loop. circRNAs have been implicated in many important biological processes 21 . In  www.nature.com/scientificreports/ this study, a genome-wide approach employing HTS was used to evaluate circRNA expression in HCC. Seven candidate circRNAs identified in the initial HTS analysis were subsequently examined in 233 sets of HCC samples to evaluate the relationship between expression and prognosis. The HTS data indicated that these seven circRNAs were all expressed at a lower level in HCC tissues, when compared with background normal tissues. Subsequent qPCR analysis using matched patient samples showed that hsa_circ_0041150, hsa_circ_0025624, hsa_circ_0001020, hsa_circ_0008558, and hsa_circ_0036683 were all expressed at a significantly lower level in HCC tissues, when compared with background normal tissues. Regarding prognosis, hsa_circ_0001020, hsa_ circ_0036683, and hsa_circ_0058087 were all associated with worse RFS and OS. Moreover, multivariate analysis revealed that low hsa_circ_0036683 expression was an independent prognostic factor for worse RFS and OS. Several circRNAs have been reported to be involved in HCC with some demonstrating elevated or reduced expression, suggesting that they may have utility as biomarkers in this disease 22,23 . It has been reported that circR-NAs may contribute to the pathology of HCC through their role as micro RNA-sponges, protein-sponges, micro RNA transporters and as regulators of parental gene expression [22][23][24] . The altered expression of the seven circR-NAs examined in this study has not previously been reported for HCC, although changes in hsa_circ_0041150, hsa_circ_0028129, and hsa_circ_0008558 expression have been reported in other carcinomas. An analysis of 20 sets of pancreatic ductal adenocarcinoma samples using microarray and qPCR revealed that hsa_circ_0041150 is downregulated in this disease 25 . Microarray data indicate that hsa_circ_0028129 is downregulated in colorectal cancer 26 , and that hsa_circ_0008558 is upregulated in both bladder cancer and oral mucosal melanoma 27,28 . Our study therefore demonstrates that these three circRNAs are implicated in the regulation of HCC in addition to these other carcinoma types.
Previous studies have examined the relationship between circRNA expression and background liver status 29 . Zhou et al. showed that expression profiles for hepatic circRNAs were significantly different between chronic hepatitis B (CHB) and normal hepatic tissues, with 99 dysregulated circRNAs identified in total 29 . Ji et al., reported the involvement of circ_0070963 in liver fibrosis 30 . Unexpectedly, this study did not reveal any significant association between circRNA expression and HBV, HCV, or cirrhosis status.
In addition to our study, others have also examined the relationship between circRNA expression and patient prognosis in HCC using qPCR. In a study of 70-200 HCC patients, high expression of circ_0021093, circ_0008450, circ_0128298, circ_0003998 and hsa_circ_0006916 was associated with unfavorable prognosis, as determined using Kaplan-Meier and multivariate analysis [31][32][33][34][35] . Low circ_0000567 and circ-ITCH expression was predictive of poor prognosis in a study of 134-288 HCC patients, as determined using Kaplan-Meier analysis 36,37 . In our study of 233 HCC patients we now demonstrate, using both Kaplan-Meier and multivariate analysis, that decreased expression of hsa_circ_0001020, hsa_circ_0036683, and hsa_circ_0058087 is predictive of poor disease prognosis. www.nature.com/scientificreports/ Although we have identified several circRNAs as useful biomarkers and prognostic predictors for HCC, there are some limitations to this study. Firstly, the study cohort consisted of individuals from a single institution. Secondly, we did not investigate the underlying mechanisms of altered circRNA expression and how these changes may impact on prognosis by identifying potential gene or microRNA targets. This study simply evaluated circRNA expression in HCC and normal background liver tissues to examine its utility as a biomarker and prognostic factor in clinical practice.
In conclusion, the circRNAs hsa_circ_0041150, hsa_circ_0025624, hsa_circ_0001020, hsa_circ_0028129, hsa_circ_0008558, hsa_circ_0036683, and hsa_circ_0058087 were all associated with prognosis in HCC. In particular, hsa_circ_0036683, which was found to be an independent and significant prognostic factor for HCC, has potential utility as a candidate biomarker for this disease. Moreover, we believe that further functional analysis of these circRNAs may identify novel therapeutic targets for the treatment of this disease.

Methods
Patients and samples. A total of 233 frozen tumour specimens and paired para-tumour normal background liver tissues were collected from patients with HCC who underwent surgery at Nagoya University Hospital between January 1998 and January 2012. All fresh tissues were immediately frozen in liquid nitrogen and stored at − 80 °C until required. Patient characteristics are summarized in Table 1 RNA isolation and genome-wide high-throughput sequencing. Total RNA was extracted from tissue samples using the Qiagen miRNeasy mini-kit (Qiagen, Hilden, Germany). Approximately 10 μg total RNA was then subject to ribosomal RNA depletion using the Ribo-Zero Gold Kit, as per the manufacturer's instructions (Illumina, San Diego, CA, USA). The RNA fragments were then reverse-transcribed to create the final cDNA library using the ncRNA-Seq sample preparation kit (Illumina) according to the manufacturer's recommended protocol. The prepared libraries were then sequenced on an Illumina Hiseq X ten platform (Illumina) and 2 × 100-bp paired-end reads (PE100) were generated according to the standard Illumina protocol. All procedures for circRNA sequencing were performed by BGI Genomics Services (Beijing, China). Sequencing reads containing low-quality, adaptor-polluted and a high content of unknown base (N) reads were removed using QCleaner v4.0.1 (Amelieff, Tokyo, Japan) before downstream analysis.
Annotation of human circRNAs. CIRI 39 was used to predict circRNA in this project and to integrate results based on the start and end position of circRNA. Burrows-Wheeler Aligner software (BWA 0.7.17, http:// bio-bwa.sourc eforg e.net/) 40 was used to align discovered reads to the hg38 reference genome. circRNAs that had the same start and end position within 10 bases were assigned to the same class. If a circRNA was recorded in the circBase (http://www.circb ase.org/cgi-bin/downl oads.cgi), the corresponding ID code was provided. circRNA was considered novel if it did not overlap with any registered circRNA in circBase.
Estimation of tumour-specific circRNA and differential expression analysis. circRNAs detected in only HCC samples were considered to be tumour-specific. Tumour-specific circRNAs with a number of reads ≥ 6 were used in down-stream analyses. Differential expression analysis of circRNAs between HCC and background normal liver tissues was conducted using the limma package (3.38.3, https ://bioco nduct or.org/ packa ges/relea se/bioc/html/limma .html) 41 . The read count for circRNA was logarithmically transformed (log2 [count + 1]) and established linear models were assessed using the empirical Bayes method. The acquired p-value was adjusted by the Benjamini-Hochberg method.
Validation of circRNA expression by qPCR. Total RNA was extracted from tissue samples using the Qiagen miRNeasy mini-kit and then converted to complementary DNA using M-MLV Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) for subsequent use in qPCR assays. PCR was performed using SYBR Premix Ex Taq II (Takara Clontech, Kyoto, Japan) under the following conditions: denaturing at 95 °C for 10 s, 40 cycles of denaturing at 95 °C for 5 s, and annealing/extension at 60 °C for 30 s. The SYBR Green signal was detected in real-time using a StepOne Plus real-time PCR System (Life Technologies, Carlsbad, CA, USA). The relative quantification method was used where the expression level of each gene in a sample was determined after normalization to the housekeeping control glyceraldehyde-3-phosphate dehydrogenase (GAPDH). Relative gene expression levels were determined using the comparative threshold cycle (2-ΔΔCT) method. To design PCR primers for circRNA targets, templates were generated using CircPrimer (Ver.1.2, https ://www.bioin f.com.cn/) 19 and divergent primers were designed with primer3 (ver.0.4.0, https ://bioin fo.ut.ee/prime r3-0.4.0/) 42 . All qPCR experiments were performed in duplicate, including the template-omitted negative controls. Statistical analysis. Continuous variables were expressed as the median and range, and gene expression comparisons were performed using the Mann-Whitney U, Wilcoxon singed rank, Kruskal-Wallis or Steel-