Main

During recent years, large-scale gene expression profiling has provided important insights into the biology of hepatocellular carcinomas. The results provided new information on classification, aetiology, survival prediction, and identification of signalling pathways that could serve as therapeutic targets.1, 2, 3, 4, 5 Functional genomics integrating comparison with genetically modified mice as models for human hepatocellular carcinoma6 as well as data from promoter regions,7 expression of non-coding genes, that is, microRNA,8 or array-based comparative genomic hybridisation (aCGH)9 has further increased the reliability and significance of the biological and clinical conclusions drawn from gene expression profiles. This will be the basis for developing new targeted therapies, an urgent need to reduce the mortality from hepatocellular carcinoma, which represents the fourth most common malignant tumour with more than one million patients affected worldwide per year, and usually has a very poor prognosis.10

Recently, we have shown that hepatocellular carcinoma is characterised by chromosomal instability.11 In contrast, hepatocellular adenoma, a rare benign tumour occurring mainly in premenopausal women after oral contraceptive use,12 shows a low rate of chromosomal imbalance. This qualifies hepatocellular adenoma as a comparison group since the surrounding tissue in livers harbouring hepatocellular carcinoma often contains significant alterations in gene expression and on the genomic level similar to those found in hepatocellular carcinoma.13, 14, 15

During dedifferentiation of hepatocellular carcinoma, chromosomal instability increases from differentiated G1 to undifferentiated G3 tumours. Recurrent chromosome aberrations of hepatocellular carcinoma are gains of 1q, 8q, 6q, and 20q and losses of 4q, 8p, 13q, and 17p. Gains of 1q are by far the most frequent aberrations and occur early during tumorigenesis.11, 16

Here we aimed to verify recent data suggesting that hepatocellular adenoma can be separated from hepatocellular carcinoma based on gene expression profiles, and to identify candidate driver genes in gained or amplified regions, in particular in 1q. For this purpose, 32 hepatocellular adenoma and hepatocellular carcinoma, cytogenetically characterised by aCGH, were analysed for gene expression using a whole genome microarray. Alterations in gene expression were then set in correlation to genomic imbalances using novel bioinformatic models that enabled us to dissect the overexpressed genes in amplified regions.

Materials and methods

Materials

Analysis was carried out on tumour specimens of 24 patients suffering from hepatocellular carcinoma and of 8 patients with hepatocellular adenoma. All patients were treated at Hannover Medical School. Clinical, serological, and pathomorphological data are given in Table 1. Tumours were obtained as permitted by the Ethics Committee of the Hannover Medical School from surgical specimens taken for diagnostic purposes. Unfixed tissue samples were snap-frozen within 30 min after resection and stored at −80°C.

Table 1 Clinical, morphological, and serological data of the 32 patients analysed

A piece of tissue was cut from the tumour for RNA extraction and another piece of tissue was fixed in formalin and was paraffin-embedded for additional histological examination. This ensured the closest proximity and homogeneity of the tissues taken for the different experiments.

For all tumours included in this study, cytogenetic characterisation was already carried out using aCGH with detailed information given in an earlier study.11 The karyotypes obtained by this approach are listed in Table 2. In brief, the most commonly gained region was 1q22.1–23.1 in 22 out of 24 hepatocellular carcinoma (92%). With a decreasing number of tumours affected, a gain of 8q was observed in 14 hepatocellular carcinoma (58%), losses for 8p in 12 hepatocellular carcinoma (50%) losses for 4q in nine cases (38%), losses of 13q in eight cases (33%), losses of 6q in eight cases (33%), gains for 6p in eight cases (33%), losses of 17p in eight cases (33%), and gains for 20q in eight cases (33%). In the group of hepatocellular adenoma, sample N103 (13%) showed a gain of 1q22–q23.1 as also seen in hepatocellular carcinoma. In this case, adenomatosis hepatis was diagnosed because of more than 20 separate tumour nodules. In tumours N103 and N48, a loss of 8p was found. A gain of 20q13.33–20qter was observed in six out of eight cases (75%). Five hepatocellular adenoma revealed a gain of the subtle region 22q12.3–q13.1. These two regions were rarely affected in hepatocellular carcinoma.

Table 2 List of the chromosome bands containing set of genes significantly upregulated (P<0.01) in HCC or in HCA (complete data are given in Supplementary Table 2)

Methods

cDNA Microarray Analysis

Total RNA was isolated using Trizol (Invitrogen, Karlsruhe, Germany) followed by RNeasy Mini Kit (Qiagen, Hilden, Germany) and linearly amplified (MessageAmp aRNA kit, Ambion, Huntingdon, UK). A measure of 1.5 μg of each of amplified RNA from the tumours and the human reference (Universal Human Reference RNA, Stratagene, La Jolla, CA, USA) were reverse transcriptase labelled with Cy3 (reference) and Cy5 (tumours) (Amersham Biosciences, Little Chalfont, UK). Labelled RNA of samples and references were comparatively hybridised overnight onto a cDNA chip containing the ‘Resgen clone set’ with more than 43 000 spots covering the entire human genome (Stanford Functional Genomics Facility, Stanford, CA, USA) according to the posted protocol (http://brownlab.Stanford.edu/protocols.html). Arrays were imaged using a GenePix 4000B scanner and dependent software (GenePix Pro 4.1, Axon Instruments, Ismaning, Germany). For normalisation of each array, data were imported to the Stanford Microarray Database (available at http://smd.stanford.edu). Data were excluded if the regression correlation was less than 0.6 or the mean spot intensity/median background intensity was less than 1.5. The well-measured genes were subsequently ‘mean-centred’ (i.e., reported for each gene relative to the mean ratio across all samples, rather than relative to the control spots).

Normalised and log-transformed expression data served as input for significance analysis of microarray with standard parameters given in the literature.17

For detection of modest but coordinate changes on gene sets that share common chromosomal location, gene set enrichment analysis was used based on the C1 and C2 databases (version 1.0, March 2005) and the software package developed by Subramanian et al.18

RT-PCR

To confirm the array results, quantitative RT-PCR was performed on the selected genes CCT3, CABC1, and CDKN1B, found upregulated in hepatocellular carcinoma in comparison to hepatocellular adenoma.

In a total volume of 120 μl, 1 μg of total RNA was reverse transcribed by M-MuLV reverse transcriptase using random hexamer primers (RevertAid First Strand cDNA synthesis kit; Fermentas, St-Leon-Rot, Germany). Two microliters of CCT3 cDNA samples were amplified on an iCycler iQ real-time detection system (Bio-rad Laboratories, München, Germany) using a QuantiTect SYBR Green PCR kit (Qiagen, Hilden, Germany). Five microliters of each cDNA sample was used for the analysis of CABC1 and CDKN1B expression. This expression was performed using an iCycler iQ real-time detection system (Bio-rad Laboratories, München, Germany). Primers and probes for CABC1 and CDKN1B were synthesised by TibMolBiol (Berlin, Gerrmany). Detailed PCR conditions and primer sequences for all genes used are shown in Table 3. β-Actin was used as internal control gene. For calibration the human reference (Stratagene, Amsterdam, The Netherlands) was used. Relative quantification of gene expression was determined according to the ΔΔCT method using an iCycler iQ real-time detection system software version 3.1 (Bio-rad Laboratories). All samples were analysed in triplicates.

Table 3 Sequences of primer, probes, and cycler conditions used for verification of array results by RT-PCR

Results

Gene Expression Analysis of Hepatocellular Adenoma Compared to Hepatocellular Carcinoma

Evaluating microarray data from 24 hepatocellular carcinoma and eight hepatocellular adenoma, 7733 cDNA clones with the most variable expression among tumour and reference DNA were selected. Hierarchical clustering algorithm was applied based on the Pearson's correlation (centred) coefficients to group the genes, as well as the samples, on the basis of similarity in their expression pattern. This analysis revealed two major clusters of gene expression pattern (Figure 1), one representing hepatocellular carcinoma tumours and the other representing hepatocellular adenoma tumours, with only one exception (N37).

Figure 1
figure 1

Subset of the heat map using hierarchical clustering based on significance analysis of microarray. Upregulated genes are marked in red, downregulated genes in green. As seen at the top, annotations of hepatocellular adenoma are printed in green, hepatocellular carcinoma in black. Hierarchical clustering including 7733 genes informative for this approach branched all hepatocellular adenoma from hepatocellular carcinoma with only one hepatocellular carcinoma sample (N37) occurring in the cluster of hepatocellular adenoma. Conversely, no hepatocellular adenoma was seen in the group of hepatocellular carcinoma. The detailed search for a clustered subset of genes includes four genes upregulated in hepatocellular carcinoma and localised in 1q22 (CCT3, SCAMP3, ApoA1BP, and PYGO2), indicating the importance of these genes in the definition of hepatocellular carcinoma by expression analysis.

We then used significance analysis of microarray to identify those genes significantly deregulated. By this approach, 171 genes were expressed at a higher level and 551 genes were expressed at a lower level in hepatocellular carcinoma in comparison to hepatocellular adenoma, as listed in detail in Supplementary Table 1. Comparison to the list of genes most significantly dysregulated in hepatocellular carcinoma as reported by Chen et al19 revealed nine genes detected in both studies.

Genes Commonly Upregulated in Amplified Regions

As the second mathematical approach, gene set enrichment analysis was applied. Gene set enrichment analysis evaluates microarray data not at the level of single genes but at the level of gene sets, that is, groups of genes that share a common biological function, regulation, or chromosomal location. As the basic parameter, sets of genes were analysed with regard to their localisation in the genome, that is, their chromosome band annotation.

First, upregulated genes were determined in hepatocellular carcinoma in comparison to hepatocellular adenoma. Second, downregulated genes in hepatocellular carcinoma were identified. Six sets of genes significantly upregulated in hepatocellular carcinoma (P<0.01) were localised in chromosome bands 1q22 (37 genes), 1q32 (88 genes), 1q41 (14 genes), 1q42 (46 genes), 2p22 (21 genes), and Yp11 (14 genes), respectively (Figure 2a, Supplementary Table 2). However, since chromosome band 2p22 was found altered in dedifferentiated but not in well-differentiated hepatocellular carcinoma, this gene set was excluded from further evaluation to avoid a systematic bias with regard to the dedifferentiation of hepatocellular carcinoma. We also excluded Yp11, because 6 out of 8 patients suffering from hepatocellular adenoma were females, whereas 21 out of 24 patients with hepatocellular carcinoma were males, resulting in an expected bias. Nevertheless, this finding demonstrates the reliability of the technique. All genes dysregulated in the remaining chromosome bands are named in Table 4.

Figure 2
figure 2

(a) Gene set enrichment analysis for comparison of hepatocellular carcinoma with hepatocellular adenoma. Four chromosome bands were found to contain highly significant (P<0.01) upregulated gene sets in hepatocellular carcinoma. All these four chromosome bands are localised in 1q (1q22 (shown here), 1q32, 1q41, and 1q42) known as most frequently gained in hepatocellular carcinoma with 1q22 aberrant in 92% of cases in aCGH. Visualisation of the results is carried out using a gene list index (left), null distribution graph (middle), and heat map (right) with downregulated genes marked in blue and upregulated genes marked in red. (b) Gene set enrichment analysis data obtained when comparing hepatocellular adenoma with hepatocellular carcinoma. Three bands (11p15 (shown here), 11q12, and 12p13) were found.

Table 4 List of chromosome bands and their annoted gene sets found upregulated most significantly by gene set enrichment analysis in hepatocellular carcinoma and adenoma, respectively

Moreover, 10 gene sets commonly downregulated in hepatocellular carcinoma were identified as significantly deregulated (complete set of data is given in Supplementary Table 2). Seven gene sets were again localised at chromosome bands affected only in dedifferentiated hepatocellular carcinoma and were therefore also excluded. The remaining three sets were localised in 11p15 (121 genes), 11q12 (62 genes), and 12p13 (83 genes), respectively (Figure 2b).

Comparison of the data obtained by gene set enrichment analysis and significance analysis of microarray analyses showed that seven genes (SCAMP3, IQGAP3, PYGO2, GPATC4, ASH1L, APOA1BP, and CCT3) located in 1q were found to be significantly upregulated in both mathematical models (Table 5). For chromosome band 1q32, two out of three genes were found to be deregulated both in gene set enrichment analysis and significance analysis of microarray analyses. For chromosome band 1q41, 1 out of 1 gene, for 1q42 5 out of 6 genes, for 11p15 10 out of 10, for 11q15 8 out of 8 genes, and for 12p13 2 out of 8 genes were seen as deregulated in both mathematical models.

Table 5 Comparison of gene set enrichment analysis and significance analysis of microarray results

According to the literature (detailed information given in: http://smd.stanford.edu/cgi-bin/source/sourceResult), the majority of the genes, that is, 24 out of 45 genes, have a function in the induction of proliferation and cell cycle control (Table 6).

Table 6 Genes found dysregulated in SAM and GSEA

Comparison of mRNA Expression Analysis and aCGH

The upregulation of genes may be induced by copy number changes. Therefore, we investigated whether the gene sets significantly upregulated in hepatocellular carcinoma in comparison to hepatocellular adenoma are located in gained or amplified regions. This was true for chromosome bands 1q22, 1q32, 1q41, and 1q42 that were gained in 22 out of 24 hepatocellular carcinoma analysed by both gene expression and aCGH analyses. In chromosome band 11p15, significance analysis of microarray and gene set enrichment analysis revealed a set of upregulated genes. However, only 4 out of 22 hepatocellular carcinoma demonstrated a gain of this chromosomal region. The same is true for chromosome band 11q12, where no chromosomal imbalances were found. For 12p12–13, three hepatocellular carcinoma showed a loss, whereas three hepatocellular carcinoma showed a gain of at least part of this region.

RT-PCR

Results obtained for mRNA expression analysis were proven exemplarily for genes CCT3, CABC1, and CDKN1, respectively. CCT3 is localised on 1q22, CABC1 on 1q42, and CDKN1 on 12p13. RT-PCR confirmed the results of array analysis revealing a systematic increase of mRNA expression of CCT3 and CABC1, and a decrease for CDKN1 as also found by significance analysis of microarray as well as gene set enrichment analysis, as summarised graphically in Figure 3 (raw data given in Supplementary Table 3).

Figure 3
figure 3

Results obtained by global expression analysis were proven by RT-PCR exemplarily for CCT3 and CABC1, two of the genes found upregulated in hepatocellular carcinoma by significance analysis of microarray as well as gene set enrichment analysis. Experiments were performed for those cases for which exactly the same lot of mRNA was available for RT-PCR as also used in array-based experiments. This was carried out to avoid alterations of the results due to differences based in the laboratory work. The increased gene expression on the mRNA level for these two genes was also detectable by RT-PCR, as demonstrated by significantly higher ΔΔCT values in hepatocellular carcinoma compared to hepatocellular adenoma. Conversely, CDKN1B was seen downregulated in hepatocellular carcinoma, as expected after array-based expression analysis.

Discussion

Array-based gene expression analysis has been established as a highly informative technique in hepatocellular carcinoma. Recent studies have focused on the correlation of mRNA expression with aetiology, clinicopathological findings, grading, and survival.20, 21, 22, 23 However, mRNA expression analysis is hampered, since in livers bearing hepatocellular carcinoma the non-neoplastic tissues already reveal distinct dysregulated sets of genes as also found in hepatocellular carcinoma.24, 25, 26 We therefore focused on the comparison of hepatocellular carcinoma and hepatocellular adenoma, a rare tumour, almost never reported as becoming malignant.

Until now, global gene expression analysis in hepatocellular adenoma has only been described in a study by Chen et al.27 Their attempt was to define a set of diagnostically dysregulated genes separating hepatocellular adenoma from well-differentiated hepatocellular carcinoma. When comparing the results of Chen et al and to those of our study, in the former study, five genes were downregulated in hepatocellular carcinoma and 39 upregulated, whereas in our study, 171 genes were upregulated and 551 genes upregulated in hepatocellular adenoma. This difference in the detection of significantly altered genes is a phenomenon also reported in other studies and has to be seen together with variant mathematical models used, different settings of patients examined, and variant experimental settings.28 Sample sets in particular are not directly comparable since our pool of hepatocellular carcinoma includes not only well differentiated but also poorly differentiated tumours. Furthermore, a sample size of six hepatocellular adenoma vs eight well-differentiated hepatocellular carcinoma is borderline to achieve statistically significant results in view of single genes.

Nevertheless, at least nine genes were dysregulated concordantly in both studies, including genes for metabolism (ALDH2, ADK, SLC7A2, and CYP4V2), hormonal receptor (ESR1), complement activation (C8A), DNA repair (PCNA), and cell signalling (PTPN3), which is important in cell cycle regulation.

As a further mathematical attempt to better understand the overwhelming data obtained, gene set enrichment analysis was performed. This statistical model was chosen as a second approach to detect coordinate changes on gene sets defined as groups of genes that share common biological function, regulation, or chromosomal location.

In our study, we focused on the evaluation of gene sets defined by their location on distinct chromosome bands because of the known recurrent structural chromosome imbalances reported for hepatocellular carcinoma, but not hepatocellular adenoma, respectively. When performing gene set enrichment analysis, to look for gene sets upregulated in hepatocellular adenoma in comparison to hepatocellular carcinoma, 10 sets were found with significant P-values <0.01. However, evaluation was impaired, since seven of these chromosome bands were found altered in dedifferentiated hepatocellular carcinoma, but not in well-differentiated hepatocellular carcinoma. To avoid a systematic bias, these gene sets were excluded from further analysis.

Four sets of upregulated genes are located in 1q. The remaining three gene sets are located in chromosome bands found aberrant by aCGH in only a minority of cases. Therefore, it seems unlikely to assume that gene copy number changes have induced the overexpression of these genes as shown for 1q. The gene sets in 1q contained 37, 88, 14, and 46 genes within chromosome bands 1q22, 1q32, 1q41, and 1q42, respectively. Reliability of these results was underscored by comparing gene set enrichment analysis and significance analysis of microarray. Thirty-nine out of 44 (89%) upregulated genes identified by significance analysis of microarray were also detected by gene set enrichment analysis. Most interestingly, a gain of 1q has been reported as the most frequent imbalance (for review see progentix.com29) and is known as the chromosomal imbalances separating hepatocellular carcinoma from hepatocellular adenoma most significantly.16

In particular, chromosome band 1q22 was described as the commonly gained region in hepatocellular carcinoma. Within this small region eight genes were detected in both mathematical approaches, significance analysis of microarray and gene set enrichment analysis, respectively. This group of genes includes SCAMP3, IQGAP3, PYGO2, GPATC4, ASH1L, APOA1BP, and CCT3. Most genes play a role in proliferation and cell cycle control. Members of the IQGAP proteins are integral components of cytoskeletal regulation.30 They are known to act on Ca(2+) calmodulin signalling, cytoskeletal architecture, CDC42 and Rac signalling. Furthermore, they are part of E-cadherin-mediated cell–cell adhesion and β-catenin-mediated transcription control.31 Hepatocellular carcinoma A2 contains two Src homology 3 (SH3) domains indicating a function in intracellular signal transduction. PYGO2 is a member of the canonical Wnt pathway and after recruitment to nuclear β-catenin permits transcriptional activation of WNT target genes.32 The function of secretory carrier membrane proteins (SCAMP3) has yet not been characterised in detail. Singleton et al33 found a ubiquitous distribution in nearly all tissues examined. They discuss the possible function of this gene in trafficking processes, probably of the post-Golgi apparatus. CCT3 is a chaperone playing an important role in the folding of cytoskeletal components.34 Elevated CCT3 expression possibly impairs correct folding and assembly of complex proteins. For CCT3, significant overexpression in hepatocellular carcinoma has been reported by Wong et al.35

In conclusion, gene expression profiling suggests a coordinated upregulation of genes localised in amplified chromosome regions, in particular of 1q. This underlines the importance of this genomic region for the development of hepatocellular carcinoma regardless of the aetiological factors as already recognised on the genomic level. Therefore, further attempts should be made to narrow down this number of candidate driver genes in carcinogenesis of hepatocellular carcinoma.