Investigation of chromosome 1q reveals differential expression of members of the S100 family in clinical subgroups of intracranial paediatric ependymoma

Gain of 1q is one of the most common alterations in cancer and has been associated with adverse clinical behaviour in ependymoma. The aim of this study was to investigate this region to gain insight into the role of 1q genes in intracranial paediatric ependymoma. To address this issue we generated profiles of eleven ependymoma, including two relapse pairs and seven primary tumours, using comparative genome hybridisation and serial analysis of gene expression. Analysis of 656 SAGE tags mapping to 1q identified CHI3L1 and S100A10 as the most upregulated genes in the relapse pair with de novo 1q gain upon recurrence. Moreover, three more members of the S100 family had distinct gene expression profiles in ependymoma. Candidates (CHI3L1, S100A10, S100A4, S100A6 and S100A2) were validated using immunohistochemistry on a tissue microarray of 74 paediatric ependymoma. In necrotic cases, CHI3L1 demonstrated a distinct staining pattern in tumour cells adjacent to the areas of necrosis. S100A6 significantly correlated with supratentorial tumours (P<0.001) and S100A4 with patients under the age of 3 years at diagnosis (P=0.038). In conclusion, this study provides evidence that S100A6 and S100A4 are differentially expressed in clinically relevant subgroups, and also demonstrates a link between CHI3L1 protein expression and necrosis in intracranial paediatric ependymoma.

Identification of cancer-specific molecular alterations has had a major impact on understanding the biology of cancer and improving treatment options in many cancers. For example, therapeutic agents such as Imatinib (Gleevecs) targeting genes altered in the cancer cell, but not in normal cells, are increasingly being tested in the clinical setting. By systematically deciphering the genomes of different cancers, frequent genomic aberrations and gene expression changes have been revealed and correlated with clinical details. However, the biological and clinical relevance of most aberrations in cancer is largely unknown. One such cancer, where the investigation of the tumour-specific genetic aberrations would have a major benefit upon the understanding of the disease that could lead to better treatment choices and patient survival, is ependymoma.
Ependymoma is the third most common brain tumour of childhood, with around 50% occurring in infants younger than 5 years of age (Bouffet et al, 1998). Treatment and prognostication is predominantly currently based on clinical criteria despite many genomic studies identifying common molecular aberrations (Reardon et al, 1999;Zheng et al, 2000;Scheil et al, 2001;Ward et al, 2001;Carter et al, 2002;Dyer et al, 2002;Grill et al, 2002;Jeuken et al, 2002;Gilhuis et al, 2004;Taylor et al, 2005;Suarez-Merino et al, 2005;Mendrzyk et al, 2006;Modena et al, 2006;Sowar et al, 2006;Lukashova-v Zangen et al, 2007). Currently complete tumour resection is the only confirmed independent prognostic marker, indicating a better patient outcome (Bouffet et al, 1998;Sala et al, 1998). Despite complete resection, local recurrence is reported in up to 50% of paediatric cases (McLaughlin et al, 1998;van Veelen-Vincent et al, 2002). Some improvements in survival rates have been seen over the last 30 years, with some 50% of patients now obtaining 5-year survival (Gatta et al, 2003(Gatta et al, , 2005. However, when compared with other cancers such as acute lymphoblastic leukaemia, where more than 80% of children are long-term survivors, these improvements lag far behind. There is a need to identify robust biological markers and to better understand the biology of ependymoma to improve therapeutic strategies and patient survival. Gain of 1q is one of the most common genomic aberrations in cancer (Struski et al, 2002) and is frequently gained in ependymoma, occurring at an incidence of 420% (Reardon et al, 1999;Scheil et al, 2001;Ward et al, 2001;Dyer et al, 2002;Grill et al, 2002). The gain of the whole of the q-arm of chromosome 1 has been associated with a poor prognosis in ependymoma (Dyer et al, 2002), and has also been shown to adversely affect patient survival in other paediatric cancers, including Wilms' tumour and Ewing's sarcoma (Hirai et al, 1999;Hing et al, 2001). The region-specific amplicon, 1q25, has been demonstrated as an independent prognostic marker, indicating a poor prognosis (Mendrzyk et al, 2006). However, the mechanisms by which 1q, or 1q25, confer adverse biological behaviour in ependymoma is unclear and a more detailed analysis of 1q is necessary.
Chromosome 1q gain has also been shown to be the most common global genetic change in ependymoma recurrent tumours, seen in 67% of cases, with de novo gain of 1q frequently occurring in the relapse sample (Dyer et al, 2002). The regionspecific amplicon 1q21.1 -q32.1 has been associated with tumour recurrence in intracranial ependymoma (Mendrzyk et al, 2006). Several other region-specific amplicons frequently gained in ependymoma include 1q21.3 -q23.1, 1q21 -q31 and 1q22 -q31, 1q31.1 -q31.3, 1q31 -q32 and 1q41 -qter (Kramer et al, 1998;Scheil et al, 2001;Ward et al, 2001;Mendrzyk et al, 2006;Modena et al, 2006). The 1q32 amplicon contains two genes laminin and GAC1 that are overexpressed in ependymoma and the candidate gene DUSP12 is located within the frequently gained 'hotspot' 1q23.3 (Suarez-Merino et al, 2005;Mendrzyk et al, 2006). Despite these observations the role of these amplicons remain unclear and, to date, no specific genes on 1q have been shown to directly relate to ependymoma tumorigenesis, relapse or patient outcome.
Taken together this evidence implicates the gain of 1q as a marker for adverse clinical behaviour. However, the underlying biology and the gene(s) involved remain to be elucidated. To address these issues we used a combination of comparative genome hybridisation (CGH) and serial analysis of gene expression (SAGE) to identify candidate genes on 1q. Candidates were validated using immunohistochemistry (IHC) on a tissue microarray and the protein expression levels correlated with clinicopathological data to determine their potential role in intracranial paediatric ependymoma.

Sample cohort
For SAGE and CGH analysis, 11 fresh-frozen tumour samples were obtained from the Duke Brain Tumour Bank, USA and Birmingham Children's Hospital, UK (Table 1). Five normal brain libraries (white matter, cerebral cortex, paediatric frontal cortex and two cerebellum) and six other brain tumour types (one astocytoma grade I, eight astrocytoma grade II, 11 astrocytoma grade III, 10 glioblastoma, two oligodendroglioma and 20 medulloblastoma) were downloaded from the SAGE Genie website (http://cgap.nci.nih.gov/SAGE; Boon et al, 2002).
For immunohistochemistry, a tissue microarray (TMA) was constructed using formalin-fixed paraffin-embedded (FFPE) tumour material from 74 primary tumours. The samples were obtained from the Histopathology Department at the Birmingham Children's Hospital and further Neuropathology Departments of the Children's Cancer Leukaemia Group (CCLG) Centres. The histology of each tumour was verified, representative areas were identified by a pathologist (MAB) and a minimum of three cores were taken. Haematoxylin and Eosin smears of corresponding frozen material were used to confirm viable tumour. Clinical information was obtained from the CCLG Data Centre, West Midlands Children's Tumour Registry and case notes.

CGH and SAGE libraries
Comparative genome hybridisation was performed as described by Dyer et al (2002). Serial analysis of gene expression libraries were constructed using the RNA isolated from 11 frozen tissue samples as described by Boon and Riggins (2003). SAGE2000 software (http://www.sagenet.org) was used to extract tags from the original sequence files and processed to remove duplicate ditags, linker sequences and repetitive tags. Tag counts and library information for nine SAGE libraries have been posted to CGAP's SAGE Genie website (http://cgap.nci.nih.gov/SAGE) (Boon et al, 2002). The complete list of genes mapping to chromosome 1q were downloaded from the Ensembl genome browser (NCBI36) and the best tag for each identified by searching the SAGE Genie website (http://cgap.nci.nih.gov/SAGE) using the HUGO gene symbol. All SAGE libraries were normalised to tags per 200 000 to enable cross-library comparison.

SAGE analysis
The SAGE data was analysed in four ways: (1) Tags were identified in relapse pair R1 with a higher tag count in the relapsed sample (E1023) than in the corresponding primary (E628). This data was filtered to determine tags with either the same, or less, count in the Investigation of chromosome 1q in paediatric ependymoma V Rand et al relapse compared with the primary of R2. Tags were then ranked based on the difference between the relapse and the primary of pair R1.
(2) The mean normalised tag counts were calculated for the 656 1q tags across 10 ependymoma SAGE libraries and five normal brain libraries to identify ependymoma-specific genes (E1023 was removed from this analysis). Results were then sorted based on the difference between the mean ependymoma and mean normal brain tag count. (3) The data from (2) was filtered to identify tags with p0.5 in normal brain and 42 in ependymoma, then sorted by the highest ependymoma tag count. (4) The mean tag count for each S100 gene was calculated across the six different tumour types and fold changes between ependymoma and normal brain and ependymoma and other brain tumour types determined.

Statistical analysis
Statistical analyses were performed using SPSS v15.0. Univariate analysis of the association of protein expression levels with clinical variables (Table 4) was assessed by Fisher's exact test. Kaplan -Meier survival curves were constructed to investigate candidates as prognostic markers. Multivariate cox regression hazard analysis was used to identify independent prognostic markers. A P-value of o0.05 was considered statistically significant and all values are given in Table 4.

Patients and CGH profiles
Genomic profiles were generated for 11 flash-frozen ependymoma (Table 1). Of the nine tumours with a CGH profile, six tumours (four paediatric and two adults) had a balanced genome (i.e., had no detectable genomic losses or gains) and three (two paediatric and one adult) had a structural genome (i.e., had few, mainly partial chromosome gains). Two paediatric relapse pairs, R1 and R2, were included in this study. Comparative genome hybridisation revealed that the recurrent sample (E1023) of relapse pair R1 had gain of 1q whereas the primary (E628) was balanced (i.e., de novo 1q gain). The genomes of the primary (E1p) and recurrent sample (E1r) of relapse pair R2 were both balanced.

SAGE libraries and 1q tags
A total of 801 076 SAGE tags were generated from 11 ependymoma samples with, on average, over 26 000 unique tags per library. The complete libraries for 9 of the 11 ependymoma are available to download from the SAGE Genie website. The unique SAGE tags representing the genes on 1q were identified using the Ensembl genome browser, reducing the number of tags to be analysed to 656.

Upregulated genes associated with 1q gain in recurrent ependymoma
From the filtered dataset of 656 1q tags, 205 were selected that had a higher tag count in the relapse of pair R1 than in the corresponding primary. To identify the genes upregulated on account of gain of 1q, filtering was done to select tags that were specifically upregulated upon relapse in the R1 pair compared with the relapse pair R2. This reduced the number of tags to 149. Once the tags were ranked based on the difference in tag count between the recurrent sample of pair R1 and the primary, CHI3L1, S100A10 and PSMB4 were revealed as the top three genes upregulated as a consequence of the gain of 1q in recurrent ependymoma (Table 2).

Ependymoma-associated 1q transcripts
A comparison of 1q tags in 10 ependymoma with five normal brain libraries revealed S100A10 as the most upregulated gene (125.5 tags; Table 3). A second member of the S100 family, S100A6, was identified with one of the highest differences between ependymoma and normal brain of 27.1. CHI3L1 also ranked highly, with a Investigation of chromosome 1q in paediatric ependymoma V Rand et al difference of 25.7 tags. When the data was then filtered for tags meeting the criteria p0.5 mean tag count in normal brain but 42 tags in ependymoma, the uncharacterised gene C1orf192 showed the highest difference in expression of 17.3 (Table 3). S100A4, a third member of the S100 family, was also one of the most upregulated genes in ependymoma with a difference in tag count of 6.6.

S100 gene expression in ependymoma and other brain tumours
Our analyses revealed that several members of the same gene family were associated with 1q gain and were also upregulated in ependymoma compared with normal brain tissue. Therefore, to investigate this gene family, the mean tag counts were calculated for all 14 S100 genes located on 1q21.3 represented in SAGE genie in the ependymoma SAGE libraries (Figure 1). Of the 13 S100 genes, S100A4 had the highest mean tag count for ependymoma relative to normal brain with a fold change of 20.7 and S100A10 had the highest mean tag count of 133 in ependymoma. S100A2 was the only S100 gene that was expressed in ependymoma but not in normal brain with mean tag counts of 1.5 and zero, respectively. No expression of five members of the S100 family (A15, A7, A5, A14 and A13) was observed in ependymoma. Members of the S100 family have been associated with different cancers, including brain tumours. Therefore, this analysis was extended to six other brain tumour types and the mean SAGE tags were calculated for the 14 S100 genes in each tumour type (Figure 1). S100A10 and S100A6 showed the highest mean expression across the six brain tumour types of 43.6 and 41.7 tags, respectively. In both grade I astrocytoma and glioblastoma (GBM) S100A10 had the highest tag count of 134 and 112.3, respectively. S100A6 had the highest tag counts in oligodendroglioma (15 tags), medulloblastoma (10.4 tags) and grade II and III .2 Mb S100A10 S100A11 S100A9 S100A8 S100A15 S100A7 S100A16 S100A6 S100A5 S100A4 S100A3 S100A2 S100A14 S100A13

Mb
Mean tag count <1 <3 <10 <30 <100 >100 Figure 1 Summary of the mean SAGE tag counts for 14 S100 genes in ependymoma and six other brain tumour types (astrocytoma grade I, II, III, glioblastoma, oligodendroglioma and medulloblastoma). The S100 genes are in genomic order and the start and end positions on chromosome 1 are given in megabases (Mb). a Mean of the 52 SAGE libraries from the six brain tumour types. A red star marks the genes selected for further investigation.
Investigation of chromosome 1q in paediatric ependymoma V Rand et al astrocytoma (49.1 and 55 tags, respectively). Overall, S100A10 had the highest fold change between ependymoma and other brain tumours of three and S100A4 had the highest fold change between ependymoma and normal brain of 20.7.

Differential expression of S100 proteins in intracranial paediatric ependymoma
Four members of the S100 family were selected for further investigation based on their distinct gene expression profiles in ependymoma. Protein expression levels of S100A10, S100A6, S100A4 and S100A2 were determined by immunohistochemistry using an independent cohort of seventy-four primary paediatric ependymoma arrayed on a tissue microarray (Figure 2A -H). One of the eleven ependymoma samples used to create SAGE libraries, ER1p, was represented on the TMA. In this sample, no gene or protein expression was observed for S100A2 and S100A4 with SAGE tag counts of 0 per 200 000 tags and negative protein staining observed by IHC. For S100A6 the SAGE tag count was 15 per 200 000 tags and immunostaining determined as negative/weak. Both gene and protein expression was observed for S100A10, where the SAGE tag count was 76 per 200 000 tags and moderate protein expression was observed by IHC. Univariate analysis was performed to explore possible associations between the S100 protein expression levels and clinical variables in primary ependymoma (Table 4). S100A6 significantly correlated with tumours arising in the supratentorial region of the brain (Po0.001) and S100A4 correlated with age at diagnosis under 3 years (P ¼ 0.038). No significant correlations were found for S100A10 or S100A2. Kaplan -Meier survival analysis of the clinical parameters and S100 protein expression showed that resection status and tumour location were the only indicators of prognosis, with complete resection and supratentorial tumours indicating a better patient outcome (Po0.001 and P ¼ 0.020, respectively). Similarly, multivariate analysis revealed extent of resection and tumour location as independent prognostic markers (P ¼ 0.007 and P ¼ 0.001, respectively).

CHI3L1 expression in ependymoma
Forty-eight primary tumours were scored for CHI3L1 protein expression. Tumours were categorised as having negative (0%), weak (o25%) or strong (X25%) expression levels. Twenty-eight (58%) were negative for CHI3L1 protein expression, 14 (29%) showed weak and six (13%) demonstrated strong expression. CHI3L1 protein expression was determined as weak for sample ER1p, which had a SAGE tag count of 30 per 200 000 tags. No significant correlations with the clinical parameters investigated were found. Histopathological review revealed that in five cases, where areas of necrosis were visible, CHI3L1 protein expression was restricted to the cytoplasm of viable tumour cells adjacent to the areas of necrosis ( Figure 2I and J).

DISCUSSION
Little is known about the genes and genetic mechanisms underlying ependymoma tumorigenesis, patient relapse and survival. To address these issues we focused our study on chromosome 1q, one of the most commonly gained regions in ependymoma. Using CGH and SAGE profiling we identified CHI3L1 and members of the S100 family as candidate genes in ependymoma. Immunohistochemical analysis on a large cohort of paediatric ependymoma revealed that CHI3L1 protein expression is associated with necrosis and that members of the S100 family are differentially expressed in clinically relevant subgroups. S100A6 is significantly associated with paediatric ependymoma arising in the supratentorial compartment and S100A4 strongly correlates with patients aged less than 3 years at diagnosis.
In this study, different approaches were taken to mine the SAGE data to identify the ependymoma-associated genes on Examples of immunohistochemical staining patterns of S100A2, S100A4, S100A6, S100A10 and CHI3L1 on the paediatric ependymoma tissue microarray. The tumour-specific staining pattern for S100A2 (nuclear and/or cytoplasmic) was scored as either negative (0%; A) or positive (40%; B). For S100A4 (nuclear and cytoplasmic staining) cores were grouped as either negative/weak (o1%; C) or moderate (X1%; D). S100A6 and S100A10 protein expression (cytoplasmic and/or nuclear staining for A6 and membranous staining for A10, respectively) was determined based on the percentage of immunopositive cells (0%; o10%; X10% and 0%; o50%; X50%, respectively) ranging in intensity from negative to strong (0 to 2 þ ). The cumulative scores denoted the expression levels, which were grouped as either negative/weak or moderate/strong for S100A6 (E, F) and S100A10 (G, H). CHI3L1 expression (cytoplasmic, granular) was scored as either negative (0%), weak (o25%; I) or strong (X25%; J). Magnification Â 100 for all figures except h, which is Â 10.
Investigation of chromosome 1q in paediatric ependymoma V Rand et al chromosome 1q. Analysis of the effect of de novo 1q gain on gene expression in recurrent ependymoma revealed that CHI3L1 was the most highly expressed gene. Strikingly, four members of the same gene family (S100A10, S100A6, S100A4 and S100A2) were also identified as having distinct gene expression profiles in ependymoma. Comparison of 10 ependymoma SAGE libraries with five normal brain libraries identified S100A10 as the most highly expressed gene on 1q in ependymoma; it was also the second most highly expressed gene in the relapse sample with gain of 1q. S100A6 was also identified as one of the most highly expressed genes in ependymoma when compared with 'normal' brain. S100A4, was implicated in ependymoma as, again, being one of the most highly expressed genes in ependymoma with very low expression in 'normal' brain. Analysis of all 1q S100 genes represented in SAGE Genie identified S100A2 as the only S100 gene expressed in ependymoma but not expressed at any level in 'normal' brain. S100A10, S100A6, S100A4 and S100A2 are all members of the S100 family of calcium-binding proteins and are located in a cluster on 1q21.3; a region that has been shown to have both high level gains (1q21 -q31; Ward et al, 2001) and an association with tumour recurrence in ependymoma (1q21.1 -q23.1; Mendrzyk et al, 2006). Members of the S100 family show divergent expression patterns in a range of tissues and several have been linked with cancer, including medulloblastoma (Hernan et al, 2003;Lindsey et al, 2007) and astrocytoma (Camby et al, 1999(Camby et al, , 2000. S100A6 has been shown to clearly distinguish between low (WHO grade I and II) and high (WHO grade III and IV) grade astrocytic tumours (Camby et al, 1999). In ependymoma, we have clearly demonstrated that S100A6 is differentially expressed in tumours arising in different locations of the brain, and is significantly associated with supratentorial tumours (Po0.001).
Clinically supratentorial ependymoma are associated with better survival rates when compared with posterior fossa tumours (Schiffer and Giordana, 1998). This survival difference could be because of a number of confounding factors, for example, the resectability of supratentorial when compared with infratentorial tumours (Palma et al, 2000). There is now evidence that ependymoma arising within different regions of the central nervous system exhibit specific and distinct genetic signatures (Taylor et al, 2005). For example, genes upregulated specifically in supratentorial tumours include members of the EPHB-EPHRIN and NOTCH cell-signalling systems. Our observation of the differential expression of S100A6 in different regions of the brain adds to this supratentorial gene signature.
In ependymoma we have shown that S100A4 is significantly associated with patients under the age of 3 years at diagnosis in intracranial paediatric ependymoma (P ¼ 0.038). Differences in the genomic profiles between tumours from patients under the age of 3 years and older children have previously been identified. For example, balanced genomes (with no detectable genomic losses and gains) are significantly associated with children younger than 3 years of age at surgery (Dyer et al, 2002). This finding suggests that ependymoma occurring in patients less than 3 years are biologically distinct from those occurring in older children. It has been hypothesised that tumours occurring in infants may be driven by powerful genetic events that lead to presentation at a young age without the requirement for additional genetic changes (Dyer et al, 2002). Although S100A4 is one of the best characterised of the S100 genes in terms of its role in cancer (Emberley et al, 2004;Salama et al, 2007), no other study has previously reported a link with patient age or explored its role in ependymoma. The significance of S100A4 in ependymomas arising in children under The median age at diagnosis of the primary tumours was 3.8 years (range, 8 months to 14.9 years). Significant P-values (o0.05) are given in bold.
Investigation of chromosome 1q in paediatric ependymoma V Rand et al 3 years of age is not clear but its differential expression demonstrates that there is a distinction between genetic events occurring in children of different ages. S100A4 and S100A6 are clearly differentially expressed in paediatric ependymoma and can be used to distinguish clinically and biologically relevant subgroups. Expanding our study to the gene expression levels across six other brain tumour types showed that for a number of S100 genes the expression levels were notably elevated in a particular tumour type. For example, S100A16 is elevated in grade I astrocytoma and S100A11 in glioblastoma. In contrast, several S100 genes had similar expression levels across multiple tumour types, including S100A14, S100A13 and S100A3. Similar expression profiles were also seen across all tumour types for S100A8 and S100A9 but this could be attributed to their function in which they form a heterodimer complex (Vogl et al, 2006). These findings highlight the importance of further investigation of specific members of the S100 family to understand their function in ependymoma and other brain tumours.
CHI3L1, located on 1q32.1, was the most overexpressed gene in the relapse pair with 1q gain and was also one of the most upregulated genes in ependymoma when compared with 'normal' brain. Immunohistochemistry of CHI3L1 revealed a correlation between the immunostaining and necrosis. Notably, the TMAs were constructed by selecting three representative areas from each tumour, in this process we tended to avoid areas of necrosis, however, in five cases necrosis was present. In these five cases staining showed that CHI3L1 was more highly expressed in the cytoplasm of tumour cells adjacent to the necrotic regions ( Figure  2I and J). CHI3L1 encodes for YKL-40, which is a secreted protein that has been reported to be overexpressed in a number of different cancers, including glioma, and has been proposed as a new therapeutic target (Pelloski et al, 2005;Johansen et al, 2007). The role of CHI3L1 in cancer is unknown, but it has been suggested that it has a function in a number of pro-survival processes (Johansen et al, 2007). In glioblastoma (GBM), where necrosis is a characteristic, both CHI3L1 expression and necrosis are associated with poor prognosis (Burger and Green, 1987;Raza et al, 2002;Pelloski et al, 2005;Homma et al, 2006;Kleihues et al, 2007). In ependymoma, we did not find a correlation with prognosis, thus, raising the possibility that CHI3L1 is a marker of necrosis rather than of adverse biology per se. These observations in GBMs and our findings in ependymoma suggest a link between CHI3L1 and necrosis in brain tumours. As the cores represented on the TMA are selected to avoid regions of necrosis, our findings maybe an under-representation and further investigation of CHI3L1 expression on whole tissue sections is necessary.
Previously we identified gain of 1q as one of the most common gains in primary and recurrent ependymoma and demonstrated a tendency that patients with gain of 1q have a poorer outcome (Dyer et al, 2002). The aim of this study was to investigate 1q in ependymoma to gain insight into the role of genes located in this region. In this study, we have identified members of the S100 family located within the commonly gained amplicon 1q21.3 and provide evidence of their differential expression in clinical subgroups of paediatric ependymoma: S100A4 is associated with patients of a very young age at diagnosis and S100A6 with supratentorial tumour location. We also demonstrated a link between CHI3L1 protein expression and necrosis. However, we are yet to elucidate the underlying mechanism by which 1q gain confers adverse biological behaviour in paediatric ependymoma.
We are now extending this study to a larger tumour cohort to further unravel the underlying biology of 1q in this complex tumour.