The integrated genomic and epigenomic landscape of brainstem glioma

Brainstem gliomas are a heterogeneous group of tumors that encompass both benign tumors cured with surgical resection and highly lethal cancers with no efficacious therapies. We perform a comprehensive study incorporating epigenetic and genomic analyses on a large cohort of brainstem gliomas, including Diffuse Intrinsic Pontine Gliomas. Here we report, from DNA methylation data, distinct clusters termed H3-Pons, H3-Medulla, IDH, and PA-like, each associated with unique genomic and clinical profiles. The majority of tumors within H3-Pons and-H3-Medulla harbors H3F3A mutations but shows distinct methylation patterns that correlate with anatomical localization within the pons or medulla, respectively. Clinical data show significantly different overall survival between these clusters, and pathway analysis demonstrates different oncogenic mechanisms in these samples. Our findings indicate that the integration of genetic and epigenetic data can facilitate better understanding of brainstem gliomagenesis and classification, and guide future studies for the development of novel treatments for this disease.

B rainstem gliomas represent a heterogeneous group of tumors that arise from the midbrain, pons, or medulla. Among these tumors, pediatric diffuse intrinsic pontine glioma (DIPG), with a median overall survival of 9-12 months 1 , has been the main research focus for the past five decades due to the inoperability and resistance to chemotherapy and radiotherapy [2][3][4][5][6] . Approximately 80% of pediatric DIPGs harbor K27M mutations affecting H3F3A or HIST1H3B/C 1,7-12 . These K27Mmutant tumors are associated with a particularly poor prognosis 13 . In the 2016 World Health Organization (WHO) classification of CNS tumors, the term "diffuse midline gliomas, H3K27M-mutant" was introduced to represent this DIPG tumor subset 14 . However, the molecular characteristics of the nonpediatric brainstem gliomas, such as midbrain and medulla oblongata gliomas, remain poorly characterized. Research on these tumors has been challenging due to the relatively low incidence rate and the high risks related to surgical resection resulting in low tissue availability. We collected, to our knowledge, the largest and most comprehensive cohort of brainstem gliomas, encompassing all age groups and anatomic locations, including medulla, pons, and midbrain. We performed integrated whole genome sequencing, RNA sequencing, and array-based genome-wide methylation analysis to acquire a more comprehensive picture of the molecular composition of these brain tumors. Here, we report methylation-based clusters that identify brainstem glioma subsets associated with tumor location and mutation landscape. We present two distinct clusters of H3mutant brainstem gliomas, H3-Pons and H3-Medulla, which despite their similar genetic mutations, differ not only in location, but in methylation pattern, gene expression, and prognosis.
Methylation classification reveals distinct H3 clusters correlated with tumor locations in brainstem gliomas. DNA methylation status has been utilized for classification of brain tumors, and could assist diagnosis and prognostication 2,13,15-17 . We performed unsupervised hierarchical clustering (linkage method: WPGMA; distance: Euclidean) using the top 20,000 most variable probes (Supplementary Table 3), excluding methylation probes on sex chromosomes and common single nucleotide polymorphisms sites. This approach revealed four distinct methylation clusters (Fig. 1). The hypermethylated cluster (methylation cluster IDH) consisted primarily of tumors bearing isocitrate dehydrogenase 1 (IDH1) mutations. Histone H3 mutant samples formed two different clusters associated with tumor location (methylation clusters H3-Pons and H3-Medulla) (Fig. 1). The remaining cluster (methylation cluster PA-like) consisted largely of lower grade gliomas without detectable IDH1 or H3 mutations. These clusters matched with the DKFZ methylation classifier by three main classes 15 , "diffuse midline gliomas H3 K27M mutant", "pilocytic astrocytoma", and "IDH mutant". However, our methylation-based clustering analysis revealed that H3-mutant samples were made up of two distinct sub-clusters, which correlated with their anatomic localization in the brainstem of either the pons (methylation cluster H3-Pons) or medulla (methylation cluster H3-Medulla). Principal component analysis of the methylation array probe data using R packages and functions (ggbiplot and prcomp) 18,19 also revealed distinct groups based on the DKFZ methylation classifier ( Fig. 2a; Supplementary Fig. 2a). When performing PCA specifically for whole probes from tumors within methylation cluster H3-Pons and H3-Medulla (Fig. 2b), as well as locations in methylation cluster H3-Pons and H3-Medulla ( Supplementary  Fig. 2b), we observed similar trends in these clusters and locations. The first principal component could explain most of the variance in the various principal component analyses performed (96.4% in Fig. 2a and 96.7% in Fig. 2b, Supplementary Fig. 2b), indicating methylation profiling could provide significant utility as a classifier of these samples. The four distinct methylation clusters, corresponding to the H3-Pons, H3-Medulla, IDH, and PA-like subtypes, could be readily identified. Analysis using Tumor Map supported these findings by demonstrating a similar distinction of clusters correlated to tumor location 20 ( Fig. 2c; Supplementary Fig. 3).
The top 20,000 variable probes used above were based on methylation data from all samples, including samples from methylation clusters IDH and PA-like. To prevent potential confounding factors for those differential probes mainly for methylation clusters IDH and PA-like, we selected the top 20,000 variable probes from only the methylation clusters H3-Pons and H3-Medulla, and we performed hierarchical clustering on these cases ( Supplementary Fig. 4). Heatmap analysis showed this hierarchical clustering of subclusters indeed maintained distinct clusters. Most samples from methylation cluster H3-Medulla are tumors from the medulla, or the nearby dorsal pontomedullary junction, ( Supplementary Fig. 1). Tumors in the methylation cluster H3-Pons are largely pontine tumors (29 out of 47, 61.7%) and 4 from the nearby middle cerebellar peduncle (8.5%). Methylation cluster PA-like primarily consisted of medullary tumors (18 out of 34, 52.9%), along with 12 tumors from midbrain regions (8 from midbrain tegmentum and 4 from tectum, 35.3%). Unlike other clusters having focused locations, tumors from cluster IDH were distributed across brainstem regions, 3 from the pons (20%), 3 from the middle cerebellar peduncle (20%), and 4 from the medulla (26.7%). Among all 31 DIPG tumors with methylation data, 27 of them were included in the H3-Pons cluster (87.1%), with the remaining 4 DIPG cases clustering in the IDH cluster (12.9%).
All patients in this study were of Asian ethnicity. To evaluate if these distinct H3 clusters can be found in a predominantly non-Asian population, we combined our dataset with published studies of 28 DIPG samples (Buczkowicz et al. 9 ). From tSNE results of those selected top 20,000 variable probes, we found that those DIPG samples grouped closely with our H3-Pons samples as expected ( Supplementary Fig. 5), indicating that classification according to the methylation cluster H3-Pons may be robust across ethnicities. Notably, of the 6 DIPG cases from Buczkowicz et al. that clustered toward the PA-like group, 4 were H3 WT and all were previously classified in either the "silent" or "MYCN" methylation clusters of that study.
Tumors of distinct methylation clusters display different genomic landscapes. To identify somatic genetic alterations in this brainstem glioma cohort, whole genome sequencing and panel targeted sequencing for 68 common mutated brain tumor genes were used on both the tumor samples and matched blood (Fig. 3). BWA and GATK MuTect2 21,22 were used for variant calling, and IntOgen 23,24 and FML 25 were used to predict potential driver mutations and significant noncoding region mutations. The mutation landscape of these tumors is grouped by the four distinct DNA methylation clusters defined above ( Fig. 3  As expected, methylation cluster IDH is enriched for IDH1 mutations (78.57%) and most of these cases harbored cooccurring TP53 (92.86%) and ATRX (50%) mutations. FML/ Intogen analysis showed that TP53, ATRX, and IDH1 were significantly mutated and potential driver mutations in this methylation cluster. Of note, the patients in this cluster are adults (age range 23-60 years).
Methylation cluster PA-like, consisting primarily of grade I or II brainstem gliomas, showed distinct patterns in DNA methylation and genetic mutations (Figs. 1 and 3). The number of mutations for each sample was lower in samples of cluster PAlike compared with samples in other clusters (mean mutation count: 6.9; methylation cluster IDH, H3-Medulla, H3-Pons mean mutation count: 24.1). Interestingly, FML/Intogen driver analysis revealed potential driver mutations in NF1 and SF3B1 coding regions, and in the noncoding 3′ UTR of EXD3, despite their low frequency of mutations. Overall, few of the commonly associated glioma driver genes were mutated in the methylation cluster PAlike, and NF1 and SF3B1 were the only recurrently mutated genes in this cluster.
Gene expression profiling reveals distinct enriched gene sets in methylation clusters H3-Pons and H3-Medulla. We performed RNAseq on samples from the brainstem glioma cohort and evaluate patterns in gene expression ( Supplementary Fig. 6). When selecting genes encoding transcription factors, similar to DNA methylation profiling, methylation clusters H3-Pons and H3-Medulla could be differentiated by gene expression profiles 27 ( Supplementary Fig. 6). Next, we used HTseq 28 and edgeR 29,30 to identify differentially expressed genes between methylation clusters H3-Pons and H3-Medulla, followed by enrichment analysis      Tables 4 and Supplementary Table 5). The top altered pathways and GOs identified in our analyses are related to cell cycle, cell division, or mitosis, showing the potential different mechanisms involved in the tumors in methylation clusters H3-Pons or H3-Medulla. We also applied Gene Set Enrichment Analysis for the normalized FPKM values 32,33 ( Fig. 4c-h). Gene sets enriched in methylation cluster H3-Medulla were primarily immune response-related gene sets such as interferon gamma signaling and cytokine receptor interaction, while gene sets enriched in methylation cluster H3-Pons were cell cycle or mitosis-related such as DNA replication, mitotic phase, and checkpoints.
Fusion genes and copy number alterations. Using whole genome sequencing data and RNA sequencing data, we evaluated our cohort for genomic rearrangements (Manta 34 ) and fusion genes (STAR-fusion 35 ). Several common fusion genes were detected, including KIAA1549/BRAF (n = 3) and NTRK fusions, which have been reported in gliomas 11,36,37 . The KIAA1549/BRAF fusion was detected in 1 out of 8 pilocytic astrocytomas in our cohort and in two grade II astrocytomas (all 3 in methylation cluster PA-like). We validated several recurrent fusion genes identified in our analyses, including C15orf57-CBX3 genes (n = 3), and NTRK2-other genes (n = 6) ( Fig. 5; Supplementary Table 6 and Supplementary Table 7). Sanger sequencing was performed to confirm the fusion genes and specific breakpoints in these samples. We also used methylation array data to assess copy number alterations for each methylation cluster by conumee 38 and GISTIC 39 (Supplementary Tables 8 and 9; Supplementary Fig. 7a,  b), which showed different patterns in copy number gains and deletions among the four methylation clusters. Although methylation clusters H3-Pons and H3-Medulla shared similar frequent copy number alterations in 3p26.32, 8p23.1 (gains) and 5q31.3 (loss), H3-Medulla globally harbored more copy number gains (11 loci in H3-Medulla vs. 5 loci in H3-Pons) while H3-Pons exhibited more frequent copy number losses (13 loci in H3-Pons vs. 5 loci in H3-Medulla). Interestingly, only H3-Pons showed 4q12 amplification which contains the frequently amplified gene PDGFRA in midline gliomas. Copy number alterations in methylation cluster IDH (7 loci in gains or losses) and PA-like (7 loci in gains or losses, including 7q34: KIAA1549 and BRAF amplification) were fewer in comparison with methylation clusters H3.
Cases included in methylation cluster PA-like showed better overall long-term survival compared with the other groups. Importantly, this improved survival trend for patients in the methylation cluster PA-like occurred in the context of the majority of these cases being diagnosed histologically as astrocytoma or oligoastrocytoma, grades II-III (21 out of 34). Only 7 out of 34 tumors in this cluster were diagnosed as pilocytic astrocytoma (Fig. 6a). Collectively, these results suggest that methylation classification into these subgroups may serve as a better correlate to identify patients with grade I-III tumors that have a more benign clinical course.
In terms of tumor grade, most of the tumors from methylation cluster H3-Pons were grade IV (20 out of 47, 42.6%), while 19 tumors were grade III (40.4%), and 8 tumors were grade II (17.0%). Tumors classified in the methylation cluster H3-Medulla were composed of grade II and III cases (II: 11, 40.7%, III: 12, 44.4%;). When evaluating only grade II and III tumors in both groups, H3-Medulla cases exhibited longer median overall survival (26.3 months) compared to tumors classified as H3-Pons (11.1 months) (Log-rank test, p = 0.00034) (Fig. 6c). DIPGs are known to have the worst prognosis among brainstem gliomas, and none of the DIPGs in our cohort were in methylation cluster H3-Medulla. When DIPGs in methylation cluster H3-Pons were excluded, samples from methylation cluster H3-Medulla still showed a significantly longer overall survival relative to tumors classified as H3-Pons (26.3 months vs. 10.6 months) (p value = 0.0064, log-rank test) (Fig. 6d). We also conducted Cox proportional hazards regression models for multivariate analysis ( Supplementary Fig. 8). When including methylation cluster and age as factors, H3-Pons still showed higher risk than H3-Medulla (hazard ratio: 1.04-6.6; p value = 0.041), while age showed only limited effect (hazard ratio: 0.95-1.0, p value = 0.066) (Supplementary Fig. 8a). When including whether the sample is DIPG or non-DIPG, methylation cluster remains the most dominant factor (Supplementary Fig. 8b).

Discussion
Brainstem gliomas represent a heterogeneous group of tumors arising from the midbrain, pons, and the medulla, affecting both children and adults. These tumors have different histologic features, but also differing levels of resectability and therefore variable clinical courses. Among these tumors, DIPG has been the most extensively studied, due to its relative prevalence and lack of therapeutic options resulting in a poor prognosis. Integrated genomic, epigenomic and transcriptomic studies have provided insightful understanding of the tumorigenesis of pediatric DIPG, which also might hold promise for future utilization of molecular marker-driven clinical trials and use of novel targeted therapies such as HDAC, JMJD3, ACVR1, PPM1D, and BET bromodomain inhibitors, and CDK7 blockade [40][41][42][43][44] . However, the molecular profiling of the many other brainstem tumors that are nonpediatric DIPG has remained elusive due in large part to the lack of sample availability. This has limited our understanding of these diseases and ability to objectively stratify these patients and implement use of novel targeted therapies. Here, we provide a comprehensive integrated genomic analysis of gliomas of the brainstem, with tumors spanning from the most rostral midbrain to the pons and medulla. The newly updated WHO guidelines use a new diagnostic term for brainstem gliomas with the H3 K27M mutation, however, this classification is unable to distinguish the variable prognoses of these gliomas, categorizing all of them as WHO Grade IV. From our study, we show using the epigenetic and genetic signatures that tumors from various locations in the brainstem can be classified into four major epigenetic subtypes, each with distinct clinical courses and potential therapeutic targets.
Using methylation data we showed that brainstem gliomas could be classified into four major methylation clusters: H3-Pons, H3-Medulla, IDH, and PA-like. We summarized the integrated genetic and clinical features of these subtypes in Fig. 7. Our study revealed the presence of two distinct epigenetic subgroups of H3-mutant tumors, H3-Pons and H3-Medulla. There were significant differences in the survival trends between these two clusters, with the H3-Pons group having a more aggressive course as compared to the H3-Medulla tumors. Based on RNA-seq based differential expression analysis, we found these tumors to have different enrichment of gene expression pathways, with the H3-Medulla tumors enriched for immune-response related pathways, as compared with the more aggressive H3-Pons tumors having more cell cycle-related pathways. Despite these tumors having similar mutation patterns, with common alterations in core mutations such as H3F3A, these striking differences in epigenetic and expression patterns may inform distinct origins or influences of the tumor microenvironment, and warrant further investigation. These discoveries indicate that methylation status might improve the classification for brainstem gliomas and guide clinical decision making for patients.
Tumors in the PA-like cluster consisted primarily of tumors originally diagnosed as grades II-III infiltrative gliomas, pilocytic astrocytomas, PMA, and PXA. The PA-like-group tumors had a benign clinical course, despite variability in grade. Based on histologic criteria, such variability may lead to classification of a subset of these cases as higher grade gliomas and lead to overtreatment in clinical practice. Here again we demonstrate that epigenetic and genomic patterns can more precisely stratify patient tumors diagnosed with small biopsies. Using methylationbased classification of tumors could better inform clinical decision making and identify patients that are candidates for therapeutic intervention.
We also utilized whole genome sequencing data to establish the mutation landscape of brainstem gliomas and discovered methylation patterns closely matched with mutation landscapes. Several frequently mutated genes were identified in these clusters, including H3F3A, HIST1H3B, IDH1, TP53, PPM1D, ATM, ATRX, FGFR1, PIK3CA, NF1, PTEN, PDGFRA, and TCF12. We used additional algorithms to predict driver mutations in noncoding regions of genes, such as UBE3C, CXorf28, and EXD3, as well as structural variants and copy number changes. Although several genes could be identified in multiple clusters, certain genes, such as NF1 and PPM1D were more frequent in cluster H3-Medulla, while the percentage of TP53 mutations was higher in H3-Pons. Also, we identified the cluster IDH in brainstem gliomas, with tumors in this cluster harboring co-occurring IDH1, TP53, and ATRX mutations. The majority of these IDH cluster tumors were restricted to adult patients, consistent with previous studies focusing on pediatric brainstem tumors and showing very rare IDH mutations in these pediatric tumors. This indicates age is a key factor in developing brainstem glioma with IDH1 mutation.
This comprehensive study of brainstem gliomas provides an overview of this heterogeneous tumor entity. Using an integrated genomic analysis of more than one hundred brainstem gliomas from various anatomic locations, we show the promise of molecular profiling of brainstem tumors for improved tumor classification and understanding of their molecular underpinnings, and identify new potential therapeutic targets, all to improve outcomes for these patients.

Methods
Sample collection and cohort characteristics. Brain tumor and peripheral blood samples were collected from patients at Beijing Tiantan Hospital, Capital Medical University, China between 2013 and 2017 with informed consent reviewed by Institutional Review Board of Beijing Tiantan Hospital, with accreditation of the Association for the Accreditation of Human Research Protection Program. Biopsy or resected tumors were for clinical diagnosis and therapy. All the FFPE and snapfrozen tumor tissues used for sequencing were reviewed by an experienced team of neuropathologists at Beijing Tiantan Hospital, Captial Medical University. Tissues whose tumor content were less than 70% were excluded from subsequent sequencing. 126 leftover samples were used in this analysis. Among these samples, 97 samples were used for whole genome sequencing, 123 for methylation microarray, 75 for RNAseq, and 21 samples for panel sequencing. Clinical information and survival data are available for these patients. Kaplan-Meier analysis, log-rank test, and Cox proportional hazards regression model (R package survminer) were used to test for survival analysis.
Whole genome sequencing and RNA sequencing. Whole exome sequencing, RNAseq, and panel targeted sequencing (for 68 common mutated brain tumor genes) were performed by GenetronHealth, Beijing, China. For whole genome sequencing and panel sequencing, BWA was used for alignment and GATK mutect2 was utilized for variant calling. For RNAseq, STAR, or hisat2 was use for alignment, cufflinks was used for gene expression profiling, Htseq and edgeR were used for differential counts analysis, and GSEA was used for gene sets analysis.
Methylation microarrays. The Illumina HumanMethylation450 BeadChip and Infinium MethylationEPIC BeadChip were used for assessing genome-wide methylation profiling of 123 samples. GenomeStudio Methylation Module was used for data processing and quality check. Hierarchical clustering, t-distributed stochastic neighbor embedding (tSNE), and principal component analysis by R package Rtsne, and pheatmap with linkage method WPGMA and Euclidean distance were performed for evaluation of subgroups 18,[45][46][47] .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Whole Genome Sequencing data of this study has been deposited to Genome Sequence Archive in BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences 48,49 , https://bigd.big.ac.cn/gsa-human, accession number HRA000092, and RNAseq, panel targeted sequencing, and methylation microarray data to the European Genome-phenome Archive (EGA, http://ega-archive.org) under accession number EGAS00001004341. The deposited and publicly available data are compliant with the regulations of the Ministry of Science and Technology of the People's Republic of China.
MB Midbrain tegmentum (11) Tectum (5) Pontomesencephalic junction (2) Focal pontine tumor (5) Focal intrinsic medullary ( II  III IV  I  II  III IV  I  II  III IV  I  II  III IV   GBM Fig. 7 Overview of subtypes of brainstem glioma Integrative analysis of brainstem gliomas revealed four different subtypes with distinct clinical, genetic, and epigenetic characteristics. The illustration on the top shows the locations of tumors selected for this study. Numbers in the parentheses indicated case counts in the code index respectively. Based on methylation patterns, brainstem gliomas could be differentiated into four subtypes: H3-Pons, H3-Medulla, IDH, and PA-like. Clinical information was listed in columns. Tumor location, histopathology, grade, and diagnosis of DIPG are showed in frequency as barplots. Survival data was showed as Kaplan-Meier curves, and age at diagnosis is represented as density plot. Several hallmark mutations are selected and showed in frequency as barplots.