Epigenetic landscape of pancreatic neuroendocrine tumours reveals distinct cells of origin and means of tumour progression

Recent data suggest that Pancreatic Neuroendocrine Tumours (PanNETs) originate from α- or β-cells of the islets of Langerhans. The majority of PanNETs are non-functional and do not express cell-type specific hormones. In the current study we examine whether tumour DNA methylation (DNAme) profiling combined with genomic data is able to identify cell of origin and to reveal pathways involved in PanNET progression. We analyse genome-wide DNAme data of 125 PanNETs and sorted α- and β-cells. To confirm cell identity, we investigate ARX and PDX1 expression. Based on epigenetic similarities, PanNETs cluster in α-like, β-like and intermediate tumours. The epigenetic similarity to α-cells progressively decreases in the intermediate tumours, which present unclear differentiation. Specific transcription factor methylation and expression vary in the respective α/β-tumour groups. Depending on DNAme similarity to α/β-cells, PanNETs have different mutational spectra, stage of the disease and prognosis, indicating potential means of PanNET progression.

The authors studied tumour DNA methylation (DNAme) together with genomic data in order to identify the cell of origin and pathways involved in PanNET progression. They studied genome-wide DNAme data of 125 PanNETs and used sorted α-and 13-cells as reference. To confirm cell identity, also ARX and PDX1 expression was studied. Based on epigenetic similarities, PanNETs clustered in α-like, 13-like (1/3) and intermediate tumours (2/3). Interestingly, CNAs were lowest in the former two subgroups. In addition, PanNETs with mutations in DAXX or ATRX were not found, explaining -in part (see below) -the shorter disease-free survival compared to wild type tumours. The two subgroups (alpha-and beta-like) were low in grading and also primarily T1 and T2, most likely functional (data not shown) and showed -not surprisingly -also a lower relapse rate. In detail, 14/19 of alpha-like PanNETs were G1 and 14/18 (1 without data) were of low stage (T1 or T2, Fig. 1A and table S1). Similarly, 8/14 beta-like PanNETs were G1 and 9/11 (3 without data) were T1 or T2. DNA methylation corresponded to the cell type assigned by TF expression. Tumours positive for PDX1 had the lowest risk of relapse. However, DAXX/ATRX status did not improve stratification of ARX+ samples for risk of relapse. The authors found different mutational spectra, stage of the disease and prognosis and suggest these findings (relevant for 1/3 of patients) as potential means. In summary, the presented data are novel and clinically relevant. However, some criticisms which should be ruled out.
We thank the reviewer for his positive feed-back. The reviewer is correct in stating that alpha-like and beta-like tumours are low in grade and primarily T1 and T2 tumours. Functionality of the tumours is reported in table S1. Notably, only 2/15 (13%, 4 cases NA) of the alpha-like samples were functional (1 glucagonoma and 1 insulinoma), while the vast majority (9/10,91%,4 NA) of the beta-like samples are insulinomas.

Criticisms:
1.1: The studied patient cohort encompasses a rather large number of small-sized PanNETs thereby limiting the validation parameter DSF/relapse rate, since small PanNETs are also clinically well-known for slow progression in contrast to larger tumors. Please, explain patient selection.

1.1
We thank the reviewer for this comment. The reviewer is correct, our collective includes 13% (16/123 -2 NA) of PanNETs smaller than 2cm. This reflects the surgical situation, as our collective is composed of a surgical rather than an oncologic series (Marinoni et al., 2014). We believe that such a collective composition is relevant to predict relapse: while for late stage tumours relapse is very likely to happen, only a portion of small low grade tumours will relapse hence the identification of those patients who need to be closely followed up or who may even need adjuvant therapy, is crucial. Most importantly, since we are interested in identifying cell of origin and way of progression, it is crucial to include in the study small tumours representing early phase of progression. Additionally, the main focus of our research are G1 and G2 tumours which most often have small size, G3 PanNET which always will relapse were excluded. 1.2 Why were patients included lacking correct stage, nodal and metastatic status? 1. 2 We agree with the reviewer that we had a relevant number of samples with no information. We were able to retrieve the information for 10 cases, which we added now in table S1. Nevertheless, we decided to include in the clustering analysis also the samples without TNM stage to increase the statistical power (by increasing the number) hence ensuring a more robust molecular tumour classification.
1.3 Furthermore, although a correlation between driver mutation status and epigenetic profiles across all PanNETs was observed, the clinical application in daily practice should be explained.  The overall picture in this report is that DNA methylation/bisulfite sequencing was used in a large primary cohort of pancreatic neuroendocrine tumors (125 PNETs) to categorize, cluster and predict the cell type of origin for PNETs, which the authors infer is an alpha cell a beta cell or a more primitive neuroendocrine progenitor. Their findings are confirmed in a second set of 65 PNETs. The authors suggest, based on DNA methylome analysis, that human PNETs can be divided into three or four broad groups, and these groups correspond to alpha-like cells, beta-like cells, and intermediate cells, and attribute the origin of PNETs as arising from alpha or beta cells in the endocrine pancreas. They further observe that the beta-like PNETs display largely benign clinical outcomes, the alpha cell-like have generally favorable outcomes (with a few exceptions), and the intermediate PNETs generally have unfavorable outcomes, and are associated with additional mutations or CNVs in DAXX, ATRX, and ALT or MEN1, which are well known to be associated with aggressive behavior and poor survival in PNETs. The authors conclude that adding DNA bisulfite sequencing to current targeted mutation sequencing studies in current use clinical use, as well as immunohistochemistry studies (for ARX, PDX1, insulin, glucagon), may provide additional diagnostic and prognostic data for patients with PNETs. Overall, the work is interesting, but poorly described, and overinterpreted. I am not sure there is much here that is new. Specific comments follow: We thank you the reviewer for the short overview of our work, however important points are missing: with our approach (not whole genome bisulphite sequencing but HumMeth450 BeadChip array) we were able to define epigenetically novel distinct PanNET subgroups, which overlap with genetic mutations. These subgroups are pro gnostically relevant, in addition to the mutational status. The subgroups furthermore allow the proposition of a progression model of human PanNET.

Perhaps most importantly, I am not sure we have learned much from this study. It is generally accepted that functional insulin-and glucagon-expressing
PNETs have a better prognosis than those that produce no hormones. It is also well established that PNETs with DAXX, ATRX, ALT with or without MEN1 mutations are more aggressive, more likely to metastasize, and have worse outcomes than PNETs that lack these mutations. Also, there is no mention of PTEN, mTOR pathways, PI3kinase pathway, TSC1, TSC2 mutations which may also be associated with aggressive tumors.

2.1
In our work, we clearly described for the first time on a large number of samples 4 epigenetic subgroups, with distinct: epigenetic, genetic, genomic aberration and prognosis. Our DNA methylation classification, besides confirming the clinical relevance of DAXX/ATRX loss, also identifies a new group of samples with poor prognosis (intermediate-WT), which cannot be detected by other types of molecular data.
The scope of the study goes beyond the mere description of prognosis associated with DAXX/ATRX mutation or functional tumours. It further focuses on identifying specific cell of origin and way of progression using DNA methylation alongside with gene mutations and CNA and on using these parameters to group clinically distinct PanNETs. We have been able to describe two distinct cells of origin for PanNETs, based on DNAmethylation, confirming the recent findings observed in Cejas et al. (Cejas et al., 2019), who however performed the study on a very limited number of samples (n=21). Additionally, Cejas and co-workers used a super-enhancer signature, while our approach, looking at genome-wide methylation signatures, is much broader. In conclusion, our findings move one-step forward in better defining PanNET groups, cell of origin and clinical implication.
The reviewer is correct in saying that other mutations such as member of mTOR pathways may be associated with aggressive behaviour. Following on the reviewer comment we added the information in table S1. These mTOR pathway mutations are usually occurring in about 15% of sporadic PanNETs (Jiao et al., 2011;Scarpa et al., 2017)  2.2. The details and results of the bisulfite DNA sequencing are inadequate and difficult to follow. What exactly was done? Was deep, next-gen bisulfite sequencing performed across the entire genome? Or more likely, chip assays for specific CpGs in specific regions? How were these sites selected? How many CpGs were assessed? Where specifically in the genome were they? And after filtering, 2131 were differentially methylated in PNETs vs. islet cells. This is a tiny number (there are ~30,000 CpG's in the insulin locus at 11p15.5-15.4 alone). And among these, 49 and 51 CpGs were in the regions of the "TFs". Again, these are very tiny numbers. And where in the genome specifically were these 49 and 51 differentially methylated CpGs?? Did they bear any relation to INS, PDX, ARX1, GCG loci or expression? Are they in promoters or enhancers? In imprinting control regions (ICRs)? Related to cell cycle regulatory genes? It is hard to ascribe any significance to such a small number of differentially methylated CpGs.

2.2
We thought that the method and technical details were exhaustively explained in the material and methods section, nevertheless upon the reviewer comment we made some repetitions in the results section. As reported in the material and methods we did not perform bisulfite sequencing but we used instead and array based approach.  (Sandoval et al., 2011). As stated in the materials and methods section (lines 370-372), probe filtering was performed as described in the ChAMP pipeline (Morris et al., 2014) and in details we filtered for: SNPs (SNP list generated from Zhou's Nucleic Acids Research paper in 2017 ), low quality probes (detection p value >0.01 and/or probes with <3 beads in at least 5% of samples per probe), all non-CpG probes, multi-hit probes (list come from Nordlund's Genome Biology paper in 2013 (Nordlund et al., 2013)) and finally for probes located on chromosome X and Y. After filtration, the PanNET sample matrix retained 363,665 probes and 125 samples (added at lines 88-89). All the relevant publications are referenced in the paper.

And after filtering, 2131 were differentially methylated in PNETs vs. islet cells.
This is a tiny number (there are ~30,000 CpG's in the insulin locus at 11p15.5-1 5.4 alone).

As already reported above, in our analysis we used chip based methylation data rather than bisulfite DNA sequencing data. The 2,131 differentially methylated sites refer to the comparison between samples of sorted alpha (n=2 samples) and sorted beta (n=2 samples) normal cells. Only sites with high differences in methylation (>20%) and low BH adjusted p value (adj. p value < 0.001) were selected in this case. The strict statistical thresholds and the low number of replicates might explain the number of CpGs reported. Nevertheless these numbers comply with numbers found in other studies, more in detail, Neiman and colleagues, comparing alpha and beta cell methylation (using the HumMeth450 BeadChip) show that: "genes active in one cell type and silent in the other
tend to share demethylated promoters, while methylation differences between α-and 13cells are concentrated in enhancers". Additionally, "Among over 450,000 sites analyzed from all autosomal chromosomes, they found 745 sites uniquely hypomethylated in 13cells, 353 sites uniquely hypomethylated in α-cells, and 3,753 sites commonly hypomethylated in both cell types" (Neiman et al., 2017). Table S2 includes all the 2,131 selected sites. In the table is reported location and features of the sites, we have added here a column reporting the specific chromatin state detected in normal islets by M. Thurner and colleagues (Thurner et al., 2018) as well as annotation about the imprinting control regions (only one probe is located on a paternal ICR). The attached figures depict number of DMPs (red) and total number of probes of the 450K array in the 15 pancreatic islet specific chromatin states (A) and the percentage of DMPs relative to the total number of probes in the specific chromatin states (B).

And among these, 49 and 51
CpGs were in the regions of the "TFs". Again, these are very tiny numbers. And where in the genome specifically were these 49 and 51 differentially methylated CpGs?? Did they bear any relation to INS, PDX, ARX1, GCG loci or expression? Are they in promoters or enhancers? In imprinting control regions (ICRs)? Related to cell cycle regulatory genes? It is hard to ascribe any significance to such a small number of differentially methylated CpGs. (Sandoval et al., 2011) and we agree with the reviewer that the selection of CpG sites located at regulatory regions of the genome ("closed weak enhancer", "lowly-methylated weak enhancer", "open weak enhancer", "closed strong enhancer", "open strong enhancer", "genic enhancer", "insulator" and "polycomb repressed states" -described in materials and methods, lines 392-398) results in a tiny number of probes (in total 102,666 probes).

As stated above we performed the analysis by using the HumMeth450 BeadChip, indeed this assay shows underrepresentation of enhancer/regulatory regions
The selection of sites, respectively to specific genes, was based on the list generated by Muraro et al. (Muraro et al., 2016) (line 108). In the study they describe the top 10 TF checkpoints specific for each of the 5 pancreatic endocrine cell types.

A better description of probe selection is included in lines 111-114.
Despite the small number of probes selected (49 and 51) we see a clear correlation between the clusters of samples defined by cell type specific TF checkpoints regulatory regions DNA methylation and expression of PDX1 and ARX as well as with the clusters defined by the phyloepigenetic trees (see figure 1

and table S1). [Redacted] we see high expression of the selected alpha/beta TF checkpoints in the alpha-like and beta-like tumours, respectively and gradual downregulation in the intermediate samples (see heatmap below). [Redacted]
[Redacted] The analysis we have performed is independent from any differential methylation between samples, providing a different approach for confirming the segregation of the samples according to the relative hypothetical cell of origin, when specific regions are selected. We have included as part of the manuscript 2 tables listing the 49 and 51 probes which are located in regulatory regions of the genome closed to specific alpha and beta TF checkpoints (table S3 and S4). None of the selected probes is located on an ICR (known ICRs were retrieved from WAMIDEX and igc.otago.ac.nz -extended annotations reported in A- GEOD-18809 -Illumina HumanMethylation450 BeadChip (v1.2, extended annotation) -description added at lines 399-401).

2.3
And the "transcription factor" data are confusing. The "TF"s are listed in Table 2. How specifically was this list generated? And to be clear, many of the genes listed do not encode TFs: for example, CDKN1C, BMP5, SMARCA1 are not TFs. And what is the point of the "TF" data? Muraro et al., the selection refers to figure 1E (Muraro et al., 2016). In this study the authors "... selected those genes that have been reported to function as transcription factors using the TFcheckpoint database (Chawla et al., 2013) (Table S4). Several genes and transcription factors found here have never been reported as markers for specific cell types of the human pancreas..."

Seit e 8/ 14
We recognized that we have oversimplified the definition of the genes as transcription factors, and we have corrected the text accordingly (TF checkpoints). Indeed some of these proteins are not direct transcription factors but involved in the regulation of transcription (CDKN1C reducing phosphorylation of the RNA polymerase II (pol II) Cterminal domain (CTD) (Ma et al., 2010), BMP5 acts as coactivator of the SMAD4 pathway and inhibitor of the STAT/JAK pathway (Chen et al., 2018), and SMARCA1 acting as ATPase which contributes to the chromatin remodeling complex involved in transcription (Peterson and Workman, 2000)). Additionally, it has been shown that CDKN1C is upregulated in NEUROG3 expressing cells, blocking cell cycle of the progenitor cells (Georgia et al., 2006). Similarly, induction of BMP signalling modulates cell fate choice between liver and pancreas (Rodríguez-Seguel et al., 2013). SMARCA1 shows downregulation with aging in islet cells (Rankin and Kushner, 2010) and specific developmental roles (Lazzaro and Picketts, 2001). Given the implication in regulating transcription processes and the involvement into islet cell development we decided to include the listed genes.
As pointed above the analysis we have performed is independent from any differential methylation between samples, providing a different approach for confirming the segregation of the samples according to the relative hypothetical cell of origin, when specific regions are selected. Furthermore, this analysis shows the similarity of intermediate-ADM to alpha-like tumours and the most likely different nature of intermediate-WT tumours, as discussed in lines 234-236.

The authors interpret the ARX/glucagon and PDX1/insulin expression data as implying that alpha cells and beta cells are the cells of origin in PNETs. I found this
unconvincing, and the authors also add a sentence to the Discussion (lines 286-288) acknowledging that time course studies would be required to confirm this hypothesis. But is there evidence that other endocrine progenitor cells in normal islets, such as delta, PP, ghrelin and other neuroendocrine cell types cannot be involved? It is not obvious to this reader that alpha or beta cells must be the cell of origin. Why not a primitive neuroendocrine cell rest left over from fetal pancreas development?
2. 4 We would agree with the reviewer if the data on ARX/glucagon and PDX1/insulin expression would stand alone. We recognize that we might have not been clear enough in the text in explaining our hypothesis. ARX and PDX1 expression have been used only to confirm the cell type identified by DNA methylation analysis and per se they are not an indicator of a specific cell of origin. On the contrary, our hypothesis on the cell of origin is based on the DNA methylation profiles and not on ARX/PDX1 expression.

Indeed, comparison of PanNET DNA methylation with the ones of sorted pancreatic cells including alpha, beta, acinar, ductal cells showed that a group of PanNET clearly resemble alpha and beta endocrine cells (alpha-and beta-like tumours)
, while an intermediate group shows low degrees of similarity to alpha or any cell type (see figure  1A, S2B and table S1). Nevertheless intermediate-ADM are more similar to alpha cells than other cell types (see figure 1A and table S1). Based on these evidences and given the recent publication in Nature Medicine from Cejas and colleagues (Cejas et al., 2019), we concluded that subgroups of tumours may originate either from alpha or from beta cells.
As stated in the introduction, DNA methylation profile is highly conserved along differentiation. For example, it was shown that cells with a common embryonic origin share more overlap in their DNA methylation than cells from the same tissues but with different origin (Kundaje et al., 2015), suggesting that common developmental origin is a primary determinant for global DNA methylation. For these reasons, epigenome is suitable to learn biologically meaningful relationships between cell type, tissues and lineages. As mentioned in the text, this approach has been used already in several studies in brain tumours to identify distinct cell of origin and developmental path (Gibson et al., 2010;Northcott et al., 2017) which were confirmed by scRNA-seq experiments (Hovestadt et al., 2019). Similarly, integrative analysis of colorectal adenoma, carcinoma, and crypt section methylomes shows that the methylation patterns of specific intestinal cell differentiation stages are maintained during colorectal carcinogenesis (Bormann et al., 2018).
In conclusion, we agree with the reviewer that intermediate tumours might arise from an adult ε, δ or PP cells, or even from a common progenitor cell but there is no methylome data or other evidence that supports this hypothesis. DNA methylation of alpha and beta like tumours instead strongly speaks in favour of an origin from alpha and beta endocrine cells. We do not exclude that intermediate tumours may originate from a precursor cell, especially the intermediate-WT for which DNA methylation profile do not clearly indicate a specific cell similarity. Upon the reviewer comments we rephrase this part in the discussion (lines 292-298).
2.5 It would be important to correlate the findings with a measure of proliferation (Ki67, mitotic index) since these are widely used in tumor prognosis.

2.5
We appreciate the reviewer suggestion and have performed the analysis for a subset of 56 samples for which data on Ki-67 was available. Results are shown in figure S6. The results are described in line 179 and  are reported in table S1.
2.6 Line 50-51. "Only a minority of PanNETs are functional, leading to clinical syndromes due to inadequate hormone secretion." This sentence on makes no sense and should be re-written.

2.6
This sentence has been corrected (lines 50-51): "Only a minority of PanNETs are functional, secreting inadequate hormones that lead to clinical syndromes".