The genomic landscape of 85 advanced neuroendocrine neoplasms reveals subtype-heterogeneity and potential therapeutic targets

Metastatic and locally-advanced neuroendocrine neoplasms (aNEN) form clinically and genetically heterogeneous malignancies, characterized by distinct prognoses based upon primary tumor localization, functionality, grade, proliferation index and diverse outcomes to treatment. Here, we report the mutational landscape of 85 whole-genome sequenced aNEN. This landscape reveals distinct genomic subpopulations of aNEN based on primary localization and differentiation grade; we observe relatively high tumor mutational burdens (TMB) in neuroendocrine carcinoma (average 5.45 somatic mutations per megabase) with TP53, KRAS, RB1, CSMD3, APC, CSMD1, LRATD2, TRRAP and MYC as major drivers versus an overall low TMB in neuroendocrine tumors (1.09). Furthermore, we observe distinct drivers which are enriched in somatic aberrations in pancreatic (MEN1, ATRX, DAXX, DMD and CREBBP) and midgut-derived neuroendocrine tumors (CDKN1B). Finally, 49% of aNEN patients reveal potential therapeutic targets based upon actionable (and responsive) somatic aberrations within their genome; potentially directing improvements in aNEN treatment strategies.

Subsequent alignment, somatic mutation detection and in silico tumor cell percentage estimation were performed in a uniform manner as detailed by Priestley et al. (2019). Briefly, paired-end sequencing reads were aligned against the human reference genome (GRCh37) using ). Duplicate reads were marked and small insertion and deletions (InDels) were realigned using GATK IndelRealigner (v3.4.46). Prior to somatic SNV and InDel variant calling, base qualities were recalibrated using GATK BQSR (v3.4.46). Somatic SNV, InDels and MNV were called by Strelka (v1.0.14) using the matched peripheral blood WGS sample for matched-normal variant calling.
Additional in-depth settings and optimizations of the HMF pipeline are described by Priestley et al. (2019) and tools are available at https:// github.com/hartwigmedical/.
SIFT and PolyPhen-2 scoring was applied for additional functional effect prediction.
During downstream analysis, we only retained SNV, InDels and MNV which passed all of the following heuristic filters; default Strelka filters (PASS-only), gnomAD exome (ALL) allele frequency < 0.001, gnomAD genome (ALL) < 0.005, not present in " 5 samples from the Hartwig Medical Foundation germline panel-of-normals (GATK Haplotyper) and not present in " 3 samples from the Hartwig Medical Foundation Strelka-specific somatic blacklist.
Putative protein-altering (coding) or high-impact (e.g., splicing) mutations were aggregated per sample and gene by selecting the most deleterious annotated effect (from VEP) on any known overlapping gene-wise transcript (except those transcripts flagged as retained intron and nonsense mediated decay). In addition, structural variants with a Tumor Allele Frequency (TAF) " 0.1, as calculated by PURPLE and GRIDSS84, that overlapped only partly with the respective coding sequences (i.e., not all exons of the respective gene), were annotated as 'Structural Variant' mutations. Multiple coding mutations and/or SV per gene were annotated as 'multiple mutations'.
Discovery of somatic structural variants (SV), copy-number alterations and in-frame fusions of EWSR1 was performed using the GRIDDS, PURPLE and LINX suite.84 During the downstream analyses, we only retained somatic structural variants passing all default QC filters (PASSonly) and with an upstream and/or downstream TAF " 0.1.
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.
Mean read coverages of the reference and tumor samples were calculated using Picard Tools (v1.141; CollectWgsMetrics) based on GRCh37. Genomic and coding tumor mutational burden (TMB) was calculated as previously described by van Dessel, van Riet et al. (2019).
Detection and annotation of recurrent copy-number alterations To detect recurrent copy-number alterations, we performed a GISTIC289 (v2.0.23) analysis over the entire mNEN cohort and, again, four separate GISTIC2 analysis on the major subgroups (mNEC, mNET and pancreas-and midgut-derived mNET).
The GISTIC2 was performed using the following settings: Genes were annotated to GISTIC2 peaks (q ! 0.1) based on the following strategy; GISTIC2 focal peaks (all_lesions.conf_95.txt) were overlapped to genes (from verified and manually annotated loci, no pseudogenes or readthroughs and from standard chromosomes; n = 36574) from GENCODE (GRCh37; v33), taking into consideration only the genes overlapping with at least 100 base pairs within the detected GISTIC2 peak. If a GISTIC2 focal peak overlapped with multiple GENCODE genes, a combined database containing known drivers detected in a metastatic pan-cancer dataset (CPCT-02), COSMIC Cancer Gene Census (v85) 6d). By comparing the cophenetic correlation coefficient, residual sum of squares and silhouette, we opted to generate seven custom de novo signatures. Custom signatures were correlated to existing (COSMIC v3) mutational signatures using cosine similarity. Detection of chromothripsis Shatterseek (v0.4) using default parameters was used to detect chromothripsis-like events. As input, we used the rounded absolute copy numbers (as derived by PURPLE) and structural variants with an TAF " 0.1 at either end of the breakpoint. The male sex chromosome (chrY) was excluded. The criteria for a chromothripsis-like event were based on the following criteria: a) total number of intra-chromosomal structural variants involved in the event "25; b) max. number of oscillating CN segments (2 states) "7 or max. number of oscillating CN segments (3 states) "14; c) total size of chromothripsis event "20 megabase pairs (Mbp); d) satisfying the test of equal distribution of SV types (p > 0.05); and e) satisfying the test of non-random SV distribution within the cluster region or chromosome (p ! 0.05).

Classification of homologous recombination deficiency genotypes
To determine Homologous Recombination Deficiency (HRD) due to possible loss-of-function of BRCA1 and/or BRCA2 (amongst others), we utilized the Classifier for Homologous Recombination Deficiency with default settings (CHORD; v2.0). CHORD uses a random-forest approach to classify samples into HR-deficient / HR-proficient categories. Briefly, we make use of CHORD31; a random-forest based classifier designed to classify samples with evidence of HRD (BRCA1-type, BRCA2-type or otherwise) by using all the information captured within all the somatic small mutations and somatic structural variants of whole-genome sequenced samples. If a sample contains sufficient HRD-related genomic scars (structural variants) and additional markers for HRD, that sample will be classified as HR-deficient (HRD).

Detecting enrichment of mutant genes within major subgroups.
To determine the enrichment of mutant genes within our major subgroups (mNEC, midgut-and pancreas-derived mNET), we generated a list of potential driver genes based on captured genes through our dN/dS (q ! 0.1) analysis and/or present within the focal amplification and deletion peaks captured by GISTIC2. We extended this list by selecting genes which contained a coding mutation in "20% of a respective subgroup or which harbored a deep amplification or deletion in "20% of the respective subgroup (i.e., 20% of the respective subgroup contained coding mutations and/or "20% contained a copy-number alteration, irrespective of coding mutation). Using this list of genes (n = 20), we performed a one-sided (enrichment) Fisher's Exact Test with Benjamini-Hochberg correction between each pairwise comparison per major subgroup against the remaining major subgroups (e.g., mNEC vs. the combined group of midgut-and pancreas-derived mNET). Inventory of clinically-actionable somatic alterations and putative therapeutic targets Current clinical relevance of somatic alterations in relation to putative treatment options or resistance mechanisms and trial eligibility was determined based upon the following databases; CiViC (Nov. 2018), OncoKB (Nov. 2018), CGI (Nov. 2018) and the iClusion (Dutch) clinical trial database (Sept. 2019) from iClusion (Rotterdam, the Netherlands). The databases were aggregated and harmonized using the HMF knowledgebase-importer (v1.7). This list was manually corrected for discrepancies and subsequently, we curated the linked putative treatments for current on-and off-label mNEN and mNEN-subtype treatment options, as defined within the Netherlands by the Dutch Medicines Evaluation Board ("College ter Beoordeling van Geneesmiddelen; CBG).

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative. No a priori sample size test was performed. We performed analysis on all the distinct mNEN samples which had been successfully sequenced within the CPCT-02 consortium (n = 85). Considering the rare occurrence of this malignancy, we considered this to be representative of mNEN as a whole. This is the largest (WGS) repository to date with previous (m)NEN cohort-analysis being performed on as small as~5-25 samples.
If multiple WGS samples were available from the same patient(s), these were excluded and only the first biopsy was used in order to prevent inflation of observed measurements. In addition, if the clinical records of the patients (among other biopsy location, prior therapy, sample identifiers etc.) were not all available, we did not take along these samples. This is further detailed within the manuscript and main figure 1.
In addition, following review of the associated pathological reports we've excluded a single sample prior to analysis.
A duplicate (WGS) cohort of mNEN is not available for replication/validation. We compared previous findings within the field of (m)NEN genomics and found them to be concordant with our cohort in to testify for the quality of our cohort as no other surrogate was available.
N/A, this study makes no use of separate experimental groups. Inclusion criteria of patients within the CPCT-02 and DRUP studies are detailed within the manuscript.
Blinding was not relevant as investigators specifically wished to make use of sample meta-data (biopsy location, prior therapy) within the various analysis to, for example, compare pancreas vs. midgut-derived mNEN. All results are based upon WGS-derived metrics as detailed within the manuscript.