Integrative genomic analysis of salivary duct carcinoma

Salivary duct carcinoma (SDC) is one of the most aggressive subtypes of salivary gland cancers. Conventional chemotherapy and/or radiation have shown only limited clinical efficacy in the treatment of recurrent or metastatic SDC. Currently, clinically approved targeted-therapeutics are not generally applicable except in very limited cases, and there exists a strong need for the development of treatment against this unique tumor type. To further interrogate genomic features of SDC, we have conducted multi-omic profiling of the SDC to describe the genomic alterations prevalent in this disease. Whole-genome sequencing, whole exome-sequencing and transcriptome sequencing were performed on a discovery cohort of 10 SDC samples. Targeted genomic profiling was performed in additional 32 SDC samples to support the findings obtained from the original discovery cohort. The cancer cohort was characterized by an average mutation burden of 85 somatic exonic mutations per tumor sample. The cohort harbored a mutational signature of BRCA and APOBEC/AID. Several genes, including TP53, RB1, SMAD4, HRAS, APC, PIK3CA and GNAQ were recurrently somatically altered in SDC. A novel fusion gene, generated by genomic rearrangement, MYB-NHSL1, was also noted. Our findings represent a significant layer in the systematic understanding of potentially clinically useful genomic and molecular targets for a subset of recurrent/metastatic SDC.


Supplementary Materials
Supplementary Methods S1: Biospecimen collection and clinical data: Tumor samples were collected from SDC patients with surgical resection. Each fresh-frozen tumor sample was collected with corresponding adjacent normal tissue or matched blood samples. Each case was reviewed by two or more independent pathologists at Samsung Medical Center For sequencing studies, we initially collected 30 SDC tissue samples with matched adjacent normal tissue or blood samples from the same individual. 14 samples passed initial quality control screening for DNA and RNA sequencing studies. Of these, 4 samples were further excluded on the basis of their low tumor cellularity. Final 10 samples were sequenced at multiple levels, including whole-genome, whole-exome and transcriptome sequencing.
Additional 37 samples were further screened from FFPE tissue archives to identify their somatic mutations with Ion-torrent Ampliseq assay. Copy number alterations were determined by nCounter Nanostring assay. Out of the 37 samples, 5 samples were removed from the final analysis for their quality control issues. Excluding these cases, the final set of composite SDC mutations consists of 10 samples in the discovery cohort analyzed with multi-dimensional sequencing and 32 additional validation samples.

Supplementary Methods S2: Transcriptome sequencing
Total RNA from each tumor sample was prepared using RNeasy mini kit (Qiagen, Germany). Transcriptome sequencing was performed on 10 SDC discovery cohort samples with adequate amount of RNA (1ug) and integrity (RIN> 7.0, rRNA ratio > 1.5). RNA library for sequencing on the illumine Hiseq 2000 was generated according to the protocol for the Illumina TruSeq sample preparation kit. Analysis of transcriptome-sequencing data was performed with RNA-fusion was detected by running three different algorithms in parallel. These include CHIMERASCAN, DEFUSE and FUSIONMAP. Fusion transcripts supported by more than two pipelines with adequate reliance-score were selected as primary-fusion candidates.
Supplementary   Non-silent recurrent mutations identified from whole-exome sequencing and targeted-sequencing were summarized as above. Shown in the table are their gene names, number of non-silent mutations (nnon) identified from the cohort and number of independent patients harboring such mutations (npat). Unbiased hierarchical clustering of contributing weights of each mutational signature to the samples in the current study yielded three mutation signature clusters. The violet cluster in the middle commonly possessed signature 3, which was associated with BRCA-related mutational processes. The dark orange cluster was characterized by signature 16. APOBEC signatures (signature 2,13) were more broadly interspersed between the clusters. An outlying sample (sd01) consisted of another signature cluster. This sample displayed a low rate of somatic mutation with a distinct mutational process implicated.
Left column represents signatures contributing to the current SDC cohort, bottom row represents patients' sample ID and color key represents samplewise normalized representation of contributing mutational signatures.   Whole genome sequencing data of each of SDC matched pairs were used to perform analysis of somatic copy number alterations. Depicted below is the copy number representation from FACETS algorithm. X-axis is the chromosomal number and y-axis is ploidy and absolute copy number estimation.
Supplementary Fig. 5: Statistical annotation of recurrently amplified/deleted genomic regions in SDC cohort The segmentation output from FACETS/BIC-seq2 was used as an input file to GISTIC 2.0 analyses. GISTIC false discovery rate (FDR) q-values (x-axis) are plotted across genomic and chromosomal location.
Shown red are chromosomal region with focal amplifications, and depicted blue are those with focal deletions. Chromosomal Region are marked with peaks that exceed certain q-values.
Regions of focal copy number amplification detected by GISTIC 2.0 are annotated in the supplementary data file. Listed in the file are the names of the genes within the peak regions of copy number alterations.  Fig. 9: Immune-profiling of SDC tumor samples Tumor RNA sequencing data was used to de-convolute the profile of immune cells in SDC surgical tissue samples. Shown below is the relative composition of various immune cells in tumor compartment. Shown on the right legend is the colored definition of immune compartment that was used to de-convolute the immune composition in SDC samples. Supplementary Fig. 10: Tumor-map representation of SDC RNA sequencing data was processed following the processing pipeline of TCGA data. In the updated pan-can tumour map, the RNA transcriptome of each sample was mapped according to their RNA expression profile and represented in 2-D principal component analysis plot. The most closely matched tumour clusters in the pan-can tumour map space was matched with each samples' RNA sequencing data. Shown below is the tumor map representation of 10 SDC transcriptome samples. Red balloon marks represent each of 10 SDC samples. 8 out of 10 SDC samples were closely clustered with breast cancer subtypes (represented in pink in the upper left corner of the figure below).
Two samples were not co-clustered with BRCA (one orphan assignment and the other in the head and neck squamous cell cancer, pale blue in the figure above). Histological examination of these two samples indicated slight squamous differentiation, reflecting their assignment out of breast cancer attribute.
Shown below is the enlargement of breast cancer tumor map clusters. Coclustered with Her2 subtype of BRCA were 3 SDC samples, two out of which displayed ERBB2 focal somatic copy number alterations (3 red balloon marks in the yellowish tail). 4 other samples were co-clustered with LumA subtype (blue cluster), where 1 sample was co-segregated with Basal subtype of BRCA.