Main

The mutational landscape of invasive breast carcinoma has been extensively documented in recent landmark studies.1, 2, 3, 4, 5 However, little is known about the mutational profile of ductal carcinoma in situ (DCIS). Studies have addressed mutations of individual genes6, 7, 8, 9, 10, 11, 12, 13 but until very recently, assessment of a broad panel of genes in DCIS has not been performed, largely due the challenges in obtaining DNA from DCIS cases compatible with highly multiplexed methodologies. To date, whole genome or whole exome sequencing has been performed in only 45 published DCIS cases in total, all using DNA from fresh frozen tissue.4, 14, 15 Therefore, the cases used to date have been highly selected, derived from large mass forming tumors with adequate sufficient tissue to spare after diagnostic samples were taken.

To document the mutational profile of DCIS we performed massive parallel sequencing of a comprehensive panel of cancer-related genes on formalin-fixed, paraffin-embedded derived DNA from 20 DCIS tumors representing a full range of phenotypes. Our aims were to (1) document the mutational landscape of DCIS of different grades, hormone receptor and ERBB2 status, (2) to identify differences between invasive and pre-invasive disease, (3) assess whether particular genomic alterations correlate with clinico-pathological parameters, and (4) determine whether genomic profiling of DCIS can be implemented in the routine diagnostic setting.

Materials and methods

Clinicopathological Parameters of Patients and Tumors

Twenty DCIS cases were selected to represent a range of tumor characteristics (Table 1) and to have sufficient DNA quantity and quality to ensure successful sequencing. The median age of patients at surgery was 56.5 years (range 29–90 years, mean 56.5 years). Ten cases were of high nuclear grade, seven of intermediate nuclear grade, and three of low nuclear grade. Sixteen cases were estrogen receptor (ER)-positive, 14 progesterone receptor (PR)-positive, and five cases were HER2 amplified. No long term clinical outcome data was available for these cases. Five cases had matched normal DNA available derived from adjacent breast tissue. Additional cases for GATA3 sequencing were obtained from Royal Melbourne Hospital as previously described.16 Approval for the study was obtained from the ethics committee of Peter MacCallum Cancer Centre (project numbers 02/26, 10/16, and 00/81).

Table 1 DCIS cohort

Sample Processing and Sequencing

Areas of DCIS were microdissected from formalin-fixed paraffin embedded sections and DNA extracted using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s instructions. A total of 500 ng per sample was used for sequencing. The KAPA Hyper Prep Kit (Kapa Biosystems, Wilmington, MA, USA) was used for library preparation with Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA) for library clean-up. Target capture was performed using the SureSelectXT Target Enrichment System (Agilent Technologies, Santa Clara, CA, USA), targeting the exons of 107 cancer-related genes (Supplementary Methods), including 61 genes of specific relevance to breast cancer, covering a total of 360 kb. Captured libraries were sequenced using a NextSeq500 sequencer (Illumina, San Diego, CA, USA).

Identification and Filtering of Variants

Sequence alignment and variant calling was performed by aligning the sequencer output to the reference genome using Burrows–Wheeler Aligner software17 and variants were called using the SNPPET algorithm provided in the Agilent SureCall software package (Agilent Technologies). The variants identified by the Agilent SureCall algorithm were filtered for non-synonymous variants in exonic or essential splice site locations with a variant allele frequency of at least 10%. The allele frequency cut-off of 10% was chosen to minimize inclusion of sequencing artefacts related to formalin fixation.18 As no matched normal DNA was submitted for sequencing, the 1000 Genomes database was used to exclude potential germline variants, which may represent non-pathogenic genetic variation in the population.19 The sequencing reads in regions containing variants were then visually inspected using Integrative Genomics Viewer to exclude potentially artefactual variants, such as those occurring in variant-rich regions or variants identified exclusively at read ends.

Validation of Variants by Sanger Sequencing

The variants identified by massively parallel sequencing were validated by Sanger sequencing. Normal DNA, where available (n=5), was also subjected to Sanger sequencing alongside matching DCIS samples. Additional samples were sequenced for GATA3 exons 5 and 6 using different primers with greater exon coverage. Sanger sequencing primers were designed using the Primer 3 tool.20, 21 Target sequences were amplified using primers and conditions listed in Supplementary Methods. The BigDye Terminator system (Applied Biosystems) was used for sequencing on a 3730 DNA Analyzer (Applied Biosystems). The sequencer output was viewed using Sequencer 4.8 software (Gene Codes Corporation, Ann Arbor, MI, USA) or Geneious (Biomatters, Auckland, New Zealand).

p53 Immunohistochemistry

Immunohistochemistry for p53 was performed using a Ventana BenchMark Ultra (Roche Diagnostics, USA) on 3 μm formalin-fixed paraffin embedded tissue sections of tissue microarrays containing 0.5–2 mm cores of DCIS with up to eight-fold redundancy as described previously.22, 23 Antigen retrieval was performed in a high pH Ultra cell conditioning solution (CC1, Roche Diagnostics) for 32 min at 100 °C. Sections were incubated with the p53 antibody (Novocastra Liquid Mouse Monoclonal DO-7, Leica Biosystems) at 1/100 for 24 min at 36 °C. The On-board detection system, OptiView Universal DAB Detection Kit (Roche Diagnostics) was used in accordance with the manufacturer’s instructions.

p53 immunohistochemical nuclear reactivity was scored for intensity (0=no reactivity, 1=weak, 2=moderate, 3=strong reactivity) and percentage of tumor cells positive to the closest 5%. Absence of p53 nuclear reactivity or nuclear reactivity in ≥60% of tumor cells was considered to be an abnormal pattern and suggestive of TP53 mutation.24 For cases represented by more than one tumor core, absence of p53 nuclear reactivity in all cores or ≥60% nuclear reactivity in any core was considered an abnormal pattern.

Copy Number Analysis

Copy number was generated using the CopywriteR tool in R25 using a normal DNA sample that was included in the same sequencing batch as the baseline control. Copy number data from invasive breast carcinomas was downloaded as Level 3 data from the TCGA Data portal. Data were imported to Nexus (BioDiscovery) with thresholds of gain and loss of ±0.2 log2 ratio. Fraction of the genome altered was calculated as the percentage of the genome in base pairs affected by gain or loss. Mutation data from TCGA was obtained from the cBio Portal.26 The results published here are, in part, based upon data generated by TCGA project established by the NCI and NHGRI. Information about TCGA and the investigators and institutions who constitute the TCGA research network can be found at http://cancergenome.nih.gov.

Results

Sequence Variants Detected

The median number of reads per sample was 20,054,319 (range 17 504 368–24 942 911; mean 20 300 466), with a median 16.0% being duplicate reads (range 10.2–27.7%; mean 16.6%). Total percentage mapped reads on average was 99.7% (median 99.7%; range 99.6–99.8%; mean 99.7%), and 60.1% were on-target reads (median 60.1%; range 51.8–66.1%; mean 60.1%). The median read-depth over the target regions was 1692 reads (range 1325–2031; mean 1651) and the average percentage of target bases with over 1000 reads was 99.1%. These parameters indicate very good technical performance of the assay, which is expected to yield good quality and reliable data.

There were a total of 11 145 variants in 103 genes identified by the Agilent SureCall algorithm. The median number of variants per sample was 600.5 (range 150–652; mean 557.3 variants per sample). Filtering based on inspection of the sequencing reads and the 1000 Genomes data set resulted in a reduction in the number of variants (81.9% and 73.8% of variants, respectively), suggesting that most of the variants in the target regions were likely sequencing artefacts or single nucleotide polymorphisms. After filters were applied, 52 candidate somatic variants in 25 genes remained and were validated by Sanger sequencing (Table 2).

Table 2 Variants (filtered) detected by sequencing panel

The majority of variants identified were validated by Sanger sequencing (48/50, 96%), apart from those occurring in FGFR1 and PALB2 (one variant each). It is unclear why these failed to validate, although with allele frequencies of 0.21 and 0.12, respectively, it is possible these were below the sensitivity of Sanger sequencing. The GATA3 variants of two samples (chr10:8111479insC, allele frequency 0.26 and chr10:8115709_8115711delGAC, allele frequency 0.35) were not validated due to insufficient DNA. However, variants occurring in the same regions as these two GATA3 variants were validated in other samples, giving confidence to the sequencing results (Table 2). Four variants in two patients were present in both the DCIS and matching normal DNA samples (Table 2). It is possible that other very rare germline variants remain in the validated variants, as some samples did not have matching normal DNA for comparison.

Mutated Genes in DCIS

Following Sanger validation there were a total of 46 variants in 19 genes over 20 samples. The cohort had a median of two variants per sample (range 1–4 variants per sample; mean 2.3 variants per sample). Forty variants (87%) were located in exons and the remaining six (13%) at essential splice sites. Similar to previous studies of DCIS,14, 15 single-nucleotide substitutions were the most frequent mutation type detected (33/46, 71.7%), with the most frequent substitution being C>T:G>A (14/33, 42.4%) (Figure 1).

Figure 1
figure 1

Mutation types in DCIS. (a) Distribution of mutation types. (b) Distribution of single nucleotide substitutions. Both consider only filtered variants; known germline variants excluded.

PIK3CA was the most frequently mutated gene, harboring 12 mutations in 11 cases (11/20, 55%). All the PIK3CA mutations were missense mutations. There were ten known activating PIK3CA mutations, five in the helical domain (one E542K, one E545K, and three E545A mutations) and five located in the kinase domain (one H1047Y and four H1047R mutations). In addition there was a E542V mutation (helical domain), predicted to be damaging and deleterious by the SIFT27 and PROVEAN28 algorithms, respectively. One mutation (E726K) was predicted to be non-deleterious.27, 28 One case had co-existing E542K and E545A mutations, which were mutually exclusive on the sequencing reads with differing allele frequencies (0.23 and 0.10, respectively), suggesting that these mutations were present in separate clones rather than being mutations on separate alleles of the same clone.

The next most commonly mutated genes were GATA3 (9/20, 45%) and TP53 (6/20, 30%). In contrast to PIK3CA, only one of the nine GATA3 mutations was a missense mutation, the remainder being four splice site mutations, three frameshift insertions, and one in-frame deletion. Both the missense mutation and in-frame deletion were predicted to be damaging and deleterious by the SIFT27 and PROVEAN28 algorithms, respectively. All but one of the mutations affected exons 5 and 6, consistent with the mutation types previously reported for this gene.5 Of the TP53 mutations, three were missense mutations, two splice site mutations and one was a frameshift deletion. The missense and splice site mutations are known deleterious mutations recorded in the IARC database.29 Co-existing mutations of GATA3 and TP53 were present in one case, of PIK3CA and TP53 in three cases, and of PIK3CA and GATA3 in four cases (Figure 2). RUNXI (Figure 3) and TSC2 were mutated in two cases each (2/20, 10%). The remaining genes were mutated in one case each (1/20, 5%, Table 2 and Figure 2). Of note, one case (P124) had two truncating mutations in PIK3R1 suggestive of bi-allelic inactivation.

Figure 2
figure 2

Mutations and copy number alterations identified in DCIS (germline variants excluded).

Figure 3
figure 3

RUNX1 mutations. (a) Sample P44 with chr21:36252866G>C. (b) Sample P89 with chr21:36206724 insG. In both panels the Integrated Genome Viewer plot of the targeted sequencing bam file is shown at top, while the Sanger sequencing validation electrophoretogram is shown below.

The proportion of GATA3 mutated cases observed (45%) was considerably higher than that observed in studies of invasive breast cancer (4–22%).4, 5, 30, 31 We therefore evaluated GATA3 mutation status in a further 91 pure DCIS cases using Sanger sequencing of exons 5 and 6 (as DNA was limited). Non-synonymous or splice region mutations were identified in 21/91 cases (23%, Supplementary Table 1), for a final frequency of 30/111 cases (27%). This frequency is significantly higher than observed in invasive breast cancer (TCGA 54/507, P<0.0001, Fisher’s exact test), and remains significantly more when only those with known ER positive status are considered (25/70 (36%) compared with 54/390 (14%) in TCGA, P<0.0001). We observed a higher proportion of missense mutations in our cohort than expected compared with invasive cohorts (8/30), nevertheless it is possible these are rare polymorphisms not represented in any existing databases, as no matching normal DNA was available to remove germline variants. However, even considering only overtly deleterious mutations (splice site and truncating), we still observed a significantly higher frequency than expected from TCGA data (19% all, P=0.02, 27% ER-positive only, P=0.007).

We hypothesized that GATA3 mutations might predict a better outcome, however, within ER positive cases treated by wide local excision, there was no association with recurrence (P=0.20, Cox log-rank test), although the power in this analysis was limited, with only seven events and no cases of invasive recurrence.

Mutations and DCIS Phenotype

There was no difference in the median number of mutations or the median number of genes mutated by nuclear grade, ER status, PR status, or HER2 amplification status (Table 3). However, mutations of some genes did show an association with DCIS phenotype. TP53 mutations occurred exclusively in high nuclear grade DCIS (P=0.011) and occurred more frequently in PR-negative DCIS (4/6, 66.7%) compared with PR-positive DCIS (2/14, 14.3%) (P=0.037). There was a trend of TP53 mutations occurring more frequently in ER-negative DCIS cases but this association did not reach statistical significance (3/4, 75% ER-negative vs 3/16, 18.8% ER-positive DCIS, P=0.061). No association was observed between the presence of PIK3CA mutations and DCIS phenotype. Mutations of other genes occurred in too few cases for a meaningful assessment of association with phenotypic features.

Table 3 Associations between mutations and DCIS phenotype

Copy Number Alterations

We undertook genome-wide copy number analysis using the CopywriteR algorithm, which uses the off-target reads to estimate copy number.25 Two cases, P13 and P92, gave poor quality copy number output and were excluded from further analysis. All the remaining 18 cases had at least one copy number event, and the copy number profiles generated were similar to those we previously reported for pure DCIS32, including common gains on 1q, 8q, 17q, and 20q and frequent losses on 8p, 11q, 16q, and 17p. There was good correlation between the copy number values generated by the on- and off-target reads for the genes in the panel (Spearman r=0.96, P<0.0001, Figure 4), and ERBB2 amplifications detected by sequencing were consistent with SISH data.

Figure 4
figure 4

Integration of mutation with copy number data. (a) Good correlation between on- and off-target copy number predictions was observed. (b) Fraction of the genome altered by copy number (FGA) by TP53 mutation status. (c) Number of telomeric imbalances (NTAI) by TP53 mutation status. (d) Comparison of copy number profiles of TP53 mutant vs wild-type tumors (blue=gain, red=loss). Bar at the top indicates statistical significance at P<0.05.

Copy number profiles were evaluated for various measures of genomic instability, including overall aberration type, the fraction of the genome altered by copy number and a measure of homologous recombination deficiency, the number of telomeric allelic imbalances.33 Seven cases had simple profiles (all were low or intermediate grade and ER positive), three cases had a background of simple copy number change but with two or more high-level amplifications, and the remaining eight cases had highly complex profiles, most with high level amplifications. No cases had chromothripsis. The fraction of the genome altered by copy number ranged from 2 to 47% (median 12.9%). TP53 mutant tumors had a significantly higher fraction of the genome altered by copy number than wild-type tumors (Figure 4, P=0.005, Mann–Whitney two-tail test). The median telomeric imbalance score for the 18 DCIS cases was 8 (range 0–22), slightly lower than the median of 12 in invasive breast cancer.34 In our previous cohort of 53 DCIS with molecular inversion probe array data, the median was 8.5, thus we do not think the score is reduced by not having allelic imbalance information available. The telomeric imbalance score was significantly associated with TP53 mutation (Figure 4, P=0.001) and high nuclear grade (P=0.01).

We integrated our copy number and mutation data to investigate associations of particular mutations with copy number profiles. All TP53 mutant tumors had a complex copy number profile and were significantly enriched for gain of ERBB2, 3q, and 20q and loss of 9p, 17p, and X (Figure 4). In invasive breast cancers from TCGA, 3q gain was also strongly associated with TP53 mutation but the other regions were not significant at a threshold of >25% difference in frequency. All TP53 mutant tumors had either amplification (n=4) or gain (n=2) of ERBB2, similar to Abba et al14 (amplified n=4, gain n=1). To further investigate this association, we compared p53 immunohistochemistry with ERBB2 amplification status by silver in situ hybridization in a wider DCIS cohort, using tissue microarrays including both pure DCIS (n=187) and DCIS associated with invasive breast cancer (n=19). Abnormal p53 staining was observed in 68/187 (36.0%) pure DCIS and 9/19 (47%) mixed DCIS, with the majority of abnormal cases (57%) showing over-expression (Table 4). There was no significant difference in p53 positivity between pure and mixed DCIS (P=0.46). A significant association of p53 abnormal staining with ERBB2 amplification was observed in pure DCIS (P=0.033). Half of ERBB2 amplified pure DCIS cases were p53 abnormal, compared with 30% of ERBB2 non-amplified. Considering only p53 over-expression, 38% of ERBB2 amplified were p53 over-expressing, compared to 16% of ERBB2 non-amplified (P=0.003). Abnormal p53 protein was also significantly associated with high nuclear grade, ER negativity, and PR negativity (Table 4).

Table 4 p53 immunohistochemistry

Twelve of the twenty DCIS cases submitted for panel sequencing had p53 immunohistochemistry data. All five TP53 mutants with protein data were classified as p53 abnormal (either entirely negative for a truncating mutation or strongly positive in >60% of cells for a missense mutation; sensitivity 100%). Six of the seven cases classified as TP53 wildtype by sequencing were p53 normal by immunohistochemistry (specificity 85.7%), with the one discordant case negative for protein. Immunohistochemistry is therefore a fair proxy for mutation status in DCIS using these thresholds, given likely tumor heterogeneity.

PIK3CA-mutated tumors were more likely to have loss of chromosome 18, while GATA3-mutated tumors were less likely to have loss of 8p. Neither observation was significant in TCGA invasive breast cancer data. Neither PIK3CA nor GATA3 mutations were associated with a significant difference in the fraction of the genome altered by copy number or the number of telomeric imbalances.

Discussion

Little is known about the mutations that occur in DCIS. We therefore used massively parallel sequencing of a targeted panel of 107 cancer-related genes to determine mutations and copy number alterations in 20 DCIS cases of various phenotypes. The panel was selected to represent frequently mutated breast cancer drivers and clinically actionable genes. The technique was shown to be robust for the detection of genomic alterations in formalin-fixed paraffin embedded derived DNA (96.0% validation rate). Previous studies examining mutations of a broad panel of genes in DCIS have utilized DNA from fresh-frozen tumor specimens, but this study shows that interrogation of formalin-fixed paraffin embedded-derived DNA can now be applied to a variety of DCIS cases from hospital-based series, including cases archived for up to nine years at the time of DNA extraction.

The rate of variants in DCIS samples was higher (5.6 mutations/Mb) than that previously reported for DCIS (1.61 mutations/Mb)14 and invasive breast cancer (1.66 mutations/Mb)4 and is likely to be due to the use of a panel focussed on genes known to be mutated in breast cancer. Although potential germline variants might have contributed to this (these were unable to be completely excluded as few cases had matched normal DNA available), the number of these germline variants is expected to be small, as stringent population filters based on large public databases were applied.

The most prevalent single-base substitution observed was the C>T:G>A alteration, accounting for 43% of single-nucleotide substitutions detected. The C>T:G>A alteration forms a major part of three of the five mutational signatures identified in breast cancer35 and is thought to occur through the spontaneous deamination of 5-methyl-cytosine which accumulates with age35, 36 and overactivity of the APOBEC family of cytidine deaminases.35, 37 Although C>T:G>A alterations are also characteristic of formalin-related DNA damage18 the proportion of C>T:G>A alterations is similar to that reported in fresh-frozen samples of invasive breast cancer (~40%)4 and DCIS (~50%)14, 15, so the C>T:G>A alterations detected are unlikely to represent formalin-fixation artefacts.

Mutations of 19 genes were identified including those involved in DNA repair and cell cycle control (ARID1A, ATM, CDKN2A, NBN, RAD51D, TP53), the PI3K/AKT/mTOR pathway (PIK3CA, PIK3R1, TSC2), the Hedgehog pathway (PTCH1), transcription factors (CBFB, GATA3, RUNX1, TBX3), receptor tyrosine kinases (ERBB2, FGFR4), serine/threonine-protein kinases (STK11), other tumor suppressor genes (NF2), and splicing factors (SF3B1), suggesting that multiple mechanisms are disrupted in the development of DCIS. Notably, unlike Abba et al,14 we only saw one case that had no common driver (ie, wild-type for TP53, PIK3CA, ERBB2, and GATA3). Although this case (P124) had a relatively stable genome (5% affected by copy number), the copy number changes included a CCND1 amplification, CBFB loss, and CDH1 loss. The sample also carried two PIK3R1 truncating mutations. Thus, DCIS cases of all phenotypes carry mutational drivers reminiscent of invasive breast cancer, although the frequency of some events, such as GATA3 mutations, may vary. Analysis of more DCIS cases with matching normal DNA will be required to determine whether the more rarely mutated genes are indeed more frequently affected in DCIS than invasive breast cancer.

The prevalence of PIK3CA mutations in this cohort (55%) is somewhat higher than that previously reported for DCIS (17–48%)7, 8, 14, 15, 38, 39 and invasive breast cancers (25–36%).4, 5, 40, 41 The relatively small number of cases included in this and other DCIS each studies (involving between six and 202 cases),7, 8, 14, 15, 38, 39 may account for the variation in prevalence. Alternatively, the high read-depth of our targeted sequencing may have led to increased detection rates compared with exome or Sanger studies: four DCIS cases had a PIK3CA variant frequency of <20%, which could have been missed by less sensitive methods. The role of PIK3CA mutations as markers of tumor progression in DCIS is uncertain. PIK3CA mutations in invasive breast cancers have been associated with favorable tumor features41 and better prognosis in patients with ER-positive, HER2-negative tumors;40 however, we saw no association of PIK3CA mutations with clinico-pathological features in our cohort.

The GATA3 mutations were mostly observed in ER-positive cases (93%), in keeping with the reported association of GATA3 mutations with luminal breast cancer.5, 42 Only one case had both GATA3 and TP53 mutations, reflecting a pattern reported in invasive breast tumors where mutations of these two genes rarely co-exist.42 Four DCIS cases had both GATA3 and PIK3CA mutations, in contrast to invasive carcinomas in which mutations of these two genes are almost mutually exclusive.1, 42 In one case the allele frequencies suggest that the GATA3 and PIK3CA mutations could be present in different clones.

The prevalence of GATA3 mutations in the current DCIS cohort is significantly higher than that reported in invasive breast cancer (4–22%)4, 5, 30, 31 and in DCIS. Recently, Abba et al14 reported a GATA3 mutation rate of 7% in 30 pure high grade DCIS cases while Kim et al15 observed GATA3 mutation in one low grade DCIS out of a cohort of in six DCIS lesions of various nuclear grades (16.7%). If the prevalence of GATA3 mutations observed in the current cohort is truly reflective of the general DCIS population, the high prevalence of GATA3 mutations in DCIS compared with invasive carcinomas suggests that GATA3 mutations, like HER2 amplification, are not selected for during the transition to invasive disease and perhaps may indicate better prognosis, similar to that observed in invasive carcinomas.42 Our extended cohort did not have sufficient invasive recurrences to be able to test this hypothesis.

TP53 mutations were present in exclusively in high grade DCIS, and half the mutations were missense; both these features are consistent with previous reports of TP53 mutations in both invasive and in situ breast cancer.1, 4, 5, 6, 12, 13 We also noted an association of abnormal p53 protein with ERBB2 amplification. In invasive breast cancer from TCGA, TP53 mutation is also significantly associated with ERBB2 amplification (P<0.0001), and the proportion of ERBB2 amplified cases with TP53 mutation (61/118, 52%) is similar to our pure DCIS cohort (50%), despite the different methods used (exome vs immunohistochemistry and SNP arrays vs in situ hybridization for p53 and ERBB2, respectively). A key question in DCIS biology is the relatively higher proportion of cases with ERBB2 amplification compared to invasive cases: our data would not support a role for TP53 in explaining this difference.

We identified RUNX1 mutations in two DCIS cases, which has not previously been reported in DCIS but occurs in 4–5% of invasive breast carcinomas,4, 5 where mutations are associated with luminal B expression profiles and high grade tumors1. The DCIS cases with RUNX1 mutations were both ER positive, HER2 negative and high or intermediate grade, consistent with a luminal B phenotype. RUNX1 is required for ER localization and regulation of target genes43 and is believed to function as a tumor suppressor gene in breast cancer.44 We also observed a mutation of CBFB which was mutually exclusive with RUNX1 mutation, consistent with their role as subunits of a heterodimeric transcription factor. The CBFB-mutated case was also ER positive, HER2 negative and intermediate grade. The mutually exclusive relationship between mutations of RUNX1 and CBFB, and also PIK3R1 and PIK3CA is observed in invasive breast cancer.4, 5

As the panel assays genes predominantly altered in invasive breast cancer rather than an agnostic exome or genome approach, mutations unique to DCIS were unable to be identified. Nevertheless, a targeted approach has several advantages, including high read-depth, which enabled detection of mutations even at allele frequencies of 10%. In addition, the cost was relatively low and bioinformatics analysis quick and straightforward. We were also able to obtain good quality copy number data from almost all samples, which enabled copy number drivers such as CCND1 amplification and MAP2K4 homozygous deletion to be observed. Copy number alterations of MAP2K4 have not previously been reported in DCIS.

In conclusion, this study provides a snapshot of the mutational profiles of DCIS, incorporating both copy number and somatic point mutations. The entire spectrum of mutations in DCIS and even in an individual tumor is unlikely to have been documented in this study due to the small cohort, targeted gene panel approach and limited sampling of the lesions. Nonetheless, mutations were present in all samples suggesting that mutational processes have a role in DCIS biology. While generally similar mutational patterns to invasive breast carcinomas were observed, there was a surprisingly high prevalence of GATA3 mutations. Given that GATA3 mutations are associated with improved survival in patients with invasive breast cancer, it is hypothesized that GATA3 mutations in DCIS may be a marker of less aggressive behavior. In contrast, TP53 mutations were associated with adverse tumor characteristics. The role of copy number alteration, telomeric imbalance and specific gene mutations as prognostic biomarkers will require testing in a suitable cohort of DCIS with outcome data available. The panel testing approach used here will be an appropriate methodology to address these questions.