## Introduction

Increasing adoption of breast cancer screening and advances in imaging capabilities have improved our ability to identify breast ductal carcinoma in situ (DCIS). Rarely diagnosed 40 years ago, DCIS now comprises nearly 20% of all breast cancer-related diagnoses1,2. Unfortunately, this progress has not resulted in decreased breast cancer mortality. Standard treatment, involving surgical excision often complemented with radiation therapy (in the setting of breast-conserving surgery) and endocrine recurrence risk reduction (particularly with ER + DCIS), therefore constitutes overtreatment, and not without treatment-related consequences for many2,3. DCIS progression is particularly difficult to study longitudinally due to the current standard of surgical excision of the lesion and the infrequent progression and/or occurrence of new primary lesions over a long timespan (5–10% after 10 years)4. Clinicopathological risk factors such as large size, dense breast, younger age, high pathological grade, presence of comedo necrosis, or Her2 positivity have been associated with increased risk of recurrence, but the resulting predictive models, or those relying on gene expression signatures, are currently insufficient to safely distinguish patients to watch from patients to treat5.

Contrary to models of progression in other tissue types, there is little evidence for the sequential accumulation of somatic alterations during progression from in situ to invasive breast cancer (IBC), but rather all IBC intrinsic subtypes and known driver mutations have been identified in DCIS, albeit at variable prevalence6,7,8,9,10,11,12. Moreover, both single-cell and bulk studies have shown similar clonal make-up of synchronous invasive and in situ lesions, convoluting the idea that clonal selection drives invasion11,13. The role of the immune environment has also been investigated, highlighting the higher lymphocyte infiltration in Her2+ or Triple Negative DCIS, or specific immunological make-up of samples at higher risk of progression12,14,15,16,17,18,19. Similarly, the role of the basal layer, fibroblasts, adipocytes, other stromal cells, or overall extracellular matrix has identified features that are different between DCIS and IBC, likely mediated by chemokine signaling, and can be associated with known progression risk factors20,21,22,23. Their active participation in the malignant transformation of the breast epithelium remains to be established as similar mechanisms are typically involved in normal development, activity, and aging of the mammary gland24,12.

Progress in our understanding of the processes mediating DCIS onset and progression has been considerably hindered by technical and logistical limitations. Indeed, pure DCIS lesions are commonly small in size, formalin and paraffin embedded (which damages nucleic acids), and can display significant histological heterogeneity25. As a consequence, comprehensive molecular and cellular assays and their integrated analysis have seldom been performed in pure DCIS cohorts. Capturing evidence of phenotypic, genetic, and cellular heterogeneity, and how they relate to each other is necessary to develop a better spatial, temporal, and functional understanding of the mechanisms at play. Recent advances in genome-wide assays, becoming compatible with ever more challenging samples26,27,28,29, have improved our ability to connect histological and molecular observations and enabled such application even to individual microbiopsies from a histological slide of pure DCIS.

Here we describe the combined, parallel histological, molecular, and immunological profiling of premalignant lesions from 39 patients diagnosed with DCIS, including multiple epithelial microbiopsies within a subset of samples. The dissection of specific epithelial lesions provided a detailed assessment of the association of their histological architecture with intrinsic subtypes, mutational landscape, driver mutations, and immunological states. Multi-region profiling resulted in the inference of clonal relationships, illustrating how genotypes related to phenotypes within a specimen. We, therefore, report multi-modal and sub-histological profiling of a cohort of pure DCIS, illustrating spatial heterogeneity and placing diverse states of immune activity observed in their specific molecular and histological context.

## Results

### Histological and molecular characterization

We collected a total of 43 specimens (referred to as samples) from 39 patients diagnosed with pure DCIS, including three samples from subsequent DCIS diagnosed between 14 and 70 months after the index DCIS (Fig. 1a, b, Table 1, Supplementary Table 1). Sixty-nine percent (29/42) of the samples were positive for estrogen receptor (ER) expression and 40% (16/40) had ERBB2 gene overexpression or amplification (Supplementary Fig. 1a). Each sample was further annotated for grade and histological architecture and the annotations were used to identify regions of interest, guide the microbiopsies of the epithelial areas and the immuno-histological analysis. On the basis of their studied regions, the cohort consisted of 32 high or intermediate grade DCIS (HG-DCIS), nine low-grade DCIS (LG-DCIS), and two low-grade atypical ductal hyperplasia (ADH). The DCIS regions could be further annotated according to their dominant histological architecture (17 cribriform, 19 solid, three mixed, two micropapillary) and the presence of necrosis (ten comedo necrosis, 17 other). LG-DCIS were more frequently of cribriform architecture (8/9), while HG-DCIS were frequently necrotic (25/32). The relative area of adipose tissue in each sample varied between four and 91% as estimated by segmental classification of the whole slide digital image (Fig. 1c, Methods). The lower adipose fraction was associated with higher mammographic breast density (p = 0.0067) suggesting the sample histology was representative of the whole breast texture. Interestingly, solid DCIS were associated with a higher adipose fraction (median 69% vs 40%, p = 0.008), suggesting a contribution of the breast microenvironment to the growth architecture. Overall, the cohort represents a diverse set of pure in situ lesions identified in absence of any detectable invasive component. The studied samples are enriched for DCIS lesions and specifically annotated for their histological architecture. Each sample was profiled using multiple assays, performed on sequential histological sections (4–7 µm) used for whole transcriptome, whole exome, and spatial immune profiling. Whenever possible, the investigated regions were matched across assays to preserve the spatial information in the analysis and limit the variation due to spatial heterogeneity. Spatial heterogeneity was further addressed in 21 samples for which multiple sub-regions were profiled independently.

The expression of genes was measured using high-throughput sequencing of RNA-seq libraries directly prepared from the microbiopsied regions27 (Supplementary Table 2). The samples were classified according to the PAM50 intrinsic subtypes used for invasive breast cancer (IBC), which identified Basal (N = 5), Luminal A (N = 10), Luminal B (N = 6), Her2-like (N = 7), and Normal-like (N = 10) samples (Fig. 1d). Consistent with IBC classification, Luminal A and B were enriched for samples from ER + cases, while Her2-like were enriched for Her2+ cases. Similarly, Luminal B and Her2-like were enriched in HG-DCIS, while Luminal A was almost exclusively composed of cribriform LG-DCIS. Luminal A and Normal-like represented closely related classes and together comprised the majority of the samples (20/38), which is not unexpected given the higher fraction of low-grade and pure in situ lesions in the cohort, in contrast with IBC and previous DCIS expression profiling studies18,30. The PAM50 subtype of two independent sub-regions with matching histology and grade was determined in ten samples and observed to be discordant in five samples (Supplementary Table 2B), which was associated with larger distances between the regions (Mann–Whitney, p = 0.005, Supplementary Fig. 1b). Interestingly, matched index and recurrent samples from two patients had at least one region with a concordant subtype. Across all samples, the distribution of probabilities for each PAM50 subtype likely captures such heterogeneity. Normal-like were truly a mix of Normal and Luminal A, while Her2-like tended to have two main subsets: Her2/Basal and Her2/Luminal B. This suggests that subtypes inferred from bulk analysis, even after epithelial microdissection, are frequently the result of a variable mixture of pure subtypes.

### Subtype differences in the mutational landscape

To determine whether any of the histological or molecular subtypes described above were associated with specific genetic alterations, we characterized their mutational landscape. Whole exome sequencing was carried out on microbiopsies from 30 samples using a procedure specifically optimized for a low amount of damaged DNA26. Mutations and copy number alterations (CNA) were identified in 27 and 30 samples, respectively (Supplementary Table 3). The median copy number burden—or fraction of the genome involved in CNA—was 0.14 and was 2.5 fold higher in HG-DCIS (Mann–Whitney, p = 0.017, Fig. 2a, b). Whole-genome doubling (WGD) events were detected in 3/8 eligible samples, all of which were low or intermediate grade cribriform DCIS consistent with its early timing in breast carcinogenesis31 (Supplementary Table 3). Consistent with previous studies, loss of 16q (13/30) and 17p (12/30) or gain of 1q (12/30) were among the most frequent chromosomal alterations and, while many events were more frequent in HG-DCIS (Fig. 2c, Supplementary Table 4), these hallmarks were also observed in low-grade or benign lesions, including ADH: (1q gain: 1/9, 16q loss: 7/9, 17p loss: 3/9).

We identified between 74 and 207 coding mutations per sample. The mutational burden was higher in HG-DCIS (Mann–Whitney, p = 0.003) and Her2-like subtypes (Mann–Whitney, p = 0.025), recognizing that these categories are overlapping. The HG-DCIS burden (4.4 mut/Mb) was higher than previous reports, possibly due to residual germline variants in our study11,13. We identified aging-associated mutational signatures (SBS1 and SBS5) in all samples eligible for analysis (N = 13), APOBEC signature (SBS2 and SBS13) on one intermediate grade solid DCIS and mismatch repair signature (SBS15 or SBS21) in three DCIS of variable grade and architecture (Supplementary Table 5). The APOBEC signature is, therefore, rarer in DCIS than IBC (~8% vs >75%), but can be present in premalignant lesions. Interestingly, this sample also displayed clustered mutations (N = 3 within 1,416 bp) in chromosome 17q (Supplementary Fig. 2), an APOBEC-driven kataegis site frequently seen in IBC32. The most recurrently mutated genes were PIK3CA (44%), TP53 (31%), and GATA3 (20%), and were all affected by known somatic mutations in breast cancer at similar rates to previous studies of pure DCIS6,7,9,10,11 (Fig. 2d and Table 2). TP53 mutations were only found in HG-DCIS and associated with high CNA burden (Mann–Whitney, p = 0.018), while GATA3 mutations were only found in cribriform or ADH histologies and associated with LG-DCIS (Fisher Exact, p = 0.005). Interestingly, GATA3 mutations were identified in larger lesions (Mann–Whitney, p = 0.038), consistent with a similar observation in invasive cancer and the larger tumor size of GATA3 mutated xenograft models33,34. Another nine selected genes known to be mutated in IBC were recurrently affected by 18 mutations predicted to be deleterious, four of which are known somatic mutations7. The result suggests that oncogenic driver mutations are already present at the premalignant stage, including in LG-DCIS (e.g. SF3B1 c.2098 A > G) or ADH (GATA3 c.925-3_925-2del). This is consistent with previous reports and reports of field-effect mutations in normal ducts or benign lesions35,36, though the contribution of these mutations to the lesion progression remains to be determined.

### Genetic heterogeneity and clonal diversity

The histologic assessment and expression profiling have revealed variable levels of phenotypic heterogeneity across the samples. In order to determine whether such heterogeneity is present at the genetic level, we measured genetic heterogeneity in two distinct ways: (1) divergence, which measures the genetic distance between regions of a sample and, (2) clonal relationships, which uses phylogenetic tree construction to establish evolutionary order to genetic alterations (Supplementary Table 3). We measured divergence by computing a CNA-based score on 19 pairs of histologically matching regions in 11 samples (Supplementary Table 3, Methods). With no pairs completely independent, the spatial distance separating dissected DCIS regions was correlated with the extent of their genetic divergence (R2 = 0.65, p = 0.00017, Supplementary Fig. 3), while this could simply be a result of local proliferation, it could also be a consequence of selective pressures of the microenvironment, migratory capacity or genomic instability of particular clones. Interestingly, one ADH had the lowest divergence despite a large distance, suggesting either a different pattern in ADH or a distance threshold for the extent of the correlation. Divergence was not associated with grade, Her2/ER status, or adipose fraction suggesting that local genetic heterogeneity is not associated with progression risk factors.

More precise clonal relationships between regions were evaluated using phylogenetic analysis in 12 samples, comparing CNA, and mutations when available (Fig. 3, Supplementary Fig. 4, Methods). While the majority (88.4%) of CNA were shared across all regions of a sample, 11.6% were private to some regions, as observed in 7/12 samples. Multiple samples (3/12) contained mutations in putative cancer driver genes that were private to one region only. These included known and likely pathogenic mutations in ATR, PIK3CA, MET, KDM5C, suggesting that not all driver mutations are acquired early. Interestingly, the three samples with the most private CNA displayed discordant histological architecture or discordant PAM50 subtypes between regions, suggesting that within a sample, genetic and phenotypic differences are linked. Furthermore, in 4/5 samples containing regions with discordant histology and 3/4 with discordant PAM50 subtypes, features historically associated with low-risk of progression (benign histology, Normal or LumA subtype), appeared earlier than regions with high-risk features (Her2 or Basal subtype, presence of necrosis). Overall, these results illustrate that in these samples, regions evolved to acquire distinct histological and molecular features, and in particular, regions with low-risk features can precede regions with high-risk features.

Substantial heterogeneity and evolutionary patterns are evident in samples like MCL76_061_16200 (Fig. 3a–c), where a region of benign columnar alterations preceded two cribriform regions. While all regions shared a WGD event as well as several arm-level CNA and pathogenic mutations in GATA3 and SF3B1, the cribriform region A acquired private 5q and 8q gains and necrotic features. While this example shows tandem genetic and histological changes as seen across the cohort, it also illustrates that despite occurring earlier, the benign region shares many “driver-like” alterations with both cribriform regions. Furthermore, in another example, despite homogeneous cribriform histologies in regions of MCL76_077_15300 (Fig. 3d–f) only one cribriform region lost a copy number of chromosome 8 and presented with Her2 PAM50 subtype as opposed to its Luminal A predecessor. Notably, bulk studies have shown chromosome 8 loss to be more frequent in Her2 vs Luminal A breast cancers37. Taken all together we illustrate abundant genetic heterogeneity in pure DCIS of all histologies and grades that parallels the levels of phenotypic heterogeneity and often accompanies it, even in regions that are millimeters apart.

### Regional differences in the immune microenvironment

To measure the diversity of the immune landscape and to investigate its potential association with molecular or histological features, we used multiplex immunohistochemistry (mIHC) to measure the number and density of four cell types—T-cells (CD3+), B-cells (CD20+), T-regs (CD3+/FOXP3+) and epithelial cells (PanCK+)—according to their proliferative status (Ki67+). Both epithelial (PanCK+) and adjacent stromal (PanCK- proximal to epithelium) areas from premalignant (N = 36 regions across 32 samples) or normal (N = 21 across 21 samples) histologies were evaluated. Among premalignant regions, the high-grade epithelial areas had lower cell density due to larger cell sizes and frequent central necrosis (median 3.8 vs 6.4 103 cells/mm2 p < 0.03—Mann–Whitney). Solid lesions had the highest fraction of proliferating epithelial cells (median 11.5% vs 2.8% p < 0.02—Mann–Whitney, Supplementary Fig. 5a), and interestingly 3/10 HG-DCIS cribriform lesions (two Her2-like, one Luminal B) had markedly higher proliferation. Consistent with previous findings, we observed higher lymphocyte infiltration in ER- and Her2+ samples compared to ER + ones (Supplementary Fig. 5b, Mann–Whitney, p < 0.001). We next classified all regions using non-negative matrix factorization of the stromal and epithelial cell densities, resulting in three immune states characterized by their dominant meta-markers (MM; Fig. 4a, b Supplementary Table 7a–c): (1) “Active”—ubiquitous high T-cells (high MM2), including a subset with elevated T-cell proliferation (high MM1), (2) “Suppressed”—ubiquitous low T-cells (low MM1 and MM2), high B-cells and T-regs (high MM3), and (3) “Excluded”—high stromal, low epithelial densities (high MM4). To further confirm differences between immune states, we compared the total T-cell, B-cell and T-regs densities in epithelial and stromal compartments. While overall lymphocyte densities were much higher in stroma than in epithelium across all examined regions (median ratio 9.8, Supplementary Table 7a), the skew was a distinguishing feature in regions in Excluded state for all three cell types (Fig. 4c, Supplementary Fig. 6 and Supplementary Table 7b). Furthermore, regions in Active states had the highest epithelial T-cell density (120 cells/mm2) while regions in Suppressed state had the highest T-regs and B-cells epithelial densities (10.4 and 8.1 cells/mm2 respectively). A larger fraction of the normal regions were found in Active (7/21) or Suppressed state (9/21) rather than in Excluded state (4/21) and premalignant regions in Excluded state were more likely to be high grade (7/15 vs 2/17 p = 0.049). Interestingly, the immune states of normal and premalignant regions were concordant in 12/19 matched cases and discordant in seven whose lesions were specifically in the Excluded state (Fig. 4d). This suggests that the eExcluded state may be acquired in response to premalignant growth, while other states may be intrinsic to various breast microenvironments. Furthermore, premalignant regions in Suppressed state were more likely identified in cases younger than 55 (5/8 vs 4/24 OR = 7.6 p = 0.02), consistent with the younger age of DCIS patients with infiltrating PD-L1 + lymphocytes38. We did not observe any associations between immune states and intrinsic subtype, ER or Her2 status, tumor size, breast density, adipose fraction, or DCIS architecture suggesting that they may be independent from traditional histopathological progression risk factors.

In order to identify functional differences between immune states, we evaluated the differential activity of Hallmark and Reactome processes among the 29 DCIS regions with available gene expression information (Supplementary Fig. 7). Compared to Active and Suppressed states, the Excluded state was associated with upregulation of Type 1 and 2 Interferon response, PD1 signaling, and proliferation-related processes as well as the repression of Calreticulin-Calnexin cycle (Supplementary Fig. 7). Noting that the epithelium of DCIS in Excluded state were not completely depleted of infiltrating lymphocytes, the upregulated processes were consistent with the higher expression of PCDC1 or CTLA4 genes in DCIS in Excluded state (Fig. 4e), albeit not significant, and suggesting a likely continuum of increasing immuno-suppression from Suppressed to Excluded states. More interestingly, the repression of the Calreticulin-Calnexin cycle was confirmed via single-sample enrichment analysis and showed a progressive repression from Active, to Suppressed, to Excluded states (p = 0.022, ANOVA, Fig. 4f). This suggests that the export of glycoproteins—including components of MHC1 complex—via the endoplasmic reticulum, impacts immune-surveillance. To verify this hypothesis, we measured the in situ expression of MHC1 complex in 15 samples (Supplementary Table 8 and Supplementary Fig. 8) and compared its levels in adjacent normal and DCIS in each immune state. While the level of MHC1 expression in DCIS region were not significantly different between Excluded and non-Excluded samples, the change between normal and DCIS was different, with the non-Excluded samples displaying increased expression between normal and DCIS, while the Excluded samples remained constant (p = 0.0009, Mann–Whitney, N = 15, Fig. 4g). This therefore suggests that the Excluded immune state may be mediated by both intrinsic expression level of MHC1 and ability to increase it in DCIS.

## Discussion

There is a compelling requirement for a DCIS atlas that delivers a relatively unbiased, multi-modal perspective of pre-invasive breast cancer. Here, we report the multi-modal profiling of a diverse set of pure DCIS. This comprehensive atlas both confirms previous molecular findings and provides a higher resolution histological and spatial context to interpret them. However, with only three known recurrences, the significance of our observations for progression prognosis could not be formally established. Our findings provide a landscape of representative pure DCIS identified in absence of invasive lesions. While some lesions were small, others were quite extended (N = 14 >4 cm), which should capture factors that may be associated with robust containment. The cohort therefore spans a variety of clinical, histological, phenotypic, and genotypic features. Such variety and contrast are critical to ensure this atlas’ utility in designing larger studies, or perhaps providing more cautionary interpretation of observations from cohorts enriched for specific risk factors.

At the heart of our study’s innovation was the ability to generate molecular profiles from limited amounts of dissected archival tissue specimen. Similar approaches are used to study clonal expansion in normal tissues28,29, but generally not performed in parallel for RNA and DNA. Importantly some limitations remain and not all assays were successful. The large variability in success rate was not easy to predict. Likely the age of the specimen, its size, fixation conditions and storage conditions all contribute to success variability which cannot be controlled in a retrospective investigation. Additional limitations are analytical, such as the absence of a matched source of normal DNA from every sample which can result in residual germline variants, perhaps inflating the overall mutation rate observed. The use of adjacent normal tissue can also be problematic and there is ample evidence that they also accumulate somatic mutations39. In our study, we clearly identified known breast cancer driver mutations in samples from ADH or other benign alterations. Overall, while some samples are unlikely to ever contain sufficient material for profiling or dissection of adjacent normal, as methodologies evolve and advance, the success rate and data quality will improve to make molecular pre-malignant profiling more accessible and as routine as is the case in invasive cancer.

Our report contributes to two major advances for understanding pre-malignant lesions. First, we characterized most samples across four important modalities all within a maximum of 50 µm sequentially sectioned tissue. Such advances were enabled by pre-analytical improvements allowing us to reduce the tissue requirement, to include small lesions, and to precisely match regions of interest across each modality: histology, epithelial gene expression, DNA mutations, and immune landscape. As a result, we could isolate regions with different histological features that may coexist within a specimen and more confidently establish their association with expression subtypes, clonal heterogeneity, or immune state. For example, the integration of histology and expression subtypes showed a clear correlation between cribriform architecture and the Luminal A subtype. By integrating histology, expression subtype, and immune state we showed that some immune states are found in normal areas and that there is no clear association between immune state and expression subtype. Hence, the depth and interpretability of the analysis are considerably increased by integrating all modalities at the regional level. This has been clearly the case in large cancer studies such as the TCGA, or, more recently through the integrated analysis of histological and somatic features in normal, aging tissues28,29,40. While most studies do not typically include immunohistochemical or other multiplexed spatial analysis, other important advancements in this field in the past year include spatial proteomics used to evaluate the structure of the myoepithelium in DCIS, and spatial transcriptomics used to identify the transcriptional effect of driver mutations in DCIS, representing the emerging frontier of pre-malignant tissue characterization10,41. It is therefore likely that additional spatial profiling compatible with FFPE specimens will bring additional prognostic and mechanistic insights in future DCIS studies.

The other important contribution of our study is the sub-histological analysis to compare regions of interest from the same sample and infer phylogenetic relationships between them. While we determined that the majority of the DCIS samples were classified as Normal-like and Luminal A subtype, typically considered less-aggressive subtypes in breast cancer and reflective of the known precursor stage that DCIS represents, we showed evidence for intrinsic heterogeneity in the PAM50 probabilities, either from the distribution of probabilities within a region or from physically separated regions. This is not entirely surprising as bulk expression subtypes are the result of averaging heterogeneity, similar to glioblastoma subtypes42 or IBC subtypes43 from single-cell analysis. Such heterogeneity, especially in DCIS, had been proposed before on the basis of marker staining44 and our results confirm that it may be rather common. Similarly to the frequency of heterogeneity between region subtypes, we identified evidence of genetic heterogeneity in 7/12 cases, including the presence of private putative driver mutations. This fraction may be an underestimate given the close proximity of many selected pairs. However, the majority of putative genetic drivers, copy number hallmarks and even WGD were clonal, shared by all regions investigated, including a few normal regions. This observation supports evolutionary models derived from invasive cancer, including multi-sample studies, that suggest that most driver mutations occur early followed by a phase of clonal expansion. Similar observations were also made in early multi-regional studies in DCIS44,45,46 and studies comparing synchronous DCIS-IBC cases using single-cell sequencing13, providing further evidence that breast cancer genetic evolution starts in the pre-invasive stage and possibly in normal regions. It is likely that driver alterations may even be present in adjacent histologically normal tissue as observed in field effects studies in normal ducts39,47. Such effects support an important contribution of host factors to the initial genetic injury. Hence, unlike previous attempts which were focused on histopathological features, including grade, surgical margins48,49, future DCIS prognostic models will likely need to be derived from lifetime cancer risk models like GAIL50 or BOADICEA51 and incorporate host-specific factors, such as polygenic risk scores and reproductive factors, that likely contribute to the DCIS initiation and trajectory.

The immune microenvironment of DCIS has been previously investigated, using both quantification of tumor infiltrating lymphocytes (TIL) and more specific immunohistochemical approaches and revealed clear quantitative and qualitative variation in lymphocyte infiltration, including higher TIL number and more immunosuppressive features in high-risk lesions19. Importantly, previous studies in pure DCIS did not quantify stromal and epithelial TILs separately12,19. This distinction may be hard to make in IBC, where both compartments interact at the invasive front and pathologist subjectivity can have a major impact52,53. However, this separation can be more clearly established in the analysis of DCIS and was critical in the identification of the Excluded immune state in our atlas. While the Active and Suppressed states have been observed before and could readily be identified in our data, the identification of the Excluded state required the use of an analytical method (NMF) to account for the strong correlations that can exist between TILs type and compartments. The inclusion of adjacent normal areas was also important to interpret the significance of the immune states, as the Excluded state appeared more likely in reaction to the DCIS growth and increased grade. The Excluded state exhibited features of immune evasion and could represent a more advanced level of immuno-suppression than the Suppressed state, with the consequence of a topological exclusion from the duct. The downregulation of components of the Calreticulin-Calnexin cycle in the epithelium in Excluded state could impact MHC-I export or maturation, as suggested by the lack of MHC-I expression induction in DCIS of the eExcluded state, hence providing an evasion mechanism, and contrasting with evasion mediated by MHC-I genetic loss observed in IBC54,55. It would be interesting to determine whether the immune states identified can explain the variability of response to local injection of anti-PD1 antibody in DCIS patients and whether any of the states would elicit, or prevent, the desired ductal infiltration by T-cells56.

As illustrated by our study and recent advances in the profiling of normal tissues28,29, histopathology, and molecular pathology are becoming more integrated fields, generating deeper and broader datasets at increased cellular and spatial resolution, from the most challenging human samples. Future studies of early transformation and pre-cancer biology such as the one presented here will likely benefit the most from such approaches which capture heterogeneity at scale and can help reconcile analog (optical) and digital (genomics and multiplex) observations. As a result, such multi-dimensional integration may help identify common factors mediating epithelial transformation and progression across multiple glands and organs.

## Methods

### Sample collection and preparation

FFPE blocks were obtained from UCSD or UVM Pathology Departments after surgical biopsy, excision or mastectomy. The study was reviewed and approved by each institutional review board and they granted a waiver of consent. Eligibility criteria were: (1) adult female, (2) pure DCIS diagnosis (without evidence of invasive disease), and (3) with available pathology blocks. Few cases also had bilateral disease or were matching index and recurrent lesion (ipsilateral or contralateral—Table 1 and Supplementary Table 1). Importantly there was no attempt to enrich high-risk cases or investigate specifically the role of certain candidate risk factors. Factors such as age, grade, race, ER, or Her2 status were not part of the selection criteria and the cohort was designed to reflect patients seen in a regular DCIS clinic. All specimen blocks were de-identified and sectioned sequentially for the following purpose: Hematoxylin-Eosin (H&E) staining (N = 1; 4 µM glass slide), Laser Capture Microdissection (LCM; N = 3; 7 µM glass slide coated with polyethylene naphthalate—ThermoFisher #LCM0522), multiplex or regular immunohistochemistry (N ≥ 3 4 µM glass slide) and a final H&E staining (N = 1; 4 µM glass slide). The H&E slides were scanned at high resolution and reviewed and annotated by the study pathologist. The LCM slides were stored at −20 °C in an airtight container with desiccant until ready for dissection (1 day to 3 months). H&E sections were diagnosed according to the standard of care criteria (AJCC TNM 8th ed./CAP Breast DCIS Reporting Protocol v4.3). DCIS features recorded included lesion grade: Grade I (low), Grade II (intermediate), or Grade III (high), and associated histology: e.g., papillary, cribriform, solid, comedo necrosis. DCIS lesion, normal glands (and in some cases hyperplasia) were delineated on H&E images to assist LCM. DCIS laterality and size, patient age, and menopausal status, and lesion mode of detection were obtained from the original pathology reports or from the Vermont Breast Cancer Surveillance System (UVM specimen) or local cancer registry and chart review (UCSD specimen). Hormone receptor and Her2 statuses (where available) were gathered from the patient reports and/or by de novo IHC staining. The LCM sections were thawed and stained with eosin, sections were kept in xylene and dissected within 2 h of staining. LCM was performed using the ArcturusXM Laser Capture Microdissection System (ThermoFisher). Matching regions from six adjacent sections were collected on CapSure Macro Cap (for DNA, N = 3 slides) or HS caps (for RNA, N = 3 slides), region size, and unambiguous match permitting. Post-dissection, all caps were covered and stored at −20 °C with desiccant.

### DNA extraction and QC

The membrane and adhering tissue were peeled off the caps using a razor blade and the peeled membrane was incubated in proteinase K digestion reaction overnight for 16 h at 56 °C to maximize DNA yield after cell lysis The DNA was extracted using the QIAamp DNA Micro Kit (Qiagen) and the elution was done in 20 µL. The extracted DNA was quantified by fluorometry (HS dsDNA kit Qbit—Thermofisher).

### RNA sequencing and analysis

#### Library Preparation

RNA sequencing was performed using SMART-3Seq, a 3' tagging strategy specifically designed for degraded RNA directly from FFPE LCM specimen27. LCM dissected SMART-3Seq libraries were prepared using the standard protocol for FFPE tissue on Arcturus HS LCM Cap and the individual library SPRI purification option. All FFPE LCM dissected libraries were amplified using 19 PCR cycles during indexing to minimize over-amplification of high abundance mRNAs in each library. Libraries were individually analyzed for size distribution on an Agilent 2200 TapeStation with High Sensitivity D1000 reagent kits to verify average library size of 190 bp and stored at −20 °C until sequencing. When all libraries were ready for sequencing, 1 µL of each library was then used to create two library pools used for sequencing and quantified by Qubit 2.0 Fluorometer HS DNA assay. Library pools were sequenced with a 1% PhiX spike-in control library and sequenced on an Illumina HiSeq 4000, a run type of single read 75 (SR75), and dual index sequencing.

#### Transcriptome analysis

Read count data was obtained using a dedicated analysis workflow https://github.com/danielanach/SMART-3SEQ-smk. Briefly, sequencing reads were trimmed using cutadapt 1.18, UMIs were processed using the umi_homopolymer.py script in the SMART-3SEQ tools (https://github.com/jwfoley/3SEQtools), aligned using STAR 2.6.1a, deduplicated using the dedup.py script from https://github.com/jwfoley/umi-dedup and read counts were calculated using featureCounts 1.6.357,58. Count data were then merged and filtered to remove samples with <55,000 counts and genes with <10 read counts across all samples. Filtered count data was then loaded into Seurat version 3.2.3 and processed using the SCTransform() function version 0.3.2 to regress out the high mitochondrial content variability across the samples59. Batch correction was then performed using ComBat to remove variation attributable to the sequencing center (UCSD vs UVM)60. PAM50 subtype probabilities were calculated from the SCTransform and batch normalized data using the genefu package61. Gene set enrichment analysis (GSEA) was performed as in62 and single-sample GSEA as in63. Gene sets from the REACTOME and Hallmark collections in MSigDB were used to compare the Excluded to the non-Excluded groups, a permutation test was performed to assess the significance of the GSEA results64,65. ANOVA was used to compare the ssGSEA results between the three mIHC groups. FDR of <0.1 and p-values of <0.05 were considered significant.

### Whole exome sequencing and primary analysis

#### Library preparation

DNA was sheared down to 200 base pairs (bp) using Adaptive Focused Acoustics on the Covaris E220 (Covaris Inc) following manufacturer recommendations with 10 μL Low EDTA TE buffer supplemented with 5 μL of truSHEAR buffer using a microTUBE-15. Libraries were prepared using the Accel-NGS 2 S PCR-Free DNA Library Kit (Swift Biosciences). Ligated and purified libraries were amplified using KAPA HiFi HotStart Real-time PCR 2X Master Mix (KAPA Biosystems). Samples were amplified with 5 μL of KAPA P5 and KAPA P7 primers. The reactions were denatured for 45 seconds (sec) at 98 °C and amplified 13–15 cycles for 15 sec at 98 °C, for 30 sec at 65 °C, and for 30 sec at 72 °C, followed by final extension for 1 min at 72 °C. Samples were amplified until they reached Fluorescent Standard 3, cycles being dependent on input DNA quantity and quality. PCR reactions were then purified using 1x AMPure XP bead clean-up and eluted into 20 μL of nuclease-free water. The resulting libraries were analyzed using the Agilent 4200 Tapestation (D1000 ScreenTape) and quantified by fluorescence (Qubit dsDNA HS assay).

#### Capture and sequencing

Samples were paired and combined (12 μL total) to yield a capture “pond” of at least 350 ng, and supplemented with 5 μL of SureSelect XT HS and XT Low Input Blocker Mix. The hybridization and capture were performed using the Human All Exon V7 panel (S31285117) paired with the Agilent SureSelect XT HS Target Enrichment Kit following the manufacturer’s recommendations. Post-capture amplification was performed on the beads in a 25 μL reaction: 12.5 μL of nuclease-free water, 10 μL 5x Herculase II Reaction Buffer, 1 μL Herculase II Fusion DNA Polymerase, 0.5 μL 100 millimolar (mM) dNTP Mix and 1 μL SureSelect Post-Capture Primer Mix. The reaction was denatured for 30 sec at 98 °C, then amplified for 12 cycles of 98 °C for 30 sec, 60 °C for 30 sec and 72 °C for 1 min, followed by an extension at 72 °C for 5 min and a final hold at 4 °C. Libraries were purified with a 1x AMPure XP bead clean-up and eluted into 20 μL nuclease-free water in preparation for sequencing. The resulting libraries were analyzed using the Agilent 4200 Tapestation (D1000 ScreenTape) and quantified by fluorescence (Qbit−ThermoFisher). All libraries were sequenced using the HiSeq 4000 sequencer (Illumina) for 100 cycles in Paired-End mode. Libraries with distinct indexes were pooled in equimolar amounts. The sequencing and capture pools were later deconvoluted using program bcl2fastq [19].

#### Sequencing reads processing and coverage quality control

Sequencing data were analyzed using bcbio-nextgen (v1.1.6) as a workflow manager [20]. Adapter sequences were trimmed using Atropos (v1.1.22), the trimmed reads were subsequently aligned with BWA-MEM (v0.7.17) to reference genome hg19, then PCR duplicates were removed using biobambam2 (v2.0.87)66,67,68. Additional BAM file manipulation and collection of QC metrics were performed with Picard (v2.20.4) and Samtools (v1.9)69. The summary statistics of the sequencing and coverage results are presented in Supplementary Table 9.

### Identification of somatic mutation and copy number alterations

#### Variant calling

Single nucleotide variants (SNVs) and short insertions and deletions (indels) were called with VarDictJava (v1.6.0), and Mutect2 (v2.2)70,71. Variants were required to fall within a 10 bp boundary of targeted regions that overlapped with RefSeq genes (v 109.20190905). A pool of normal DNA was created using whole exome sequencing data of blood of 18 unrelated individuals and was used to eliminate artifacts and common germline variants. Only variants called by both algorithms were considered. These variants were then subjected to an initial filtering step with default bcbio-nextgen tumor-only variant calling filters and the following parameters were used: position covered by at least five reads, mapping quality > 45, mean position in read >15, number of average read mismatches < 2.5, microsatellite length < 5, tumor log odds threshold > 10, Fisher strand bias Phred-scaled probability < 10 and VAF > 0.172. Functional effects were predicted using SnpEff (v4.3.1)73. All samples were re-evaluated for the presence of COSMIC (v91) database mutations which have been previously observed in at least 15 patients and fall within 137 known breast cancer driver genes (Supplementary Table 10)7.

#### Germline variant filtering

In absence of matched normal tissue for DCIS samples, somatic mutations were prioritized computationally using the approach from the bcbio-nextgen tumor-only configuration then additionally subjected to more stringent filtering72. Briefly, common variants (MAF > 103 or >9 individuals) present in population databases - 1000 genomes (v2.8), ExAC (v0.3), or gnomAD exome (v2.1)—were removed unless in a tier 1 gene from the cancer gene consensus and present in either COSMIC (v91) or clinvar (20190513)7,74,75,76,77. Variants were removed as likely germline if found at a variant allelic fraction (VAF) greater or equal to 0.9 in non-LOH genomic segments—as determined by CNA analysis (below). Lastly, variants were also removed as potential germline (or artifact) if found in >2 patients in the pool of normal (described above).

#### Single-sample CNA calling

CNVkit78 was used for calling somatic copy number alterations (CNA) to measure both overall CNA burden, arm and gene level CNA, and identify LOH as previously described in26. Allele-specific copy number calling algorithm, ASCAT, was used on a select number of samples for which there was sufficient coverage and the algorithm converged on a solution, in order to identify whole-genome doubling events as well as confirm CNA identified by CNVkit79. Default parameters were used with ASCAT with the exception of a segmentation penalty of 100 and a gamma of 1.

#### Multi-region CNA segmentation

To generate harmonized segmentation breakpoints between regions belonging to the same sample, multi-region segmentation was performed with the R CopyNumber (v1.26.0) package80. Outliers in CNVkit bin-level log2 copy ratios were detected and modified using Median Absolute Deviation Winsorization with the winsorize() function, segments were then called using the multipcf() function with a gamma of 40.

#### Mutational signatures

Mutational signatures were called on merged region samples using a single-sample variation of SigProfiler with default parameters to decompose into known single-base substitutions (SBS) reported in COSMIC81,82.

### Analysis of the clonal evolution and genetic heterogeneity

#### Measurement of genetic divergence

Divergence was measured on each pair of related regions, a and b, using Eq. (1):

$$Divergence_{a,b} = \mathop {\sum}\nolimits_{k = 0}^n {\left| {copy\;ratio\;a_k - copy\;ratio\;b_k} \right| \ast \left( {\frac{{bins_k}}{{total\;bins}}} \right)}$$
(1)

Where k is the copy number segment, n is the total segments, and bins is the number of bins covered by a segment from the CNVkit input file. The $$\frac{{bins_k}}{{total\;bins}}$$ term was used as a weighted correction factor for the number of bins contributing to a segment. For samples with >2 regions, the maximum divergence between any two regions was used to represent the sample.

#### CNA-based phylogenetic reconstruction

Construction of phylogenetic trees was performed similarly to the methodology outlined in83. Briefly, for each sample the log2 copy ratios from multi-sample copy number segments with at least 12 probes (see above), were translated into a matrix containing −1 for loss (log2 copy ratio < −0.6), 0 for neutral (−0.4 ≤ log2 copy ratio ≤ 0.3) and undetermined for anything else. This matrix was then used to generate Maximum Parsimony trees using phangorn using default parameters84.

#### Mutation-based phylogenetic reconstruction

To allow the analysis of clonal relationships between regions of the same sample, the coverage depth of each allele at any remaining mutated position in any region was extracted using Mutect2 joint variant caller on the sets of aligned reads from each region. In order to call a mutation either absent or present in a region, we used a Bayesian inference model specifically designed for multi-region variant calling85. Treeomics (v1.7.10) was run with the default parameters except for e = 0.02. The tree solution which matched the CNA-based reconstruction was then integrated into a single tree for Fig. 3 and S4.

### Multiplex immunohistochemistry

#### Staining

Tissue sections were prepared from formalin-fixed paraffin embedded tissue blocks and cut to 4 μm serial sections and mounted on Superfrost Plus (VWR). The procedure for multiplex immunohistochemistry (mIHC) was followed by a manufacturer’s protocol for Opal7-color automation IHC kit (Akoya Bioscience), and the staining was performed with Autostainer DISCOVERY ULTRA (Ventana). Antibodies used in mIHC are anti-CD3 (clone 2GV6, Ventana), anti-CD20 (clone L26, Ventana), anti-Ki67 (clone 30-9, Ventana), anti-FOXP3 (clone SP97, Spring), anti-pan cytokeratin (CK; clone AE1/AE3, DAKO), anti-CD117 (clone c-kit, DAKO). The molecular markers of immune panel (CD3, CD20, Ki67, CKs, FOXP3, and CD117) were visualized with Opal520, Opal540, Opal570, Opal620, Opal650, and Opal690, respectively. DAPI counterstaining was performed with Discovery QD DAPI (Roche). ProLong Diamond Antifade Mounting (ThermoScientific) was used for mounting the coverslip. Detailed staining conditions and autostainer’s protocols are reported in our recent report86.

#### Visualization and analysis

Tissue samples stained with mIHC were scanned with multispectral imaging microscopy (Vectra 3, Akoya Bioscience). Scanned multispectral images were unmixed on inForm software (ver.2.4.0, Akoya Bioscience) to acquire the fluorescence signal from each marker86. Imaging analysis was performed on inForm software by identifying tumor (CK + area) and stroma (CK- area proximal to the epithelium), each nucleated cell, and its cell type. Alternatively, QuPath software 2.3.187 was also used to perform similar imaging analysis on unmixed images converted to multi-layered TIFF format by inForm software86. The images of the regions of the same type (DCIS or normal) from the same case, were typically stitched together and stored and shared as one single larger multi-layered TIFF image (data availability below). Scanned image areas were aggregated into up to three histological regions per sample: main pre-invasive lesion, alternate pre-invasive lesion, normal epithelium. In each region, the stromal and epithelial densities of each cell type and state were calculated, including when cells were not present (density = 0). Regularized marker densities into distribution deciles were then used to classify samples using non-negative matrix factorization (Supplementary Table 7c). The immune states were assigned and named after the hierarchical clustering of the H-matrix (meta-marker values).

### MHC1 immunostaining and analysis

Four-micron sections were baked at 60 ° for 1 h, followed by deparaffinization through three successive changes of xylene. Tissue was then rehydrated in decreasing grades of alcohol, with two changes of 100%, 95%, and then 70% EtOH, followed by diH20. Antigen retrieval was performed using Antigen Unmasking Solution Citrate Based pH6, H-3300 (Vector) at 95 °C for 30 min. Staining was performed using the intelliPATH Automated IHC stainer (Biocare). Endogenous peroxidase was blocked using Bloxall blocking solution, SP-6000 (Vector) for 10 min, followed by two washes in TBST. Afterward, tissue was blocked with a 3% Donkey Serum for 10 min, followed by blocking with Anti-HLA Class I ABC Primary Mouse Antibody, ab70328m (Abcam) at 1:1000 for 1 h and subsequently washed twice with TBST. Tissues were then blocked with Anti-Mouse HRP UltraPolymer IgG, 2MH-100 (Cell IDx) for 30 min, and washed twice with TBST. The reaction was then developed with 3,3'-Diaminobenzidine Chromogen, 95041-478 (VWR) for 5 min, and then stopped with two washes in diH20. Counterstaining was performed with Mayer’s Hematoxylin Solution, 51275 (Sigma) for 5 min. Lastly tissues were washed twice in TBST, and once in diH20, dehydrated in increasing grades of EtOH, then cleared and mounted with xylene based mountant. MHC1 expression was scored from 0 to 3 separately for DCIS and normal epithelium throughout the entire section, away from possible biopsied areas. The scores were established as follows: 0: no staining or weak staining in <50% of cells; 1: weak staining in >50% of cells; 2: intermediate staining in >50% of cells; 3: strong staining in >50% of cells.

### Whole slide image digital analysis

High resolution whole slide images of H&E stains were loaded into a QuPath (v2.3) project87. One analysis area was defined for each specimen, avoiding the location of biopsies as well as dust or marked areas. The analysis areas were segmented into superpixels (sigma = 5 µm, spacing = 50 µm, maxIterations = 10, regularization = 0.25) and each superpixel was annotated with both Hematoxylin and Eosin Intensity features (size = 2 µm, tile size = 25 µm). The mean, median, min, max, and standard deviation values were then smoothed (Haralick distance = 1, Haralick bins = 32). Multiple training areas were annotated from each of the following classes: adipose, stroma, inflammation, epithelium (normal and atypical), void, necrosis, blood vessels. Multiple areas across 2–4 samples were used to train a Random Tree classifier. The classifier was then applied to all superpixels included in the analysis area. The accuracy of the classifier was assessed both visually and with multiple test areas for each class. Superpixels of the same class were merged into single annotations and the resulting areas were recorded. Separate classifiers were used for images from different institutions, to mitigate possible variation staining, scanning, or image format. The fraction of adipose area was compared to breast density using Mann–Whitney test comparing dense & heterogeneously dense breast to other lower densities, or comparing solid DCIS to non-solid DCIS lesions.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.