Introduction

The management paradigm for early breast cancer (EBC) is shifting away from adjuvant treatment towards neoadjuvant strategies as a standard of care for patients with more aggressive subtypes1. Originally used for downstaging locally advanced tumors, neoadjuvant systemic therapy (NST) has established benefits in improving surgical outcomes by increasing operability, facilitating breast-conserving surgery, and reducing the extent of lymph node resection2,3. Furthermore, NST enables in vivo chemosensitivity testing to inform about prognostication and guide subsequent treatment decisions based on individual response4. The overall prognosis in patients who attain pathologic complete response (pCR) is exceptional5 and the highest pCR rates are observed in patients with triple-negative breast cancer (TNBC) and human epidermal growth factor receptor-2 (HER2)-positive subtypes; in whom the association between pCR and improved outcomes is also strongest5,6. Meanwhile, for patients not achieving pCR, a new model of post-neoadjuvant treatment escalation is emerging7, with recent trials demonstrating benefit in this setting for adjuvant capecitabine for TNBC and trastuzumab emtansine for HER2-positive disease8,9.

Despite the positive prognostic impact of invasive disease eradication, a significant minority of patients with pCR following NST ultimately relapse10. In a large analysis conducted by the German Breast Group on a database which included 2188 patients with pCR from five neoadjuvant trials, the rate of disease-free survival (DFS) at 5 years was reported to be 87%11. Similarly, a recent meta-analysis reported a 5-year event-free survival (EFS) of 88% among 5748 pCR-attaining patients from clinical trials or cohort studies5. In that analysis, 5-year EFS in patients with pCR was lower in TNBC (90%) or HER2-positive subtypes (86%) than in hormone receptor (HR)-positive disease (97%). Relapse risk in patients with pCR has been associated with several clinicopathologic and demographic variables, including more advanced primary tumor or nodal stage, HER2-positivity, younger age, and premenopausal status10,11,12,13,14. However, it is not currently possible to reliably predict post-pCR recurrence. Prospective identification of patients remaining at high risk despite pCR could enable targeted use of intensified post-neoadjuvant treatment and/or more stringent monitoring approaches.

It is clear that the molecular and immunologic characteristics of the breast tumor and its microenvironment are important determinants of both treatment response and prognosis7. In the neoadjuvant setting, pCR attainment is predicted by the degree of tumor lymphocyte infiltration15, as well as tumor transcriptomic features including intrinsic subtype and expression of genes involved in proliferation, immune regulation, and cell signaling7,16,17,18. However, there are only limited data regarding the impact of such parameters on the risk of relapse in the context of pCR19.

With an intention of identifying transcriptomic changes associated with post-pCR recurrence, we compared the expression of an extensive panel of genes and gene signatures in matched primary and recurrent tumors from the same cohort of patients from our institutional database of >4500 breast cancer patients whose primary tumors had achieved pCR (intraindividual comparison). In addition, we also assessed differential gene expression between the primary tumors from these patients and the primary tumors from matched controls with pCR who did not relapse (interindividual comparison).

Results

Patient population

From a total of 4616 primary breast cancer patients in our database we identified 1450 EBC patients who had received NST prior to surgery (Fig. 1). The tumors in approximately half of these patients (n = 672, 46.3%) demonstrated pCR regardless of the breast cancer subtype. The rate of relapse in patients whose tumors experienced pCR was 9.7% (n/N = 65/672). After further shortlisting of patients as described in the “Methods” section, tumor samples from a total of 14 patients (primary and recurrent tumor) and 41 matched controls (primary tumor) were sent to NanoString Technologies Germany GmbH (Hamburg, Germany) for transcriptomic analysis using the BC360 panel.

Fig. 1: Flowchart identifying patient and control selection.
figure 1

*Two primaries in the same breast, one relapsed while the other showed continued pCR. **A total of 42 controls were identified but immunohistochemical analysis failed in one control. Abbreviations: EBC early breast cancer, KEM Kliniken Essen-Mitte, NST neoadjuvant systemic therapy, pCR pathologic complete response.

Patient and control characteristics are described in Table 1. In both groups, the majority were node-positive and presented with grade 3, stage T1–T2 tumors at the time of diagnosis. According to immunohistochemical analysis, these tumors were classified as—HR-positive HER2-negative (n = 4, 28.6%), HER2-positive (n = 5, 35.7%), and TNBC (n = 5, 35.7%) (Table 1). In patients whose tumors had gBRCA1 mt status (n = 2), the primary tumors were estrogen receptor (ER)-low positive (<10%)/progesterone receptor (PR)-negative in one case and ER-negative/PR-positive in the other.

Table 1 Patient characteristics.

Median time from diagnosis to any relapse was 23.5 months (range: 9.0–75.0). The most common sites of recurrence were lymph nodes (regional and distant; n = 7; 50.0%), breast (n = 6; 42.9%), brain (n = 4, 28.6%), liver (n = 4, 28.6%), and lung (n = 4, 28.6%). A total of eight patients (57.1%) relapsed with distant metastases.

Since only 2 (instead of 3, as was the case for others) matched controls were available for one of the patients with relapse (who had TNBC), the control cohort comprised tumors from 41 patients with pCR and no relapse. Matched controls for the two relapsed patients whose tumors were gBRCA1 mt included four patients with tumors that had TNBC/gBRCA1 wt status, due to limited availability of non-relapsed gBRCA1 mt controls and the clinical similarity between TNBC and gBRCA mt.

Gene expression analysis

A total of 69 RNA samples were analyzed: 14 primary and 14 recurrent tumors from patients with post-pCR relapse, and 41 primary tumors from controls without relapse. Gene expression analysis failed to meet quality control criteria for one sample (a primary tumor from a control matched to a patient with TNBC/gBRCA1 wt tumor and local recurrence only), providing available data for 68 tumor samples (98.6%). Subgroup assessments for patients with available gene expression data are indicated in Table 1.

Intrinsic subtype analyses

Intrinsic subtype of patients’ tumors (N = 14) according to PAM50 analysis of the BC360 panel was—luminal B in one (7.1%) patient, HER2-enriched in four (28.6%) patients, and basal-like in nine (64.3%) patients. Intrinsic subtype differed between primary and recurrent tumors in four (28.6%) patients (Fig. 2a). Of those, two were TNBC which converted from basal-like primaries to luminal A recurrent tumors. One patient with HER2-positive disease had a HER2-enriched primary tumor and a basal-like recurrence, while one patient with HER2-positive tumor had a luminal B primary and HER2-enriched recurrence. The intrinsic subtype was basal-like for both the primary as well as recurrent tumors in the two patients who were gBRCA mt indicating that tumors harboring gBRCA mutations behave fundamentally similar to TNBC despite some ER or PR expression. In contrast, comparison of the immunohistochemical analysis of the primary tumors to their corresponding recurrent tumors showed an entirely different pattern (Fig. 2b). In this case, while all patients with HER-positive primary tumors presented with HER-positive recurrences, a substantial proportion of patients with HR-positive primary tumors relapsed with TNBC tumors.

Fig. 2: Comparison of intrinsic subtype and immunohistochemistry between patients and controls.
figure 2

Shift in intrinsic subtype (a) and immunohistochemical classification (b) in between primary tumor and relapse in patients and comparison of intrinsic subtype of primary tumors between patients and matched controls (c). For interindividual analysis, patients and controls were matched by immunohistochemistry. *Also gBRCA1 mt. Abbreviations: HER2 human epidermal growth factor receptor 2, HER2E HER2 expressing, HR hormone receptor, IHC immunohistochemistry, LumA luminal A, LumB luminal B, NST neoadjuvant systemic therapy, TNBC triple-negative breast cancer.

In controls with pCR and no relapse, intrinsic subtype of tumors (N = 40) was luminal A in two patients (5.0%), luminal B in four patients (10.0%), HER2-enriched in 14 patients (35.0%) and basal-like in 20 patients (50.0%) (Fig. 2c). As matching was done according to immunohistochemistry (IHC) and nodal status, primary tumors from patients and their matched controls had the same IHC. Primary tumor intrinsic subtype in controls was the same as that of their matched patient in 28/40 cases (70.0%). This disparity can be attributed to matching tumors from patients to those in control using classical IHC analyses prior to intrinsic subtyping using the BC360 panel.

Interindividual comparison

Primary tumor expression of major histocompatibility complex-class II (MHC-II) molecules was significantly lower in the tumors of patients who later relapsed despite pCR compared with controls (Fig. 3a). This difference appeared more pronounced in tumors from patients with distant relapse (Fig. 3b). Patients with distant relapse also showed a trend for decreased interferon gamma (IFNγ) signaling (logFC = −0.759; 95% confidence interval (CI): −1.534 to 0.172; P = 0.055) and significantly higher homologous recombination deficiency (HRD) signature expression in the primary tumor versus controls (Fig. 3c; Supplementary Table S1).

Fig. 3: Interindividual comparison of primary tumor gene expression between patients with relapse despite pCR and controls with pCR and no relapse.
figure 3

Data are shown for MHC-II (a, b) and HRD (c) signature expression for the overall cohort (a) and for the distant relapse subgroup (b, c). The central line indicates the median, the box indicates the 25th and 75th percentiles and the whiskers indicate the range. Circles represent individual data points. Negative values for logFC indicate lower expression in patients with relapse versus controls. Abbreviations: CI confidence interval, HRD homologous recombination deficiency; logFC log2-fold change, MHC-II major histocompatibility complex-class II.

Several other differences were observed between primary tumors from patients with any relapse despite pCR and controls in subgroup analyses. In patients with HER2-positive tumors, proliferation score was significantly higher versus controls (Fig. 4a). In patients with TNBC and relapse, tumor expression of endothelial cell signature, mammary stemness signature, and PR gene was significantly greater than in controls (Fig. 4b–d). P values in interindividual analyses were not significant after false detection rate (FDR) adjustment (Supplementary Tables S1 and S2).

Fig. 4: Interindividual comparison of primary tumor gene expression between patients and matched controls with HER2-positive disease or TNBC.
figure 4

For the HER2 subgroup, data are shown for proliferation (a) while for the TNBC subgroup, data are shown for endothelial cell activation (b), mammary stemness (c), and progesterone receptor (d). The central line indicates the median, the box indicates the 25th and 75th percentiles and the whiskers indicate the range. Circles represent individual data points. Negative values for logFC indicate lower expression in patients with relapse versus controls. Abbreviations: CI confidence interval, HER2 human epidermal growth factor receptor 2, logFC log2-fold change, PR progesterone receptor, TNBC triple-negative breast cancer.

Intraindividual comparison

Compared with primary tumors, post-pCR recurrences had significant downregulation of ER signaling signature expression (Fig. 5a). This effect was also seen when the analysis was restricted to distant recurrences (Fig. 5b), which additionally showed lower expression of genes or signatures for apoptosis, CD8+ T-cells, IFNγ signaling, stromal cells, T-cell immunoreceptor with immunoglobulin and immunoreceptor tyrosine-based inhibition motif domains (TIGIT), and regulatory T cells (Treg), relative to the primary tumor (Fig. 5c–h; Supplementary Table S3).

Fig. 5: Intraindividual comparison of gene expression between matched primary tumors and post-pCR recurrences.
figure 5

Data are shown for ER signaling (a, b), apoptosis (c), CD8+ T cells (d), IFNγ signaling (e), stroma (f), TIGIT (g), and Treg (h) signature expression for the overall cohort (a) and for the distant relapse subgroup (bh). The central line indicates the median, the box indicates the 25th and 75th percentiles and the whiskers indicate the range. Circles represent individual data points. Negative values for logFC indicate lower expression in recurrent versus primary tumors. Abbreviations: CI confidence interval, ER estrogen receptor, IFN-γ interferon gamma, logFC log2-fold change, TIGIT T-cell immunoreceptor with Ig and ITIM domains, Tregs regulatory T-cells.

Significant differences were also observed in subgroup analyses conducted irrespective of the site(s) of recurrence according to the pathologic classification of the primary tumor. ESR1 was downregulated in recurrent tumors in the HER2-positive subgroup (Fig. 6a), along with decreases in genes coding the phosphatase and tensin homolog, a tumor suppressor (Fig. 6b) and transforming growth factor-β (TGFβ) (Fig. 6c), a multifunctional cytokine. In the TNBC subgroup, expression of both TGFβ and the stromal cell signature were downregulated in the recurrent tumor (Fig. 6d, e). P values in intraindividual analyses were not significant after FDR adjustment (Supplementary Tables S3 and S4).

Fig. 6: Intraindividual comparison of tumor gene expression between matched primary tumors and post-pCR recurrences that were HER2-positive or TNBC.
figure 6

For the HER2 subgroup, data are shown for estrogen signaling (a), PTEN (b), and TGFβ (c) while for the TNBC subgroup, data are shown for stromal signaling (d), and TGFβ (e). The central line indicates the median, the box indicates the 25th and 75th percentiles and the whiskers indicate the range. Circles represent individual data points. Negative values for logFC indicate lower expression in patients with relapse versus controls. Abbreviations: CI confidence interval, ESR1 estrogen receptor 1, logFC log2-fold change, PTEN phosphatase and tensin homolog, TGFß transforming growth factor beta, TNBC triple-negative breast cancer.

Discussion

Despite a strong favorable prognosis following pCR, there is an unmet need for biomarkers to identify the substantial minority of EBC patients who are still at a risk of relapse. To our knowledge, ours is the first such study investigating the association between gene expression with both tumor evolution at relapse (intraindividual analysis) and relapse risk (interindividual analysis). Using a transcriptomic approach, our analysis of >4500 patients with primary breast cancer in our institute’s database, of whom approximately 10% experienced relapse post pCR, was able detect differential expression of several key pathways mediating tumor biology and progression such as antitumor immunity, DNA damage response, stromal factors, hormonal signaling, and tumor regulation.

The interaction between the breast tumor and the immune system plays a key role in shaping the course and outcome of the disease20. MHC-II is an important driver of immune activation which further leads to an increase in tumor response21,22,23,24. Various studies have identified a correlation between lowered MHC-II expression and reduced tumor lymphocyte infiltration, greater lymphovascular invasion, and poor outcome in patients with TNBC treated with or without adjuvant chemotherapy23,24,25. Attenuated MHC-II expression is expected to limit activation of CD4+ T cells, which mediate anticancer immunity by facilitating CD8+ T cell activation, secreting effector cytokines and direct cell-killing26, and also regulate metastasis via effects on the vasculature27. Although predominantly expressed by professional antigen-presenting cells, MHC-II can be induced in breast cancer cells by IFNγ released by activated T cells into the microenvironment28. Our interindividual analyses demonstrated a lower expression of MHC-II in the primary tumors of patients with post-pCR relapse in comparison with those from their matched controls (Fig. 3a, b). Furthermore, the reduced MHC-II expression was paralleled by a trend for lower expression of the IFNγ signaling signature, albeit only in the distant relapse subgroup. Interestingly, tumors may adapt to suppress IFNγ-mediated induction of MHC-II by activation of the Ras-mitogen-activated-protein-kinase (RAS/MAPK) pathway, which is implicated in immune evasion of residual TNBC persisting after NST29.

Another signal of interest we report from our interindividual analyses is the substantially higher expression of HRD signature (from the 143 DNA damage repair genes in the BC360 panel) in primary tumors from patients with post-pCR relapse in comparison to those from the controls (Fig. 3c). The role of HRD, especially in TNBC, has been explored widely in recent times, from a therapeutic as well as a prognostic angle. While the former has led to the discovery of a number of poly-adenosinediphosphateribose polymerase (PARP) inhibitors for clinical use in metastatic and advanced breast cancer30, the importance of HRD assessment for predicting treatment response as well as attainment of pCR is also currently being elucidated. Although it is common knowledge that the homologous recombination DNA repair pathway harbors mutations in large proportion of TNBCs, Timms et al were the first to postulate that other breast cancer subtypes could also present defects in this pathway31 and hence a metric of HRD could be utilized for clinical assessment of treatment outcomes. A recent comparison of BRCA-positive tumors to BRCA-negative tumors also showed a higher HRD score in the former32. From a translational perspective, HRD status has also been shown to be predictive of achievement of pCR in patients with TNBC/BRCA-positive breast cancer33 as well as those with HR-positive breast cancer34.

Our interindividual analysis demonstrated higher primary tumor expression of endothelial cell and mammary stemness signatures than controls in patients with TNBC who subsequently relapsed (Fig. 4b, c). Detected by immunohistochemistry, both endothelial cells (reflected in microvessel density) and cancer stem cells are most abundant in TNBC and are poor prognostic factors35,36,37. Cancer stem cells, in particular, are critical determinants of metastatic dissemination and treatment resistance, as underscored by their enrichment in occult metastatic lesions and residual disease post-therapy38. In HER2-positive tumors we saw an association of higher pretherapeutic proliferation index with post-pCR recurrence. Greater baseline tumor proliferation, as reflected in Ki-67 positivity, has previously been associated with higher pCR rates across breast cancer subtypes, but its prognostic significance in the neoadjuvant setting may be dependent not only on subtype but also on pCR status39.

We observed significant downregulation of ER signaling signature in post-pCR recurrences versus paired primary tumors (Fig. 5a, b), with a reduction in the expression of ESR1 in HER2-positive subgroups (Fig. 6a). Loss or mitigation of ER signaling is a known mechanism of endocrine resistance occurring in around one-fifth of initially ER-positive tumors and may be accompanied by the upregulation of alternative growth-stimulating pathways40. We also noted a paradoxical association between tumor PR signature and relapse in patients who had TNBC and therefore negative PR expression by immunohistochemistry. This finding likely reflects differences in PR expression at the mRNA and protein level41, which could be introduced by translational regulation, for example, by microRNAs42.

Taken together, the output of our analyses poses two main clinical implications. Firstly, our results appear to suggest that post-pCR recurrence is associated with a change in tumor immune microenvironment from an anti-tumorigenic to a pro-tumorigenic phenotype. Indeed, through interindividual comparison we have shown the altered expression of certain known drivers of anti-tumorigenic immune activity such as MHC-II in tumors of patients who subsequently developed recurrent tumors. Furthermore, through intraindividual comparison we have shown a downregulation of key anti-tumorigenic effectors of the immune system such as CD8+ T cells (Fig. 5d), IFN-γ (Fig. 5e), TIGIT (Fig. 5g), and Treg cells (Fig. 5h) in recurrent tumors. A number of recent studies postulate that such alterations in the tumor immune microenvironment are in response to NST and are likely to have predictive value in treatment outcomes43,44,45. Secondly, our findings suggest that the immunologically quiescent primary tumors which are at risk of post-pCR relapse could benefit from the early use of immune-oncologic agents either alone or in combination with NST. The improvement in pCR achievement rate in early TNBC from the addition of immune checkpoint inhibitors to NST as seen from certain landmark clinical trials such as IMpassion03146, Keynote 52247, and I-SPY248 indicate a validity in our suggestion. Consequently, it would be interesting to conduct analyses similar to the one we present here on the datasets of these trials to assess whether early use of immunotherapy is effective in patients with immunologically quiescent tumors who achieve pCR. One open question is to what extent the risk factors for relapse after pCR might differ from more general risk factors for recurrence. Many of the variables associated with relapse in the present study have shown negative prognostic significance in wider populations. Previously reported analysis conducted by the German Breast Group evaluated the impact of classical clinical parameters on relapse and concluded that initial tumor size and nodal status were the only prognostic factors associated with long-term survival11. However, it is unclear whether these results are comparable to our findings since the vast majority of patients in our analysis were clinically designated as cT1/2 and clinically node negative thereby representing a relatively early stage breast cancer. In addition, since post-pCR recurrence implies discordant response to NST, mechanisms that increase mutational load and clonal heterogeneity, such as HRD, may be especially expected to drive relapse as seen in this setting.

Our analyses have certain limitations. Due to their nature, we had to retrospectively identify eligible patients and controls from our database and hence had no control over the sample size. Consequently, our subgroup analyses for different breast cancer subtypes were underpowered. Moreover, given the relatively short time to recurrence, we could not gather certain types such as luminal A breast cancers which are known to relapse much later. In addition, our analysis was based on data from a single center, and this could affect generalizability of our findings. Finally, statistical significance did not persist after FDR adjustment.

A key strength of our study was the requirement for time to distant relapse of >12 months, which enabled us to exclude patients with occult primary metastatic disease, who represent a distinct population. In addition, given our institutional practice of collecting tumor tissue for biopsy from the primary as well as relapsed tumor, we could perform paired intraindividual analyses, an advantage not always available to large national and international study groups. We also included both local recurrences and distant metastases since invasive DFS is one of the prime endpoints in breast cancer clinical trials. Taken together, our unique approach of intra- and interindividual analyses allowed us to identify significant differences in several gene expression variables which may drive relapse after pCR attainment.

In conclusion, we have identified several transcriptomic correlates of relapse despite pCR and changes in tumor gene expression associated with recurrences occurring after pCR attainment, which warrant further investigation in prospective trials. Even in patients whose tumors attain a pCR, insufficiently activated immunogenic pathways may play a key role for relapse. If validated, the identified risk factors and mechanisms of recurrence could inform the design of novel post-neoadjuvant strategies, for patients who are at a risk of recurrence despite their tumors achieving pCR8,9. For example, tumors with HRD may be sensitive to platinum agents and PARP inhibitors49,50. Moreover, the role of immune oncology in the neoadjuvant setting is still under investigation and genomic analysis of patients after NST comprising immuno-oncological agents may play a role in future to identify patients who benefit most from tailored therapy strategies. Conversely, downregulation of immune markers in post-pCR recurrences suggests that immunotherapies may be better deployed earlier in the course of disease51. Given the absence of tumor in surgical specimens at pCR, liquid biopsy methods for quantifying residual disease burden and tracking tumor evolution after NST particularly holds a promise for the future personalization of therapy in this setting52.

Patients and methods

Study design and patients

This study involved a retrospective analysis of female patients diagnosed with early or locally advanced breast cancer at a single center (Kliniken Essen-Mitte [KEM]) in Essen, Germany between September 2011 and January 2020.

In order to identify appropriate study patients and controls, we evaluated the records of breast cancer patients in KEM’s database for the aforementioned period (Fig. 1). Data eligible for analysis came from patients who had—received standard of care NST for a minimum of 12 weeks followed by requisite surgery to the breast and the axillary lymph nodes; had subsequently attained pCR, as defined by absence of invasive cancer in the breast and axilla (i.e., ypT0/is and ypN0); had received appropriate treatment and follow-up on relapse; and had biopsy samples from the primary cancer (acquired prior to initiating NST) as well as from the site of recurrence available. Certain patients in whom further treatment was deemed not necessary by the treating physician, e.g., those with TNBC achieving pCR following mastectomy, were also considered eligible for inclusion in our analysis. To exclude occult primary metastatic disease, we discounted patients in whom the time to distant relapse was <12 months.

Eligible controls were patients in our database who were relapse-free following pCR for a period comparable to the corresponding study patient and also matched them for receptor status (HR, HER), nodal status, or gBRCA status using baseline immunohistochemical analysis. For patients with gBRCA1 mutation (mt) status, if no matching gBRCA1 mt control was available, the patient was instead matched to a control with TNBC and gBRCA1 wildtype (wt) status. We aimed to identify three controls for every patient who fulfilled the aforementioned criteria.

Ethics committee approval was obtained from the institutional ethics committee (Ethik-Kommission der Ärztekammer Nordrhein, Düsseldorf, Germany) and patients had provided written informed consent previously.

Sample processing and gene expression analysis

Biopsy specimens of patients (primary as well as recurrent tumors) and controls (primary tumors), previously preserved by fixing in formalin and embedding in paraffin according to KEM’s standard protocol were retrieved and sent to NanoString Technologies Germany GmbH (Hamburg, Germany) for analysis using the Breast Cancel 360 (BC360) panel on the multiplexed digital nCounter® platform (NanoString Technologies Inc., Seattle, WA, USA).

Briefly, RNA from biopsy specimens were isolated and when necessary, amplified, using NanoString’s in-house standardized protocol, followed by hybridization with the BC360 panel and a signal readout on the nCounter® System.

The BC360 panel consists of 758 genes of interest for breast tumor biology, including the 50-gene prediction analysis of microarray (PAM50) set53,54, and 18 housekeeping control genes55. Gene counts were normalized to housekeeping gene expression, as well as either a panel standard (for non-PAM50 genes) or a reference sample (for PAM50 genes). Normalized gene expression data were log2-transformed and used to derive expression scores for the 42 genes and gene signatures that are a preselected focus of the panel and which reflect tumor biology, the immune response and abundance of different cell populations in the microenvironment. The PAM50 gene set enabled determination of the intrinsic subtype (i.e., luminal A, luminal B, HER2-enriched or basal-like) and risk of recurrence score according to published methods53,54.

Statistics

Since our analysis entailed identifying eligible patients and controls from our institutional database, no formal sample size calculation was performed. The decision to identify three controls to every patient for interindividual analysis was also arbitrary although we attempted to stringently match patients and controls according to their immunohistochemical profile.

Differential gene expression was expressed as log2-fold change (logFC) with its associated 95% CI. Statistical significance was assessed using Student’s t-test (paired for intraindividual comparison and unpaired for interindividual comparison). P values were corrected for multiplicity using Benjamini–Yekutieli false-discovery rate (FDR) adjustment56. Subgroup analyses were performed to analyze differences between any and distant relapse or those between tumor categorized by immunohistochemistry (HR-positive/HER2-negative; HER2-positive [irrespective of HR status]; TNBC); and gBRCA1 status. For subgroups in interindividual analyses, controls were included based on the characteristics of their matched patient.

Analyses were performed using SPSS Statistics Version 23.0 (IBM Corporation, Armonk, NY, USA) and Prism Version 9.0 (GraphPad Software Inc., San Diego, CA, USA).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.