Cyclin E expression is associated with high levels of replication stress in triple-negative breast cancer

Replication stress entails the improper progression of DNA replication. In cancer cells, including breast cancer cells, an important cause of replication stress is oncogene activation. Importantly, tumors with high levels of replication stress may have different clinical behavior, and high levels of replication stress appear to be a vulnerability of cancer cells, which may be therapeutically targeted by novel molecularly targeted agents. Unfortunately, data on replication stress is largely based on experimental models. Further investigation of replication stress in clinical samples is required to optimally implement novel therapeutics. To uncover the relation between oncogene expression, replication stress, and clinical features of breast cancer subgroups, we immunohistochemically analyzed the expression of a panel of oncogenes (Cyclin E, c-Myc, and Cdc25A,) and markers of replication stress (phospho-Ser33-RPA32 and γ-H2AX) in breast tumor tissues prior to treatment (n = 384). Triple-negative breast cancers (TNBCs) exhibited the highest levels of phospho-Ser33-RPA32 (P < 0.001 for all tests) and γ-H2AX (P < 0.05 for all tests). Moreover, expression levels of Cyclin E (P < 0.001 for all tests) and c-Myc (P < 0.001 for all tests) were highest in TNBCs. Expression of Cyclin E positively correlated with phospho-RPA32 (Spearman correlation r = 0.37, P < 0.001) and γ-H2AX (Spearman correlation r = 0.63, P < 0.001). Combined, these data indicate that, among breast cancers, replication stress is predominantly observed in TNBCs, and is associated with expression levels of Cyclin E. These results indicate that Cyclin E overexpression may be used as a biomarker for patient selection in the clinical evaluation of drugs that target the DNA replication stress response.


INTRODUCTION
Breast cancers are the most frequently diagnosed neoplasms worldwide, with approximately 1.38 million women being diagnosed with breast cancer worldwide every year. One-third of these women subsequently die of this disease, accounting for 14% of all cancer-related deaths in women 1 . Therefore, there is an urgent clinical need for improved breast cancer treatment.
Breast cancers are very heterogeneous, and multiple classification methods have been developed to stratify patient groups. Using gene expression profiling, at least six major breast cancer subgroups have been defined, including "normal-like", "luminal A", "luminal B", "HER2-enriched", "claudin-low", and "basal-like" 2 . Furthermore, combining copy number variations with gene expression analysis allowed identification of ten clusters that are associated with differential clinical outcome 3 . In standard care, breast cancers are subtyped based on the expression of the estrogen and progesterone receptors (ER and PR) and the human epidermal growth factor receptor-2 (HER2), as these receptors are "oncogenic drivers" and relevant drug targets. Patients with breast cancers that do not express the ER, PR, and HER2, so-called triple-negative breast cancers (TNBCs), do not benefit from antihormonal or anti-HER2-targeted treatments, and rely on conventional chemotherapeutic regimens. Initially, high response rates to conventional chemotherapeutics are seen in TNBC, however, tumors often recur and women have a poor prognosis overall 4 . TNBCs display aggressive behavior, and account for~15-20% of all invasive breast cancers 4 .
TNBC tumors show a large degree of overlap with the intrinsic "basal-like" and "claudin-low" subgroups, lack common "druggable" aberrations, but share a profound genomic instability 5 .
Finding novel treatment options for such genomically instable cancers is not only relevant for TNBCs, but also for other hard-totreat cancers with extensive genomic instability, including ovarian and pancreatic cancers 6,7 .
Evidence is increasingly pointing to "replication stress" as a driver of genomic instability 8,9 . DNA replication is initiated at certain genomic loci called "replication origins" 9 . Replication origins are fired in a temporally controlled way, which prevents exhaustion of the nucleotide pool. A key source of replication stress in cancer cells appears to be the uncoordinated firing of replication origins due to oncogene activation [8][9][10][11] . As a consequence, oncogene activation depletes the nucleotide pool, leading to slowing or complete stalling of replication forks 12 . Oncogenes that have been linked to the induction of replication stress are the transcription factor c-Myc 13 and Cyclin E, which acts in conjunction with cyclin-dependent kinase-2 (CDK2) to promote S-phase entry. It was shown that Cyclin E overexpression triggers aberrant origin firing with consequent nucleotide pool depletion, leading to replication fork stalling and genomic instability 12,14 . Likewise, overexpression of the Cdc25A phosphatase, which activates CDK2, promotes premature cell cycle progression and genomic instability [15][16][17] .
Cells are equipped with multiple mechanisms to survive replication stress. During replication fork stalling, single-stranded DNA (ssDNA) is exposed and rapidly activates the so-called "replication checkpoint", in which the ATR kinase is the central player 18 . Activation of the replication checkpoint facilitates the rapid coating of ssDNA at replication forks with replication protein-A (RPA), which is phosphorylated by ATR 19,20 . When stalled replication forks are not resolved in time, they can collapse and cause DNA double-strand breaks (DSBs), triggering phosphorylation of the histone variant H2AX at serine 139, which is referred to as γ-H2AX 21 .
Genomically instable tumors increasingly rely for their survival on mechanisms that allow cells to resolve replication stressinduced DNA lesions, including cell cycle checkpoints 22 . Hence, cell cycle checkpoint kinases, including WEE1 and ATR, are potential therapeutic targets for tumors with high levels of replication stress. In order to implement novel therapeutic agents that target tumors with high levels of replication stress optimally, it is essential to know which tumor subgroups display replication stress. To this end, and to find potential biomarkers for tumors with high levels of replication stress, we examined replication stress levels in relation to oncogene expression and clinicopathological data in a consecutive well-defined series of breast cancer samples.

RESULTS
Overexpression of Cyclin E1 results in replication stress and increased sensitivity to ATR and WEE1 inhibition To study the potential effects of Cyclin E1, encoded by the CCNE1 gene, on DNA replication kinetics in vitro, we transduced MDA-MB-231 TNBC cells with a doxycycline-inducible Cyclin E1 construct (Fig. 1a). Cells were treated for 48 h with doxycycline to induce Cyclin E1 overexpression, and were then sequentially labeled with the thymidine analogues CldU and IdU to probe replication kinetics (Fig. 1b). Measurement of individual IdU tract lengths revealed that overexpression of Cyclin E1 resulted in a reduction in ongoing DNA synthesis speed of approximately 25% (Fig. 1c). To assess whether Cyclin E1 overexpression affects the sensitivity of cancer cells to inhibitors of cell cycle checkpoint kinases, we induced Cyclin E1 overexpression and inhibited ATR and WEE1 kinases using VE-822 and MK-1775 respectively (Fig. 1d, Supplementary Table 1). Induction of Cyclin E1 overexpression increased the sensitivity to ATR and WEE1 inhibitors in MDA-MB231 cells, as assessed using MTT assays (Fig. 1d). Taken together, overexpression of Cyclin E1 results in replication stress in TNBC cells and enhanced the sensitivity towards inhibitors of the WEE1 and ATR cell cycle checkpoint kinases.

Analysis of breast cancer tissues
To further investigate oncogene-induced replication stress in clinical samples, we selected a study population that comprised 384 breast cancer patients (Fig. 2a), whose baseline clinical, pathological and treatment characteristics are summarized in Supplementary Table 2. Breast cancer patients were divided into four subgroups according to their hormone receptor status and HER2 expression. Molecular subgroup analysis showed that our cohort consisted of n = 161 ER/PR + HER2 − , n = 90 ER/PR + HER2 + , n = 27 ER/PR − HER2 + , and n = 106 ER/PR − HER2 − (TNBC) patients (Supplementary Table 2). Compared to other patient subgroups, the median age at diagnosis was lowest for TNBC patients, followed by ER/R − HER2 + patients 23 (Supplementary Table 2). In addition, tumor grade significantly varied across breast cancer subgroups (Supplementary Table 2, P = 1.72 × 10 −13 ), and was highest in patients with TNBC (Supplementary Table 2, P < 0.05 for all tests). In accordance with treatment guidelines, chemotherapy was most frequently used in TNBC patients (66.0%), whereas radiotherapy and endocrine therapy were more frequently used in non-TNBC patients (Supplementary Table 3, P < 0.001 for all tests).
Expression of Cyclin E, c-Myc, and Cdc25A in breast cancer subgroups We next performed immunohistochemical analysis in breast cancer tissues taken prior to treatment (Supplementary Table 4) to examine the expression levels of Cyclin E (encoded by CCNE1) and c-Myc (encoded by MYC), two oncogenes that are frequently amplified in TNBC, and have been associated with replication stress in experimental models [24][25][26][27] (Fig. 2b). For Cyclin E, we separately assessed nuclear and cytoplasmic Cyclin E (Fig. 2b), since cytoplasmic Cyclin E has been related to reduced breast cancer survival 28,29 . We also assessed expression of the Cdc25A phosphatase (Fig. 2b). Although Cdc25A is less frequently overexpressed in breast cancer 15 , Cdc25A overexpression is frequently used to induce replication stress in experimental models 30 , and has been linked to oncogenic activity 31,32 , and for that reason was included in our analysis.
Expression levels of nuclear Cyclin E were significantly higher in TNBC than in the other breast cancer subgroups (Supplementary Table 5, and Fig. 2c, P < 0.05 for all tests). In contrast, cytoplasmic Cyclin E levels were high in both TNBC and ER/PR − HER2 + tumors, compared to ER/PR + HER2 − tumors (Supplementary Table 5 and Fig. 2c, P < 0.05 for all tests). Expression levels of c-Myc were also higher in TNBC (Supplementary Table 5 and Fig. 2c, P < 0.001) compared to the other subgroups. Although TNBC tumors also displayed the highest levels of Cdc25A, these differences were not statistically significant (Supplementary Table 5 and Fig. 2c). We next analyzed mRNA expression levels of CCNE1, MYC, and CDC25A in a set of 7270 gene expression profiles from primary breast tumors obtained from the Gene Expression Omnibus (GEO) 33 . The mRNA expression of CCNE1, MYC, and to a lesser extend CDC25A were significantly higher in TNBC when compared to the other subgroups (Fig. 2d). These findings confirm at the mRNA level that, among breast cancer subgroups, TNBCs exhibited the highest expression levels of Cyclin E, c-Myc and Cdc25A.

Levels of replication stress in breast cancer subgroups
To determine levels of replication stress, we immunohistochemically examined the expression of phosphorylated RPA33 (further referred to as pRPA) in breast cancer tissues taken prior to treatment. In addition, we analyzed the expression of γ-H2AX, an established marker for collapsed replication forks and doublestrand breaks, which are consequences of replication stress 34,35 . Representative immunohistochemical pRPA and γ-H2AX stainings are shown in Fig. 3a. We also compared expression of pRPA and γ-H2AX with other DNA damage response components in a subset of samples. Specifically, we immunohistochemically stained n = 45 cases for 53BP1 and FANCD2, two proteins involved in the repair of DNA lesions induced by replication stress (Supplementary Fig.  1a). High levels of 53BP1 were present in all cases analyzed ( Supplementary Fig. 1b), whereas expression levels of FANCD2 showed larger variation ( Supplementary Fig. 1b). Importantly, we found that expression of γ-H2AX and FANCD2 were associated ( Supplementary Fig. 1c, r = 0.344, P = 0.028), in line with their roles in resolving the consequences of replication stress-induced DNA lesions. In contrast, 53BP1 expression did not show statistically significant associations with expression of either pRPA or γ-H2AX ( Supplementary Fig. 1c).
Since the highest expression levels of c-Myc and Cyclin E as well as replication stress markers were observed in TNBC, we further analyzed relevant TNBC subgroups. Expression of the androgen receptor (AR) has been described to define a TNBC subgroup with distinct characteristics 36 . In our cohort, n = 29 out of 106 TNBC cases (27.4%) expressed the AR (Supplementary Fig. 2a, Supplementary Table 6a, b). We next analyzed the expression levels of replication stress markers in TNBC-AR − and TNBC-AR + subgroups ( Supplementary Fig. 2b). Tumor expression of pRPA ( Supplementary Fig. 2b Fig. 2b and Supplementary  Table 6b, P > 0.05 for all tests). These data indicate that in this cohort, expression of replication stress markers and expression of oncogenes are similarly distributed in TNBC-AR − and TNBC-AR + subgroups.
Correlations between replication stress markers and Cyclin E, c-Myc, and Cdc25A expression To determine whether expression of Cyclin E, c-Myc, or Cdc25A was associated with replication stress in our study population, we examined associations between expression of Cdc25A, Cyclin E, and c-Myc with expression of replication stress markers pRPA and γ-H2AX. We first analyzed associations of replication stress markers with oncogene expression as continuous variables (Table 1), because no clear biphasic distributions of staining intensities were observed. Expression levels of pRPA were positively correlated with those of c-Myc (Table 1, r = 0.26, P < 0.001), as well as expression levels of nuclear Cyclin E ( Table 1, r = 0.37, P < 0.001) and cytoplasmic Cyclin E ( Table 1, r = 0.28, P < 0.001) in the entire cohort. Among breast cancer subgroups, the strongest correlations were found in TNBC between Cyclin E and pRPA expression ( Table 1, r = 0.43, P < 0.001), and between c-Myc and pRPA (Table 1, r = 0.36, P < 0.001). Furthermore, Cyclin E expression was strongly correlated with levels of γ-H2AX staining ( Table 1, r = 0.63, P < 0.001). Spearman correlation analysis of breast cancer subgroups revealed that the association between nuclear Cyclin E and γ-H2AX expression was strongest in ER/PR − HER2 + (Table 1, r = 0.86, P < 0.001) and TNBC (Table 1, r = 0.71, P < 0.001). Combined, these data indicate that expression of Cyclin E is associated with expression of replication stress markers in our study population, especially in the TNBC and ER/PR − HER2 + subgroups.
Similar associations were observed when we dichotomized samples on the basis of oncogene expression. Specifically, Cyclin E stainings were categorized into nuclear and cytoplasmic negative (N−/C−, n = 113), nuclear positive and cytoplasmic negative (N+/C−, n = 78) and cytoplasmic positive with either nuclear   Table  7b, Cdc25A: r = −0.100, P = 0.607 and (Cyclin E: r = 0.048, P = 0.806). In conclusion, markers of replication stress appear equally expressed in AR-negative and AR-positive TNBCs, although the associations between replication stress (pRPA, γ-H2AX) and oncogene expression (Cdc25A, Cyclin E) are strongest in ARnegative TNBCs within our cohort.
Associations of expression of replication stress markers with clinicopathological characteristics and tumor expression of Cyclin E, c-Myc and Cdc25A Linear regression analyses were performed to evaluate the relation between expression of replication stress markers versus clinicopathological characteristics and tumor expression of Cdc25A, Cyclin E, and c-Myc. Univariate regression analysis showed that pRPA was associated with γ-H2AX (Table 3, β = 0.409, P < 0.001). Also, pRPA was associated with positivity for cytoplasmic Cyclin E ( Table 3, β = 0.345, P < 0.001). In contrast, weaker associations were found between oncogene expression and pRPA levels ( Table  3). The covariates from univariate regression analyses that displayed P < 0.05 were included for multivariate analysis, and showed that pRPA was weakly related to γ-H2AX (Table 3, β = 0.351, P < 0.001). Conversely, tumor expression of γ-H2AX was associated with nuclear Cyclin E (Table 3, β = 0.407, P < 0.001) and with cytoplasmic Cyclin E (Table 3, β = 0.324, P < 0.001). No strong independent associations were found between clinicopathological parameters and pRPA or γ-H2AX expression. In summary, our findings indicate that replication stress measured with pRPA, corrected for age, tumor subgroup, tumor grade, and tumor expression, is related to γ-H2AX, whereas replication stress quantified with γ-H2AX, corrected for tumor subgroup and tumor expression, is positively associated with tumor expression of nuclear and cytoplasmic Cyclin E.
Associations between expression of oncogenes and markers of replication stress with survival We next analyzed the relationship between the expression of Cyclin E, c-MYC, Cdc25A, and markers of replication stress with disease-free survival (DFS), recurrence-free survival (RFS) or overall survival (OS) in our breast cancer cohort (n = 379) using Cox regression analyses (Table 4). Univariate analyses showed that positivity of nuclear Cyclin E expression was associated with worse OS (Table 4, β = 0.633, P = 0.035) and borderline associated with DFS (Table 4, β = 0.565, P = 0.058) but not RFS (  (Table 4) and were not associated with expression of Cyclin E, Cdc25A, c-Myc or γ-H2AX.
Next, we analyzed DFS on the basis of those publicly available patient data retrieved from GEO, of which also clinical data was available (n = 3450 out of n = 7270). A shorter DFS was found in patients with ER + /HER2 + (n = 341) and ER − /HER2 + (n = 291) tumors, when compared to other subgroups ( Supplementary Fig.  4a), and a shorter OS in the TNBC (n = 263) and ER − /HER2 + (n = 341) patients ( Supplementary Fig. 4b). When we evaluated the relation between CCNE1 mRNA expression and DFS (n = 846) or OS (n = 632) using multivariate Cox regression analysis of data from publicly available mRNA samples of primary breast tumors, we found no associations between CCNE1 mRNA expression and DFS. In contrast, a higher mRNA expression level of CCNE1 was associated with reduced OS (Supplementary Table 8, HR = 1.660, P = 0.004) in the ER + /HER2 − subgroup (n = 417).

DISCUSSION
In the present study, we examined the relation between oncogene expression and replication stress marker expression in breast cancer subgroups. Our immunohistochemical analyses show that levels of replication stress and oncogene expression vary among breast cancer subgroups and that the highest expression levels of replication stress markers and oncogenes were found in the TNBC and ER/PR − HER2 + subgroups. Furthermore, both nuclear and cytoplasmic Cyclin E expression, and to a lesser extend c-Myc expression, were strongly associated with the levels of replication stress. These findings are relevant in the context of ongoing clinical studies using novel agents that target replication stress, for which proper patient selection is warranted.
high-grade serous ovarian cancer and head-and-neck squamous cell carcinoma 31,38 , which are characterized by genomic instability 39,40 .
Concerning c-Myc expression, our results indicated that high c-Myc expression was predominantly observed in TNBCs, and that c-Myc expression was associated with expression of the replication stress marker pRPA. These findings are in line with previous reports, showing frequent MYC amplification in TNBC 24 . Importantly, our results provide confirmation that the link between c-Myc-overexpression and induction of replication stress is also observed in patient samples 13,26,41 .
CCNE1 is frequently amplified in TNBC, in line with our finding that high levels of Cyclin E expression is most prominent in TNBC cases 24 . Importantly, our observation that Cyclin E expression is associated with expression of replication stress markers is in line with experimental models, in which Cyclin E overexpression has been shown to trigger a DNA damage response 12,14,42 . Specific isoforms of Cyclin E, so-called low molecular-weight Cyclin E isoforms (LMW-E) are suggested to accumulate in the cytoplasm because they lack the NH 2 -terminal nuclear localization signal 43 . In line with experiments in which expression of cytoplasmic Cyclin E was shown to induce various features that relate to replication stress, including chromosome missegregation 44 , our data show that expression of cytoplasmic Cyclin E, like expression of nuclear Cyclin E, is associated with expression of replication stress markers pRPA and γ-H2AX. Of note, we found that cytoplasmic and nuclear Cyclin E showed a similar distribution among breast cancer subgroups, with highest expression observed in TNBC. However, no clear biphasic staining distributions of nuclear or cytoplasmic Cyclin E were observed. For this reason, we analyzed the expression of Cyclin E and other markers as continuous variables.
Survival analysis of our cohort of patients and publicly available patient data showed that higher levels of replication stress (pRPA) or high Cyclin E expression levels were associated with worse DFS, whereas a higher nuclear Cyclin E levels was related with worse OS, albeit only in univariate analysis. CCNE1 mRNA was not found to be independently associated with DFS or OS in public data, except in the ER + /HER2 − subgroup, in which higher levels of CCNE1 expression were associated with worse OS. Immunohistochemical analysis of Cyclin E previously identified Cyclin E as an independent predictor of survival 45 , although hormone receptor status was not included in this analysis. Also, total and lowmolecular-weight Cyclin E levels, as assessed by Western blot, were shown to be independent predictors of overall survival in a breast cancer cohort 46 . Studies in other cancer types, including serous ovarian cancers and endometrial carcinomas, also showed that CCNE1 amplification or Cyclin E overexpression was associated with more aggressive tumor features, but was not an independent predictor factor of survival [47][48][49] . However, high Cyclin E expression was shown to be a significant predictive marker for survival in suboptimally debulked ovarian cancers 50 . Association analysis between oncogene expression and pRPA or γ-H2AX expression in the combined cohort (n = 384) and in breast cancer subgroups. Oncogene expression was used as a continuous variable in a Spearman rank correlation analysis.

S. Guerrero Llobet et al.
Taken together, our findings indicate that among breast cancer subgroups, TNBCs and ER/PR − HER2 + tumors are characterized by overexpression of the c-Myc and Cyclin E oncogenes, and by higher expression levels of replication stress markers. These findings are relevant, as increasing numbers of drugs are being developed that target cancer cells with high levels of replication stress. Specifically, inhibitors of the cell cycle checkpoint kinases Chk1 and ATR are currently being tested in combination with genotoxic agents that interfere with DNA replication 51,52 . In parallel, inhibitors of the WEE1 kinase have been developed. The potential of WEE1 inhibition was early on attributed to high levels of replication stress 53 and preclinical data indicated that WEE1 inhibition would be preferentially effective in Cyclin E-overexpressing cancer cells 54 . In line with these data, ovarian cancer patients that responded favorably to WEE1 inhibitor treatment more frequently showed tumor overexpression of Cyclin E 55 . Based on these observations, a clinical trial testing WEE1 inhibitor treatment in patients selected on CCNE1 amplification is currently ongoing (clinicaltrials.gov identifier: NCT03253679).
Although different cell cycle checkpoint inhibitors are already in clinical development, an effective patient selection strategy is required to identify those patients who might benefit from these drugs. For breast cancer patients, our data underscore that overexpression of nuclear and/or cytoplasmic Cyclin E could be used as a selection criterion for treatment with drugs that target replication stress, including inhibitors of WEE1 and ATR.
Western blotting MDA-MB-231 cells were lysed in M-PER lysis buffer (Pierce), supplemented with protease and phosphatase inhibitor cocktail (Thermo Scientific). Protein content was measured using the Pierce BCA protein quantification Kit (Thermo Scientific). Protein samples were separated using sodium dodecyl sulfate-polyacrylamide gels (SDS-PAGE) and transferred to polyvinylidene fluoride membranes (Immobilon). Membranes were blocked in 5% skimmed milk (Sigma), in tris-buffered saline (TBS) containing 0.05% Tween-20 (Sigma) and incubated overnight with primary antibodies at 4°C and subsequently incubated with secondary antibodies for 1 h at room temperature. Primary antibodies used were mouse anti-Cyclin E1 (Abcam, ab3927, 1:500) and mouse anti-β-actin (MpBiomedicals, 69100, 1:10,000). Secondary antibodies used were horseradish peroxidaselinked anti-mouse IgG (1:2000, DAKO) and visualized using chemiluminescence (Lumi-Light, Roche Diagnostics) on a Bio-Rad bioluminescence device. Protein imaging was performed using Image Lab software (Bio-Rad). All blots derive from the same experiment and were processed in parallel.
DNA fiber analysis      Bethyl, TX, USA). Staining was detected by the application of 3,3diaminobenzidine (DAB), and hematoxylin as a counterstaining. For c-Myc and pRPA, the complete staining procedure was performed on an autostainer (BenchMark Ultra IHC/ISH, Roche, Basel, Switzerland). Additional information about antibodies and staining protocols is provided in Supplemental Table 4.
Scoring was performed semi-quantitatively by two independent researchers, without knowledge of clinical data, and was supervised by a breast cancer pathologist. Stainings were categorized according to percentages of cells that showed staining and on intensity of staining. Staining intensity was scored in three categories: 0 (negative), 1 (medium), and 2 (high). In order to calculate the score for each core, the percentage of cells in each group was multiplied by their intensity score, resulting in a range from 0 to 200 points. Next, the scores from each case and staining were averaged and considered for analysis.
For Cdc25A, only nuclear staining was considered, in line with a previous study 60 . For Cyclin E, nuclear and cytoplasmic staining were scored individually 28 . In addition, nuclear c-Myc, pRPA, γ-H2AX, 53BP1, and FANCD2 stainings were evaluated. A concordance of more than 90% was found between observers. Discordant scores were reviewed and adjusted to consensus. The status of ER, PR, HER2, and AR was determined according to the guidelines of the American Society of Clinical Oncology/ College of American Pathologists by counting at least 100 cells.
Immunohistochemical stainings were considered evaluable when a tumor core contained at least 10% tumor cells. In addition, tumor stainings were included for analysis when at least 2 out of 3 cores were evaluable. Core loss over 558 cases was on average 15 Evaluation of mRNA expression of CCNE1, MYC and CDC25A Publicly available mRNA profiles of 7270 primary breast tumors were collected from GEO platforms GPL96 (generated with Affymetrix HG-U133A) and GPL570 (generated with Affymetrix HG-U133 Plus 2.0) as previously described 33 . Expression profiles were batch corrected using COMBAT 61 . CCNE1 expression values were calculated using probe 213523_at. CDC25A expression values were calculated using probe 204695_at. MYC expression values were calculated using probe 202431_s_at.

Statistical analyses
Analyses were performed on the total study population as well as on four patient subgroups based on hormone receptor status and HER2 expression. Differences regarding clinicopathological features, treatment, and immunohistochemical expression levels between the four groups were analyzed using Pearson chi-square tests in case of categorical variables, while Kruskal-Wallis tests and Mann-Whitney U tests were used in case of continuous variables.
Univariate linear regression analyses were performed to study the relation between expression of replication stress markers versus clinicopathological characteristics and tumor expression of Cyclin E, c-Myc, or Cdc25A. Comparisons that reached P < 0.05 in univariate linear regression analyses were selected for multivariate linear regression analyses. All statistical analyses in this study were performed using SPSS Statistics 23.0 (IBM).
Associations between mRNA expression levels and survival in breast cancer subgroups were determined using multivariate Cox regression analyses with age, tumor size, tumor grade, lymph node involvement, ER status, HER2 status, and treatment regimen as co-variates. DFS was calculated as the interval between date of diagnosis to date of diagnosis of distant metastasis. RFS was based on the interval between date of diagnosis of disease to date of diagnosis of DM or date of overall death. OS was calculated as the interval between date of diagnosis to date of death by any cause. Survival probabilities for different breast cancer subgroups were calculated using Kaplan-Meier curves.

Reporting summary
Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.