Prognostic significance of RNA-based TP53 pathway function among estrogen receptor positive and negative breast cancer cases

TP53 and estrogen receptor (ER) are essential in breast cancer development and progression, but TP53 status (by DNA sequencing or protein expression) has been inconsistently associated with survival. We evaluated whether RNA-based TP53 classifiers are related to survival. Participants included 3213 women in the Carolina Breast Cancer Study (CBCS) with invasive breast cancer (stages I–III). Tumors were classified for TP53 status (mutant-like/wildtype-like) using an RNA signature. We used Cox proportional hazards models to estimate covariate-adjusted hazard ratios (HRs) and 95% confidence intervals (CIs) for breast cancer-specific survival (BCSS) among ER- and TP53-defined subtypes. RNA-based results were compared to DNA- and IHC-based TP53 classification, as well as Basal-like versus non-Basal-like subtype. Findings from the diverse (50% Black), population-based CBCS were compared to those from the largely white METABRIC study. RNA-based TP53 mutant-like was associated with BCSS among both ER-negatives and ER-positives (HR (95% CI) = 5.38 (1.84–15.78) and 4.66 (1.79–12.15), respectively). Associations were attenuated when using DNA- or IHC-based TP53 classification. In METABRIC, few ER-negative tumors were TP53-wildtype-like, but TP53 status was a strong predictor of BCSS among ER-positives. In both populations, the effect of TP53 mutant-like status was similar to that for Basal-like subtype. RNA-based measures of TP53 status are strongly associated with BCSS and may have value among ER-negative cancers where few prognostic markers have been robustly validated. Given the role of TP53 in chemotherapeutic response, RNA-based TP53 as a prognostic biomarker could address an unmet need in breast cancer.

INTRODUCTION TP53 status has generally been observed to be an independent prognostic factor among breast cancer cases [1][2][3][4][5][6] . However, recent studies suggest the prognostic effect is subtype-dependent, with conflicting reports regarding its prognostic performance [7][8][9][10] . It is important to understand the prognostic value of TP53 status within ER subtypes, given that TP53 and ER pathways play essential roles in breast cancer, and due to recent evidence of crosstalk between their signaling pathways 11-14 . Inconsistent results across previous studies may have been due in part to technical differences. Most studies on TP53 and survival among breast cancer patients have classified TP53 status using either DNA sequencing or immunohistochemistry (IHC) to detect nuclear overexpression of TP53 protein as a surrogate marker of mutation status. IHC methods may misclassify some mutant tumors as wildtype, and both methods may miss some tumors with functional defects in the TP53 pathway [4][5][6][7][8][9][10][11][12][13][14][15][16] . In contrast, RNA methods detect patterns of loss or activity downstream in the TP53 signaling pathway. As such, RNA-based TP53 classification methods may reduce misclassification of functional status and clarify associations with survival outcomes. It is also important to address the role of TP53 in in diverse populations and across ER subtypes.
We have sought to address these gaps by evaluating the prognostic value of a validated, RNA-based signature of TP53 functional status (overall and within ER subtypes). Black women have higher rates of TP53 mutant-tumors [15][16][17] and may have different mutation types 17 , and therefore, we used data from the Carolina Breast Cancer Study, which oversampled Black and younger women. We compared the prognostic effects of TP53 in this diverse population to those from another large, mostly European dataset.

RESULTS
The eligible population included 3213 and 1343 breast cancer cases in CBCS and METABRIC, respectively (Table 1, Supplementary  Fig. 1). The number of events for each outcome in the two populations are provided in Supplementary Fig. 1. Because the populations differ substantially in the distribution of ER status (50 and 29% ER negative in CBCS and METABRIC, respectively), Table 1 is stratified by ER to facilitate comparisons. Compared to METABRIC, both ER-positive and -negative cases in CBCS were younger at diagnosis, with tumors diagnosed at a lower grade, and a lower proportion of node-positive tumors. As the METABRIC population is predominantly non-Black, the most comparable population is the non-Black subgroup in CBCS. The differences in clinical characteristics between the studies became more pronounced when comparing METABRIC to the non-Black population in CBCS.
Breast cancer-specific survival patterns varied across TP53 subtypes. Kaplan Meier plots (Figs. 1, 2) and multivariable models (Tables 2 and 3 As 60% of TP53 mutant-like tumors were Basal-like in CBCS, it was of interest to also evaluate Basal-like vs. non-Basal-like subtypes to see whether the survival associations mirrored those for TP53. The Kaplan Meier plots and multivariable models showed that these markers have similar effects. For example, in CBCS the HR (95% CI) for Basal-like vs. non-Basal-like status was 3.37 (1.99-5.71). In multivariable models for both populations, the overall associations were recapitulated when restricting to ERpositive cases. When restricting to ER-negative cases, there were no statistically significant associations between tumor subtypes and BCSS, except in CBCS where the magnitude of association between RNA-based TP53 status and survival was similar among ER-positive and -negative cases (4.66 [1.79-12.15] and 5.38 [1.84-15.78], respectively). Sensitivity analyses restricting CBCS to non-Black cases resulted in no change among ER-positive cases and an increased magnitude among ER-negative cases.
TP53 status was also associated with overall survival, regardless of classification method. Kaplan Meier plots (Supplementary Figs. 2 and 3) only showed statistically significant associations with OS when using DNA-based TP53 classification (as well as RNA-based TP53 in METABRIC). When adjusting for other clinical and tumor characteristics (Supplementary Tables 2 and 3, Supplementary Fig.  4), however, statistically significant associations were observed between all subtype classifications and OS. In CBCS, the strongest associations were observed when using RNA-based TP53 classification, with a similar magnitude among ER-positive and -negative cases. In METABRIC, survival associations were only observed among ER-positive cases.
The association between TP53 status and recurrence-free survival varied by ER status. In both populations, Kaplan Meier plots (Supplementary Figs. 5 and 6) and multivariable models (Supplementary Tables 4 and 5, Supplementary Fig. 7) demonstrated that RNA-based TP53 mutant-like status was associated with worse RFS, but the effect was only observed among ERpositive cases. In CBCS, the association was stronger when using RNA-based TP53 status (6.21 [3.27-11.80]) than when using IHCbased TP53 status (2.16 [1.24-3.78]). In METABRIC, IHC-based TP53 was not associated with RFS.
RNA-based TP53 status provided more prognostic information than the other markers of interest (DNA-and IHC-based TP53, and Basal-like status) in both populations (Supplementary Table 6 It is of interest to understand whether the effects of TP53 status differ between Black and non-Black cases; however, the sample size in CBCS allowed only exploratory analysis of these associations. Among ER-positive cases there were no interactions between RNA-or DNA-based TP53 status and race (p = 0.96 and 0.78, respectively), but an interaction was observed by IHC-based TP53 status (p = 0.03). Specifically, the association between mutant-like status and poorer BCSS was more pronounced for non-Black cases compared to Black cases. Among ER-negative cases there were suggestions of interactions between RNA-and DNA-based TP53 and race (p = 0.18 and 0.12, respectively), with the association between TP53 mutant/mutant-like status and

DISCUSSION
RNA-based TP53 functional score had stronger prognostic value than other technical methods in a population-based cohort including Black and Non-Black women in North Carolina. The survival effect of TP53 mutant-like status was most consistent among ER-positive cases, but also showed significant effects among ER-negative cases in CBCS (where ER negatives were prevalent at 33%). Given the proportion of cases who had both TP53 mutant-like and Basal-like phenotypes, it was important to also evaluate the effects of TP53 among Basal-like vs. non-Basallike. The BCSS associations for Basal-like and TP53 were similar, but more high-risk cases were captured with the TP53 status classification. TP53 is an important prognostic marker with potential clinical value and may be useful among ER-negative patients for whom prognostic markers are otherwise lacking. Prior studies have evaluated the survival effects of IHC and DNA-based TP53 status among breast cancer patients, with near consensus that TP53 mutant cases have poorer survival compared to wildtype (Table 4) [1][2][3][4]6,7,18,19 . Very few studies, however, have assessed survival differences by ER status. Among those that have, TP53 mutant cases were generally associated with worse outcomes among ER-positive cases 7,9,10,20 , in line with our findings. However, results among ER-negatives have been more mixed, with some reporting TP53 mutant cases having better survival 9 , but most finding no effect 7,10,21 . It may seem paradoxical that the more aggressive tumors were sometimes found to have better outcomes, but several mechanisms have been proposed, largely indicating enhanced chemosensitivity in ER negative/TP53 mutant tumors. In the present study we found a strong association between TP53 mutant-like status and poorer BCSS among ER negatives, which may demonstrate the importance of functional TP53 status over other classification methods. Additionally, the present findings come from a population-based study, unlike all other previous studies.
Sampling differences between METABRIC and CBCS may explain differences in results among ER-negative cases. In METABRIC, the sample size of ER-negative cases was relatively small (n = 303) and among these, almost all (92%) were classified as TP53 mutant-like by the RNA signature. Whereas in CBCS, there was a larger sample of ER-negatives (n = 1067), which included a smaller proportion TP53 mutant-like cases (86%). CBCS ER-negative cases were also lower grade and more frequently node negative. Given that the METABRIC samples were sourced from tumor banks, it is plausible that this study oversampled more aggressive tumors, reducing variation of TP53 phenotypes. It is also possible that the more diverse CBCS population led to a different distribution of TP53 mutations (i.e., different types of mutations). Ethnically diverse population-based studies incorporating multigene signatures are important for understanding the diversity of ER negative cases. When population characteristics become a key consideration in interpreting differences across studies, it suggests that either selection bias or relevant variables that vary across populations have not been addressed. However, the current study does show that stratification by ER status is critical and should be included in future studies of TP53-based prognostication.
A strength of this analysis was the racially diverse population with more younger women, and a larger proportion of ERnegative cases. Previous studies of TP53 and prognosis have included populations that are exclusively, or nearly exclusively, of European descent. Another strength was availability of data on TP53 status using three different classification methods. Perhaps the most important limitation was that we did not model treatment differences, precluding the assessment of the predictive value of TP53. A lesser limitation was our choice to use the full dataset for each classification method, inhibiting comparability across methods; but sensitivity analysis in METABRIC among those with complete data for all three classification methods (n = 752) produced effect estimates that were unchanged or slightly stronger than those reported in the main analysis. Due to overlap of the TP53 mutant-like and Basal-like phenotypes, we evaluated survival effects of Basal-like vs. non-Basal-like, but we did not evaluate all possible comparisons (e.g., Basal-like versus each of the other individual PAM50 intrinsic subtypes) because even within relatively large data sets, sample sizes did not allow for further stratification. Lastly, this RNA-based TP53 signature has been widely used and validated for research purposes and is operationalized using cohort normalization. However, a single sample predictor has not yet been developed, so it cannot be applied to a single sample or small cohort without making important assumptions. If this signature continues to demonstrate clinical value, development of a single sample is warranted.
The science of prognostication and prediction has generally been led by applications for ER-positive cases and has relied on factors that reflect tumor growth (e.g., proliferation scores). A marker such as TP53, which represents underlying tumor biology and may define molecular vulnerabilities to chemotherapeutics 22,23 , could address an unmet need. Particularly as immunotherapies become widely utilized, markers that identify tumors likely to benefit will be important. Homologous recombination deficiency status has been proposed as one possible approach 24,25 , but TP53 status may also merit consideration. RNA-based TP53 may be particularly valuable because of its interpretability as a pathway-level change and because it can be conveniently paired with other RNA-based assays. Further consideration of multigene TP53 scores in clinical care could be particularly important for ER-negative cases, for whom fewer predictive biomarkers are currently available.

Study populations
The Carolina Breast Cancer Study (CBCS) is a population-based study that enrolled participants in three phases between 1993 and 2013. Study details have been described previously 26 . Briefly, incident invasive breast cancers among women 20-74 years of age were identified using rapid case ascertainment. Black women and those younger than 50 years of age were oversampled. Clinical characteristics at diagnosis were assessed by collecting medical records and formalin-fixed paraffin-embedded (FFPE) tumor samples at study enrollment. All CBCS study procedures were approved by the University of North Carolina School of Medicine Institutional Review Board and participants provided written informed consent.
We compared the results from CBCS to those from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC), which includes fresh-frozen primary breast tumors collected from five tumor banks across UK and Canada between 1977 and 2005. Clinical and genomic data was downloaded from cbioportal (http://www.cbioportal. org/study?id=brca_metabric). About 93% of subjects were of European descent and the population ranges in age at diagnosis from 22 to 96 years. With an age distribution that skews older (median = 61 years), METABRIC includes a large proportion of ER-positive cases (77%).
Eligible cases were those diagnosed at stage I-III, with available data on TP53 status (Supplementary Fig. 1). In METABRIC, only cases with data on tumor characteristics (stage, grade, size, and node status) were included.

Breast tumor markers
CBCS. ER status was abstracted from clinical records for Phases 1-2. When missing, ER status was determined by the UNC central laboratory. For Phase 3, ER status for all cases was determined by the central laboratory.  Ref. Ref.

Basal
Concordance between central laboratory and clinical record was 93% 27 . Methods for tissue processing and IHC analysis of tumor markers have been described previously 17,[27][28][29] . ER positivity and TP53 mutant-like status was defined using a 10% positivity threshold. We selected the 10% cutoff for ER because at the time of enrollment for Phases 1-2, it was not yet the clinical standard to classify ER borderline tumors (1% to <10% positivity) as ER positive. Additionally, a 10% cutoff for ER positivity has been shown to have a stronger association with molecular phenotypes (e.g., intrinsic subtypes) 27 . Tumor stage and size were abstracted from the medical records. Tumor grade was defined by centralized pathology review. RNA expression in CBCS has been quantified using NanoString assays on at least one FFPE tumor sample per patient, with random replication to assess reproducibility 27,30,31 . A previously validated RNA signature that aggregates expression information on TP53-dependent genes was used to classify TP53 functional status (mutant-like or wildtype-like) based on a similarity-to-centroid approach (Supplementary Table 1) 32 . A research version of the PAM50 predictor was used to classify tumors into intrinsic subtypes 30,33 , which were then dichotomized as basal-like or non-basal-like (i.e., luminal A, luminal B, HER2-enriched, or normal-like).
For cases in CBCS phase 1, two complementary DNA-based methods were employed for detecting TP53 mutations using FFPE tumor samples. First, single strand conformational polymorphism (SSCP) analysis was used as a screening procedure to detect mutations in exons 4-8 of the TP53 gene, with subsequent manual radiolabeled sequencing of SSCP positives 34 . The Roche p53 Amplichip research test was also used to detect single base pair substitutions and single base pair deletions in exons 2-11, as well as splice sites (2 base pairs before and after each exon), in the TP53 gene 35 . All assays were carried out by the UNC central laboratory.
METABRIC. ER status, as well as other tumor characteristics (tumor grade, stage, and size) were obtained from the medical records. RNA and DNA were extracted for transcriptional and genomic profiling on the Illumina Human v3 microarray and Affymetrix SNP 6.0 platforms, respectively 36 . Tumors were classified for TP53 functional status (mutant-like/wildtypelike) using the RNA-based TP53 signature 32 and for PAM50 intrinsic subtype (basal-like/non-basal-like) using a research version of the PAM50 predictor 30,33 .

Outcome assessment
The follow-up period for both studies is defined as the number of years between diagnosis and breast cancer death (for breast cancer-specific survival (BCSS)) and death due to any cause (for overall survival (OS)). For CBCS Phases 1-2, vital status and date of death were determined by linking with the National Death Index (NDI) in 2020. Breast cancer deaths were defined using the International Classification of Diseases breast cancer codes 174.9 (ICD-9) or C50.9 (ICD-10) as derived from death certificates. For METABRIC, vital status and time to death were obtained from the medical records.
Recurrence-free survival (RFS) was defined as time in years from diagnosis to first subsequent recurrent breast cancer (either local, regional, or distant). In CBCS Phase 3, recurrence date was abstracted from medical records after a patient reported a recurrence during follow-up telephone interviews (occurring at regular intervals). In METABRIC, recurrences and time to recurrences were obtained from the medical records.
All subjects who did not experience the outcome of interest were administratively censored at their date of last contact or the last linkage date to the NDI (for CBCS).

Statistical analyses
Kaplan-Meier plots were generated to compare survival patterns between TP53 subtypes defined using different classification methods (RNA signature, DNA sequencing, and IHC). Because of the overlap in TP53 mutant status and Basal-like intrinsic subtype, we also evaluated survival patterns by PAM50 intrinsic subtype (Basal-like/non-Basal-like) to determine whether the effects mirrored those for TP53. Survival patterns were assessed overall and within ER subtypes. Differences between the curves were evaluated using log-rank tests. Kaplan-Meier plots were restricted to node negative cases, while in multivariable models we retained these cases and included node status as an adjustment factor.
The prognostic value of the TP53 subtypes was evaluated using Cox proportional hazards models to compute hazard ratios (HRs) and 95% confidence intervals (CIs), overall and stratified by ER status, analyzing each TP53 classification method (RNA-, IHC, and DNA-based) separately. Again, we estimated survival effects for PAM50 intrinsic subtype (Basal-like vs. non-Basal-like) to assess whether they mirrored those for TP53. Minimally adjusted models accounted for age at diagnosis (as well as race and study phase in CBCS). Fully adjusted models additionally accounted for tumor stage, grade, size, and node status. Since tumor grade was missing for about 26% of cases in CBCS, covariates with missing values were addressed using the multiple imputation plus outcome approach 37 . TP53 status and PAM50 subtype were modeled with addition of a timevarying term (T) due to the observed violation of the proportionality assumption of the Cox model. The direction and magnitude of the change in HR over time is indicated by the log of this coefficient (i.e., log(T) < 1 indicates a decreasing hazard and log(T) > 1 indicates an increasing hazard). We estimated the prognostic value of each TP53 classification method as the change in likelihood ratio chi square (Δχ 2 ) following a

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The CBCS datasets generated and/or analyzed during the current study are not publicly available due to some human subjects restrictions, but may be available from the corresponding author on reasonable request. METABRIC data can be found here: http://www.cbioportal.org/study?id=brca_metabric.