Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups

Rueda, Oscar M.; Sammut, Stephen-John; Seoane, Jose A.; Chin, Suet-Feung; Caswell-Jin, Jennifer L.; Callari, Maurizio; Batra, Rajbir; Pereira, Bernard; Bruna, Alejandra; Ali, H. Raza; Provenzano, Elena; Liu, Bin; Parisien, Michelle; Gillett, Cheryl; McKinney, Steven; Green, Andrew R.; Murphy, Leigh; Purushotham, Arnie; Ellis, Ian O.; Pharoah, Paul D.; Rueda, Cristina; Aparicio, Samuel; Caldas, Carlos; Curtis, Christina

doi:10.1038/s41586-019-1007-8

Letter
Published: 13 March 2019

Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups

Oscar M. Rueda¹,
Stephen-John Sammut¹^na1,
Jose A. Seoane^2,3,4^na1,
Suet-Feung Chin¹,
Jennifer L. Caswell-Jin²,
Maurizio Callari¹,
Rajbir Batra¹,
Bernard Pereira¹,
Alejandra Bruna¹,
H. Raza Ali¹,
Elena Provenzano^5,6,
Bin Liu¹,
Michelle Parisien⁷,
Cheryl Gillett⁸,
Steven McKinney⁹,
Andrew R. Green¹⁰,
Leigh Murphy⁷,
Arnie Purushotham⁸,
Ian O. Ellis¹⁰,
Paul D. Pharoah^1,5,6,11,
Cristina Rueda¹²,
Samuel Aparicio⁹,
Carlos Caldas^1,5,6 &
…
Christina Curtis^2,3,4

Nature volume 567, pages 399–404 (2019)Cite this article

44k Accesses
200 Citations
608 Altmetric
Metrics details

Subjects

Abstract

The rates and routes of lethal systemic spread in breast cancer are poorly understood owing to a lack of molecularly characterized patient cohorts with long-term, detailed follow-up data. Long-term follow-up is especially important for those with oestrogen-receptor (ER)-positive breast cancers, which can recur up to two decades after initial diagnosis^1,2,3,4,5,6. It is therefore essential to identify patients who have a high risk of late relapse^7,8,9. Here we present a statistical framework that models distinct disease stages (locoregional recurrence, distant recurrence, breast-cancer-related death and death from other causes) and competing risks of mortality from breast cancer, while yielding individual risk-of-recurrence predictions. We apply this model to 3,240 patients with breast cancer, including 1,980 for whom molecular data are available, and delineate spatiotemporal patterns of relapse across different categories of molecular information (namely immunohistochemical subtypes; PAM50 subtypes, which are based on gene-expression patterns^10,11; and integrative or IntClust subtypes, which are based on patterns of genomic copy-number alterations and gene expression^12,13). We identify four late-recurring integrative subtypes, comprising about one quarter (26%) of tumours that are both positive for ER and negative for human epidermal growth factor receptor 2, each with characteristic tumour-driving alterations in genomic copy number and a high risk of recurrence (mean 47–62%) up to 20 years after diagnosis. We also define a subgroup of triple-negative breast cancers in which cancer rarely recurs after five years, and a separate subgroup in which patients remain at risk. Use of the integrative subtypes improves the prediction of late, distant relapse beyond what is possible with clinical covariates (nodal status, tumour size, tumour grade and immunohistochemical subtype). These findings highlight opportunities for improved patient stratification and biomarker-driven clinical trials.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: A multistate model of breast-cancer relapse enables individual risk-of-relapse predictions throughout disease progression.**

**Fig. 2: The integrative breast-cancer subtypes exhibit distinct patterns of relapse.**

**Fig. 3: The integrative subtypes improve prediction of late, distant recurrence in ER⁺/HER2⁻ breast cancer beyond clinical covariates.**

**Fig. 4: Organ-specific patterns and timing of distant relapse in ER-positive and ER-negative patients.**

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Article Open access 28 March 2024

Genome-wide CRISPR screens identify the YAP/TEAD axis as a driver of persister cells in EGFR mutant lung cancer

Article Open access 24 April 2024

Code availability

All code and scripts are available for academic use at https://github.com/cclab-brca/brcarepred.

Data availability

The genomic copy number, gene-expression and molecular-subtype information has been described previously¹² and is available at the European Genome-Phenome Archive at https://www.ebi.ac.uk/ega/studies/EGAS00000000083. Clinical data are available in Supplementary Tables 5–8. The breast-cancer-recurrence predictor is available as a web application for academic use at https://caldaslab.cruk.cam.ac.uk/brcarepred.

References

Blows, F. M. et al. Subtyping of breast cancer by immunohistochemistry to investigate a relationship between subtype and short and long term survival: a collaborative analysis of data for 10,159 cases from 12 studies. PLoS Med. 7, e1000279 (2010).
Article PubMed Central Google Scholar
Davies, C. et al. Long-term effects of continuing adjuvant tamoxifen to 10 years versus stopping at 5 years after diagnosis of oestrogen receptor-positive breast cancer: ATLAS, a randomised trial. Lancet 381, 805–816 (2013).
Article CAS PubMed Central Google Scholar
Sestak, I. et al. Factors predicting late recurrence for estrogen receptor-positive breast cancer. J. Natl Cancer Inst. 105, 1504–1511 (2013).
Article CAS PubMed Central Google Scholar
Sgroi, D. C. et al. Prediction of late distant recurrence in patients with oestrogen-receptor-positive breast cancer: a prospective comparison of the breast-cancer index (BCI) assay, 21-gene recurrence score, and IHC4 in the TransATAC study population. Lancet Oncol. 14, 1067–1076 (2013).
Article PubMed Central Google Scholar
Pan, H. et al. 20-year risks of breast-cancer recurrence after stopping endocrine therapy at 5 years. N. Engl. J. Med. 377, 1836–1846 (2017).
Article PubMed Central Google Scholar
Dowsett, M. et al. Integration of clinical variables for the prediction of late distant recurrence in patients with estrogen receptor-positive breast cancer treated with 5 years of endocrine therapy: CTS5. J. Clin. Oncol. 36, 1941–1948 (2018).
Article CAS PubMed Central Google Scholar
Harris, L. N. et al. Use of biomarkers to guide decisions on adjuvant systemic therapy for women with early-stage invasive breast cancer: American Society of Clinical Oncology clinical practice guideline. J. Clin. Oncol. 34, 1134–1150 (2016).
Article CAS PubMed Central Google Scholar
Sledge, G. W. et al. Past, present, and future challenges in breast cancer treatment. J. Clin. Oncol. 32, 1979–1986 (2014).
Article CAS PubMed Central Google Scholar
Richman, J. & Dowsett, M. Beyond 5 years: enduring risk of recurrence in oestrogen receptor-positive breast cancer. Nat. Rev. Clin. Oncol. 1, https://doi.org/10.1038/s41571-018-0145-5 (2018).
Perou, C. M. et al. Molecular portraits of human breast tumours. Nature 406, 747–752 (2000).
Article ADS CAS PubMed Central Google Scholar
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167 (2009).
Article PubMed Central Google Scholar
Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).
Article CAS PubMed Central Google Scholar
Ali, H. R. et al. Genome-driven integrated classification of breast cancer validated in over 7,500 samples. Genome Biol. 15, 431 (2014).
Article PubMed Central Google Scholar
Putter, H., van der Hage, J., de Bock, G. H., Elgalta, R. & van de Velde, C. J. H. Estimation and prediction in a multi-state model for breast cancer. Biom. J. 48, 366–380 (2006).
Article MathSciNet Google Scholar
Fisher, B. et al. Significance of ipsilateral breast tumour recurrence after lumpectomy. Lancet 338, 327–331 (1991).
Article CAS Google Scholar
Insa, A. et al. Prognostic factors predicting survival from first recurrence in patients with metastatic breast cancer: analysis of 439 patients. Breast Cancer Res. Treat. 56, 67–78 (1999).
Article CAS Google Scholar
Putter, H., Fiocco, M. & Geskus, R. B. Tutorial in biostatistics: competing risks and multi-state models. Stat. Med. 26, 2389–2430 (2007).
Article MathSciNet CAS Google Scholar
Wishart, G. C. et al. PREDICT: a new UK prognostic model that predicts survival following surgery for invasive breast cancer. Breast Cancer Res. 12, R1 (2010); erratum 12, 401 (2010).
Article PubMed Central Google Scholar
Michaelson, J. S. et al. Improved web-based calculators for predicting breast carcinoma outcomes. Breast Cancer Res. Treat. 128, 827–835 (2011).
Article Google Scholar
Ormandy, C. J., Musgrove, E. A., Hui, R., Daly, R. J. & Sutherland, R. L. Cyclin D1, EMS1 and 11q13 amplification in breast cancer. Breast Cancer Res. Treat. 78, 323–335 (2003).
Article CAS PubMed Central Google Scholar
Sanchez-Garcia, F. et al. Integration of genomic data enables selective discovery of breast cancer drivers. Cell 159, 1461–1475 (2014).
Article CAS PubMed Central Google Scholar
Shrestha, Y. et al. PAK1 is a breast cancer oncogene that coordinately activates MAPK and MET signaling. Oncogene 31, 3397–3408 (2012).
Article CAS PubMed Central Google Scholar
Holland, D. G. et al. ZNF703 is a common luminal B breast cancer oncogene that differentially regulates luminal and basal progenitors in human mammary epithelium. EMBO Mol. Med. 3, 167–180 (2011).
Article CAS PubMed Central Google Scholar
Reis-Filho, J. S. et al. FGFR1 emerges as a potential therapeutic target for lobular breast carcinomas. Clin. Cancer Res. 12, 6652–6662 (2006).
Article CAS PubMed Central Google Scholar
Liu, H. et al. Pharmacologic targeting of S6K1 in PTEN-deficient neoplasia. Cell Reports 18, 2088–2095 (2017).
Article CAS PubMed Central Google Scholar
Delmore, J. E. et al. BET bromodomain inhibition as a therapeutic strategy to target c-Myc. Cell 146, 904–917 (2011).
Article CAS PubMed Central Google Scholar
Pearson, A. et al. High-level clonal FGFR amplification and response to FGFR inhibition in a translational clinical trial. Cancer Discov. 6, 838–851 (2016).
Article CAS PubMed Central Google Scholar
Wapnir, I. L. et al. A randomized clinical trial of adjuvant chemotherapy for radically resected locoregional relapse of breast cancer: IBCSG 27-02, BIG 1-02, and NSABP B-37. Clin. Breast Cancer 8, 287–292 (2008).
Article PubMed Central Google Scholar
Clark, G. M., Sledge, G. W. Jr, Osborne, C. K. & McGuire, W. L. Survival from first recurrence: relative importance of prognostic factors in 1,015 breast cancer patients. J. Clin. Oncol. 5, 55–61 (1987).
Article CAS PubMed Central Google Scholar
Kennecke, H. et al. Metastatic behavior of breast cancer subtypes. J. Clin. Oncol. 28, 3271–3277 (2010).
Article Google Scholar
Fix, E. & Neyman, J. A simple stochastic model of recovery, relapse, death and loss of patients. Hum. Biol. 23, 205–241 (1951).
CAS PubMed Google Scholar
Broët, P. et al. Analyzing prognostic factors in breast cancer using a multistate model. Breast Cancer Res. Treat. 54, 83–89 (1999).
Article Google Scholar
Meier-Hirmer, C. & Schumacher, M. Multi-state model for studying an intermediate event using time-dependent covariates: application to breast cancer. BMC Med. Res. Methodol. 13, 80 (2013).
Article PubMed Central Google Scholar
Therneau, T. M. & Grambsch, P. M. Modeling Survival Data: Extending the Cox Model (Springer, New York, 2000).
Book Google Scholar
de Wreede, L. C., Fiocco, M. & Putter, H. mstate: an R package for the analysis of competing risks and multi-state models. J. Stat. Software 38, 1–30 (2011).
Article Google Scholar
Klein, J. P., Keiding, N. & Copelan, E. A. Plotting summary predictions in multistate survival models: probabilities of relapse and death in remission for bone marrow transplantation patients. Stat. Med. 12, 2315–2332 (1993).
Article CAS Google Scholar
Aalen, O., Borgan, O. & Gjessing, H. Survival and Event History Analysis—A Process Point of View (Springer, New York, 2008).
Book Google Scholar
Fiocco, M., Putter, H. & van Houwelingen, H. C. Reduced-rank proportional hazards regression and simulation-based prediction for multi-state models. Stat. Med. 27, 4340–4358 (2008).
Article MathSciNet Google Scholar
Hothorn, T., Bretz, F. & Westfall, P. Simultaneous inference in general parametric models. Biom. J. 50, 346–363 (2008).
Article MathSciNet Google Scholar
Dunnett, C. W. A multiple comparison procedure for comparing several treatments with a control. J. Am. Stat. Assoc. 50, 1096–1121 (1955).
Article Google Scholar
Prentice, R. L., Williams, B. J. & Peterson, A. V. On the regression analysis of multivariate failure time data. Biometrika 68, 373–379 (1981).
Article MathSciNet Google Scholar
Harrell, F. E. J. Regression Modeling Strategies (Springer, 2001).
Li, Y. et al. Amplification of LAPTM4B and YWHAZ contributes to chemotherapy resistance and recurrence of breast cancer. Nat. Med. 16, 214–218 (2010).
Article PubMed Central Google Scholar
Clarke, C. et al. Correlating transcriptional networks to breast cancer survival: a large-scale coexpression analysis. Carcinogenesis 34, 2300–2308 (2013).
Article CAS Google Scholar
Loi, S. et al. Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics 9, 239 (2008).
Article PubMed Central Google Scholar
Nagalla, S. et al. Interactions between immunity, proliferation and molecular subtype in breast cancer prognosis. Genome Biol. 14, R34 (2013).
Article PubMed Central Google Scholar
Schmidt, M. et al. The humoral immune system has a key prognostic impact in node-negative breast cancer. Cancer Res. 68, 5405–5413 (2008).
Article CAS Google Scholar
Desmedt, C. et al. Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin. Cancer Res. 13, 3207–3214 (2007).
Article CAS Google Scholar
Miller, L. D. et al. An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc. Natl Acad. Sci. USA 102, 13550–13555 (2005); correction 102, 17882 (2005).
Article ADS CAS Google Scholar
Gautier, L., Cope, L., Bolstad, B. M. & Irizarry, R. A. affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20, 307–315 (2004).
Article CAS PubMed Central Google Scholar
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Article CAS PubMed Central Google Scholar
Gendoo, D. M. A. et al. Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer. Bioinformatics 32, 1097–1099 (2016).
Article CAS PubMed Central Google Scholar
Schröder, M. S., Culhane, A. C., Quackenbush, J. & Haibe-Kains, B. survcomp: an R/Bioconductor package for performance assessment and comparison of survival models. Bioinformatics 27, 3206–3208 (2011).
Article PubMed Central Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing. http://www.r-project.org/ (2015).

Download references

Acknowledgements

We thank the women who participated in this study and the UK Cancer Registry. O.M.R. was supported by a Cancer Research UK (CRUK) travel grant (SWAH/047) to visit C. Curtis’ laboratory. C.R. is supported by award MTM2015-71217-R. C. Caldas is supported by ECMC, NIHR, the Mark Foundation for Cancer Research and Cancer Research UK Cambridge Centre (C9685/A25177). C. Curtis is supported by the National Institutes of Health through the NIH Director’s Pioneer Award (DP1-CA238296), the American Association for Cancer Research and the Breast Cancer Research Foundation. This study is dedicated to J.M.W. and J.N.W.

Reviewer information

Nature thanks Jeff Gerold, Martin A. Nowak, Peter Van Loo and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

These authors contributed equally: Stephen-John Sammut, Jose A. Seoane.

Authors and Affiliations

Cancer Research UK Cambridge Institute and Department of Oncology, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
Oscar M. Rueda, Stephen-John Sammut, Suet-Feung Chin, Maurizio Callari, Rajbir Batra, Bernard Pereira, Alejandra Bruna, H. Raza Ali, Bin Liu, Paul D. Pharoah & Carlos Caldas
Department of Medicine, Division of Oncology, Stanford University School of Medicine, Stanford, CA, USA
Jose A. Seoane, Jennifer L. Caswell-Jin & Christina Curtis
Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Jose A. Seoane & Christina Curtis
Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
Jose A. Seoane & Christina Curtis
Cambridge Breast Unit, Addenbrooke’s Hospital, Cambridge University Hospital NHS Foundation Trust, Cambridge, UK
Elena Provenzano, Paul D. Pharoah & Carlos Caldas
NIHR Cambridge Biomedical Research Centre and Cambridge Experimental Cancer Medicine Centre, Cambridge University Hospital NHS Foundation Trust, Cambridge, UK
Elena Provenzano, Paul D. Pharoah & Carlos Caldas
Research Institute in Oncology and Hematology, Winnipeg, Manitoba, Canada
Michelle Parisien & Leigh Murphy
NIHR Comprehensive Biomedical Research Centre at Guy’s and St Thomas’ NHS Foundation Trust and Research Oncology, Cancer Division, King’s College London, London, UK
Cheryl Gillett & Arnie Purushotham
Department of Molecular Oncology, British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada
Steven McKinney & Samuel Aparicio
Division of Cancer and Stem Cells, School of Medicine, University of Nottingham and Nottingham University Hospital NHS Trust, Nottingham, UK
Andrew R. Green & Ian O. Ellis
Strangeways Research Laboratory, University of Cambridge, Cambridge, UK
Paul D. Pharoah
Departamento de Estadística e Investigación Operativa, Universidad de Valladolid, Valladolid, Spain
Cristina Rueda

Authors

Oscar M. Rueda
View author publications
You can also search for this author in PubMed Google Scholar
Stephen-John Sammut
View author publications
You can also search for this author in PubMed Google Scholar
Jose A. Seoane
View author publications
You can also search for this author in PubMed Google Scholar
Suet-Feung Chin
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer L. Caswell-Jin
View author publications
You can also search for this author in PubMed Google Scholar
Maurizio Callari
View author publications
You can also search for this author in PubMed Google Scholar
Rajbir Batra
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Alejandra Bruna
View author publications
You can also search for this author in PubMed Google Scholar
H. Raza Ali
View author publications
You can also search for this author in PubMed Google Scholar
Elena Provenzano
View author publications
You can also search for this author in PubMed Google Scholar
Bin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Michelle Parisien
View author publications
You can also search for this author in PubMed Google Scholar
Cheryl Gillett
View author publications
You can also search for this author in PubMed Google Scholar
Steven McKinney
View author publications
You can also search for this author in PubMed Google Scholar
Andrew R. Green
View author publications
You can also search for this author in PubMed Google Scholar
Leigh Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Arnie Purushotham
View author publications
You can also search for this author in PubMed Google Scholar
Ian O. Ellis
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. Pharoah
View author publications
You can also search for this author in PubMed Google Scholar
Cristina Rueda
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Aparicio
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Caldas
View author publications
You can also search for this author in PubMed Google Scholar
Christina Curtis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

O.M.R., C. Caldas and C. Curtis conceived the study. O.M.R. performed statistical analyses and implemented the model. J.A.S. compiled the validation cohort and performed statistical analyses. S.-J.S. led the annotation of clinical samples, with input from S.-F.C., M.C., R.B., B.P., A.B., H.R.A., E.P., B.L., M.P., C.G., S.M., A.R.G., L.M., A.P., I.O.E., S.A. and C. Caldas. A.R.G., L.M., A.P., I.O.E., S.A. and C. Caldas provided data. P.D.P. and C.R. provided statistical advice. C. Caldas and S.A. are METABRIC principal investigators. O.M.R., J.A.S., J.L.C.-J., C. Caldas and C. Curtis interpreted the results. O.M.R., J.L.C.-J., C. Caldas and C. Curtis wrote the manuscript, which was approved by all authors. C. Caldas and C. Curtis supervised the study.

Corresponding authors

Correspondence to Carlos Caldas or Christina Curtis.

Ethics declarations

Competing interests

S.A. is founder and shareholder of Contextual Genomic and a scientific advisor to Sangamo Biosciences and Takeda Pharmaceuticals. C. Caldas is a scientific advisor to AstraZeneca-iMed and has received research funding from AstraZeneca, Servier and Genentech/Roche. C. Curtis is a scientific advisory board member and shareholder of GRAIL and consultant for GRAIL and Genentech. A patent application has been filed on aspects of the described work, entitled ‘Methods of treatment based upon molecular characterization of breast cancer’ (C. Curtis, C. Caldas, J.A.S. and O.M.R.).

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Description of the cohorts used in this study.

a, Description of the METABRIC discovery cohort, clinical characteristics and flow chart of sample inclusion for analysis. b, Description of the validation cohort, clinical characteristics and flow chart of sample inclusion for analysis. DRFS, distant-relapse-free survival; DSS, disease-specific survival; OS, overall survival; RFS, relapse-free survival. The cohorts are as follows: GSE19615 (DFHCC cohort⁴³), GSE42568 (Dublin cohort⁴⁴), GSE9195 (Guyt2 cohort⁴⁵), GSE45255 (IRB/JNR/NUH cohort⁴⁶), GSE11121 (Maintz cohort⁴⁷), GSE6532 (TAM cohort⁴⁵), GSE7390 (Transbig cohort⁴⁸) and GSE3494 (Upp cohort⁴⁹). NA, not available.

Extended Data Fig. 2 Effect of censoring nonmalignant deaths on the estimation of disease-specific survival, and prognostic value of clinical covariates at different disease states.

a, Cumulative incidence computed as 1 − Kaplan–Meier (KM) estimator, using only disease-specific death as an end point and censoring other types of death. b, Cumulative incidence computed using a competing-risk model that takes into account different causes of death. The bias of the 1 − Kaplan–Meier estimator is visible. c, Distribution of age at the time of diagnosis for ER-negative and ER-positive patients. The number of patients in each group is indicated in all panels. This analysis was done with the full dataset. Box plots were computed using the median of the observations (centre line). The first and third quartiles are shown as boxes, and the whiskers extend to the ±1.58 interquartile range divided by the square root of the sample size. Outliers are shown as dots. d, log hazard ratios calculated using the multistate model stratified by ER status (n = 3,147) for different covariates, namely grade, lymph-node (LN) status, tumour size (size), time from surgery and time from local relapse (LR). log hazard ratios are shown for different states, including post-surgery (PS; hazard ratio of progressing to relapse or DSD), locoregional recurrence (LR; hazard ratio of progressing to distant relapse or DSD) and distant recurrence (DR; hazard ratio of cancer-specific death). 95% confidence intervals are shown. This analysis was done with the full dataset.

Extended Data Fig. 3 Model calibration and validation in an external dataset.

a, Internal validation of the global predictions of the models on all transitions using bootstrap (n = 200). Discriminant measures of predictive ability are shown on the x axis, as described in the Methods section ‘Model validation and calibration’. The y axis shows the optimism, that is, the difference between the training predictive ability and the test predictive ability of the discriminant measures (see Methods). b, Internal calibration of the global predictions of the models on all transitions using bootstrap (n = 200). The distribution of the mean absolute error between observed and predicted is plotted. c, External calibration of DSD risk and nonmalignant death risk using PREDICT 2.1 (n = 1,841). The distribution of the mean absolute error between the predictions of PREDICT and our model based on ER status only is plotted. a–c, Box plots were computed using the median of the observations (centre line). The first and third quartiles are shown as boxes, and the whiskers extend to the ±1.58 interquartile range divided by the square root of the sample size (see Methods). d, Scatter plot of the predictions of DSD risk computed by PREDICT and our model based on the IntClust subtypes only at ten years (n = 1,841; see Methods). The Pearson correlation is shown. e, Concordance index (C-index) of prediction of risk of distant relapse (DRFS), disease-specific death (disease-specific survival, DSS), death (overall survival, OS) and relapse (RFS) in the 178 withheld METABRIC samples and in a metacohort composed of eight published studies among ER⁺/HER2⁻ patients in the high-risk IntClust subtypes, where results are shown for individual cohorts and the combined metacohort (see Methods and Supplementary Information). Error bars correspond to 95% confidence intervals for the C-index. The number of patients in each group is indicated on the right.

Extended Data Fig. 4 Different subtypes have distinct probabilities of recurrence.

a, Average probability of experiencing a distant relapse (defined as the probability of having a distant relapse at any point followed by any other transition) or cancer-related death for the high-risk ER⁺ IntClust (IC) subtypes (IC1 n = 134, IC6 n = 81, IC9 n = 134, IC2 n = 69) relative to IC3 (n = 269), the ER⁺ subgroup with the best prognosis. This analysis was restricted to ER⁺/HER2⁻ cases, which represent the vast majority for each of these subtypes. Error bars represent 95% confidence intervals around the mean. b, As for a, but showing the average probability of experiencing distant recurrence or cancer-related death after a local recurrence (IC1 n = 21, IC6 n = 10, IC9 n = 21, IC2 n = 13, IC3 n = 30). c, Average probability of recurrence (distant relapse or cancer-specific death) after locoregional relapse for all patients in each of the 11 IntClust subtypes. d, Median time until an additional relapse (distant recurrence or cancer-specific death) after local recurrence for all patients in each of the 11 IntClust subtypes (n = 270). This has been computed using a Kaplan–Meier approach with competing risks of progression and nonmalignant death. Error bars represent 95% confidence intervals around the median time. Asterisks denote situations in which the median time cannot be computed because fewer than 50% of the patients relapsed. This analysis was done with the molecular dataset. e, Average probability of cancer-related death after distant recurrence for all patients by subtype. f, As for d, except that the median time until cancer-specific death after distant recurrence is shown (n = 596). g, Mean probabilities of relapse after surgery and after five and ten disease-free years (see Methods and Supplementary Table 4) for the patients in each of the four IHC subtypes. Error bars represent 95% confidence intervals. The number of patients in each group is indicated. h–k, As for c–f, but for the IHC subtypes (same sample sizes). l, As for g, but for the PAM50 subtypes. The number of patients in each group is indicated. m–p, As for h–k, but for the PAM50 subtypes (with the same sample sizes, except for p where n = 593).

Extended Data Fig. 5 The ER⁻/HER2⁻ integrative subtypes exhibit distinct risks of relapse.

The probabilities of distant relapse or cancer-related death among ER⁻/HER2⁻ patients who were disease-free at five years after diagnosis reveal marked differences in the risk of relapse for TNBC IntClust subtype IC4ER⁻ versus the IC10 (basal-like enriched) subtype. Here the base clinical model with IHC subtypes is compared with the base clinical model plus IntClust subtype information. Error bars represent 95% confidence intervals. The number of patients in each group is indicated.

Extended Data Fig. 6 Subtype-specific risks of relapse after locoregional relapse.

Transition probabilities from locoregional recurrence to other states for individual average patients, stratified on the basis of ER, IHC, PAM50 or IntClust subtype. 95% confidence bands were computed using bootstrap. This analysis was done with the full dataset for the comparisons between ER⁺ and ER⁻, and the molecular dataset for the remainder.

Extended Data Fig. 7 Associations between probabilities of distant relapse ten years after locoregional relapse with clinico-pathological and molecular features of the primary tumour.

For each patient that had a locoregional recurrence, the ten-year probability of having a distant relapse or cancer-related death is plotted against different variables. A loess fit is overlaid to highlight the relationship between the probability and tumour size or time of relapse. Box plots were computed using the median of the observations (centre line). The first and third quartiles are shown as boxes, and the whiskers extend to the ±1.58 interquartile range divided by the square root of the sample size. Outliers are shown as dots. This analysis was done with the molecular dataset and the model was stratified by IntClust subtype (n = 257).

Extended Data Fig. 8 Subtype-specific risks of cancer-related death after a distant relapse.

Transition probabilities from distant relapse to other states for individual average patients stratified on the basis of ER, IHC, PAM50 or IntClust subtype. 95% confidence bands were computed using bootstrap. This analysis was done with the full dataset for the comparisons between ER⁺ and ER⁻, and the molecular dataset for the remainder.

Extended Data Fig. 9 Distribution of the number of relapses by molecular subtype.

a, Times of distant recurrence for ER⁻ and ER⁺ patients (n = 605). Each dot represents a distant recurrence, coded by colour for different sites. b, Distribution of the number of distant relapses for different subtypes (n = 609), based on ER status (ER⁺ n = 422, ER⁻ n = 187), IHC ER/HER2 status (ER⁺/HER2⁻ n = 263, ER⁻/HER2⁻ n = 82, ER⁺/HER2⁺ n = 36, ER⁻/HER2⁺ n = 41), PAM50 subtype (normal n = 33, luminal A n = 101, luminal B n = 138, basal n = 79, HER2 n = 69) and IntClust subtype (IC1 n = 40, IC2 n = 25, IC3 n = 32, IC4ER⁺ n = 46, IC4ER⁻ n = 16, IC5 n = 72, IC6 n = 23, IC7 n = 24, IC8 n = 54, IC9 n = 38, IC10 n = 52). ER status was imputed on the basis of expression in four samples. These analyses were done with the recurrent-events cohort.

Extended Data Fig. 10 Site-specific patterns of relapse in the IHC, PAM50 and IntClust subtypes.

a, Left, percentages of patients with metastases at a given site in the IHC subtypes (bar plots, total numbers also indicated). Upright triangles indicate significant positive differences in that group with respect to the overall mean and inverted triangles indicate significant negative differences in that group with respect to the overall mean using simultaneous testing of all sites (see Methods). Location of metastatic sites is not anatomically accurate. Right, cumulative incidence functions (as 1 − Kaplan–Meier estimates) for each site of metastasis in the IHC subtypes. The same patient can have multiple sites of metastasis. b, As for a, but for the PAM50 subtypes. c, As for a, but for the IntClust subtypes. These analyses were done with the recurrent-events cohort. Female silhouettes are from the public-domain human body diagrams at https://commons.wikimedia.org/wiki/Human_body_diagrams.

Supplementary information

Supplementary Information

Supplementary Methods.

Reporting Summary

Supplementary Table 1

Summary of clinico-pathological features of the cohort according to ER status (based on the full dataset) and for the IHC, PAM50 and IntClust subtypes (based on the molecular dataset).

Supplementary Table 2

Number of transitions between each state in the multistate model according to ER status (based on the full dataset) and for the IHC, PAM50 and IntClust subtypes (based on the molecular dataset).

Supplementary Table 3

Proportion of cases classified into each IntClust subtype mapping onto the IHC and PAM50 subtypes within the molecular dataset.

Supplementary Table 4

Transition probabilities and standard errors for each of the breast cancer subgroups. a, Predictions for each subgroup were computed taking the average and the standard deviation of the probabilities of all patients in each group. Standard deviations represent variability within each subtype. The probabilities of any transition ending up in a relapse group and all transitions visiting that state of the multistate model are included for patients stratified by ER status (based on the full dataset) and for the IHC, PAM50, and IntClust subtypes (based on the molecular dataset). b, Predictions for an average individual from each subgroup. These probabilities are computed by selecting an average individual and predicting the trajectory between each state of the multistate model in the and corresponding dataset for the distinct subtypes. The probabilities for staying in relapse are omitted for clarity and can be computed as one minus the sum of moving to the rest of the states. Standard errors represent uncertainty in the individual predictions.

Supplementary Table 5

Clinical information for the full dataset.

Supplementary Table 6

Clinical information for the molecular dataset.

Supplementary Table 7

Clinical information for the recurrent-events dataset.

Supplementary Table 8

Description of clinical variables provided in Supplementary Tables 5–7 for the full, molecular and recurrent-events datasets.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rueda, O.M., Sammut, SJ., Seoane, J.A. et al. Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature 567, 399–404 (2019). https://doi.org/10.1038/s41586-019-1007-8

Download citation

Received: 04 July 2018
Accepted: 31 January 2019
Published: 13 March 2019
Issue Date: 21 March 2019
DOI: https://doi.org/10.1038/s41586-019-1007-8

This article is cited by

Patient-specific signaling signatures predict optimal therapeutic combinations for triple negative breast cancer
- Heba Alkhatib
- Jason Conage-Pough
- Nataly Kravchenko-Balasha
Molecular Cancer (2024)
Clinically relevant gene signatures provide independent prognostic information in older breast cancer patients
- Miguel Castresana-Aguirre
- Annelie Johansson
- Nicholas P. Tobin
Breast Cancer Research (2024)
MYC activity at enhancers drives prognostic transcriptional programs through an epigenetic switch
- Simon T. Jakobsen
- Rikke A. M. Jensen
- Rasmus Siersbæk
Nature Genetics (2024)
Caveolin-1 gene expression provides additional prognostic information combined with PAM50 risk of recurrence (ROR) score in breast cancer
- Christopher Godina
- Mattias Belting
- Helena Jernström
Scientific Reports (2024)
S100A8/A9 predicts response to PIM kinase and PD-1/PD-L1 inhibition in triple-negative breast cancer mouse models
- Lauren R. Begg
- Adrienne M. Orriols
- Dai Horiuchi
Communications Medicine (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.