Abstract
Hormone receptor-positive (HR+)/human epidermal growth factor receptor 2-negative (HER2−) breast cancer is the most prevalent type of breast cancer, in which endocrine therapy resistance and distant relapse remain unmet challenges. Accurate molecular classification is urgently required for guiding precision treatment. We established a large-scale multi-omics cohort of 579 patients with HR+/HER2− breast cancer and identified the following four molecular subtypes: canonical luminal, immunogenic, proliferative and receptor tyrosine kinase (RTK)-driven. Tumors of these four subtypes showed distinct biological and clinical features, suggesting subtype-specific therapeutic strategies. The RTK-driven subtype was characterized by the activation of the RTK pathways and associated with poor outcomes. The immunogenic subtype had enriched immune cells and could benefit from immune checkpoint therapy. In addition, we developed convolutional neural network models to discriminate these subtypes based on digital pathology for potential clinical translation. The molecular classification provides insights into molecular heterogeneity and highlights the potential for precision treatment of HR+/HER2− breast cancer.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
The WES data, CNA data, RNA sequencing data and metabolome data for this study have been deposited into the Genome Sequence Archive (GSA) database under accession codes PRJCA017539 (https://ngdc.cncb.ac.cn/bioproject/browse/PRJCA017539). TMT-based mass spectrometry (MS)-quantified protein data have been submitted into iProX (https://www.iprox.cn) under accession codes IPX0006535000. Human Primary Cell Atlas data are obtained from the celldex package (v1.11; https://github.com/LTLA/celldex). The TCGA, METABRIC and CPTAC data were downloaded from the cBioPortal website (https://www.cbioportal.org/). Source data are provided with this paper.
Code availability
All data were analyzed and processed using published software packages whose details are provided and cited either in the Methods section or Supplementary Note. The CNN models and code from this manuscript are available at GitHub (https://github.com/yifanzhou330/SNF) and Zenodo (https://doi.org/10.5281/zenodo.8022438)87.
References
Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2020. CA Cancer J. Clin. 70, 7–30 (2020).
Huppert, L. A., Gumusay, O., Idossa, D. & Rugo, H. S. Systemic therapy for hormone receptor-positive/human epidermal growth factor receptor 2-negative early stage and metastatic breast cancer. CA Cancer J. Clin. 73, 480–515 (2023).
Ma, C. X., Reinert, T., Chmielewska, I. & Ellis, M. J. Mechanisms of aromatase inhibitor resistance. Nat. Rev. Cancer 15, 261–275 (2015).
Dowsett, M. et al. Meta-analysis of breast cancer outcomes in adjuvant trials of aromatase inhibitors versus tamoxifen. J. Clin. Oncol. 28, 509–518 (2010).
Pan, H. et al. 20-year risks of breast-cancer recurrence after stopping endocrine therapy at 5 years. N. Engl. J. Med. 377, 1836–1846 (2017).
Park, Y. H. et al. Patterns of relapse and metastatic spread in HER2-overexpressing breast cancer according to estrogen receptor status. Cancer Chemother. Pharmacol. 66, 507–516 (2010).
Pusztai, L. et al. Durvalumab with olaparib and paclitaxel for high-risk HER2-negative stage II/III breast cancer: results from the adaptively randomized I-SPY2 trial. Cancer Cell 39, 989–998 (2021).
Nanda, R. et al. Effect of pembrolizumab plus neoadjuvant chemotherapy on pathologic complete response in women with early-stage breast cancer: an analysis of the ongoing phase 2 adaptively randomized I-SPY2 trial. JAMA Oncol. 6, 676–684 (2020).
Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).
Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Jiang, Y. Z. et al. Genomic and transcriptomic landscape of triple-negative breast cancers: subtypes and treatment strategies. Cancer Cell 35, 428–440 (2019).
Jiang, Y. Z. et al. Molecular subtyping and genomic profiling expand precision medicine in refractory metastatic triple-negative breast cancer: the FUTURE trial. Cell Res. 31, 178–186 (2021).
Gluz, O. et al. West German Study Group phase III plan B trial: first prospective outcome data for the 21-gene recurrence score assay and concordance of prognostic markers by central and local pathology assessment. J. Clin. Oncol. 34, 2341–2349 (2016).
Wang, L. B. et al. Proteogenomic and metabolomic characterization of human glioblastoma. Cancer Cell 39, 509–528 (2021).
East, M. P., Laitinen, T. & Asquith, C. R. M. PIP5K1A: a potential target for cancers with KRAS or TP53 mutations. Nat. Rev. Drug Discov. 19, 436 (2020).
Semba, S. et al. Down-regulation of PIK3CG, a catalytic subunit of phosphatidylinositol 3-OH kinase, by CpG hypermethylation in human colorectal carcinoma. Clin. Cancer Res. 8, 3824–3831 (2002).
Repana, D. et al. The Network of Cancer Genes (NCG): a comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens. Genome Biol. 20, 1 (2019).
Sondka, Z. et al. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).
Johnston, S. R. D. et al. Abemaciclib combined with endocrine therapy for the adjuvant treatment of HR+, HER2−, node-positive, high-risk, early breast cancer (monarchE). J. Clin. Oncol. 38, 3987–3998 (2020).
Sledge, G. W. Jr. et al. The effect of abemaciclib plus fulvestrant on overall survival in hormone receptor-positive, ERBB2-negative breast cancer that progressed on endocrine therapy-MONARCH 2: a randomized clinical trial. JAMA Oncol. 6, 116–124 (2020).
Turner, N. C. et al. Overall survival with palbociclib and fulvestrant in advanced breast cancer. N. Engl. J. Med. 379, 1926–1936 (2018).
Slamon, D. J. et al. Phase III randomized study of ribociclib and fulvestrant in hormone receptor-positive, human epidermal growth factor receptor 2-negative advanced breast cancer: MONALEESA-3. J. Clin. Oncol. 36, 2465–2472 (2018).
Mayer, E. L. et al. Palbociclib with adjuvant endocrine therapy in early breast cancer (PALLAS): interim analysis of a multicentre, open-label, randomised, phase 3 study. Lancet Oncol. 22, 212–222 (2021).
Mirza, M. R. et al. Niraparib maintenance therapy in platinum-sensitive, recurrent ovarian cancer. N. Engl. J. Med. 375, 2154–2164 (2016).
Maman, S. & Witz, I. P. A history of exploring cancer in context. Nat. Rev. Cancer 18, 359–376 (2018).
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).
Becht, E. et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol. 17, 218 (2016).
Bartok, O. et al. Anti-tumour immunity induces aberrant peptide presentation in melanoma. Nature 590, 332–337 (2021).
Burstein, H. J. Systemic therapy for estrogen receptor-positive, HER2-negative breast cancer. N. Engl. J. Med. 383, 2557–2570 (2020).
Du, Z. & Lovly, C. M. Mechanisms of receptor tyrosine kinase activation in cancer. Mol. Cancer 17, 58 (2018).
Östman, A. PDGF receptors in tumor stroma: biological effects and associations with prognosis and response to treatment. Adv. Drug Deliv. Rev. 121, 117–123 (2017).
Krug, K. et al. Proteogenomic landscape of breast cancer tumorigenesis and targeted therapy. Cell 183, 1436–1456 (2020).
Gui, Y. et al. Metastatic breast carcinoma-associated fibroblasts have enhanced protumorigenic properties related to increased IGF2 expression. Clin. Cancer Res. 25, 7229–7242 (2019).
Bertero, T. et al. Tumor-stroma mechanics coordinate amino acid availability to sustain tumor growth and malignancy. Cell Metab. 29, 124–140 (2019).
Jungwirth, U. et al. Impairment of a distinct cancer-associated fibroblast population limits tumour growth and metastasis. Nat. Commun. 12, 3516 (2021).
Perrone, F. et al. PDGFRA, PDGFRB, EGFR, and downstream signaling activation in malignant peripheral nerve sheath tumor. Neuro Oncol. 11, 725–736 (2009).
Lin, N. U. & Winer, E. P. Advances in adjuvant endocrine therapy for postmenopausal women. J. Clin. Oncol. 26, 798–805 (2008).
Hanker, A. B., Sudhan, D. R. & Arteaga, C. L. Overcoming endocrine resistance in breast cancer. Cancer Cell 37, 496–513 (2020).
Loibl, S. et al. Palbociclib for residual high-risk invasive HR-positive and HER2-negative early breast cancer—the Penelope-B trial. J. Clin. Oncol. 39, 1518–1530 (2021).
Pereira, B. et al. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes. Nat. Commun. 7, 11479 (2016).
Razavi, P. et al. The genomic landscape of endocrine-resistant advanced breast cancers. Cancer Cell 34, 427–438 (2018).
Patten, D. K. et al. Enhancer mapping uncovers phenotypic heterogeneity and evolution in patients with luminal breast cancer. Nat. Med. 24, 1469–1480 (2018).
Ades, F. et al. Luminal B breast cancer: molecular characterization, clinical management, and future perspectives. J. Clin. Oncol. 32, 2794–2803 (2014).
Gatza, M. L., Silva, G. O., Parker, J. S., Fan, C. & Perou, C. M. An integrated genomics approach identifies drivers of proliferation in luminal-subtype human breast cancer. Nat. Genet. 46, 1051–1059 (2014).
Kim, J. A. et al. Comprehensive functional analysis of the tousled-like kinase 2 frequently amplified in aggressive luminal breast cancers. Nat. Commun. 7, 12991 (2016).
Saito, Y. et al. LLGL2 rescues nutrient stress by promoting leucine uptake in ER+ breast cancer. Nature 569, 275–279 (2019).
Golden, E. et al. The oncogene AAMDC links PI3K-AKT-mTOR signaling with metabolic reprograming in estrogen receptor-positive breast cancer. Nat. Commun. 12, 1920 (2021).
Huang, C. et al. Proteogenomic insights into the biology and treatment of HPV-negative head and neck squamous cell carcinoma. Cancer Cell 39, 361–379 (2021).
Gillette, M. A. et al. Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma. Cell 182, 200–225 (2020).
Petralia, F. et al. Integrated proteogenomic characterization across major histological types of pediatric brain cancer. Cell 183, 1962–1985 (2020).
Modi, S. et al. Trastuzumab deruxtecan in previously treated HER2-low advanced breast cancer. N. Engl. J. Med. 387, 9–20 (2022).
Dijkstra, K. K. et al. Generation of tumor-reactive T cells by co-culture of peripheral blood lymphocytes and tumor organoids. Cell 174, 1586–1598 (2018).
Neal, J. T. et al. Organoid modeling of the tumor immune microenvironment. Cell 175, 1972–1988 (2018).
Gao, Q. et al. Driver fusions and their implications in the development and treatment of human cancers. Cell Rep. 23, 227–238 (2018).
Hammond, M. E. et al. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for immunohistochemical testing of estrogen and progesterone receptors in breast cancer (unabridged version). Arch. Pathol. Lab. Med. 134, e48–e72 (2010).
Salgado, R. et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann. Oncol. 26, 259–271 (2015).
Parker, J. S. et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27, 1160–1167 (2009).
Ciriello, G. et al. Comprehensive molecular portraits of invasive lobular breast cancer. Cell 163, 506–519 (2015).
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Chen, B., Khodadoust, M. S., Liu, C. L., Newman, A. M. & Alizadeh, A. A. Profiling tumor infiltrating immune cells with CIBERSORT. Methods Mol. Biol. 1711, 243–259 (2018).
Hanzelmann, S., Castelo, R. & Guinney, J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics 14, 7 (2013).
Xiao, Y. et al. Multi-omics profiling reveals distinct microenvironment characterization and suggests immune escape mechanisms of triple-negative breast cancer. Clin. Cancer Res. 25, 5002–5014 (2019).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Whitfield, M. L. et al. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13, 1977–2000 (2002).
Timms, K. M. et al. Association of BRCA1/2 defects with genomic scores predictive of DNA damage repair deficiency among breast cancer subtypes. Breast Cancer Res. 16, 475 (2014).
Telli, M. L. et al. Homologous recombination deficiency (HRD) score predicts response to platinum-containing neoadjuvant chemotherapy in patients with triple-negative breast cancer. Clin. Cancer Res. 22, 3764–3773 (2016).
Abkevich, V. et al. Patterns of genomic loss of heterozygosity predict homologous recombination repair defects in epithelial ovarian cancer. Br. J. Cancer 107, 1776–1782 (2012).
Birkbak, N. J. et al. Telomeric allelic imbalance indicates defective DNA repair and sensitivity to DNA-damaging agents. Cancer Discov. 2, 366–375 (2012).
Popova, T. et al. Ploidy and large-scale genomic instability consistently identify basal-like breast carcinomas with BRCA1/2 inactivation. Cancer Res. 72, 5454–5462 (2012).
Ock, C. Y. et al. Genomic landscape associated with potential response to anti-CTLA-4 treatment in cancers. Nat. Commun. 8, 1050 (2017).
Goode, A., Gilbert, B., Harkes, J., Jukic, D. & Satyanarayanan, M. OpenSlide: a vendor-neutral software foundation for digital pathology. J. Pathol. Inform. 4, 27 (2013).
Zhao, S. et al. Deep learning framework for comprehensive molecular and prognostic stratifications of triple-negative breast cancer. Fundam. Res. (2022).
Paszke, A., Gross, S., Massa, F., Lerer, A. & Chintala, S. PyTorch: an imperative style, high-performance deep learning library. Proceedings of the 33rd International Conference on Neural Information Processing Systems Article 721 (Curran Associates Inc., 2019).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. & Torralba, A. Learning deep features for discriminative localization. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929 (IEEE, 2016).
Migliozzi, S. et al. Integrative multi-omics networks identify PKCδ and DNA-PK as master kinases of glioblastoma subtypes and guide targeted cancer therapy. Nat. Cancer 4, 181–202 (2023).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Sachs, N. et al. A living biobank of breast cancer organoids captures disease heterogeneity. Cell 172, 373–386 (2018).
Grunwald, B. T. et al. Spatially confined sub-tumor microenvironments in pancreatic cancer. Cell 184, 5577–5592 (2021).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
McGinnis, C., Murrow, L. & Gartner, Z. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 8, 329–337 (2019).
Lun, A. T. L. et al. EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Genome Biol. 20, 63 (2019).
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Aran, D. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 20, 163–172 (2019).
Karlsson, M. et al. A single-cell type transcriptomics map of human tissues. Sci. Adv. 7, eabh2169 (2021).
Wu, S. Z. et al. A single-cell and spatially resolved atlas of human breast cancers. Nat. Genet. 53, 1334–1347 (2021).
Zhou, Y. Molecular classification of hormone receptor-positive HER2-negative breast cancer. Zenodo https://doi.org/10.5281/zenodo.8022438 (2023).
Acknowledgements
We are grateful to the patients and their families who contributed to this study. This work was supported by grants from the National Key Research and Development Project of China (2021YFF1201300), the National Natural Science Foundation of China (82341003, 91959207, 92159301, 82272822, 82272704 and 82103039), the Shanghai Key Laboratory of Breast Cancer (12DZ2260100), the Shanghai Hospital Development Center (SHDC) Municipal Project for Developing Emerging and Frontier Technology in Shanghai Hospitals (SHDC12021103), the Program of Shanghai Academic/Technology Research Leader (20XD1421100), the Natural Science Foundation of Shanghai (22ZR1479200 and 23ZR1411800), Youth Talent Program of Shanghai Health Commission (2022YQ012), China Postdoctoral Science Foundation (2022M720790), Shanghai Sailing Program (20YF1408700) and Youth Medical Talents of Shanghai (WJWRC202014). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
X.J., Y.Z.J. and Z.M.S. conceived and designed the study. Y.F.Z., D.M., C.J.L. and C.L.L. performed the proteomics and contributed to the data processing and analyses. X.J. and Y.Z.J. wrote the first draft and organized the figures. S.Z. reviewed the pathological sections and performed the deep-learning-based digital pathology. Y.X. and W.X.X. performed the metabolomics. T.F. performed scRNA-seq. Y.Y.C. carried out IHC experiments and PDO assays. Y.Q.L., Q.W.C., Y.Y., J.X.S., L.M.S. and W.H. performed the WES, OncoScan and RNA sequencing. J.F.R. and Z.M.S. supervised all aspects of the study.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Landscape of FUSCC HR+/HER2- breast cancer cohort, related to Fig. 1.
(a, b) Schematic overview of multi-omics data acquired for this cohort. (c) The proportion of PAM50 subtypes. (d) Determination of optimal cluster number. (e) Silhouette plot of SNF clustering. (f) Summary of adjusted p-value for differences in each multi-omic features among subtypes under different clustering strategies. NS: not significant. Mut: mutation. Amp: amplification. Met: metabolite. HRD: homologous recombination deficiency. Bold font indicates the clustering strategy we used. (g) Distribution of PAM50 subtypes among SNF subtypes. P = 5e-04. P values were from the two-sided Fisher’s exact test.
Extended Data Fig. 2 SNF subtype-specific metabolomic features, related to Fig. 2.
(a) Global differences in metabolic gene expression between tumors and normal tissues in the luminal cohort. The distribution distances (r.m.s.d) were calculated between tumors and corresponding normal tissues (red), different samples of tumor tissues (yellow), and different samples of normal tissues (blue). The inset shows the average distances between pairs of tissues as a percentage of the average distance between tumors and normal tissues. Tumor samples n = 351, normal samples n = 11. P(T VS N - T VS T) < 2.2e-16, P(T VS N - N VS N) < 2.2e-16, P(T VS T - N VS N) < 2.2e-16. P values were from the two-sided Wilcox rank-sum test and Kruskal-Wallis test. (b) Heatmap illustrating subtype-specific metabolic genes. P values were from the two-sided Kruskal-Wallis test. (c) Illustration of subtype-specific polar metabolite subclasses. P(Amino acid) = 4.952e-05, P(Carbohydrates) = 0.4084, P(Lipid) = 0.02426, P(Nucleotide) = 0.3861, P(Other) = 0.6101, P(Peptide) = 0.0002545, P(Vitamins and Cofactors) = 0.8432, P(Xenobiotics) = 0.7095. P values were from the two-sided Kruskal-Wallis test. SNF1 subtype n = 86 biologically independent samples, SNF2 subtype n = 89 biologically independent samples, SNF3 subtype n = 118 biologically independent samples, SNF4 subtype n = 58 biologically independent samples, normal samples n = 11. (d) Illustration of subtype-specific lipid subclasses. P(FA) = 0.0005, P(GL) = 0.008, P(GP) = 9.113e-14, P(SP) = 0.0005, P(ST) = 0.0005. P values were from the two-sided Kruskal-Wallis test. SNF1 subtype n = 86 biologically independent samples, SNF2 subtype n = 89 biologically independent samples, SNF3 subtype n = 118 biologically independent samples, SNF4 subtype n = 58 biologically independent samples, normal samples n = 11. FA: Fatty acyls. GL: Glycerolipids. GP: Glycerophospholipids. SP: Sphingolipids. ST: Sterol Lipids. In all boxplots, the center lines represent median values; the bounds of the boxplot represent the interquartile ranges; the whiskers show the range of the data. All P values were adjusted using the Benjamini‒Hochberg procedure. ***FDR < 0.001; **0.001 ≤ FDR < 0.01; *0.01 ≤ FDR < 0.05; ns, FDR ≥ 0.05.
Extended Data Fig. 3 The associations of polar metabolites and lipids with genomic features, related to Fig. 2.
(a) Heatmap showing the associations between the abundances of metabolites and the presence of mutations within the indicated genes. The mutations include high frequency somatic mutations (mutated in at least 6% of the cases in at least one SNF subtype) within cancer-related genes and high frequency germline mutations in BRCA1 and BRCA2. T statistics were calculated by a linear regression model that adjusted the cofounding factors. (b) Correlations between TP53 mutations and deoxyinosine (top panel) and OxPG levels (bottom panel). All samples were ordered based on the abundance (y-axis) of deoxyinosine (top panel) or OxPG (bottom panel), and those with TP53 mutations were highlighted in red and indicated by the corresponding lines displayed on the x-axis. Two-sided T statistics were calculated. (c) Heatmap showing the associations between the abundances of metabolites and copy number values of SCNA peaks. T statistics were calculated by a linear regression model that adjusted the cofounding factors. (d) Top panel: correlations between the copy number values of 8q23.3 and the abundances of deoxyinosine, ribothymidine, uracil and some amino acids. Bottom panel: correlations between the copy number values of 1q32.1 and the abundances of uridine and D-pantethine. SCNA-related metabolites were shown as lines, and samples were ordered by increasing copy number values. The abundances of the metabolites were illustrated in colors. (e) Heatmap showing the correlations between the mRNA expression of cell cycle-related genes (y-axis) and the abundances of metabolites (x-axis). T statistics were calculated by a linear regression model that adjusted the cofounding factors. (f) The correlation of deoxyinosine and dUMP abundance with AURKA mRNA expression. P(AURKA-Deoxyinosine) < 2.2e-16, P(CCND3-dUMP) = 3.2e-4. P values were from two-sided Pearson’s correlation analysis. ***FDR < 0.001; **0.001 ≤ FDR < 0.01; *0.01 ≤ FDR < 0.05.
Extended Data Fig. 4 Extended analysis of clinical status of four subtypes, related to Fig. 3.
(a) Association of the SNF types with different clinical statuses. P values were from the two-sided Fisher’s exact test. (b) Association of the SNF subtypes with relapse-free survival (RFS). P = 9.3e-04. (c) Association of the SNF subtypes with metastasis-free survival (MFS) in PAM50 Luminal B patients. (d) Forest plot of univariate cox regression analysis for MFS adjusting for tumor size, lymph node status, SNF subtypes, chemotherapy, histological grade. The included patients all received endocrine therapy (n = 296). The hazard ratios (HR) were shown with 95% confidence intervals (CI). Error bar center indicates HR. SNF2vsSNF1: HR = 1.14[0.61–2.11], P = 0.687. SNF3vsSNF1: HR = 1.32[0.75–2.32], P = 0.330. SNF4vsSNF1: HR = 2.25[1.23–4.12], P = 0.008. Lymph node met: HR = 1.08[1.06–1.10], P = 1.02e-15. Chemotherapy: HR = 1.41[0.77–2.59], P = 0.262. Tumor size: HR = 1.59[1.37–1.85], P = 2.17e-09. Grade: HR = 1.31[0.86–1.99], P = 0.213. Bold font indicates statistical significance. met: metastasis.
Extended Data Fig. 5 Prediction of SNF subtypes based on the transcriptomics data.
(a) Workflow for the prediction of SNF subtypes based on the transcriptomics data. (b) ROC curves for using the random forest classifier to identify the SNF subtypes. Molecular features of inferred SNF subtypes in (c) CPTAC, (d) METABRIC and (e) TCGA cohort. ***FDR < 0.001; **0.001 ≤ FDR < 0.01; *0.01 ≤ FDR < 0.05; ns: not significant. P values were from the two-sided Kruskal-Wallis test.
Extended Data Fig. 6 Extended analysis of SNF3 subtype, related to Fig. 4.
(a) Representative gene set enrichment analysis plot showing upregulated cell cycle pathway in SNF3 subtype in FUSCC and TCGA cohorts. (b) The CNA, mRNA abundance, and protein abundance of CCND1, CDK2, CDK1; the mRNA expression of E2F1, E2F2, and E2F target genes among different subtypes. Copy number amplification was defined as copy number value > log2(4/2). P values were from the two-sided ANOVA or Fisher’s exact test. MGPS: multi-gene proliferation scores. (c) The CNV alteration, and mRNA abundance of CCND1, CDK1, and CDK2 among different subtypes in TCGA cohort. Copy number amplification was defined as copy number value > log2(4/2). P values were from the two-sided ANOVA or Fisher’s exact test. (d) Heatmap showing the mRNA expression of E2F1, E2F2, and E2F target genes in TCGA cohort. P values were from the two-sided ANOVA test. (e) The alteration of two key G2/M cell-cycle regulators (MDM2 and ATM) at the copy number level and mRNA level compared between SNF3 (n = 233) and the other subtypes (n = 259) in TCGA cohort. P(MDM2 CNA) = 7.1e-07, P(ATM CNA) = 1.2e-09, P(MDM2 RNA) = 2e-11, P(ATM RNA) = 5.3e-12. P values were from the two-sided Wilcoxon or Fisher’s exact test. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles. Whiskers were plotted at 1.5xIQR and the data points outside the whisker were outliers. ***FDR < 0.001; **0.001 ≤ FDR < 0.01; *0.01 ≤ FDR < 0.05; NS, FDR ≥ 0.05.
Extended Data Fig. 7 Extended analysis of SNF2 subtype, related to Fig. 5.
(a) Heatmap showing the estimated abundance of 24 microenvironment cell types among four SNF subtypes in TCGA cohort. P values were from the two-sided ANOVA test. (b) Representative gene set enrichment analysis plot showing upregulated T cell activation and adaptive immune response in SNF2 subtype in FUSCC and TCGA cohorts. (c) Expression of PDCD1 mRNA expression between SNF2 subtype (n = 80) and other subtypes (n = 412) in TCGA cohort. P = 5.6e-05. P values were from the two-sided Wilcoxon test. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles. Whiskers were plotted at 1.5xIQR and the data points outside the whisker were outliers. (d) The mRNA expression of CD8A, GZMA, PRF1, and IDO1 between SNF2 subtype and other subtypes in TCGA cohort. P values were from the two-sided ANOVA test. ***FDR < 0.001; **0.001 ≤ FDR < 0.01; *0.01 ≤ FDR < 0.05; NS, FDR ≥ 0.05.
Extended Data Fig. 8 Cell types detected based on scRNA-seq, related to Fig. 5.
(a) Heatmap showing the expression of marker genes in the indicated cell types. (b) Heatmap showing copy number variations for individual cells (rows) in different genomic segments (column). Sampled immune cells were used as references. (c) Distribution of each cell subtype in each SNF subtype. (d) GSEA on differentially expressed genes in CD8 + T cell from SNF2 versus non-SNF2 patients for REACTOME, GO and hallmark gene sets. Top 10 pathways enriched in CD8 + T cell from SNF2 samples were shown. (e, f) Violin plot comparing cytotoxic/dysfunction score (E) or cytotoxic-related gene (F), GNLY and GZMK, between CD8 + T cell (n = 6827 cells) from SNF2 (n = 3) versus non-SNF2 (n = 6) patients. P values were obtained by two-sided Wilcoxon test. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles. Whiskers were plotted at 1.5xIQR and the data points outside the whisker were outliers.
Extended Data Fig. 9 Extended analysis of SNF4 subtype, related to Fig. 6.
(a) Cleveland plot showing the top 15 statistically significant pathways (p-value < 0.05, q-value < 0.25) with the highest NES value in FUSCC cohort. Pathways with gene sizes between 50 and 200 have been included. All pathways included were statistically significant. Gene Ontology Molecular Function (GOMF) gene sets were used for GSEA analysis. Receptor tyrosine kinase related pathways were highlighted in red. P-values were calculated by two-sided nonparametric permutation test and adjusted using the Benjamini‒Hochberg procedure (q-value). (b) Cleveland plot showing the top 15 statistically significant pathways (p-value < 0.05, q-value < 0.25) with the highest NES value in TCGA cohort. Pathways with gene sizes between 50 and 200 have been included. All pathways included were statistically significant. Receptor tyrosine kinase related pathways were highlighted in red. P-values were calculated by two-sided nonparametric permutation test and adjusted by the false discovery rate (q-value). (c, d) The expression of EGFR, PDGFRA, KIT, and MET mRNA level between SNF4 (n = 34) and other subtypes (n = 458) in TCGA cohort. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles. Whiskers were plotted at 1.5xIQR and the data points outside the whisker were outliers. P(EGFR) = 9.7e-15, P(PDGFRA) = 3.8e-11, P(KIT) = 9.4e-15, P(MET) = 1e-12. P values were from the two-sided Wilcoxon test.
Extended Data Fig. 10 Extended analysis of SNF4 subtype, related to Fig. 6.
(A) Heatmaps showing phosphosite abundance of EGFR and their downstream substrates. P values were from the two-sided Kruskal-Wallis test without multiple test corrections. (B) Heatmaps showing phosphosite abundance of PDGFRA and their downstream substrates. P values were from the two-sided Kruskal-Wallis test without multiple test corrections. (C) Schematic diagram of PDGFRA/EGFR and their downstream MAPK signaling pathway. (D) Immunohistochemical detection of Phospho-ERK 1/2 and the immunohistochemical staining score quantification among the SNF1 (n = 45), SNF2 (n = 47), SNF3 (n = 55), and SNF4 (n = 27) subtypes. P values were from the two-sided Kruskal-Wallis test without multiple test corrections. Scale bar: 100μm. Center line indicates the median, and bounds of box indicate the 25th and 75th percentiles. Whiskers were plotted at 1.5xIQR and the data points outside the whisker were outliers. ***P < 0.001; **0.001 ≤ P < 0.01; *0.01 ≤ P < 0.05; NS, P ≥ 0.05.
Supplementary information
Supplementary Information
Supplementary Figs. 1–3 and Supplementary Note.
Supplementary Tables
Supplementary Table 1: Cohort information. Supplementary Table 2: Correlation between metabolites and oncogenic somatic mutations in HR+/HER2− breast cancer. Supplementary Table 3: Correlation between metabolites and oncogenic SCNA in HR+/HER2− breast cancer. Supplementary Table 4: Correlation between metabolites and mRNA expression of cell-cycle-related genes in HR+/HER2− breast cancer. Supplementary Table 5: Comparisons among SNF1–SNF4. Supplementary Table 6: List of SNF-specific features in transcriptomics data-based model for SNF subtype classification. Supplementary Table 7: Patient-derived organoids information.
Source data
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6b,c
Unprocessed western blots.
Source Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Source Data Extended Data Fig. 10
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jin, X., Zhou, YF., Ma, D. et al. Molecular classification of hormone receptor-positive HER2-negative breast cancer. Nat Genet 55, 1696–1708 (2023). https://doi.org/10.1038/s41588-023-01507-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-023-01507-7