A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer

Amgad, Mohamed; Hodge, James M.; Elsebaie, Maha A. T.; Bodelon, Clara; Puvanesarajah, Samantha; Gutman, David A.; Siziopikou, Kalliopi P.; Goldstein, Jeffery A.; Gaudet, Mia M.; Teras, Lauren R.; Cooper, Lee A. D.

doi:10.1038/s41591-023-02643-7

Article
Published: 27 November 2023

A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer

Nature Medicine volume 30, pages 85–97 (2024)Cite this article

10k Accesses
4 Citations
532 Altmetric
Metrics details

Subjects

Abstract

Breast cancer is a heterogeneous disease with variable survival outcomes. Pathologists grade the microscopic appearance of breast tissue using the Nottingham criteria, which are qualitative and do not account for noncancerous elements within the tumor microenvironment. Here we present the Histomic Prognostic Signature (HiPS), a comprehensive, interpretable scoring of the survival risk incurred by breast tumor microenvironment morphology. HiPS uses deep learning to accurately map cellular and tissue structures to measure epithelial, stromal, immune, and spatial interaction features. It was developed using a population-level cohort from the Cancer Prevention Study-II and validated using data from three independent cohorts, including the Prostate, Lung, Colorectal, and Ovarian Cancer trial, Cancer Prevention Study-3, and The Cancer Genome Atlas. HiPS consistently outperformed pathologists in predicting survival outcomes, independent of tumor–node–metastasis stage and pertinent variables. This was largely driven by stromal and immune features. In conclusion, HiPS is a robustly validated biomarker to support pathologists and improve patient prognosis.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of the methodological approach and datasets used.**

**Fig. 2: Thematic categorization and selection of features using the CPS-II cohort.**

**Fig. 4: Stromal features critically impact the HiPS score and alter risk categorization in stage I cancers.**

**Fig. 5: Kaplan–Meier analysis of HiPS groups compared with the control groups.**

**Fig. 6: The HiPS score is consistent with established risk profiles.**

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

Article Open access 16 April 2024

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

Article Open access 28 March 2024

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

Data availability

Supplementary Table 27 contains our calculated histomic feature values, HiPS scores and subscores, and related data for the TCGA cohort. We provide this to facilitate reproducibility and to act as a resource for the scientific community. TCGA clinical data and WSIs are publicly available at gdc.cancer.gov. The Breast Cancer Semantic Segmentation dataset is available at github.com/PathologyDataScience/BCSS, and the NuCLS dataset is available at sites.google.com/view/nucls. These datasets were combined to produce the PanopTILs dataset, available at sites.google.com/view/panoptils. Requests for ACS data from the CPS-II or CPS-3 studies should be submitted to maddison.hall@cancer.org. Requests for PLCO data should be submitted at cdas.cancer.gov/learn/plco. Breast cancer genomic subtypes, hypoxia scores, fraction genome altered, aneuploidy scores, and mRNA expression profiles were obtained from the Genomic Data Commons Pancancer Atlas: gdc.cancer.gov/about-data/publications/pancanatlas. Immune subtypes and related pathway activations, as well as angiogenesis and lymphangiogenesis scores, were obtained from the PanImmune dataset: gdc.cancer.gov/about-data/publications/panimmune. CAF subtype abundance data for TCGA were obtained from a previous study⁸¹. xCell cell type abundance data for TCGA were obtained from a previous study⁶¹.

Code availability

The code is publicly available at github.com/PathologyDataScience/HiPS. Processing of histology images was performed using HistomicsTK (v.1.2.10, github.com/DigitalSlideArchive/HistomicsTK), histolab (v.0.6.0, github.com/histolab/histolab), and scikit-image (v.0.18.0, scikit-image.org). Analysis of clinical outcomes data was performed using Lifelines (v.0.27.8, github.com/CamDavidsonPilon/lifelines). Enrichment analysis with RNA profiles was performed using GSEAPy (v.1.0.6, gseapy.rtfd.io). Additional Python libraries used for database management, graphical plotting, scientific calculations, and other tasks include numpy v.1.19.4, pandas v.1.1.5, SQLAlchemy v.1.3.21, scipy v.1.5.4, scikit-learn v.0.23.2, imageio v.2.9.0, pillow v.8.0.1, matplotlib v.3.3.3, seaborn v.0.11.0, torch v.1.7.1, torchvision v.0.8.2, and pyvips v.2.1.15.

References

Global Cancer Facts & Figures 4th Edition (American Cancer Society, 2018).
Siegel, R. L., Miller, K. D., Wagle, N. S. & Jemal, A. Cancer statistics, 2023. CA Cancer J. Clin. 73, 17–48 (2023).
Article PubMed Google Scholar
American Joint Commission on Cancer AJCC Cancer Staging Manual 2017 (Springer International Publishing, 2017).
Coughlin, S. S. Social determinants of breast cancer risk, stage, and survival. Breast Cancer Res. Treat. 177, 537–548 (2019).
Article PubMed Google Scholar
Li, X. et al. Validation of the newly proposed American Joint Committee on Cancer (AJCC) breast cancer prognostic staging group and proposing a new staging system using the National Cancer Database. Breast Cancer Res. Treat. 171, 303–313 (2018).
Article PubMed Google Scholar
Scarff, R. W. & Handley, R. S. Prognosis in carcinoma of the breast. Lancet 232, 582–583 (1938).
Article Google Scholar
BLACK, M. M., OPLER, S. R. & SPEER, F. D. Survival in breast cancer cases in relation to the structure of the primary tumor and regional lymph nodes. Surg. Gynecol. Obstet. 100, 543–551 (1955).
CAS PubMed Google Scholar
Bloom, H. J. & Richardson, W. W. Histological grading and prognosis in breast cancer; a study of 1409 cases of which 359 have been followed for 15 years. Br. J. Cancer 11, 359–377 (1957).
Article CAS PubMed PubMed Central Google Scholar
Elston, E. W. & Ellis, I. O. Method for grading breast cancer. J. Clin. Pathol. 46, 189–190 (1993).
Article CAS PubMed PubMed Central Google Scholar
Elston, C. W. & Ellis, I. O. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up. Histopathology 19, 403–410 (1991).
Article CAS PubMed Google Scholar
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
Article CAS PubMed Google Scholar
Cardenas, M. A., Prokhnevska, N. & Kissick, H. T. Organized immune cell interactions within tumors sustain a productive T-cell response. Int. Immunol. 33, 27–37 (2021).
Article CAS PubMed Google Scholar
Sahai, E. et al. A framework for advancing our understanding of cancer-associated fibroblasts. Nat. Rev. Cancer 20, 174–186 (2020).
Article CAS PubMed PubMed Central Google Scholar
Liu, T., Zhou, L., Li, D., Andl, T. & Zhang, Y. Cancer-associated fibroblasts build and secure the tumor microenvironment. Front. Cell Dev. Biol. 7, 60 (2019).
Article PubMed PubMed Central Google Scholar
Savas, P. et al. Clinical relevance of host immunity in breast cancer: from TILs to the clinic. Nat. Rev. Clin. Oncol. 13, 228–241 (2016).
Article CAS PubMed Google Scholar
Ha, S. Y., Yeo, S.-Y., Xuan, Y. & Kim, S.-H. The prognostic significance of cancer-associated fibroblasts in esophageal squamous cell carcinoma. PLoS ONE 9, e99955 (2014).
Article PubMed PubMed Central Google Scholar
Conklin, M. W. et al. Aligned collagen is a prognostic signature for survival in human breast carcinoma. Am. J. Pathol. 178, 1221–1232 (2011).
Article PubMed PubMed Central Google Scholar
Provenzano, P. P. et al. Collagen reorganization at the tumor–stromal interface facilitates local invasion. BMC Med. 4, 38 (2006).
Article PubMed PubMed Central Google Scholar
Shekhar, M. P., Werdell, J., Santner, S. J., Pauley, R. J. & Tait, L. Breast stroma plays a dominant regulatory role in breast epithelial growth and differentiation: implications for tumor development and progression. Cancer Res. 61, 1320–1326 (2001).
CAS PubMed Google Scholar
Couture, H. D. et al. Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype. NPJ Breast Cancer 4, 30 (2018).
Article PubMed PubMed Central Google Scholar
Rawat, R. R. et al. Deep learned tissue “fingerprints” classify breast cancers by ER/PR/Her2 status from H&E images. Sci. Rep. 10, 7275 (2020).
Article CAS PubMed PubMed Central Google Scholar
Gamble, P. et al. Determining breast cancer biomarker status and associated morphological features using deep learning. Commun. Med. 1, 14 (2021).
Article PubMed PubMed Central Google Scholar
Bychkov, D. et al. Outcome and biomarker supervised deep learning for survival prediction in two multicenter breast cancer series. J. Pathol. Inform. 13, 9 (2022).
Article PubMed Google Scholar
Calle, E. E. et al. The American Cancer Society Cancer Prevention Study II Nutrition Cohort: rationale, study design, and baseline characteristics. Cancer 94, 2490–2501 (2002).
Article PubMed Google Scholar
Cancer Genome Atlas NetworkComprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Article Google Scholar
Zhu, C. S. et al. The Prostate, Lung, Colorectal and Ovarian Cancer (PLCO) screening trial pathology tissue resource. Cancer Epidemiol. Biomark. Prev. 25, 1635–1642 (2016).
Article CAS Google Scholar
Patel, A. V. et al. The American Cancer Society’s Cancer Prevention Study 3 (CPS-3): recruitment, study design, and baseline characteristics. Cancer 123, 2014–2024 (2017).
Article CAS PubMed Google Scholar
Haralick, R. M., Shanmugam, K. & Dinstein, I. Textural features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3, 610–621 (1973).
Article Google Scholar
Doyle, S., Agner, S., Madabhushi, A., Feldman, M. & Tomaszewski, J. Automated grading of breast cancer histopathology using spectral clustering with textural and architectural image features. In Proc. 2008 5th IEEE Int. Symposium on Biomedical Imaging: From Nano to Macro 496–499 (IEEE, 2008).
Gurcan, M. N. et al. Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009).
Article PubMed PubMed Central Google Scholar
Litjens, G. et al. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017).
Article PubMed Google Scholar
Liu, Y., Han, D., Parwani, A. V. & Li, Z. Applications of artificial intelligence in breast pathology. Arch. Pathol. Lab. Med. https://doi.org/10.5858/arpa.2022-0457-RA (2023).
Abels, E. et al. Computational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the Digital Pathology Association. J. Pathol. 249, 286–294 (2019).
Article PubMed PubMed Central Google Scholar
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Article CAS PubMed PubMed Central Google Scholar
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Article PubMed PubMed Central Google Scholar
Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bychkov, D. et al. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 8, 3395 (2018).
Article PubMed PubMed Central Google Scholar
Chen, R. J. et al. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41, 757–770 (2022).
Article PubMed PubMed Central Google Scholar
Duanmu, H. et al. A spatial attention guided deep learning system for prediction of pathological complete response using breast cancer histopathology images. Bioinformatics 38, 4605–4612 (2022).
Article CAS PubMed PubMed Central Google Scholar
Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359 (2020).
Article Google Scholar
Ribeiro, M. T. et al. "Why should i trust you?": explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (ACM, 2016).
Amgad, M. et al. Explainable nucleus classification using Decision Tree Approximation of Learned Embeddings. Bioinformatics 38, 513–519 (2022).
Article CAS PubMed Google Scholar
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Article PubMed PubMed Central Google Scholar
Leavitt, M. L. & Morcos, A. Towards falsifiable interpretability research. Preprint at arxiv.org/abs/2010.12016 (2020).
Koh, P. W. et al. Concept bottleneck models. in Proc. 37th Int. Conf. on Machine Learning (eds III, H. D. & Singh, A.) Vol. 119, 5338–5348 (PMLR, 2020).
Kirillov, A., He, K., Girshick, R., Rother, C. & Dollar, P. Panoptic segmentation. in Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) (2019).
Amgad, M., Salgado, R. & Cooper, L. L. A panoptic segmentation approach for tumor-infiltrating lymphocyte assessment: development of the MuTILs model and PanopTILs dataset. Preprint at medRxiv https://doi.org/10.1101/2022.01.08.22268814 (2023).
Amgad, M., Salgado, R. & Cooper, L. A. D. MuTILs: a multiresolution deep-learning model for interpretable scoring of tumor-infiltrating lymphocytes in breast carcinomas using clinical guidelines. Preprint at medRxiv https://doi.org/10.1101/2022.01.08.22268814 (2022).
Amgad, M. et al. Structured crowdsourcing enables convolutional segmentation of histology images. Bioinformatics 35, 3461–3467 (2019).
Article CAS PubMed PubMed Central Google Scholar
Amgad, M. et al. NuCLS: a scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancer. Gigascience 11, giac037 (2022).
Article PubMed PubMed Central Google Scholar
Gutman, D. A. et al. The Digital Slide Archive: a software platform for management, integration, and analysis of histology for cancer research. Cancer Res. 77, e75–e78 (2017).
Article CAS PubMed PubMed Central Google Scholar
Schmid, P. et al. Pembrolizumab plus chemotherapy as neoadjuvant treatment of high-risk, early-stage triple-negative breast cancer: results from the phase 1b open-label, multicohort KEYNOTE-173 study. Ann. Oncol. 31, 569–581 (2020).
Article CAS PubMed Google Scholar
Liu, J. et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 173, 400–416.e11 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. et al. Characteristics of The Cancer Genome Atlas cases relative to U.S. general population cancer cases. Br. J. Cancer 119, 885–892 (2018).
Article PubMed PubMed Central Google Scholar
Kalinsky, K. et al. 21-gene assay to inform chemotherapy benefit in node-positive breast cancer. N. Engl. J. Med. 385, 2336–2347 (2021).
Article CAS PubMed PubMed Central Google Scholar
Paik, S. et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 351, 2817–2826 (2004).
Article CAS PubMed Google Scholar
van’t Veer, L. J. et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 415, 530–536 (2002).
Article PubMed Google Scholar
van de Vijver, M. J. et al. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347, 1999–2009 (2002).
Article PubMed Google Scholar
Howard, F. M. et al. Integration of clinical features and deep learning on pathology for the prediction of breast cancer recurrence assays and risk of recurrence. NPJ Breast Cancer 9, 25 (2023).
Article PubMed PubMed Central Google Scholar
Lehmann, B. D. et al. Multi-omics analysis identifies therapeutic vulnerabilities in triple-negative breast cancer subtypes. Nat. Commun. 12, 6276 (2021).
Article CAS PubMed PubMed Central Google Scholar
Aran, D., Hu, Z. & Butte, A. J. xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 18, 220 (2017).
Article PubMed PubMed Central Google Scholar
Yoshihara, K. et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).
Article PubMed Google Scholar
Hoadley, K. A. et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 173, 291–304.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 33, 690–705.e9 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bhandari, V. et al. Molecular landmarks of tumor hypoxia across cancer types. Nat. Genet. 51, 308–318 (2019).
Article CAS PubMed Google Scholar
Buffa, F. M., Harris, A. L., West, C. M. & Miller, C. J. Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene. Br. J. Cancer 102, 428–435 (2010).
Article CAS PubMed PubMed Central Google Scholar
Winter, S. C. et al. Relation of a hypoxia metagene derived from head and neck cancer to prognosis of multiple cancers. Cancer Res. 67, 3441–3449 (2007).
Article CAS PubMed Google Scholar
Ragnum, H. B. et al. The tumour hypoxia marker pimonidazole reflects a transcriptional programme associated with aggressive prostate cancer. Br. J. Cancer 112, 382–390 (2015).
Article CAS PubMed Google Scholar
Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).
Article CAS PubMed PubMed Central Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Article CAS PubMed PubMed Central Google Scholar
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021).
Article CAS PubMed PubMed Central Google Scholar
DeNardo, D. G. et al. Leukocyte complexity predicts breast cancer survival and functionally regulates response to chemotherapy. Cancer Discov. 1, 54–67 (2011).
Article CAS PubMed PubMed Central Google Scholar
Mahmoud, S. M. A. et al. Tumor-infiltrating CD8+ lymphocytes predict clinical outcome in breast cancer. J. Clin. Oncol. 29, 1949–1955 (2011).
Article PubMed Google Scholar
Oh, H. & Ghosh, S. NF-κB: roles and regulation in different CD4(+) T-cell subsets. Immunol. Rev. 252, 41–51 (2013).
Article PubMed PubMed Central Google Scholar
Olkhanud, P. B. et al. Tumor-evoked regulatory B cells promote breast cancer metastasis by converting resting CD4+ T cells to T-regulatory cells. Cancer Res. 71, 3505–3515 (2011).
Article CAS PubMed PubMed Central Google Scholar
Varn, F. S., Mullins, D. W., Arias-Pulido, H., Fiering, S. & Cheng, C. Adaptive immunity programmes in breast cancer. Immunology 150, 25–34 (2017).
Article CAS PubMed Google Scholar
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830.e14 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chang, H. Y. et al. Gene expression signature of fibroblast serum response predicts human cancer progression: similarities between tumors and wounds. PLoS Biol. 2, e7 (2004).
Article PubMed PubMed Central Google Scholar
Saltz, J. et al. Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Rep. 23, 181–193.e7 (2018).
Article CAS PubMed PubMed Central Google Scholar
Costa, A. et al. Fibroblast heterogeneity and immunosuppressive environment in human breast cancer. Cancer Cell 33, 463–479.e10 (2018).
Article CAS PubMed Google Scholar
Li, B. et al. Cell-type deconvolution analysis identifies cancer-associated myofibroblast component as a poor prognostic factor in multiple cancer types. Oncogene 40, 4686–4694 (2021).
Article CAS PubMed Google Scholar
Mhaidly, R. & Mechta-Grigoriou, F. Fibroblast heterogeneity in tumor micro-environment: role in immunosuppression and new therapies. Semin. Immunol. 48, 101417 (2020).
Article CAS PubMed Google Scholar
Asif, P. J., Longobardi, C., Hahne, M. & Medema, J. P. The role of cancer-associated fibroblasts in cancer invasion and metastasis. Cancers 13, 4720 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kim, I., Choi, S., Yoo, S., Lee, M. & Kim, I.-S. Cancer-associated fibroblasts in the hypoxic tumor microenvironment. Cancers 14, 3321 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ebbing, E. A. et al. Stromal-derived interleukin 6 drives epithelial-to-mesenchymal transition and therapy resistance in esophageal adenocarcinoma. Proc. Natl Acad. Sci. USA 116, 2237–2242 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yu, Y. et al. Cancer-associated fibroblasts induce epithelial-mesenchymal transition of breast cancer cells through paracrine TGF-β signalling. Br. J. Cancer 110, 724–732 (2014).
Article CAS PubMed Google Scholar
Mariotto, A. et al. Expected monetary impact of Oncotype DX score-concordant systemic breast cancer therapy based on the TAILORx trial. J. Natl Cancer Inst. 112, 154–160 (2020).
Article PubMed Google Scholar
Davis, B. A. et al. Racial and ethnic disparities in Oncotype DX test receipt in a statewide population-based study. J. Natl Compr. Canc. Netw. 15, 346–354 (2017).
Article PubMed Google Scholar
Losk, K. et al. Factors associated with delays in chemotherapy initiation among patients with breast cancer at a comprehensive cancer center. J. Natl Compr. Canc. Netw. 14, 1519–1526 (2016).
Article PubMed Google Scholar
Yousif, M. et al. Artificial intelligence applied to breast pathology. Virchows Arch. 480, 191–209 (2022).
Article PubMed Google Scholar
Abubakar, M. et al. Tumor-associated stromal cellular density as a predictor of recurrence and mortality in breast cancer: results from ethnically diverse study populations. Cancer Epidemiol. Biomark. Prev. 30, 1397–1407 (2021).
Article CAS Google Scholar
Li, H. et al. Collagen fiber orientation disorder from H&E images is prognostic for early stage breast cancer: clinical trial validation. NPJ Breast Cancer 7, 104 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, Y. et al. Computational pathology improves risk stratification of a multi-gene assay for early stage ER+ breast cancer. NPJ Breast Cancer 9, 40 (2023).
Article CAS PubMed PubMed Central Google Scholar
Diao, J. A. et al. Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes. Nat. Commun. 12, 1613 (2021).
Article CAS PubMed PubMed Central Google Scholar
Bejnordi, B. E. et al. Deep learning-based assessment of tumor-associated stroma for diagnosing breast cancer in histopathology images. In Proc. IEEE Int. Symp. Biomed. Imaging 929–932 (2017).
Lu, M. Y. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021).
Article CAS PubMed Google Scholar
Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).
Article PubMed PubMed Central Google Scholar
Ghassemi, M., Oakden-Rayner, L. & Beam, A. L. The false hope of current approaches to explainable artificial intelligence in health care. Lancet Digit. Health 3, e745–e750 (2021).
Article CAS PubMed Google Scholar
Bilal, M. et al. Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study. Lancet Digit. Health 3, e763–e772 (2021).
Article CAS PubMed PubMed Central Google Scholar
Mercan, C. et al. Deep learning for fully-automated nuclear pleomorphism scoring in breast cancer. NPJ Breast Cancer 8, 120 (2022).
Article CAS PubMed PubMed Central Google Scholar
Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med. 3, 108ra113 (2011).
Article PubMed Google Scholar
Karagiannis, G. S. et al. Cancer-associated fibroblasts drive the progression of metastasis through both paracrine and mechanical pressure on cancer tissue. Mol. Cancer Res. 10, 1403–1418 (2012).
Article CAS PubMed PubMed Central Google Scholar
Yuan, Y. et al. Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling. Sci. Transl. Med. 4, 157ra143 (2012).
Article PubMed Google Scholar
He, L. et al. Association between levels of tumor-infiltrating lymphocytes in different subtypes of primary breast tumors and prognostic outcomes: a meta-analysis. BMC Womens Health 20, 194 (2020).
Article CAS PubMed PubMed Central Google Scholar
Denkert, C. et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 19, 40–50 (2018).
Article PubMed Google Scholar
AbdulJabbar, K. et al. Geospatial immune variability illuminates differential evolution of lung adenocarcinoma. Nat. Med. 26, 1054–1062 (2020).
Article CAS PubMed PubMed Central Google Scholar
Huang, Z. et al. Artificial intelligence reveals features associated with breast cancer neoadjuvant chemotherapy responses from multi-stain histopathologic images. NPJ Precis. Oncol. 7, 14 (2023).
Article CAS PubMed PubMed Central Google Scholar
Amgad, M. et al. Report on computational assessment of tumor infiltrating lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ Breast Cancer 6, 16 (2020).
Article PubMed PubMed Central Google Scholar
Ping, Z. et al. A microscopic landscape of the invasive breast cancer genome. Sci. Rep. 6, 27545 (2016).
Article CAS PubMed PubMed Central Google Scholar
Thennavan, A. et al. Molecular analysis of TCGA breast cancer histologic types. Cell Genom. 1, 100067 (2021).
Article CAS PubMed PubMed Central Google Scholar
Garfinkel, L. Selection, follow-up, and analysis in the American Cancer Society prospective studies. Natl Cancer Inst. Monogr. 67, 49–52 (1985).
CAS PubMed Google Scholar
Stellman, S. D. & Garfinkel, L. Smoking habits and tar levels in a new American Cancer Society prospective study of 1.2 million men and women. J. Natl Cancer Inst. 76, 1057–1063 (1986).
CAS PubMed Google Scholar
Howard, F. M. et al. The impact of site-specific digital histology signatures on deep learning model accuracy and bias. Nat. Commun. 12, 4423 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In Proc. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (eds Navab, N. et al.) 234–241 (Springer, 2015).
van Rijthoven, M., Balkenhol, M., Siliņa, K., van der Laak, J. & Ciompi, F. HookNet: multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images. Med. Image Anal. 68, 101890 (2021).
Article PubMed Google Scholar
Steyerberg, E. W. & Harrell, F. E. Prediction models need appropriate internal, internal-external, and external validation. J. Clin. Epidemiol. 69, 245–247 (2016).
Article PubMed Google Scholar
Marcolini, A. et al. histolab: a Python library for reproducible digital pathology preprocessing with automated testing. SoftwareX 20, 101237 (2022).
Article Google Scholar
Achanta, R. et al. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2274–2282 (2012).
Article PubMed Google Scholar
Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. in 2009 IEEE Int. Symposium on Biomedical Imaging: from Nano to Macro 1107–1110 (IEEE, 2009); https://doi.org/10.1109/ISBI.2009.5193250
Ripley, B. D. The second-order analysis of stationary point processes. J. Appl. Probab. 13, 255–266 (1976).
Article Google Scholar
Amgad, M., Itoh, A. & Tsui, M. M. K. Extending Ripley’s K-function to quantify aggregation in 2-D grayscale images. PLoS ONE 10, e0144404 (2015).
Article PubMed PubMed Central Google Scholar
Lester, S. C. et al. Protocol for the examination of specimens from patients with invasive carcinoma of the breast. Arch. Pathol. Lab. Med. 133, 1515–1538 (2009).
Article PubMed Google Scholar
Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22 (2010).
Article PubMed PubMed Central Google Scholar
Campbell, H. & Dean, C. B. The consequences of proportional hazards based model selection. Stat. Med. 33, 1042–1056 (2014).
Article CAS PubMed Google Scholar
Stensrud, M. J. & Hernán, M. A. Why test for proportional hazards? JAMA 323, 1401–1402 (2020).
Article PubMed Google Scholar
Fang, Z., Liu, X. & Peltz, G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. Bioinformatics 39, btac757 (2023).
Article CAS PubMed Google Scholar
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 14, 128 (2013).
Article Google Scholar

Download references

Acknowledgements

We express sincere appreciation to all CPS-II and CPS-3 participants and to each member of the study and biospecimen management group. We would like to acknowledge the contributions to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries and cancer registries supported by the National Cancer Institute’s Surveillance Epidemiology and End Results Program. We thank the National Cancer Institute for access to NCI’s data collected by the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. We are grateful to the annotation team for the Breast Cancer Semantic Segmentation and NuCLS datasets. We would also like to acknowledge F.M. Howard and A.T. Pearson (University of Chicago) for providing us with the research-use Oncotype DX and MammaPrint scores for TCGA. Figures 1–4 and 6, and multiple supplementary figures, were created in part using BioRender.com. This work was supported by the US National Institutes of Health grants U01CA220401 and U24CA19436201. The ACS funds the creation, maintenance, and updating of the CPS-II and CPS-3 cohorts.

Author information

These authors jointly supervised this work: Lauren R. Teras, Lee A.D. Cooper.

Authors and Affiliations

Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
Mohamed Amgad, Kalliopi P. Siziopikou, Jeffery A. Goldstein & Lee A. D. Cooper
Department of Population Science, American Cancer Society, Atlanta, GA, USA
James M. Hodge, Clara Bodelon, Samantha Puvanesarajah & Lauren R. Teras
Department of Medicine, John H. Stroger, Jr. Hospital of Cook County, Chicago, IL, USA
Maha A. T. Elsebaie
Department of Pathology, Emory University School of Medicine, Atlanta, GA, USA
David A. Gutman
Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
Mia M. Gaudet

Authors

Mohamed Amgad
View author publications
You can also search for this author in PubMed Google Scholar
James M. Hodge
View author publications
You can also search for this author in PubMed Google Scholar
Maha A. T. Elsebaie
View author publications
You can also search for this author in PubMed Google Scholar
Clara Bodelon
View author publications
You can also search for this author in PubMed Google Scholar
Samantha Puvanesarajah
View author publications
You can also search for this author in PubMed Google Scholar
David A. Gutman
View author publications
You can also search for this author in PubMed Google Scholar
Kalliopi P. Siziopikou
View author publications
You can also search for this author in PubMed Google Scholar
Jeffery A. Goldstein
View author publications
You can also search for this author in PubMed Google Scholar
Mia M. Gaudet
View author publications
You can also search for this author in PubMed Google Scholar
Lauren R. Teras
View author publications
You can also search for this author in PubMed Google Scholar
Lee A. D. Cooper
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A. and L.A.D.C. conceived of the research idea. M.A. carried out data analysis, model development, and model validation. M.A., M.A.T.E., and L.A.D.C. wrote the paper, and J.M.H., C.B., K.P.S., J.A.G., M.M.G., and L.R.T. edited the paper. J.M.H. and S.P. performed data curation for the Cancer Prevention Studies cohorts. K.P.S. provided expertise on breast cancer pathology. C.B., M.M.G., and L.R.T. provided expertise on breast cancer epidemiology and population science and assisted with the interpretation of results. D.A.G. provided assistance with computing and data visualization. L.R.T. and L.A.D.C. jointly supervised the work.

Corresponding author

Correspondence to Lee A. D. Cooper.

Ethics declarations

Competing interests

L.A.D.C. has invention disclosures registered at the Northwestern Office of Innovation and New Ventures, consults for Tempus, and advises Veracyte and Targeted Bioscience. D.A.G. holds stock options in Histowiz LLC and is a cofounder and stockholder of Switchboard, MD. The other authors declare no competing interests.

Peer review

Peer review information

Nature Medicine thanks Po-Hsuan Cameron Chen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lorenzo Righetto, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Tables 1–4, 6–23, and 29, and Figs. 1–51.

Reporting Summary

Supplementary Data

Excel file containing Supplementary Tables 5 and 24–28.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Amgad, M., Hodge, J.M., Elsebaie, M.A.T. et al. A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer. Nat Med 30, 85–97 (2024). https://doi.org/10.1038/s41591-023-02643-7

Download citation

Received: 17 May 2023
Accepted: 13 October 2023
Published: 27 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1038/s41591-023-02643-7

This article is cited by

tRigon: an R package and Shiny App for integrative (path-)omics data analysis
- David L. Hölscher
- Michael Goedertier
- Roman D. Bülow
BMC Bioinformatics (2024)
Towards a general-purpose foundation model for computational pathology
- Richard J. Chen
- Tong Ding
- Faisal Mahmood
Nature Medicine (2024)
Clinical evaluation of deep learning-based risk profiling in breast cancer histopathology and comparison to an established multigene assay
- Yinxi Wang
- Wenwen Sun
- Johan Hartman
Breast Cancer Research and Treatment (2024)

A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer

Subjects

Abstract

Access options

Similar content being viewed by others

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Data

Rights and permissions

About this article

Cite this article

This article is cited by

tRigon: an R package and Shiny App for integrative (path-)omics data analysis

Towards a general-purpose foundation model for computational pathology

Clinical evaluation of deep learning-based risk profiling in breast cancer histopathology and comparison to an established multigene assay

Machine learning improves prediction of clinical outcomes for invasive breast cancers

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links