Screening programs must balance the benefit of early detection with the cost of overscreening. Here, we introduce a novel reinforcement learning-based framework for personalized screening, Tempo, and demonstrate its efficacy in the context of breast cancer. We trained our risk-based screening policies on a large screening mammography dataset from Massachusetts General Hospital (MGH; USA) and validated this dataset in held-out patients from MGH and external datasets from Emory University (Emory; USA), Karolinska Institute (Karolinska; Sweden) and Chang Gung Memorial Hospital (CGMH; Taiwan). Across all test sets, we find that the Tempo policy combined with an image-based artificial intelligence (AI) risk model is significantly more efficient than current regimens used in clinical practice in terms of simulated early detection per screen frequency. Moreover, we show that the same Tempo policy can be easily adapted to a wide range of possible screening preferences, allowing clinicians to select their desired trade-off between early detection and screening costs without training new policies. Finally, we demonstrate that Tempo policies based on AI-based risk models outperform Tempo policies based on less accurate clinical risk models. Altogether, our results show that pairing AI-based risk models with agile AI-designed screening policies has the potential to improve screening programs by advancing early detection while reducing overscreening.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
npj Digital Medicine Open Access 28 November 2023
Nature Communications Open Access 23 August 2023
Nature Medicine Open Access 27 July 2023
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
All datasets were used under license to the respective hospital system for the current study and are not publicly available. To access the MGH dataset, investigators should contact C.L. to apply for an IRB-approved research collaboration and obtain an appropriate data use agreement. To access the Karolinska dataset, investigators should contact F.S. to apply for an approved research collaboration and sign a data use agreement. To access the CGMH dataset, investigators should contact G.L. to apply for an IRB-approved research collaboration. To access the Emory dataset, investigators should contact H.T. to apply for an approved collaboration.
Smith, R. A. et al. Cancer screening in the United States, 2019: a review of current American Cancer Society guidelines and current issues in cancer screening. CA Cancer J. Clin. 69, 184–210 (2019).
Wernli, K. J. et al. Screening for skin cancer in adults: updated evidence report and systematic review for the US Preventive Services Task Force. J. Am. Med. Assoc. 316, 436–447 (2016).
Coleman, C. Early detection and screening for breast cancer. Semin. Oncol. Nurs. 33, 141–155 (2017).
Curry, S. J. et al. Screening for cervical cancer: US Preventive Services Task Force recommendation statement. JAMA 320, 674–686 (2018).
Gier, R. A. et al. High-performance crispr-cas12a genome editing for combinatorial genetic screening. Nat. Commun. 11, 3455 (2020).
Yala, A., Lehman, C., Schuster, T., Portnoi, T. & Barzilay, R. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology 292, 60–66 (2019).
Gail, M. H. et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J. Natl Cancer Inst. 81, 1879–1886 (1989).
Tyrer, J., Duffy, S. W. & Cuzick, J. A breast cancer prediction model incorporating familial and personal risk factors. Stat. Med. 23, 1111–1130 (2004).
Bibbins-Domingo, K. et al. Screening for colorectal cancer: US Preventive Services Task Force recommendation statement. J. Am. Med. Assoc. 315, 2564–2575 (2016).
Moyer, V. A. Screening for lung cancer: US Preventive Services Task Force recommendation statement. Ann. Intern. Med. 160, 330–338 (2014).
Siu, A. L. Screening for breast cancer: US Preventive Services Task Force recommendation statement. Ann. Intern. Med. 164, 279–296 (2016).
Yala, A. et al. Toward robust mammography-based models for breast cancer risk. Sci. Transl. Med. 13, eaba4373 (2021).
Dembrower, K. et al. Comparison of a deep learning risk score and standard mammographic density score for breast cancer risk prediction. Radiology 294, 265–272 (2020).
Lu, M. T. et al. Deep learning using chest radiographs to identify high-risk smokers for lung cancer screening computed tomography: development and validation of a prediction model. Ann. Intern. Med. 173, 704–713 (2020).
Roijers, D. M., Vamplew, P., Whiteson, S. & Dazeley, R. A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. 48, 67–113 (2013).
Sutton R. S. & Barto A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Yang, R. et al. A generalized algorithm for multi-objective reinforcement learning and policy adaptation. In Advances in Neural Information Processing Systems, 14636–14647 (NeurIPS, 2019).
Oeffinger, K. C. et al. Breast cancer screening for women at average risk: 2015 guideline update from the American Cancer Society. JAMA 314, 1599–1614 (2015).
Monticciolo, D. L. et al. Breast cancer screening in women at higher-than-average risk: recommendations from the ACR. J. Am. Coll. Radiol. 15, 408–414 (2018).
Shieh, Y. et al. Breast cancer screening in the precision medicine era: risk-based screening in a population-based trial. J. Natl. Cancer Inst. 109, djw290 (2017).
Owens, D. K. et al. Medication use to reduce risk of breast cancer: US Preventive Services Task Force recommendation statement. J. Am. Med. Assoc. 322, 857–867 (2019).
Visvanathan, K. et al. American Society of Clinical Oncology clinical practice guideline update on the use of pharmacologic interventions including tamoxifen, raloxifene, and aromatase inhibition for breast cancer risk reduction. J. Clin. Oncol. 27, 3235 (2009).
Bakker, M. F. et al. Supplemental MRI screening for women with extremely dense breast tissue. N. Engl. J. Med. 381, 2091–2102 (2019).
Gustave, R. et al. International randomized study comparing personalized, risk-stratified to standard breast cancer screening in women aged 40–70 (NCT03672331). http://clinicaltrials.gov/ct/show/NCT03672331 (2019).
Le Boulc’h, M. et al. Comparison of breast density assessment between human eye and automated software on digital and synthetic mammography: impact on breast cancer risk. Diagn. Interv. Imaging 101, 811–819 (2020).
McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020).
Lotter, W. et al. Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach. Nat. Med. 27, 244–249 (2021).
Rodriguez-Ruiz, A. et al. Can we reduce the workload of mammographic screening by automatic identification of normal exams with artificial intelligence? A feasibility study. Eur. Radiol. 29, 4825–4832 (2019).
Maillart, L. M., Ivy, J. S., Ransom, S. & Diehl, K. Assessing dynamic breast cancer screening policies. Oper. Res. 56, 1411–1427 (2008).
Ayer, T., Alagoz, O. & Stout, N. K. OR forum: a POMDP approach to personalize mammography screening decisions. Oper. Res. 60, 1019–1034 (2012).
Wang, F., Zhang, S. & Henderson, L. M. Adaptive decision-making of breast cancer mammography screening: a heuristic-based regression model. Omega 76, 70–84 (2018).
Mandelblatt, J. S. et al. Collaborative modeling of the benefits and harms associated with different US breast cancer screening strategies. Ann. Intern. Med. 164, 215–225 (2016).
Trentham-Dietz, A. et al. Tailoring breast cancer screening intervals by breast density and risk for women aged 50 years or older: collaborative modeling of screening outcomes. Ann. Intern. Med. 165, 700–712 (2016).
Schousboe, J. T., Kerlikowske, K., Loh, A. & Cummings, S. R. Personalizing mammography by breast density and other risk factors for breast cancer: analysis of health benefits and cost-effectiveness. Ann. Intern. Med. 155, 10–20 (2011).
Ahuja, K. et al. Dpscreen: dynamic personalized screening. In Advances in Neural Information Processing Systems, 1321–1332 (NeurIPS, 2017).
van Seijen, M. et al. Ductal carcinoma in situ: to treat or not to treat, that is the question. Br. J. Med. 121, 285–292 (2019).
Dembrower, P. et al. A multi-million mammography image dataset and population-based screening cohort for the training and evaluation of deep neural networks: the cohort of screen-aged women (CSAW). J. Digit. Imaging 33, 408–413 (2019).
Cho, K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (eds Alessandro, M. et al.) 1724-1734 (Association for Computational Linguistics, 2014).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Andrychowicz, M. et al. Hindsight experience replay. In Advances in Neural Information Processing Systems, 5048–5058 (NeurIPS, 2017).
This work was supported by grants from Susan G. Komen, the Breast Cancer Research Foundation, Quanta Computer, Anonymous Foundation and the MIT Jameel Clinic. This work was also supported by the Chang Gung Medical Foundation (grant SMRPG3K0051) and Stockholm Läns Landsting HMT (grant 201708002). We are grateful to the Cancer Center of Linkou CGMH for assistance with data collection under IRB no. 201901491B0C601 and R. Yang, J. Song and their team (Quanta Computer) for providing technical and computing support for analyzing the CGMH dataset.
The authors declare no competing interests.
Peer review information
Nature Medicine thanks William Lotter and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Javier Carmona was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Estimated (circle) and observed (square) Mirai 5-year risk for two random patients in the MGH test set.
We estimated unobserved risk observations using an RNN, which was optimized to predict future risk assessments from past risk assessments on the MGH training set.
Histogram of early detection benefit in months relative to historical screening for patients who developed cancer in the MGH (top left), Emory (top right), Karolinska (bottom left), and CGMH (bottom right) test sets.
MGH (top left), Emory (top right), Karolinska (bottom left), CGMH (bottom right).
Extended Data Fig. 4 Our early detection metric assumed that a cancer could be caught up to 18 months before diagnosis.
To test the robustness of our results to this assumption, we also evaluated our screening policies when changing this assumption to 6 months, 12 months and 24 months. For each policy, we report its screening efficiency, which is defined as its early detection benefit in months divided by the amount of mammograms it recommends per year. Asterisk denotes the policy with the highest screening efficiency.
Dataset construction flow chart for the MGH dataset (top left), Emory (top right), Karolinska test set (bottom left), and CGMH test set (bottom right).
About this article
Cite this article
Yala, A., Mikhael, P.G., Lehman, C. et al. Optimizing risk-based breast cancer screening policies with reinforcement learning. Nat Med 28, 136–143 (2022). https://doi.org/10.1038/s41591-021-01599-w
This article is cited by
npj Digital Medicine (2023)
Nature Medicine (2023)
Nature Communications (2023)
Breast cancer risk prediction combining a convolutional neural network-based mammographic evaluation with clinical factors
Breast Cancer Research and Treatment (2023)
Artificial Intelligence Review (2023)