Mammographic screening is widely used for the detection of breast cancers, but has its flaws. For example, false-positive findings can lead to unnecessary medical interventions and patient anxiety, whereas false-negative results delay diagnosis and potentially preclude cure. Now, collaboration between Google Health and physician scientists has resulted in an artificial intelligence (AI) approach with the potential to enhance the efficiency of breast cancer diagnosis.

Credit: Reproduced from McKinney, S. M. et al. Nature 577, 89–94 (2020).

A deep learning-based AI system was trained using mammograms from ~76,000 women in the UK and >15,000 in the USA, and was then retrospectively applied to UK and US test sets comprising 25,856 and 3,097 women, respectively. The AI system resulted in absolute reductions of 1.2% and 2.7% in the rates of false-positive and false-negative detection of biopsy-confirmed breast cancers, respectively, in the UK test set and 5.7% and 9.4% in the US dataset, relative to the judgement of the first or sole radiologists.

“In our study, we addressed a common concern that machine learning results fail to generalise to new populations by re-training the algorithm using UK data only and then testing it on US data,” adds Google Health employee Shravya Shetty. “Despite a small drop in performance, the AI system continued to demonstrate a reduction in false-positive and false-negative rates [3.5% and 8.1%, respectively].”

When used to provide a rapid second opinion as part of the double-reading process used in the UK, the accuracy of the AI system was non-inferior to serial reading by two radiologists, and the simulated workload of the second reader was reduced by 88%. Thus, AI has the potential to alleviate pressures on services in the context of a worldwide shortage of radiologists.

...we addressed a common concern that machine learning results fail to generalise to new populations...

The survival benefits of mammographic screening, per se, continue to be debated and overdiagnosis is a key concern. Notably, the fundamental principles of AI in discerning patterns and associations that are often imperceptible to humans might, in the future, provide the capacity to distinguish clinically relevant and irrelevant cancers. “Further clinical studies are required to understand how software systems inspired by this research could improve patient care,” Shetty emphasizes, concluding that “the goal is to increase the accuracy, efficacy and efficiency of screening, as well as reduce patient wait times and stress.”