Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Research Briefing
  • Published:

Physician–machine partnerships boost diagnostic accuracy, but bias persists

In a large-scale digital experiment on dermatology diagnosis, we found that specialists and generalists achieved diagnostic accuracy of 38% and 19%, respectively. With decision support from a fair deep learning system, the diagnostic accuracy of physicians improved by more than 33%, but the gap in accuracy of generalists widened across skin tones.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Experimental design and key results.

References

  1. Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. AI in health and medicine. Nat. Med. 28, 31–38 (2022). This review article covers advances in medical image analysis, problem formulation in human–AI collaboration, and common challenges, such as data scarcity and racial bias.

    Article  CAS  PubMed  Google Scholar 

  2. Liu, Y. et al. A deep learning system for differential diagnosis of skin disease. Nat. Med. 26, 900–908 (2020). This paper demonstrates the potential of AI assistance in supporting general practitioners and nurse practitioners in diagnosing common skin diseases.

    Article  ADS  CAS  PubMed  Google Scholar 

  3. Daneshjou, R. et al. Disparities in dermatology AI performance on a diverse curated clinical image set. Sci. Adv. 8, eabq6147 (2022). This paper reports that state-of-the-art dermatology AI models are less accurate on dark skin tones than on light skin tones.

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  4. Almaatouq, A. et al. Beyond playing 20 questions with nature: Integrative experiment design in the social and behavioral science. Behav. Brain. Sci. https://doi.org/10.1017/S0140525X22002874 (2022). This paper proposes an integrative experimental design whereby researchers map the design space of possible experiments and test these experiments together to promote commensurability in behavioral science.

    Article  PubMed  Google Scholar 

  5. Groh, M. et al. Evaluating deep neural networks trained on clinical images in dermatology with the Fitzpatrick 17k dataset. Proc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 1820–1828 (2021). This paper presents a large dataset of clinical images annotated with the Fitzpatrick skin type scale and demonstrates that deep learning classifiers are most accurate on skin tones similar to those it was trained on.

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Groh, M. et al. Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nat. Med. https://doi.org/10.1038/s41591-023-02728-3 (2024).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Physician–machine partnerships boost diagnostic accuracy, but bias persists. Nat Med 30, 356–357 (2024). https://doi.org/10.1038/s41591-023-02733-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-023-02733-6

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing