Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The need for uncertainty quantification in machine-assisted medical decision making

Medicine, even from the earliest days of artificial intelligence (AI) research, has been one of the most inspiring and promising domains for the application of AI-based approaches. Equally, it has been one of the more challenging areas to see an effective adoption. There are many reasons for this, primarily the reluctance to delegate decision making to machine intelligence in cases where patient safety is at stake. To address some of these challenges, medical AI, especially in its modern data-rich deep learning guise, needs to develop a principled and formal uncertainty quantification (UQ) discipline, just as we have seen in fields such as nuclear stockpile stewardship and risk management. The data-rich world of AI-based learning and the frequent absence of a well-understood underlying theory poses its own unique challenges to straightforward adoption of UQ. These challenges, while not trivial, also present significant new research opportunities for the development of new theoretical approaches, and for the practical applications of UQ in the area of machine-assisted medical decision making. Understanding prediction system structure and defensibly quantifying uncertainty is possible, and, if done, can significantly benefit both research and practical applications of AI in this critical domain.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    Oberkampf, W. L. & Roy, C. J. Verfication and Validation in Scientific Computing (Cambridge Univ. Press, Cambridge, 2010).

    Book  Google Scholar 

  2. 2.

    National Research Council Evaluation of Quantification of Margins and Uncertainties: Methodology for Assessing and Certifying the Reliability of the Nuclear Stockpile (National Academies Press, Washington DC, 2009).

    Google Scholar 

  3. 3.

    Zuk, O., Hechter, E., Sunyaev, S. R. & Lander, E. S. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc. Natl Acad. Sci. USA 109, 1193–1198 (2012).

  4. 4.

    Choi, J. D. & Lee, J.-S. Interplay between epigenetics and genetics in cancer. Genomics Inform. 11, 164–173 (2013).

    Article  Google Scholar 

  5. 5.

    Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 172, 1122–1131 (2018).

    Article  Google Scholar 

  6. 6.

    Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    Article  Google Scholar 

  7. 7.

    Mar, V. & Soyer, H. Artificial intelligence for melanoma diagnosis: How can we deliver on the promise? Ann. Oncol. 29, 1625–1628 (2018).

    Article  Google Scholar 

  8. 8.

    Weng, S. F., Reps, J., Kai, J., Garibaldi, J. M. & Qureshi, N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One 12, e0174944 (2017).

  9. 9.

    Bychkov, D. et al. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 8, 3395 (2018).

    Article  Google Scholar 

  10. 10.

    Xiao, C., Ma, T., Dieng, A. B., Blei, D. M. & Wang, F. Readmission prediction via deep contextual embedding of clinical concepts. PLoS One 13, e0195024 (2018).

  11. 11.

    Rajkomar, A. et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 1, 18 (2018).

    Article  Google Scholar 

  12. 12.

    Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3, 434–443 (2017).

    Article  Google Scholar 

  13. 13.

    Hsu, E., Klemm, J., Kerlavage, A., Kusnezov, D. & Kibbe, W. Cancer moonshot data and technology team: Enabling a national learning healthcare system for cancer to unleash the power of data. Clin. Pharmacol. Ther. 101, 613–615 (2017).

    Article  Google Scholar 

  14. 14.

    Fillon, M. Making sense of the mountains of new cancer data. J. Natl Cancer Inst. 109, djx020 (2017).

    Google Scholar 

  15. 15.

    Geraci, J. et al. Applying deep neural networks to unstructured text notes in electronic medical records for phenotyping youth depression. Evid. Based Ment. Health 20, 83–87 (2017).

    Article  Google Scholar 

  16. 16.

    Zhou, Y. et al. Resource-efficient neural architect. Preprint at https://arxiv.org/abs/1806.07912 (2018).

  17. 17.

    Gal, Y. & Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. Preprint at https://arxiv.org/abs/1506.02142 (2015).

  18. 18.

    Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning requires rethinking generalization. Preprint at http://arxiv.org/abs/1611.03530 (2016).

  19. 19.

    Arpit, D. et al. A closer look at memorization in deep networks. Preprint at https://arxiv.org/abs/1706.05394 (2017).

  20. 20.

    Zhang, C., Vinyals, O., Munos, R. & Bengio, S. A study on overfitting in deep reinforcement learning. Preprint at http://arxiv.org/abs/1804.06893 (2018).

  21. 21.

    Cawley, G. C. & Talbot, N. L. C. On over-fitting in model selection and subsequent selection bias in performance evaluation. J. Mach. Learn. Res. 11, 2079–2107 (2010).

    MathSciNet  MATH  Google Scholar 

  22. 22.

    Brahma, P. P., Wu, D. & She, Y. Why deep learning works: A manifold disentanglement perspective. IEEE Trans. Neural Netw. Learn. Sys. 27, 1997–2008 (2016).

    MathSciNet  Article  Google Scholar 

  23. 23.

    Raghu, M., Gilmer, J., Yosinski, J. & Sohl-Dickstein, J. SVCCA: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. Preprint at https://arxiv.org/abs/1706.05806 (2017).

  24. 24.

    Brahma, P. P., Huang, Q. & Wu, D. O. Structured memory based deep model to detect as well as characterize novel inputs. Preprint at http://arxiv.org/abs/1801.09859 (2018).

  25. 25.

    Yu, Y., Qu, W., Li, N. & Guo, Z. Open-category classification by adversarial sample generation. Preprint at http://arxiv.org/abs/1705.08722 (2017).

  26. 26.

    Ge, Z., Demyanov, S., Chen, Z. & Garnavi, R. Generative openmax for multi-class open set classification. Preprint at http://arxiv.org/abs/1707.07418 (2017).

Download references

Acknowledgements

This manuscript has been in part co-authored by UT-Battelle, LLC, under contract no. DE-AC05-00OR22725.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Edmon Begoli.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Begoli, E., Bhattacharya, T. & Kusnezov, D. The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell 1, 20–23 (2019). https://doi.org/10.1038/s42256-018-0004-1

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing