Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care


Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals1,2,3, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients1,4,5,6. To tackle this sequential decision-making problem, we developed a reinforcement learning agent, the Artificial Intelligence (AI) Clinician, which extracted implicit knowledge from an amount of patient data that exceeds by many-fold the life-time experience of human clinicians and learned optimal treatment by analyzing a myriad of (mostly suboptimal) treatment decisions. We demonstrate that the value of the AI Clinician’s selected treatment is on average reliably higher than human clinicians. In a large validation cohort independent of the training data, mortality was lowest in patients for whom clinicians’ actual doses matched the AI decisions. Our model provides individualized and clinically interpretable treatment decisions for sepsis that could improve patient outcomes.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Data flow of the AI Clinician.
Fig. 2: Selection of the best AI policy and model calibration.
Fig. 3: Comparison of clinician and AI policies in eRI and average dose excess received per patient of both drugs in eRI with corresponding mortality.

Data availability

MIMIC-III is openly available. Access to the eRI data is restricted to the Philips eICU Research Institute. The eICU Collaborative Research Database contains a sample of over 200,000 patient stays from the eRI database that is freely available. The databases were queried in pgAdmin 4 v 1.3, and computations were implemented in Matlab R2017a (MathWorks, Inc.). Access to the computer code used in this research is available by request to the corresponding authors. To facilitate the reproduction of our results, we provide the list of anonymous patient identifiers for both databases in Supplementary Data 1 and 2.


  1. 1.

    Gotts, J. E. & Matthay, M. A. Sepsis: pathophysiology and clinical management. Br. Med. J. 353, i1585 (2016).

    Article  Google Scholar 

  2. 2.

    Torio, C. M. & Andrews, R. M. National Inpatient Hospital Costs: The Most Expensive Conditions by Payer, 2011: Statistical Brief #160. in Healthcare Cost and Utilization Project (HCUP) Statistical Briefs (Agency for Health Care Research and Quality, Rockville, MD, USA, 2013).

  3. 3.

    Liu, V. et al. Hospital deaths in patients with sepsis from 2 independent cohorts. J. Am. Med. Assoc. 312, 90–92 (2014).

    CAS  Article  Google Scholar 

  4. 4.

    Byrne, L. & Van Haren, F. Fluid resuscitation in human sepsis: time to rewrite history? Ann. Intensive Care 7, 4 (2017).

    Article  Google Scholar 

  5. 5.

    Marik, P. E. The demise of early goal-directed therapy for severe sepsis and septic shock. Acta Anaesthesiol. Scand. 59, 561–567 (2015).

    CAS  Article  Google Scholar 

  6. 6.

    Marik, P. & Bellomo, R. A rational approach to fluid therapy in sepsis. Br. J. Anaesth. 116, 339–349 (2016).

    CAS  Article  Google Scholar 

  7. 7.

    Singer, M. et al. The third international consensus definitions for sepsis and septic shock (sepsis-3). J. Am. Med. Assoc. 315, 801–810 (2016).

    CAS  Article  Google Scholar 

  8. 8.

    Waechter, J. et al. Interaction between fluids and vasoactive agents on mortality in septic shock: a multicenter, observational study. Crit. Care Med. 42, 2158–2168 (2014).

    CAS  Article  Google Scholar 

  9. 9.

    Bai, X. et al. Early versus delayed administration of norepinephrine in patients with septic shock. Crit. Care. 18, 532 (2014).

    Article  Google Scholar 

  10. 10.

    Marik, P. E., Linde-Zwirble, W. T., Bittner, E. A., Sahatjian, J. & Hansell, D. Fluid administration in severe sepsis and septic shock, patterns and outcomes: an analysis of a large national database. Intensive Care Med. 43, 625–632 (2017).

    Article  Google Scholar 

  11. 11.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. 1st edn (MIT Press, Cambridge, MA, USA, 1998).

    Google Scholar 

  12. 12.

    Bennett, C. C. & Hauser, K. Artificial intelligence framework for simulating clinical decision-making: a Markov decision process approach. Artif. Intell. Med. 57, 9–19 (2013).

    Article  Google Scholar 

  13. 13.

    Schaefer, A. J., Bailey, M. D., Shechter, S. M. & Roberts, M. S. Modeling Medical Treatment Using Markov Decision Processes. in Operations Research and Health Care (eds. Brandeau, M. L., Sainfort, F. & Pierskalla, W. P.) 593–612 (Springer, Boston, 2005).

  14. 14.

    Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316, 2402–2410 (2016).

    Article  Google Scholar 

  15. 15.

    Prasad, N., Cheng, L.-F., Chivers, C., Draugelis, M. & Engelhardt, B. E. A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units. Preprint at (2017).

  16. 16.

    Bothe, M. K. et al. The use of reinforcement learning algorithms to meet the challenges of an artificial pancreas. Expert. Rev. Med. Devices. 10, 661–673 (2013).

    CAS  Article  Google Scholar 

  17. 17.

    Lowery, C. & Faisal, A. A. Towards efficient, personalized anesthesia using continuous reinforcement learning for propofol infusion control. in International IEEE/EMBS Conference on Neural Engineering 1414–1417 (IEEE, San Diego, CA, USA, 2013).

  18. 18.

    Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).

    CAS  Article  Google Scholar 

  19. 19.

    Elixhauser, A., Steiner, C., Harris, D. R. & Coffey, R. M. Comorbidity measures for use with administrative data. Med. Care 36, 8–27 (1998).

    CAS  Article  Google Scholar 

  20. 20.

    Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. (Wiley-Interscience, Hoboken, NJ, USA, 2014).

    Google Scholar 

  21. 21.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. 2nd edn,(MIT Press, Cambridge, MA, USA, 2018).

    Google Scholar 

  22. 22.

    Thomas, P. S., Theocharous, G. & Ghavamzadeh, M. High-Confidence Off-Policy Evaluation. in Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI, Palo Alto, CA, USA, 2015).

  23. 23.

    Hanna, J. P., Stone, P. & Niekum, S. Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation. Preprint at (2016).

  24. 24.

    Thomas, P. S., Theocharous, G. & Ghavamzadeh, M. High confidence policy improvement. in Proceedings of the 32nd International Conference on Machine Learning 2380–2388 (PMLR, Lille, France, 2015).

  25. 25.

    Acheampong, A. & Vincent, J.-L. A positive fluid balance is an independent prognostic factor in patients with sepsis. Crit. Care. 19, 251 (2015).

    Article  Google Scholar 

  26. 26.

    Johnson, A. E. W. et al. Machine learning and decision support in critical care. Proc. IEEE Inst. Electr. Electron Eng. 104, 444–466 (2016).

    Article  Google Scholar 

  27. 27.

    Vincent, J.-L. The future of critical care medicine: integration and personalization. Crit. Care Med. 44, 386–389 (2016).

    Article  Google Scholar 

  28. 28.

    Chen, J. H. & Asch, S. M. Machine learning and prediction in medicine—beyond the peak of inflated expectations. N. Engl. J. Med. 376, 2507–2509 (2017).

    Article  Google Scholar 

  29. 29.

    Gordon, A. C. et al. levosimendan for the prevention of acute organ dysfunction in sepsis. N. Engl. J. Med. 375, 1638–1648 (2016).

    CAS  Article  Google Scholar 

  30. 30.

    Ranieri, V. M. et al. Drotrecogin alfa (activated) in adults with septic shock. N. Engl. J. Med. 366, 2055–2064 (2012).

    CAS  Article  Google Scholar 

  31. 31.

    Seymour, C. W. et al. Assessment of clinical criteria for sepsis: For the third international consensus definitions for sepsis and septic shock (sepsis-3). J. Am. Med. Assoc. 315, 762–774 (2016).

    CAS  Article  Google Scholar 

  32. 32.

    Raith, E. P. et al. Prognostic accuracy of the SOFA Score, SIRS Criteria, and qSOFA score for in-hospital mortality among adults with suspected infection admitted to the intensive care unit. J. Am. Med. Assoc. 317, 290–300 (2017).

    Article  Google Scholar 

  33. 33.

    Hug, C. W. Detecting hazardous intensive care patient episodes using real-time mortality models. PhD thesis, Massachusetts Institute of Technology. (2009).

  34. 34.

    Tutz, G. & Ramzan, S. Improved methods for the imputation of missing data by nearest neighbor methods. Comput. Stat. Data. Anal. 90, 84–99 (2015).

    Article  Google Scholar 

  35. 35.

    Arthur, D. & Vassilvitskii, S. K-means++: The Advantages of Careful Seeding. in Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms 1027–1035 (Society for Industrial and Applied Mathematics, Philadelphia, 2007).

  36. 36.

    Jones, R. H. Bayesian information criterion for longitudinal and clustered data. Stat. Med. 30, 3050–3056 (2011).

    Article  Google Scholar 

  37. 37.

    Brown, S. M. et al. Survival after shock requiring high-dose vasopressor therapy. Chest 143, 664–671 (2013).

    CAS  Article  Google Scholar 

  38. 38.

    Norris, J. R. Discrete-time Markov chains. in Markov Chains (Cambridge University Press, Cambridge, MA, USA, 1997).

  39. 39.

    Jiang, N. & Li, L. Doubly robust off-policy value evaluation for reinforcement learning. Preprint at (2015).

  40. 40.

    Thomas, P. S. & Brunskill, E. Data-efficient off-policy policy evaluation for reinforcement learning. Preprint at (2016).

  41. 41.

    Precup, D., Sutton, R. S. & Singh, S. P. Eligibility Traces for off-policy policy evaluation. in Proceedings of the Seventeenth International Conference on Machine Learning 759–766 (Morgan Kaufmann Publishers Inc., Burlington, MA, USA, 2000).

  42. 42.

    Munos, R., Stepleton, T., Harutyunyan, A. & Bellemare, M. G. Safe and efficient off-policy reinforcement learning. Preprint at (2016).

Download references


We are grateful to F. Doshi-Velez and O. Gottesman for their assistance with the methodology. We are grateful for support from the National Institute of Health Research (NIHR) Comprehensive Biomedical Research Centre based at Imperial College Healthcare NHS Trust and Imperial College London. We are thankful to the Laboratory of Computational Physiology at the Massachusetts Institute of Technology and the eICU Research Institute for providing the data used in this research. M.K. and this project are funded by the Engineering and Physical Sciences Research Council and an Imperial College President’s PhD Scholarship. A.C.G. is funded by an NIHR Research Professorship award (RP-2015-06-018). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information




M.K., A.C.G. and A.A.F. conceived the overall study. M.K. and A.A.F. designed and conducted the experiments and analyzed the data. L.A.C. and O.B. contributed to the experimental design and analyses. O.B. provided key input in extracting and processing data from the eRI. All authors contributed to the interpretation of the results and M.K. drafted the manuscript, which was reviewed, revised and approved by all authors.

Corresponding authors

Correspondence to Anthony C. Gordon or A. Aldo Faisal.

Ethics declarations

Competing interests

The authors declare competing interests: A.C.G. reports that outside of this work he has received speaker fees from Orion Corporation Orion Pharma and Amomed Pharma. He has consulted for Ferring Pharmaceuticals, Tenax Therapeutics, Baxter Healthcare, Bristol-Myers Squibb and GSK, and received grant support from Orion Corporation Orion Pharma, Tenax Therapeutics and HCA International with funds paid to his institution. L.A.C. receives funding from Philips Healthcare. O.B. is an employee of Philips Healthcare. A.A.F. has received funding from Fresenius-KABI. M.K. does not have competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Tables 1–3

Reporting Summary

Supplementary Data 1

MIMIC-III patient identifiers

Supplementary Data 2

eRI patient identifiers

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Komorowski, M., Celi, L.A., Badawi, O. et al. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nat Med 24, 1716–1720 (2018).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing