The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care


Sepsis is the third leading cause of death worldwide and the main cause of mortality in hospitals1,2,3, but the best treatment strategy remains uncertain. In particular, evidence suggests that current practices in the administration of intravenous fluids and vasopressors are suboptimal and likely induce harm in a proportion of patients1,4,5,6. To tackle this sequential decision-making problem, we developed a reinforcement learning agent, the Artificial Intelligence (AI) Clinician, which extracted implicit knowledge from an amount of patient data that exceeds by many-fold the life-time experience of human clinicians and learned optimal treatment by analyzing a myriad of (mostly suboptimal) treatment decisions. We demonstrate that the value of the AI Clinician’s selected treatment is on average reliably higher than human clinicians. In a large validation cohort independent of the training data, mortality was lowest in patients for whom clinicians’ actual doses matched the AI decisions. Our model provides individualized and clinically interpretable treatment decisions for sepsis that could improve patient outcomes.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Data flow of the AI Clinician.
Fig. 2: Selection of the best AI policy and model calibration.
Fig. 3: Comparison of clinician and AI policies in eRI and average dose excess received per patient of both drugs in eRI with corresponding mortality.

Data availability

MIMIC-III is openly available. Access to the eRI data is restricted to the Philips eICU Research Institute. The eICU Collaborative Research Database contains a sample of over 200,000 patient stays from the eRI database that is freely available. The databases were queried in pgAdmin 4 v 1.3, and computations were implemented in Matlab R2017a (MathWorks, Inc.). Access to the computer code used in this research is available by request to the corresponding authors. To facilitate the reproduction of our results, we provide the list of anonymous patient identifiers for both databases in Supplementary Data 1 and 2.


  1. 1.

    Gotts, J. E. & Matthay, M. A. Sepsis: pathophysiology and clinical management. Br. Med. J. 353, i1585 (2016).

  2. 2.

    Torio, C. M. & Andrews, R. M. National Inpatient Hospital Costs: The Most Expensive Conditions by Payer, 2011: Statistical Brief #160. in Healthcare Cost and Utilization Project (HCUP) Statistical Briefs (Agency for Health Care Research and Quality, Rockville, MD, USA, 2013).

  3. 3.

    Liu, V. et al. Hospital deaths in patients with sepsis from 2 independent cohorts. J. Am. Med. Assoc. 312, 90–92 (2014).

  4. 4.

    Byrne, L. & Van Haren, F. Fluid resuscitation in human sepsis: time to rewrite history? Ann. Intensive Care 7, 4 (2017).

  5. 5.

    Marik, P. E. The demise of early goal-directed therapy for severe sepsis and septic shock. Acta Anaesthesiol. Scand. 59, 561–567 (2015).

  6. 6.

    Marik, P. & Bellomo, R. A rational approach to fluid therapy in sepsis. Br. J. Anaesth. 116, 339–349 (2016).

  7. 7.

    Singer, M. et al. The third international consensus definitions for sepsis and septic shock (sepsis-3). J. Am. Med. Assoc. 315, 801–810 (2016).

  8. 8.

    Waechter, J. et al. Interaction between fluids and vasoactive agents on mortality in septic shock: a multicenter, observational study. Crit. Care Med. 42, 2158–2168 (2014).

  9. 9.

    Bai, X. et al. Early versus delayed administration of norepinephrine in patients with septic shock. Crit. Care. 18, 532 (2014).

  10. 10.

    Marik, P. E., Linde-Zwirble, W. T., Bittner, E. A., Sahatjian, J. & Hansell, D. Fluid administration in severe sepsis and septic shock, patterns and outcomes: an analysis of a large national database. Intensive Care Med. 43, 625–632 (2017).

  11. 11.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. 1st edn (MIT Press, Cambridge, MA, USA, 1998).

  12. 12.

    Bennett, C. C. & Hauser, K. Artificial intelligence framework for simulating clinical decision-making: a Markov decision process approach. Artif. Intell. Med. 57, 9–19 (2013).

  13. 13.

    Schaefer, A. J., Bailey, M. D., Shechter, S. M. & Roberts, M. S. Modeling Medical Treatment Using Markov Decision Processes. in Operations Research and Health Care (eds. Brandeau, M. L., Sainfort, F. & Pierskalla, W. P.) 593–612 (Springer, Boston, 2005).

  14. 14.

    Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. J. Am. Med. Assoc. 316, 2402–2410 (2016).

  15. 15.

    Prasad, N., Cheng, L.-F., Chivers, C., Draugelis, M. & Engelhardt, B. E. A Reinforcement Learning Approach to Weaning of Mechanical Ventilation in Intensive Care Units. Preprint at (2017).

  16. 16.

    Bothe, M. K. et al. The use of reinforcement learning algorithms to meet the challenges of an artificial pancreas. Expert. Rev. Med. Devices. 10, 661–673 (2013).

  17. 17.

    Lowery, C. & Faisal, A. A. Towards efficient, personalized anesthesia using continuous reinforcement learning for propofol infusion control. in International IEEE/EMBS Conference on Neural Engineering 1414–1417 (IEEE, San Diego, CA, USA, 2013).

  18. 18.

    Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3, 160035 (2016).

  19. 19.

    Elixhauser, A., Steiner, C., Harris, D. R. & Coffey, R. M. Comorbidity measures for use with administrative data. Med. Care 36, 8–27 (1998).

  20. 20.

    Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. (Wiley-Interscience, Hoboken, NJ, USA, 2014).

  21. 21.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. 2nd edn,(MIT Press, Cambridge, MA, USA, 2018).

  22. 22.

    Thomas, P. S., Theocharous, G. & Ghavamzadeh, M. High-Confidence Off-Policy Evaluation. in Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI, Palo Alto, CA, USA, 2015).

  23. 23.

    Hanna, J. P., Stone, P. & Niekum, S. Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation. Preprint at (2016).

  24. 24.

    Thomas, P. S., Theocharous, G. & Ghavamzadeh, M. High confidence policy improvement. in Proceedings of the 32nd International Conference on Machine Learning 2380–2388 (PMLR, Lille, France, 2015).

  25. 25.

    Acheampong, A. & Vincent, J.-L. A positive fluid balance is an independent prognostic factor in patients with sepsis. Crit. Care. 19, 251 (2015).

  26. 26.

    Johnson, A. E. W. et al. Machine learning and decision support in critical care. Proc. IEEE Inst. Electr. Electron Eng. 104, 444–466 (2016).

  27. 27.

    Vincent, J.-L. The future of critical care medicine: integration and personalization. Crit. Care Med. 44, 386–389 (2016).

  28. 28.

    Chen, J. H. & Asch, S. M. Machine learning and prediction in medicine—beyond the peak of inflated expectations. N. Engl. J. Med. 376, 2507–2509 (2017).

  29. 29.

    Gordon, A. C. et al. levosimendan for the prevention of acute organ dysfunction in sepsis. N. Engl. J. Med. 375, 1638–1648 (2016).

  30. 30.

    Ranieri, V. M. et al. Drotrecogin alfa (activated) in adults with septic shock. N. Engl. J. Med. 366, 2055–2064 (2012).

  31. 31.

    Seymour, C. W. et al. Assessment of clinical criteria for sepsis: For the third international consensus definitions for sepsis and septic shock (sepsis-3). J. Am. Med. Assoc. 315, 762–774 (2016).

  32. 32.

    Raith, E. P. et al. Prognostic accuracy of the SOFA Score, SIRS Criteria, and qSOFA score for in-hospital mortality among adults with suspected infection admitted to the intensive care unit. J. Am. Med. Assoc. 317, 290–300 (2017).

  33. 33.

    Hug, C. W. Detecting hazardous intensive care patient episodes using real-time mortality models. PhD thesis, Massachusetts Institute of Technology. (2009).

  34. 34.

    Tutz, G. & Ramzan, S. Improved methods for the imputation of missing data by nearest neighbor methods. Comput. Stat. Data. Anal. 90, 84–99 (2015).

  35. 35.

    Arthur, D. & Vassilvitskii, S. K-means++: The Advantages of Careful Seeding. in Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms 1027–1035 (Society for Industrial and Applied Mathematics, Philadelphia, 2007).

  36. 36.

    Jones, R. H. Bayesian information criterion for longitudinal and clustered data. Stat. Med. 30, 3050–3056 (2011).

  37. 37.

    Brown, S. M. et al. Survival after shock requiring high-dose vasopressor therapy. Chest 143, 664–671 (2013).

  38. 38.

    Norris, J. R. Discrete-time Markov chains. in Markov Chains (Cambridge University Press, Cambridge, MA, USA, 1997).

  39. 39.

    Jiang, N. & Li, L. Doubly robust off-policy value evaluation for reinforcement learning. Preprint at (2015).

  40. 40.

    Thomas, P. S. & Brunskill, E. Data-efficient off-policy policy evaluation for reinforcement learning. Preprint at (2016).

  41. 41.

    Precup, D., Sutton, R. S. & Singh, S. P. Eligibility Traces for off-policy policy evaluation. in Proceedings of the Seventeenth International Conference on Machine Learning 759–766 (Morgan Kaufmann Publishers Inc., Burlington, MA, USA, 2000).

  42. 42.

    Munos, R., Stepleton, T., Harutyunyan, A. & Bellemare, M. G. Safe and efficient off-policy reinforcement learning. Preprint at (2016).

Download references


We are grateful to F. Doshi-Velez and O. Gottesman for their assistance with the methodology. We are grateful for support from the National Institute of Health Research (NIHR) Comprehensive Biomedical Research Centre based at Imperial College Healthcare NHS Trust and Imperial College London. We are thankful to the Laboratory of Computational Physiology at the Massachusetts Institute of Technology and the eICU Research Institute for providing the data used in this research. M.K. and this project are funded by the Engineering and Physical Sciences Research Council and an Imperial College President’s PhD Scholarship. A.C.G. is funded by an NIHR Research Professorship award (RP-2015-06-018). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.

Author information




M.K., A.C.G. and A.A.F. conceived the overall study. M.K. and A.A.F. designed and conducted the experiments and analyzed the data. L.A.C. and O.B. contributed to the experimental design and analyses. O.B. provided key input in extracting and processing data from the eRI. All authors contributed to the interpretation of the results and M.K. drafted the manuscript, which was reviewed, revised and approved by all authors.

Corresponding authors

Correspondence to Anthony C. Gordon or A. Aldo Faisal.

Ethics declarations

Competing interests

The authors declare competing interests: A.C.G. reports that outside of this work he has received speaker fees from Orion Corporation Orion Pharma and Amomed Pharma. He has consulted for Ferring Pharmaceuticals, Tenax Therapeutics, Baxter Healthcare, Bristol-Myers Squibb and GSK, and received grant support from Orion Corporation Orion Pharma, Tenax Therapeutics and HCA International with funds paid to his institution. L.A.C. receives funding from Philips Healthcare. O.B. is an employee of Philips Healthcare. A.A.F. has received funding from Fresenius-KABI. M.K. does not have competing financial interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Tables 1–3

Reporting Summary

Supplementary Data 1

MIMIC-III patient identifiers

Supplementary Data 2

eRI patient identifiers

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Komorowski, M., Celi, L.A., Badawi, O. et al. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nat Med 24, 1716–1720 (2018).

Download citation

Further reading