Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data

Article metrics


Diagnostic procedures, therapeutic recommendations, and medical risk stratifications are based on dedicated, strictly controlled clinical trials. However, a plethora of real-world medical data exists, whereupon the increase in data volume comes at the expense of completeness, uniformity, and control. Here, a case-by-case comparison shows that the predictive power of our real world data–based model for diabetes-related chronic kidney disease outperforms published algorithms, which were derived from clinical study data.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Results of the predictive algorithms.
Fig. 2: Role of sample size and imputation.

Data availability

Restrictions apply to the general availability of the data because of patient agreements and the nature of patient data. Data were used under license for the study presented in this manuscript. The IBM Explorys database data are run by IBM who makes the data available for secondary use (for example, scientific research) on a commercial basis. The INPC database is owned by the participating health institutions of the INPC. Access to the INPC can be provided for research purposes through the Regenstrief Institute Data Core.


  1. 1.

    Trojano, M. et al. Nat. Rev. Neurol. 13, 105–118 (2017).

  2. 2.

    Marx, V. Nature 498, 255–260 (2013).

  3. 3.

    Bender, E. Nature 527, S19 (2015).

  4. 4.

    Wu, X. et al. IEEE Trans. Knowl. Data Eng. 26, 97–107 (2014).

  5. 5.

    Frieden, T. R. N. Engl. J. Med. 377, 465–475 (2017).

  6. 6.

    Bates, D. W. et al. Health Aff. 33, 1123–1131 (2014).

  7. 7.

    Razavian, N. et al. Big Data 3, 277–287 (2015).

  8. 8.

    Miotto, R., Li, L., Kidd, B. A. & Dudley, J. T. Sci. Rep. 6, 26094 (2016).

  9. 9.

    Levin, A. et al. Lancet 390, 1888–1917 (2017).

  10. 10.

    Fioretto, P., Dodson, P. M., Ziegler, D. & Rosenson, R. S. Nat. Rev. Endocrinol. 6, 19–25 (2010).

  11. 11.

    Wanner, C. et al. N. Engl. J. Med. 375, 323–334 (2016).

  12. 12.

    Kaelber, D. C. et al. J. Am. Med. Inform. Assoc. 19, 965–972 (2012).

  13. 13.

    Hosmer, Jr., D. W., Lemeshow, S. & Sturdivant, R. X. Applied Logistic Regression 3rd edn (John Wiley & Sons, Inc., Hoboken, NJ, USA, 2013).

  14. 14.

    Vossen, P. Science 357, 22–27 (2017).

  15. 15.

    McDonald, C. J. et al. Health Aff. 24, 1214–1220 (2005).

  16. 16.

    Swets, J. A. Science 240, 1285–1293 (1988).

  17. 17.

    Bradley, A. P. Patt. Recogn. 30, 1145–1159 (1997).

  18. 18.

    The Diabetes Control and Complications Trial Research Group N. Engl. J. Med. 329, 977–986 (1993).

  19. 19.

    Dunkler, D. et al. Clin. J. Am. Soc. Nephrol. 10, 1371–1379 (2015).

  20. 20.

    Vergouwe, Y. et al. Diabetologia 53, 254–262 (2010).

  21. 21.

    Keane, W. F. et al. Clin. J. Am. Soc. Nephrol. 1, 761–767 (2006).

  22. 22.

    Jardine, M. J. et al. Am. J. Kidn. Dis. 60, 770–778 (2012).

  23. 23.

    Liaw, A. & Wiener, M. R News 2, 18–22 (2002).

  24. 24.

    Unger, J. & Schwartz, Z. Diabetes Management in Primary Care 2nd edn (Lippincott Williams & Wilkens, Philadelphia, 2013).

  25. 25.

    Glassock, R. J., Warnock, D. G. & Delanaye, P. Nat. Rev. Nephrol. 13, 104–114 (2017).

  26. 26.

    GBD 2015 Mortality and Causes of Death Collaborators. Lancet 388, 1459–1544 (2016).

  27. 27.

    Platinga, L. C., Tuot, D. S. & Powe, N. R. Adv. Chron. Kidn. Dis. 17, 225–236 (2010).

  28. 28.

    Bursac, Z. et al. Source Code Biol. Med. 3, 17 (2008).

  29. 29.

    Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference and Prediction (Springer, New York, 2009).

  30. 30.

    Van Rijsbergen, C. J. Information Retrieval (Butterworth-Heinemann, Newton, MA, USA, 1979).

  31. 31.

    Wasserstein, R. L. & Lazar, N. A. The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70, 129–133 (2016).

  32. 32.

    Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).

  33. 33.

    Carpenter, J. & Bithell, J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000).

Download references


The authors thank O. Quarder, C. Ringemann, P. Stephan (Roche Diabetes Care GmbH, Germany), and H. Mikulski (Roche Diabetes Care Spain, S.L.) for their continuing contributions to this work. We are grateful to T. Beck, S. Chittajallu, and S. Weinert (Roche Diabetes Care, Inc., USA) for their consultancy in the early phase of the investigation. The support from U. Günzel as well as H. Rincker and team (Roche Diabetes Care Deutschland, Germany) is highly appreciated. We are indebted to R. Daikeler, K. Kusterer, S. Waibel, and S. Zink (Germany) for their medical advice concerning our initial results. The research described in this manuscript was funded by Roche Diabetes Care GmbH and supplemented with in-kind contributions from Eli Lilly and Company (S.M.), Indiana Biosciences Research Institute (D.R.), and Regenstrief Institute, Inc. (T.S.).

Author information

S.R., A.A., A.B., and F.F.F. generated and validated the Roche/IBM algorithm. T.H. and H.K. performed independent validation and further analysis. S.M., D.R., T.S., and teams enabled data withdrawal and assessment. B.S., L.B., and R.H. provided consultation for the overall research project, which was led by W.P.

Correspondence to Wolfgang Petrich.

Ethics declarations

Competing interests

The authors declare the following potential conflicts of interest: T.H., B.S., W.P., S.R., and A.B. are inventors of a patent application related to the work described in this manuscript. T.H., H.K., B.S., R.H., and W.P. are employees of Roche Diabetes Care GmbH. S.R., A.A., A.B., L.B., and F.F.F. are employees of IBM Switzerland Ltd. S.M. is an employee of Eli Lilly and Company. Independent of his employment at Roche, W.P. is affiliated with Heidelberg University and is a member of the Faculty of Physics and Astronomy. T.S. is affiliated with Indiana University School of Medicine.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–3 and Supplementary Tables 1–7

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading