Abstract
Diagnostic procedures, therapeutic recommendations, and medical risk stratifications are based on dedicated, strictly controlled clinical trials. However, a plethora of real-world medical data exists, whereupon the increase in data volume comes at the expense of completeness, uniformity, and control. Here, a case-by-case comparison shows that the predictive power of our real world data–based model for diabetes-related chronic kidney disease outperforms published algorithms, which were derived from clinical study data.
Your institute does not have access to this article
Relevant articles
Open Access articles citing this article.
-
Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
Journal of Translational Medicine Open Access 26 March 2022
-
Artificial intelligence-assisted fast screening cervical high grade squamous intraepithelial lesion and squamous cell carcinoma diagnosis and treatment planning
Scientific Reports Open Access 10 August 2021
-
Predicting critical state after COVID-19 diagnosis: model development using a large US electronic health record dataset
npj Digital Medicine Open Access 20 July 2021
Access options
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
$29.99
monthly
Subscribe to Journal
Get full journal access for 1 year
$59.00
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.


Data availability
Restrictions apply to the general availability of the data because of patient agreements and the nature of patient data. Data were used under license for the study presented in this manuscript. The IBM Explorys database data are run by IBM who makes the data available for secondary use (for example, scientific research) on a commercial basis. The INPC database is owned by the participating health institutions of the INPC. Access to the INPC can be provided for research purposes through the Regenstrief Institute Data Core.
References
Trojano, M. et al. Nat. Rev. Neurol. 13, 105–118 (2017).
Marx, V. Nature 498, 255–260 (2013).
Bender, E. Nature 527, S19 (2015).
Wu, X. et al. IEEE Trans. Knowl. Data Eng. 26, 97–107 (2014).
Frieden, T. R. N. Engl. J. Med. 377, 465–475 (2017).
Bates, D. W. et al. Health Aff. 33, 1123–1131 (2014).
Razavian, N. et al. Big Data 3, 277–287 (2015).
Miotto, R., Li, L., Kidd, B. A. & Dudley, J. T. Sci. Rep. 6, 26094 (2016).
Levin, A. et al. Lancet 390, 1888–1917 (2017).
Fioretto, P., Dodson, P. M., Ziegler, D. & Rosenson, R. S. Nat. Rev. Endocrinol. 6, 19–25 (2010).
Wanner, C. et al. N. Engl. J. Med. 375, 323–334 (2016).
Kaelber, D. C. et al. J. Am. Med. Inform. Assoc. 19, 965–972 (2012).
Hosmer, Jr., D. W., Lemeshow, S. & Sturdivant, R. X. Applied Logistic Regression 3rd edn (John Wiley & Sons, Inc., Hoboken, NJ, USA, 2013).
Vossen, P. Science 357, 22–27 (2017).
McDonald, C. J. et al. Health Aff. 24, 1214–1220 (2005).
Swets, J. A. Science 240, 1285–1293 (1988).
Bradley, A. P. Patt. Recogn. 30, 1145–1159 (1997).
The Diabetes Control and Complications Trial Research Group N. Engl. J. Med. 329, 977–986 (1993).
Dunkler, D. et al. Clin. J. Am. Soc. Nephrol. 10, 1371–1379 (2015).
Vergouwe, Y. et al. Diabetologia 53, 254–262 (2010).
Keane, W. F. et al. Clin. J. Am. Soc. Nephrol. 1, 761–767 (2006).
Jardine, M. J. et al. Am. J. Kidn. Dis. 60, 770–778 (2012).
Liaw, A. & Wiener, M. R News 2, 18–22 (2002).
Unger, J. & Schwartz, Z. Diabetes Management in Primary Care 2nd edn (Lippincott Williams & Wilkens, Philadelphia, 2013).
Glassock, R. J., Warnock, D. G. & Delanaye, P. Nat. Rev. Nephrol. 13, 104–114 (2017).
GBD 2015 Mortality and Causes of Death Collaborators. Lancet 388, 1459–1544 (2016).
Platinga, L. C., Tuot, D. S. & Powe, N. R. Adv. Chron. Kidn. Dis. 17, 225–236 (2010).
Bursac, Z. et al. Source Code Biol. Med. 3, 17 (2008).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference and Prediction (Springer, New York, 2009).
Van Rijsbergen, C. J. Information Retrieval (Butterworth-Heinemann, Newton, MA, USA, 1979).
Wasserstein, R. L. & Lazar, N. A. The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70, 129–133 (2016).
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).
Carpenter, J. & Bithell, J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000).
Acknowledgements
The authors thank O. Quarder, C. Ringemann, P. Stephan (Roche Diabetes Care GmbH, Germany), and H. Mikulski (Roche Diabetes Care Spain, S.L.) for their continuing contributions to this work. We are grateful to T. Beck, S. Chittajallu, and S. Weinert (Roche Diabetes Care, Inc., USA) for their consultancy in the early phase of the investigation. The support from U. Günzel as well as H. Rincker and team (Roche Diabetes Care Deutschland, Germany) is highly appreciated. We are indebted to R. Daikeler, K. Kusterer, S. Waibel, and S. Zink (Germany) for their medical advice concerning our initial results. The research described in this manuscript was funded by Roche Diabetes Care GmbH and supplemented with in-kind contributions from Eli Lilly and Company (S.M.), Indiana Biosciences Research Institute (D.R.), and Regenstrief Institute, Inc. (T.S.).
Author information
Authors and Affiliations
Contributions
S.R., A.A., A.B., and F.F.F. generated and validated the Roche/IBM algorithm. T.H. and H.K. performed independent validation and further analysis. S.M., D.R., T.S., and teams enabled data withdrawal and assessment. B.S., L.B., and R.H. provided consultation for the overall research project, which was led by W.P.
Corresponding author
Ethics declarations
Competing interests
The authors declare the following potential conflicts of interest: T.H., B.S., W.P., S.R., and A.B. are inventors of a patent application related to the work described in this manuscript. T.H., H.K., B.S., R.H., and W.P. are employees of Roche Diabetes Care GmbH. S.R., A.A., A.B., L.B., and F.F.F. are employees of IBM Switzerland Ltd. S.M. is an employee of Eli Lilly and Company. Independent of his employment at Roche, W.P. is affiliated with Heidelberg University and is a member of the Faculty of Physics and Astronomy. T.S. is affiliated with Indiana University School of Medicine.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–3 and Supplementary Tables 1–7
Rights and permissions
About this article
Cite this article
Ravizza, S., Huschto, T., Adamov, A. et al. Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data. Nat Med 25, 57–59 (2019). https://doi.org/10.1038/s41591-018-0239-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-018-0239-8
Further reading
-
Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
Journal of Translational Medicine (2022)
-
The efficacy of canagliflozin in diabetes subgroups stratified by data-driven clustering or a supervised machine learning method: a post hoc analysis of canagliflozin clinical trial data
Diabetologia (2022)
-
Predicting diabetic nephropathy in type 2 diabetic patients using machine learning algorithms
Journal of Diabetes & Metabolic Disorders (2022)
-
Big Data in Nephrology
Nature Reviews Nephrology (2021)
-
Predicting critical state after COVID-19 diagnosis: model development using a large US electronic health record dataset
npj Digital Medicine (2021)