A valid machine model is predictive, but a predictive model may not be valid. The gap between these two can be larger than many practitioners may expect.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Machine learning in concrete science: applications, challenges, and best practices
npj Computational Materials Open Access 06 June 2022
-
Uncovering the roles of microRNAs/lncRNAs in characterising breast cancer subtypes and prognosis
BMC Bioinformatics Open Access 04 June 2021
Access options
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
$29.99
monthly
Subscribe to Journal
Get full journal access for 1 year
$99.00
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.

References
Nat. Genet. 51, 1 (2019).
Runge, J. et al. Nat. Commun. 10, 2553 (2019).
Hussein, A. A. et al. Br. J. Cancer 119, 724–736 (2018).
Tam, V. et al. Nat. Rev. Genet. 20, 467–484 (2019).
Lewis, R. A., Rao, J. M. & Reiley, D. H. in Proc. 20th International Conference on World Wide Web 157–166 (ACM, 2011).
Pearl, J. Causality: Models, Reasoning, and Inference (Cambridge Univ. Press, 2009).
Imbens, G. W. & Rubin, D. B. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction (Cambridge Univ. Press, 2015).
Reichstein, M. et al. Nature 566, 195–204 (2019).
Hill, A. B. Proc. R. Soc. Med. 58, 295–300 (1965).
Pearl, J. Commun. ACM 62, 54–60 (2019).
Acknowledgements
This work has been supported by ARC Discovery Project grant DP170101306 and NHMRC grant 1123042.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Rights and permissions
About this article
Cite this article
Li, J., Liu, L., Le, T.D. et al. Accurate data-driven prediction does not mean high reproducibility. Nat Mach Intell 2, 13–15 (2020). https://doi.org/10.1038/s42256-019-0140-2
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-019-0140-2
This article is cited by
-
Machine learning in concrete science: applications, challenges, and best practices
npj Computational Materials (2022)
-
Uncovering the roles of microRNAs/lncRNAs in characterising breast cancer subtypes and prognosis
BMC Bioinformatics (2021)
-
Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions
Nature Machine Intelligence (2020)