Machine learning is a popular tool in ecology but many scientific applications suffer from data leakage, causing misleading results. We highlight common pitfalls in ecological machine-learning methods and argue that discipline-specific model info sheets must be developed to aid in model evaluations.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Within-season vegetation indices and yield stability as a predictor of spatial patterns of Maize (Zea mays L) yields
Precision Agriculture Open Access 07 December 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Tuia, D. et al. Nat. Commun. 13, 792 (2022).
Valletta, J. J. et al. J. Anim. Behav. 124, 203–220 (2017).
Kapoor, S. & Narayanan, A. Preprint at arXiv, http://arxiv.org/abs/2207.07048 (2022).
Kaufman, S. et al. ACM Trans. Knowl. Discov. Data 6, 15 (2012).
Stock, A., Haupt, A. J., Mach, M. E. & Micheli, F. Ecol. Inform. 48, 37–47 (2018).
Geirhos, R. et al. Nat. Mach. Learn. 2, 665–673 (2020).
Shane, J. Do neural nets dream of electric sheep? AI Weirdness, https://www.aiweirdness.com/do-neural-nets-dream-of-electric-18-03-02/ (2 March 2018)
Beery, S., Van Horn, G. & Perona, P. Recognition in terra incognita. In Computer Vision – ECCV 2018 (eds Ferrari, V., Hebert, M., Sminchisescu, C. & Weiss, Y.) 472–489 (2018).
Gregr, E. J. et al. Ecography 42, 428–443 (2019).
Stock, A. ISPRS J. Photogramm. Remote Sens. 187, 46–60 (2022).
Roberts, D. R. et al. Ecography 40, 913–929 (2017).
Wiles, O. et al. Preprint at arXiv, https://doi.org/10.48550/arXiv.2110.11328 (2021).
Yates, K. L. et al. Trends Ecol. Evol. 33, 790–802 (2018).
Chan, K. M. A. & Gregr, E. J. Hindsight: tackling pattern, scale, and independence to ensure ecosystem models are predictive. functionalecologists.com, https://functionalecologists.com/2018/10/19/hindsight-tackling-pattern-scale-and-independence-to-ensure-ecosystem-models-are-predictive/ (2018).
Valavi, R. et al. Methods Ecol. Evol. 10, 225–232 (2019).
Feng, X. et al. Nat. Ecol. Evol. 3, 1382–1395 (2019).
Serra-Garcia, M. & Gneezy, U. Sci. Adv. 7, eabd1705 (, (2021).
Grill, G. Preprint at OSF Preprints, https://doi.org/10.31219/osf.io/zekqv (2022).
Lürig, M. D. et al. 9, 642774 (2021).
Acknowledgements
We were supported by a Liber Ero Postdoctoral Fellowship (A.S.) and NSERC Discovery Grant RGPIN-2020-05032 (K.M.A.C.).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks the anonymous reviewers for their contribution to the peer review of this work.
Supplementary information
Supplementary Information
Supplementary Figure 1.
Rights and permissions
About this article
Cite this article
Stock, A., Gregr, E.J. & Chan, K.M.A. Data leakage jeopardizes ecological applications of machine learning. Nat Ecol Evol 7, 1743–1745 (2023). https://doi.org/10.1038/s41559-023-02162-1
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41559-023-02162-1
This article is cited by
-
Within-season vegetation indices and yield stability as a predictor of spatial patterns of Maize (Zea mays L) yields
Precision Agriculture (2024)