Data science systems (DSSs) are a fundamental tool in many areas of research and are now being developed by people with a myriad of backgrounds. This is coupled with a crisis in the reproducibility of such DSSs, despite the wide availability of powerful tools for data science and machine learning over the past decade. We believe that perverse incentives and a lack of widespread software engineering skills are among the many causes of this crisis and analyse why software engineering and building large complex systems is, in general, hard. Based on these insights, we identify how software engineering addresses those difficulties and how one might apply and generalize software engineering methods to make DSSs more fit for purpose. We advocate two key development philosophies: one should incrementally grow—not plan then build—DSSs, and one should use two types of feedback loop during development—one that tests the code’s correctness and another that evaluates the code’s efficacy.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Haibe-Kains, B. et al. Transparency and reproducibility in artificial intelligence. Nature 586, E14–E16 (2020).
Pineau, J. et al. Improving reproducibility in machine learning research: a report from the neurIPS 2019 reproducibility program. J. Mach. Learn. Res. 22, 7459–7478 (2021).
Baker, M. 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016).
Karpathy, A. A Recipe for Training Neural Networks; https://karpathy.github.io/2019/04/25/recipe/ (2019).
Aboumatar, H. & Wise, R. A. Notice of retraction. Aboumatar et al. Effect of a program combining transitional care and long-term self-management support on outcomes of hospitalized patients with chronic obstructive pulmonary disease: a randomized clinical trial. JAMA. 2018;320(22):2335–2343. JAMA 322, 1417–1418 (2019).
Bhandari Neupane, J. et al. Characterization of leptazolines A-D, polar oxazolines from the Cyanobacterium leptolyngbya sp., reveals a glitch with the ‘Willoughby-Hoye’ scripts for calculating NMR chemical shifts. Org. Lett. 21, 8449–8453 (2019).
Gall, J. General Systemantics (General Systemantics Press, 1975).
Brabban, P., Case, S., Cutts, S., Diniz, C. & Crawford, L. Data Pipeline Playbook; https://data-pipeline.playbook.ee/ (2021).
Roberts, M. et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat. Mach. Intell. 3, 199–217 (2021).
Parnas, D. L. On the criteria to be used in decomposing systems into modules. Commun. ACM 15, 1053–1058 (1972).
Sutherland, J. & Sutherland, J. V. Scrum: The Art of Doing Twice the Work in Half the Time (Currency, 2014).
Fowler, M. & Highsmith, J. et al. The Agile manifesto. Software Dev. 9, 28–35 (2001).
Farley, D. Modern Software Engineering: Doing What Works to Build Better Software Faster (Addison-Wesley, 2021).
Bass, L., Clements, P. & Kazman, R. Software Architecture in Practice (Addison-Wesley, 2003).
Reddy, V. S. The SpaceX effect. New Space 6, 125–134 (2018).
Vance, A. & Sanders, F. Elon Musk (Harper Collins, 2015).
Smith, R. J. Shuttle problems compromise space program: with the shuttle earth-bound, political troubles and cost overruns take off. Science 206, 910–914 (1979).
Perkel, J. M. How to fix your scientific coding errors. Nature 602, 172–173 (2022).
Lakshmanan, V., Robinson, S. & Munn, M. Machine Learning Design Patterns (O’Reilly Media, 2020).
Krekel, H. et al. Pytest x.y; https://github.com/pytest-dev/pytest (2004).
MacIver, D. R. Hypothesis x.y.; https://github.com/HypothesisWorks/hypothesis-python (2016).
Baumgartner, P. Ways I Use Testing as a Data Scientist https://www.peterbaumgartner.com/blog/testing-for-data-science/ (2021).
Niels, B. pandera: statistical data validation of pandas dataframes. In Proc. 19th Python in Science Conference (eds Agarwal, M. et al.) 116–124 (2020).
Goodhart, C. A. in Monetary Theory and Practice 91–121 (Springer, 1984).
Hoskin, K. in Accountability: Power, Ethos and the Technologies of Managing (eds Munro., R. & Mouritsen, J.) 265 (Cengage Learning EMEA, 1996).
Muller, J. Z. in The Tyranny of Metrics (Princeton Univ. Press, 2019).
The Turing Way Community. The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research 1.0.1 (Alan Turing Institute, 2021).
Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998).
Valverde, S. & Solé, R. V. Hierarchical small worlds in software architecture. Preprint at https://arxiv.org/abs/cond-mat/0307278 (2003).
We are grateful to the EU/EFPIA Innovative Medicines Initiative project DRAGON (101005122; S.D. and M.R., AIX-COVNET, C.-B.S.), Trinity Challenge BloodCounts! project (M.R., J.G. and C.-B.S.), EPSRC Cambridge Mathematics of Information in Healthcare Hub EP/T017961/1 (M.R., J.H.F.R., J.A.D.A. and C.-B.S.), Cantab Capital Institute for the Mathematics of Information (C.-B.S.), the European Research Council for Horizon 2020 grant no. 777826 (C.-B.S.), the Alan Turing Institute (C.-B.S.), the Wellcome Trust (J.H.F.R.), Cancer Research UK Cambridge Centre (C9685/A25177; C.-B.S.), the British Heart Foundation (J.H.F.R.), NIHR Cambridge Biomedical Research Centre (J.H.F.R.), HEFCE (J.H.F.R.), Leverhulme Trust project on ‘Breaking the non-convexity barrier’ (C.-B.S.), the Philip Leverhulme Prize (C.-B.S.), EPSRC grants EP/S026045/1 and EP/T003553/1 (C.-B.S.) and the Wellcome Innovator Award RG98755 (C.-B.S.). We are also grateful to Intel for financial support, I. Selby for creative input, and J.-C. Lohmann, S. Griffith, J. Tang and F. Zhang for comments and discussions.
The authors declare no competing interests.
Peer review information
Nature Machine Intelligence thanks Ben MacArthur and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Dittmer, S., Roberts, M., Gilbey, J. et al. Navigating the development challenges in creating complex data systems. Nat Mach Intell 5, 681–686 (2023). https://doi.org/10.1038/s42256-023-00665-x