The worldwide health emergency caused by the spread of the SARS-CoV-2 virus has quickly mobilized scientists, and among those stepping up are machine learning researchers. In many ways, the field has been more than ready to answer the call to action. Having advanced at a fast pace in the past two decades, the machine learning community is accustomed to prompt sharing of results, preprint posting and open sourcing of code. Moreover, the past few years have seen a growing interest from the community in putting machine learning models to use in ways that benefit society and promote sustainable development goals. Major machine learning conferences like the Conference on Neural Information Processing Systems and the International Conference on Machine Learning have run popular workshops on ‘AI for social good’ applications.

There are many ways that machine learning researchers can contribute, such as in epidemic modelling, diagnosis, predicting patient outcomes and triaging, drug discovery, detecting misinformation on social media and identifying regions where aid is most needed in low-income countries. But with this surge of attention, blind spots have become visible. Across three Comments in this issue (by Miguel Luengo-Oroz et al., Yipeng Hu et al. and Nathan Peiffer-Smadja et al.), experts highlight the challenges that need to be tackled before AI can have a beneficial global impact. As Luengo-Oroz et al. point out, a first challenge is knowing where to start with developing AI tools that can be most effective, which requires close cooperation with practitioners at the healthcare frontline. The best solutions may involve adapting already validated systems rather than building new tools from scratch. Furthermore, Hu et al. describe how clinical needs are evolving as the pandemic is moving through different stages, from early detection and anticipation, to containment and mitigation and finally eradication. During these transitions, the specific types of AI models may need to change too.

All three Comments emphasize that good prediction models need large, inclusive data collaborations. So far, many new predictive and diagnostic models have been developed based on locally available data pipelines. The generation of such new models can be valuable, but represent only a first step, and the translation to clinically useful applications in new environments requires further work and validation. A substantial, global collaborative effort is required to promote immediate sharing of well-documented, anonymized datasets to develop AI models that can be widely used.

Another challenge is that while sharing of code may be widely practised, this is not universally the case. The problem was highlighted prominently for the pandemic prediction model developed at Imperial College, which informed the UK government strategy, and whose developers acknowledged that the “thousands of lines of undocumented C” would require multiple days of training in order for others to use the code. Work quickly began to refactor and document the code given the significance of the model predictions.

This case points to a related issue, which is that even if code is available, this does not necessarily translate into reusability. To make code reproducible and useful for wider implementation, transparent documentation of model design, assumptions, inputs and hyperparameters are needed, as well as hardware requirements and licensing details. However, the right incentives need to be in place for researchers to focus more on developing reusable software and constructing high-quality datasets, rather than on reporting novel performance results. The San Francisco Declaration on Research Assessment, signed by over 1,950 organizations (including Springer Nature), encourages a rethinking of how scientific research outputs are evaluated beyond conventional journal metrics.

There are many lessons to be learned for the world of scientific research from this pandemic, and that includes scientific publishing, which needs to cope with the overwhelming amount of research papers that have been produced on COVID-19. Researchers in robotics had their wake-up call in 2011 after tsunami waves struck Japan’s Fukushima Daiichi nuclear power plant. Despite impressive robot demonstrations in the preceding decade, robots turned out to be of little help in the most urgent stages of the disaster, to the disappointment of many researchers. The field came together to develop robots better equipped to deal with realistic, challenging environments and scenarios, such as by organizing disaster challenges that stimulate innovation and test robots’ readiness to deal with emergency situations.

Like a nuclear or natural disaster, the pandemic is fast moving and events are difficult to predict. Approaches are needed that can be quickly adapted and of use in various local conditions and countries. The health emergency caused by COVID-19 quickly got the attention of the machine learning field. Now that many of the challenges around data and model sharing, local adaptation and prioritizing work where clinical needs are greatest have been identified, more AI solutions can be expected that are inclusive and will make a global impact.