A health official uses a thermometer to measure the temperature of passengers disembarking a plane in DRC, May 2018

A health official checks for Ebola symptoms by taking the temperature of passengers arriving at Mbandaka Airport in the Democratic Republic of the Congo.Credit: Junior Kannah/AFP/Getty

The resurgence of Ebola virus in the Democratic Republic of the Congo this May is a stark reminder that no amount of DNA sequencing can tell us when or where the next virus outbreak will appear. More genome sequence data were obtained for the 2013–16 Ebola epidemic than for any other single disease outbreak. Still, health workers in Mbandaka, the country’s northwestern provincial capital, are scrambling to contain a growing number of cases.

Over the past 15 years or so, outbreaks caused by viruses such as Ebola, SARS and Zika have cost governments billions of US dollars. Combined with a perception among scientists, health workers and citizens that responses to outbreaks have been inadequate, this has fuelled what seems like a compelling idea. Namely, that if researchers can identify the next pandemic virus before the first case appears, communities could drastically improve strategies for control, and even stop a virus from taking hold1,2. Indeed, since 2009, the US Agency for International Development has spent US$170 million on evaluating the “feasibility of preemptively mitigating pandemic threats”1.

Various experts have flagged up problems with this approach (including the three of us)3,4. Nonetheless, an ambitious biodiversity-based approach to outbreak prediction — the Global Virome Project — was announced in February this year, with its proponents soliciting $1.2 billion in funding from around the world (see ‘High stakes’). They estimate that other mammals and birds contain 1.67 million unknown viruses from the families of viruses that are most likely to jump to humans, and will use the funding to conduct a genomic survey of these unknown viruses, with the aim of predicting which might infect people1.

Sources: NIH; Global Virome Project

Broad genomic surveys of animal viruses will almost certainly advance our understanding of virus diversity and evolution. In our view, they will be of little practical value when it comes to understanding and mitigating the emergence of disease.

We urge those working on infectious disease to focus funds and efforts on a much simpler and more cost-effective way to mitigate outbreaks — proactive, real-time surveillance of human populations.

The public has increasingly questioned the scientific credibility of researchers working on outbreaks. In the 2013–16 Ebola epidemic, for instance, the international response was repeatedly criticized for being too slow. And during the 2009 H1N1 influenza epidemic, people asked whether the severity of the virus had been overblown, and if the stockpiling of pharmaceuticals was even necessary5. Making promises about disease prevention and control that cannot be kept will only further undermine trust.

Forecasting fallacy

Supporters of outbreak prediction maintain that if biologists genetically characterize all of the viruses circulating in animal populations (especially in groups such as bats and rodents that have previously acted as reservoirs for emerging viruses), they can determine which ones are likely to emerge next, and ultimately prevent them from doing so. With enough data, coupled with artificial intelligence and machine learning, they argue, the process could be similar to predicting the weather6.

A Congolese child washes her hands as a preventive measure against Ebola in DRC, May 2018

People in Mbandaka are taking extra precautionary measures to stop the spread of Ebola virus.Credit: Kenny Katombe/Reuters

Reams of data are available to train models to predict the weather. By contrast, it is exceedingly rare for viruses to emerge and cause outbreaks. Around 250 human viruses have been described, and only a small subset of these have caused major epidemics this century.

Advocates of prediction also argue that it will be possible to anticipate how likely a virus is to emerge in people on the basis of its sequence, and by using knowledge of how it interacts with cells (obtained, for instance, by studying the virus in human cell cultures).

This is misguided. Determining which of more than 1.6 million animal viruses are capable of replicating in humans and transmitting between them would require many decades’ worth of laboratory work in cell cultures and animals. Even if researchers managed to link each virus genome sequence to substantial experimental data, all sorts of other factors determine whether a virus jumps species and emerges in a human population, such as the distribution and density of animal hosts. Influenza viruses have circulated in horses since the 1950s and in dogs since the early 2000s, for instance7. These viruses have not emerged in human populations, and perhaps never will — for unknown reasons.

In short, there aren’t enough data on virus outbreaks for researchers to be able to accurately predict the next outbreak strain. Nor is there a good enough understanding of what drives viruses to jump hosts, making it difficult to construct predictive models.

Biodiversity-based prediction also ignores the fact that viruses are not fixed entities. New variants of RNA viruses appear every day. This speedy evolution means that surveys would need to be done continuously to be informative. The cost would dwarf the proposed $1.2-billion budget for one-time sequencing.

Even if it were possible to identify which viruses are likely to emerge in humans, thousands of candidates could end up being identified, each with a low probability of causing an outbreak. What should be done in that case? Costs would skyrocket if vaccines and therapeutics were proposed for even a handful of these.

Screen and sequence

Currently, the most effective and realistic way to fight outbreaks is to monitor human populations in the countries and locations that are most vulnerable to infectious disease. This can be done by local clinicians, health workers in non-governmental organizations such as Médecins Sans Frontières (MSF; also known as Doctors Without Borders), and global institutions such as the World Health Organization (WHO).

We advocate the detailed screening of people who are exhibiting symptoms that cannot easily be diagnosed. Such tests should use the latest sequencing technologies to characterize all the pathogens that have infected an individual — the human ‘infectome’8. To track previous infections, investigators should also assess each person’s immune response, by analysing components of their blood using broad-scale serology9.

Emerging diseases are commonly associated with population expansions — when people encroach on habitats occupied by animals — as well as with environmental disturbances and climate change. Deforestation, for instance, can promote human interactions with animals that carry new threats, and can increase encounters with new vector species such as ticks and mosquitoes10. Animal die-offs, for example that of bar-headed geese (Anser indicus) at Lake Qinghai in China in 2005 (which was caused by the H5N1 influenza virus), can also flag problem regions or emerging pathogens. Surveillance efforts should therefore focus on communities that live and work in such environments.

Identifying which pathogen is causing an outbreak is no longer the bottleneck it once was. It took researchers two years to determine HIV as the cause of AIDS in the early 1980s using microscopy and other techniques. By contrast, in 2012 it took only weeks for investigators using genomic technologies to discover the coronavirus that caused Middle East respiratory syndrome (MERS).

Rapid identification of viruses can be achieved only if such technologies — and the people trained to use them — are globally available, including in resource-limited regions where the risk of outbreaks might be higher. Thankfully, relevant capacity-building programmes are now beginning to be established, such as the Human Heredity and Health in Africa (H3Africa) Initiative, run by the UK Wellcome Trust and the US National Institutes of Health11.

Once an emerging outbreak virus has been identified, it needs to be analysed quickly to establish what type it is; which molecular mechanisms (such as receptor type) enable it to jump between individuals; how it spreads through human populations; and how it affects those infected. In other words, at least four kinds of analysis are needed: genomic, virological, epidemiological and clinical. And the data must be passed to key stakeholders, from researchers and health workers on the ground to international agencies such as the WHO and the MSF. Data must be kept as free of restrictions as possible, within the constraints of protections of patient privacy and other ethical issues.

This will best be achieved through an established global network of highly trained local researchers, such as the WHO Global Outbreak Alert and Response Network (GOARN). Real-time tools for reconstructing and tracking outbreaks at the genomic level, such as portable sequencing devices, are improving fast8. Information gathered during recent outbreaks has quickly had tangible impacts on public-health decisions, largely owing to data generation and analysis by many research teams within days of people being infected12.

For instance, in the 2013–16 Ebola epidemic, genome sequencing of the virus proved that a person could sexually transmit the disease more than a year after becoming infected. This prompted the WHO to increase its recommended number of tests for persistent infection in survivors of the disease.

Ultimately, the challenge is to link genomic, clinical and epidemiological data within days of an outbreak being detected, including information about how people in an affected community are interacting. Such an open, collaborative approach to tackling the emergence of infectious disease is now possible. This is partly thanks to technology, but is mainly due to a shift in perception about the importance of this approach. At least in genomic epidemiology, there is a growing move towards real-time, open-access data and analysis, aided by the use of preprint servers and wikis such as Virological (http://virological.org). This type of collaborative effort can complement the work of agencies including the WHO and the MSF, which focus predominantly on providing information, isolating those who have been infected, and so on.

So far, researchers have sampled little of the viral universe. Surveys of animals will undoubtedly result in the discovery of many thousands of new viruses. These data will benefit studies of diversity and evolution, and could tell us whether and why some pathogens might jump species boundaries more frequently than others. But, given the rarity of outbreaks and the complexity of host–pathogen interactions, it is arrogant to imagine that we could use such surveys to predict and mitigate the emergence of disease.

New viruses will continue to emerge unexpectedly. There is a lot we can and must do to be better prepared.