The spread of Zika virus in Brazil and the rest of the Americas since 2015 has been deeply troubling. Previously thought of as mild, Zika has revealed a potent capacity to damage developing neurological tissues, with devastating consequences for the children of infected mothers1. Three related papers2,3,4 provide much-needed insight into when, where and how the current Zika outbreak emerged and spread. This understanding was gained thanks to a mixture of high-tech molecular-biology and evolutionary techniques, and low-tech sample-collection efforts.

The focus of two of the studies, by Faria et al.2 (page 406) and Metsky et al.3 (page 411), converges on South America. The groups sequenced Zika genomes from people and from Aedes aegypti mosquitoes, which carry the virus. Inspired by the success of real-time sequencing efforts during the Ebola virus outbreak5, Faria and colleagues obtained several samples using a mobile sequencing laboratory deployed in Brazil. Together, these efforts produced more than 100 new genomes.

The groups used these genomes, along with some existing ones, to construct phylogenetic (evolutionary) trees of Zika in the Americas. In this way, they could reconstruct Zika's spread by following a trail of mutations — accumulated by virus strains that the authors sampled at different times and places — back to the outbreak's most recent common ancestor. These trees confirm previous evidence6 that northeastern Brazil is the outbreak's hub.

The Zika strain that founded the American outbreak was evidently introduced from the Pacific islands6, but the current studies cannot prove that transmission to Brazil was direct. Indeed, Faria et al. note that some of the deepest branches and earliest samples on the American Zika tree are from the Caribbean. Nonetheless, the collected genomes show that Zika was circulating in northeastern Brazil by late 2013 or early 2014 — more than a year before the first reported case in Brazil7. They also demonstrate that northeastern Brazil was the source of onward dispersal to several other countries, with an estimated 6–12-month lag between dispersal and initial detection in those regions (Fig. 1). These lag times are not unreasonable, given that it takes time for infection numbers to build up, and that the most obvious effects are seen in babies, born months after mothers have been infected.

Figure 1: Spread of Zika virus across the Americas.
figure 1

In three studies2,3,4, collaborating groups sequenced Zika genomes taken from infected humans and mosquitoes that carried the virus, at different times and in different places across the Americas. By comparing the genetic variation between these sequences, they mapped viral spread in the region. Zika arrived in Brazil from the Pacific islands, although it is not clear whether this transmission was direct. The groups estimate that local transmission in Brazil began in late 2013 or early 2014 (possible time window indicated in blue), but the virus was not detected until mid-2015. From Brazil, Zika spread to other regions in the Americas, where initial detection again lagged months behind the estimated window for the start of local transmission. Introduction to Florida from the Caribbean seems to have occurred on several occasions. (Time frames depicted in this figure are rough estimates only; more-precise dates can be found in the papers.)

It would be a mistake to dismiss these findings because of the 'small' sample sizes involved. Sample numbers in phylogenetic analyses are not the same as sample sizes in, for example, clinical trials. A single sequence can prove the presence of a viral strain at an early time. And, as in the current work, just a handful of strains showing substantial genetic differences can provide compelling evidence for years of undetected circulation.

Faria et al. and Metsky et al. thus provide time points from which to compare the pre- and post-Zika incidence of microcephaly — a condition in which newborns have abnormally small heads and brains — and other Zika-associated symptoms in each affected region. This comparison will allow a better understanding of the effects of the virus. The groups' work also indicates that successful jumps out of Brazil may coincide with times at which seasonal and environmental factors are optimal for viral spread by A. aegypti.

This last point resonates with Grubaugh and colleagues' paper4 (page 401). These authors set out to determine how and when local transmission of Zika arose in Florida, again using phylogenetic trees from human- and mosquito-derived Zika genomes. They found evidence that Zika was introduced into Florida at least four times, several months before its presence was detected. The virus probably entered from Caribbean countries linked to Miami by substantial air and cruise-ship travel.

Miami may be unique among US cities in having the ingredients that favour Zika transmission: not only the presence of A. aegypti, which is found in many US cities, but also large numbers of people arriving from high-incidence Zika areas at times when the mosquitoes are prevalent. Nonetheless, Grubaugh and colleagues provide evidence that each 'successful' introduction failed to sustain a permanent infection in Miami. The transmission rate was below the crucial threshold of at least one secondary infection per primary infection, on average (a secondary infection being one contracted from another person in Miami, either through a mosquito or directly). By contrast, Faria et al. estimate that three secondary infections arose per primary infection in northeastern Brazil.

These papers, along with a report this year on Ebola8, set a new standard for what can be achieved by studying disease outbreaks in tantalizingly close to real time, using rapidly obtained genome sequences analysed in a powerful computational framework9. Such work is possible mostly through the sustained efforts of a fairly small number of scientists supported by modest grants from a few enlightened funders. These breakthroughs not only are impressive in themselves, but also expose large gaps in current approaches to detecting and responding to potentially catastrophic disease outbreaks. Systematic pathogen surveillance is within our grasp, but is still undervalued and underfunded relative to the magnitude of the threat.

A virus-as-wildfire metaphor comes to mind in this context (possibly because I used to be a forest firefighter). In fire-prone areas of North America, lightning is expected, storms are tracked and each strike is pinpointed. Planes fly out at first light to look for smoke near each strike point, and firefighters are on site the same morning. This mentality needs to be applied to emerging infectious diseases. The responses to the recent Ebola and Zika outbreaks undoubtedly involved great courage and ingenuity, but they have looked too much like valiant bucket brigades organized after the fire is out of control. We should be detecting such outbreaks within days or weeks through routine, massive, sequence-based approaches — not months or years later, when clinical symptoms have accumulated.

To do this will require investment in more-comprehensive screening and archiving of animal and human biological samples (perhaps piggybacking on the millions of samples collected for other purposes worldwide each year, then discarded). It will involve developing better ways to recover and amplify viral genetic material from low-quality samples3,10,11. And it can build on the techniques deployed in the current studies. We will not put out every new fire, but we will catch some — and improve our ability to respond to the ones that get away. Any illusions that this approach would be prohibitively expensive must be dispelled by the certainty of future outbreaks that will have billion- or trillion-dollar price tags and cause unacceptable human suffering.Footnote 1