Introduction

Dengue virus (DENV) has spread extensively over the past decade and now poses a threat to about one-third of the global human population, primarily inhabitants of tropical and subtropical regions where mosquito vectors of the genus Aedes are widely distributed1. Increases in human mobility and population growth, unplanned urbanization, globalization, climate change, and unsuccessful vector control programs have contributed to DENV’s expansion, making it a major public health threat at a global scale2. Infection with the virus causes a wide spectrum of clinical manifestations, including asymptomatic or mild self-limiting disease (denoted dengue with and without warning signs), or life-threatening disease characterized by vascular leakage and hemorrhagic symptoms and classified as severe dengue3. DENV is a single-stranded, positive-sense RNA virus with a genome of ~11,000 kb belonging to the family Flaviviridae (genus Flavivirus). Its genome encodes a polyprotein that is post-translationally processed into three structural (capsid, pre-membrane or membrane, and envelope) and seven non-structural (NS1, NS2a, NS2b, NS3, NS4a, NS4b, and NS5) proteins4.

DENV is classified into four antigenically distinct and genetically related serotypes (DENV1-4), each of which harbor several phylogenetically defined genotypes not exceeding 6% nucleotide divergence and often with differing spatio-temporal distributions5. To date, 19 DENV genotypes have been identified, five in DENV1 (1I-V), six in DENV2 (2I-VI), and four in DENV3 (3I, 3II, 3III, 3V) and DENV4 (4I-IV)6. As serotype heterotypic infections are the highest risk factor for severe clinical outcome, the surveillance of circulating serotype diversity is pivotal to public health. Recent findings suggest that immunity generated by different genotypes of the same serotype may be more complex than previously assumed7, such that the contribution of genotype heterotypic infections to clinical outcomes and public health remains largely unknown.

DENV was successfully eradicated in many regions of South America during the mid-twentieth century8, with resurgence first reported in Brazil in the early 1980s in the northern state of Roraima9. Subsequently, the southern state of Rio de Janeiro played a pivotal role in the sequential resurgence of each serotype: DENV1 was reported to have been introduced there in 198610, DENV2 in 1990 simultaneously with the first epidemic of severe dengue disease11, DENV3 in 200012, and finally DENV4 in 201013. Brazil can now be considered hyperendemic for dengue, reporting the highest number of dengue cases worldwide, with approximately 16 million notified infections since the 1980s14.

Different serotypes have caused unexpectedly large epidemics in Brazil over the past 20 years, with particularly problematic outbreaks in 2002, 2008, 2010 and 2012–201315. In the years following the Zika virus epidemic (2017–2018) dengue reporting was surprisingly low15. However, the re-emergence of DENV2 in 2019 followed the report of a staggering 1,544,987 probable cases and 782 confirmed deaths (until epidemiological week – EW52 2019)16. Even with a long history of DENV circulation and recent experience with large outbreaks of other arthropod-borne viruses (arboviruses; e.g., Zika, chikungunya, yellow fever), Brazil continues to struggle with effective mosquito control and public health preparedness17. In contrast, over the past five years, the country has become a world leader in real-time genomic surveillance of arboviruses, contributing greatly to the amount of genetic data available in public databases, as well as for understanding the molecular evolution, spread, and persistence of such viruses (e.g., https://www.zibraproject.org and https://www.zibra2project.org). Despite this significant progress in research capacity, expertise on the methodologies involved in real-time genomic surveillance is absent from higher-education programs and is largely inaccessible to the majority of local researchers and public health workers18.

Herein, we describe a research and educational initiative that included genomic surveillance in the field and in the classroom under a two-week training program in 2019, during the post-Zika resurgence of DENV in Brazil. This initiative was attended by a large number of participants from public health and higher-education institutions from across Latin America. During this initiative we generated and analyzed 227 novel complete genome sequences of DENV1 and DENV2. Informed by a mix of public and the newly generated data, climate and epidemiological time series, we used phylogenetic and epidemiological models to infer the recent transmission history of these serotypes in Brazil. From these data we describe a complex dynamic history that supports previously proposed events of viral movement, replacement, and co-circulation. By targeting public health and higher-education institutions, generating and analyzing most of the data in real-time within the training program, we provide a proof-of-concept of the unique opportunities that portable sequencing technologies offer for local capacity building.

Results

Dengue serotypes present temporal dynamics with an oscillatory behavior characterized by recurrent peak prevalence every 8-11 years19. Due to limitations in local testing capacity, inferring the relative prevalence of DENV1-4 with high spatio-temporal resolution in Brazil is often difficult. Nonetheless, our limited data (Supplementary Table 1) suggests that between 2015 and 2016 DENV1 was the dominant serotype in the Midwest, Northeast, and Southeast regions (Supplementary Fig. 1). Replacement in dominance by DENV2 took place in the Midwest region during 2017 and in the Southeast in subsequent years. Throughout the study period, DENV1 was the dominant serotype in the Northeast region. Although data-limited, these observations were supported by reports from other studies20,21,22, highlighting that the large epidemic of 2019 was dominated by serotypes DENV1 and DENV2. As we were aware of the dominance of DENV1 and DENV2 at the time of planning for the genomic surveillance initiative in 2019, data collection and analysis were focused on those two serotypes.

A total of 248 RT-qPCR positive samples for DENV1 (n = 62) and DENV2 (n = 186) were screened. Of the samples tested, 227 (DENV1 = 57, DENV2 = 170) contained sufficient DNA ( ≥ 2 ng/µL) to proceed to library preparation. For those positive samples, PCR cycle threshold (Ct) values were on average 22 (range: 15 to 35) for DENV1 and 23 (range: 16 to 35) for DENV2. Epidemiological details of the samples processed are provided in Supplementary Table 2. We used a portable nanopore sequencing approach23 to generate the complete genome sequences from the 227 viral samples. The resulting average coverage was 89% for DENV1 and 88% for DENV2. Sequencing statistics are detailed in Supplementary Table 3.

DENV1 sequences covered the time period August 2015 to July 2019 in the Brazilian Federal District (n = 2) and five other states (Bahia = 19; Goiás = 10; Minas Gerais = 11; Pernambuco = 11; São Paulo = 4). For DENV2, samples were collected between January 2016 and August 2019 in the Federal District (n = 1) and eight other states (Bahia = 20; Goiás = 31; Minas Gerais = 47; Mato Grosso do Sul = 59; Mato Grosso = 2; Pernambuco = 3; Rio de Janeiro = 1; São Paulo = 3) (Fig. 1a). Three older DENV2 samples from the Goiás state, sampled during 2008, were also included. The median age of patients was 28 years (range: 4 to 78 years of age) for DENV1 and 37 years (range: 4 to 90 years of age) for DENV2. With respect to the clinical outcome, 77% of DENV1 (44/57) and 81% of DENV2 (138/170) samples were obtained from patients without alarming clinical signs, 44% of DENV1 (2/57) and 2% of DENV2 (4/170) corresponded to patients with severe dengue. Finally, 19% of DENV1 (11/57) and 17% of DENV2 (28/170) were obtained from fatal cases (Supplementary Table 2).

Fig. 1: Spatial and temporal distribution of reported dengue cases in Brazil, 2015–2020.
figure 1

a Map of Brazil showing the number of DENV1 and DENV2 new sequences by region and state. DF = Federal District, BA = Bahia state, GO = Goiás state, MT = Mato Grosso state, MS = Mato Grosso do Sul state, MG = Minas Gerais state, PE = Pernambuco state, RJ = Rio de Janeiro state, SP = São Paulo state. The color and size of the circles indicates the number of new genomes generated in this study (black = DENV1, white = DENV2). b Weekly notified dengue cases normalized per 100 K individuals per region in 2015-2020 (until EW06). Epidemic curves are colored according to geographical macro region: SE = Southeast, NE = Northeast, MW = Midwest, N = North, S = South. ce Time series of weekly reported cases normalized per 100 K individuals and daily mosquito-viral suitability measure (index P) in the three macro regions for which new sequences were generated: MW = Midwest (C), NE = Northeast (D), and SE = Southeast (E). The index P of each region is obtained using average climatic data for the three largest urban centers in each region. Purple bars highlight the dates of sample collection of the genomes generated here. be Incidence (cases per 100 K population) is presented in log10 for visual purposes. The initial map of Brazilian regions was obtained from the R package “get_brmap” (available at: https://rdrr.io/cran/brazilmaps/man/get_brmap.html). Source data are provided as a Source Data file.

Weekly reported incidence (cases normalized per 100 K individuals) notified between 2015 and 2020 was aggregated into five Brazilian macro regions: North (N), South (S), Midwest (MW), Northeast (NE), and Southeast (SE) (Fig. 1b). Epidemic curves revealed three major outbreaks in all regions during early 2015, 2016, and mid-2019, with the Southeast region playing such a prominent role that its reported incidence was close to the overall incidence for Brazil (Fig. 1b, Supplementary Fig. 2). For the macro regions with new viral sequences (Midwest, Southeast and Northeast), we estimated transmission potential using a mosquito-viral suitability index. Since it is not possible to obtain representative temperature and humidity time series for the macro regions, we instead used climatic series from the three largest urban centers of each region. While the latter neglects climate variation within a macro region (Supplementary Fig. 3), the seasonal timing of reported cases was still well captured by the suitability index (Fig. 1c–e). The years 2015, 2016, and 2019 did not show particular increases in suitability, suggesting that the corresponding larger epidemics were not driven by particular climatic trends20, but rather by factors not accounted for by the index (e.g., sociodemographic, lack of herd immunity, change in circulating serotype or genotype, size of the mosquito population). There was also a clear decrease in reported cases across the regions during the aftermath of the Zika virus epidemic (2017–2018), a phenomenon also reported elsewhere20,24,25. In contrast, reporting of other arboviruses (e.g., chikungunya, Supplementary Fig. 4) saw increases in incidence during the same period. As such, we found no support for either climate driven reductions in transmission potential or arboviral reporting saturation that could explain the drop in DENV reporting between 2017 and 2018, suggesting that local herd immunity (serotype-specific or induced by Zika infection) may have played a role. Finally, apart from a few exceptions, the dates of the new sequences matched time periods of both high suitability and case reporting for all regions and were thus representative of epidemic periods (Fig. 1c–e).

Between 2015 and 2019, a total of 3,180 deaths attributed to DENV were reported in Brazil. We found a clear seasonal signal in weekly reported deaths that matched the seasonality of weekly reported cases and suitability for transmission (Supplementary Fig. 5). The year 2019 has been widely reported as experiencing a substantial increase in both the number of cases and deaths. Accordingly, we found the Midwest and Northeast regions had an increase in the absolute number of deaths in 2019 compared to previous years (Supplementary Fig. 5b and d). However, when the weekly (crude) case fatality rate (CFR = deaths/cases) was calculated there was no increase in the CFR during 2019 for any of the regions (Supplementary Fig. 5c, e, g). When aggregating the CFR between 2015 and 2019, we found the Midwest to have a higher CFR at 0.00084 (1.78 × 10−05–2.01 × 10−03 95% range) compared to 0.0006 for the Northeast (0–0.0030) and 0.00037 for the Southeast regions (0-0.0010) (only the Midwest versus Southeast and Northeast comparisons were statistically different using a Wilcoxon test; p-values 5.76e-09 and 3.775e-05, respectively).

DENV1 phylodynamics in Brazil, 2015 to 2019

To explore the phylodynamics of DENV1, we combined our 57 newly generated sequences to those DENV1 genotype V (DENV1-V) genomes available on GenBank (n = 444). Phylogenetic analysis revealed that the novel isolates were organized into three distinct clades, named hereafter as clades I, II, III (Fig. 2a). Clade I appeared to have been replaced by clades II–III during 2019. The DENV resurgence in 2019 was characterized by the co-circulation of two viral lineages with no particular signatures suggesting clade-related advantages. In light of this, non-virological factors such as the level of susceptibility in the population remained a plausible driver of the expansion of both clades.

Fig. 2: Phylogenetic analysis of DENV1-V in Brazil.
figure 2

a Maximum likelihood (ML) phylogenetic analysis of 57 complete genome sequences from DENV1 generated in this study plus 444 publicly available sequences from GenBank. The scale bar is in units of nucleotide substitutions per site (s/s) and the tree is mid-pointed rooted. Colors represent different sampling locations. b Time-scaled phylogeographic tree of Clade I (including eight new sequences plus 25 GenBank sequences), Clade II (including 27 new sequences plus seven GenBank sequences), and Clade III (including 22 new sequences plus 12 GenBank sequences). Colors represent different sampling locations (SE Brazil = Brazilian Southeast, NE Brazil = Brazilian Northeast, MW = Brazilian Midwest, N Brazil = Brazilian North). Tip circles represent the genome sequences generated in this study; colored sidebars represent the dengue clinical classification for each sequenced sample.

To investigate the evolution of clades I-III in more detail, we used smaller data sets derived from each clade individually, which only include sequences closely related to the newly Brazilian isolates (n = 33 for Clade I, n = 34 for Clade II and n = 34 for Clade III). An analysis of substitution rate constancy revealed a strong correlation between the sampling time and the root-to-tip divergence in all three data sets (Supplementary Fig. 6), allowing the use of molecular clock models to infer evolutionary parameters. Phylogeographic analyses of clade I (Fig. 2b) clustered the new sequences into a single well-supported monophyletic sub-clade (posterior probability support, PPS = 1.0), including isolates sampled between 2000 and 2018. New sequences in this clade were mainly from severe dengue cases registered in the SE region. The time to the most recent ancestor (tMRCA) of all Brazilian sequences was estimated to be between June 1998 to February 2000, and this common ancestor likely originated in the North region (PPS = 1.0), after a dispersion of a virus from the British Virgin Islands (PPS = 1.0). Viruses from this clade spread from the North region towards the Southeast and then the Northeast, as indicated by isolates from the Pernambuco (PE) state (represented by JX669461, JX669465, and JX669464). The tMRCA of all isolates from the Southeast and Northeast regions in this clade was estimated to be between May 2006 to February 2008.

Similarly, an analysis of clade II (Fig. 2b) revealed a single well-supported monophyletic group (PPS = 1.0), including isolates from the Southeast, Midwest, and Northeast regions sampled between 2015 and 2019. The majority of the new sequences were from mild dengue cases, although three isolates were recovered from fatal cases in the Northeast. The tMRCA of this group was dated to between August 2007 and May 2010, with a likely origin in the North region. However, as the PPS value was low (0.39) the place of origin remains uncertain. After its introduction into the North region, viruses from this lineage appear to have moved towards the Southeast, Midwest, and Northeast. Notably, the clade appears to have persisted locally after its introduction in the Northeast region between July 2011 to June 2014, and from there a second introduction into the Midwest region may have occurred, as suggested by a single isolate (OPAS134) sampled in 2019.

Finally, clade III (Fig. 2b) also formed a single supported monophyletic group (PPS = 0.82) that contained sequences from the Southeast, Northeast, and Midwest regions sampled between 2011 and 2019. Among our new isolates, six were from fatal cases reported in the Northeast. We estimated the age of this sub-clade to be between October 2009 to August 2011, with a most likely origin in the Southeast region (PPS = 0.99). Since its introduction, the clade has circulated in the Southeast, from where it has later dispersed to Paraguay. The Southeast region has also seemingly seeded outbreaks into the Northeast (PPS = 0.88) between February 2015 and September 2017, and subsequently towards the Midwest region between June 2016 to April 2018.

DENV2 phylodynamics in Brazil, 2016 to 2019

To explore the phylodynamics of DENV2 between 2016 and 2019, we performed a phylogenetic analysis of the 170 newly generated sequences plus 450 complete genome sequences of DENV2-III available on GenBank (Fig. 3). This analysis revealed four different clades (termed hereafter BR-1 to BR-4 clades)22. Notably, BR-1 contained Brazilian sequences sampled from 1990–2000, BR-2 from 2000–2006, and BR-3 from 2006–2019. The latter included six of our new isolates (34%, 6/170) collected during previous outbreaks in 2008 (Goiás = 3) and 2016 (São Paulo = 3). Finally, BR-4 contained the other 164 new sequences (97%, 164/170), sampled between 2016 and 2019. This phylogenetic pattern suggested that between 2016 and 2019, DENV2 circulated in these Brazilian regions with a succession of different viral clades. In particular, with BR-3 preceding BR-4, and with older clades (BR-1, BR-2) either not being sampled in the most recent time-points or having experienced local extinction. These results were consistent with a recent study highlighting the role of the Caribbean region in the spread of the BR-4 clade into Brazil20,22 (Fig. 3a).

Fig. 3: Phylogenetic analysis of DENV2-III in Brazil.
figure 3

a A maximum likelihood tree was inferred using 170 complete genome sequences from DENV2 generated in this study and 450 sequences publicly available, retrieved from GenBank. The scale bar is in units of nucleotide substitutions per site (s/s) and mid-point rooted. Tip circles represent the genome sequences generated in this study. b Time-scaled phylogeographic tree of DENV2 BR-4, including 164 new DENV2 sequences generated here and 17 publicly available data from the 2019 outbreak in Brazil22. Sequences are colored according to sampling location. Sidebars represent the dengue clinical classification for each sequenced sample. c Temporal fluctuation of the effective reproduction number (Re) of the for DENV2 BR-4L1 (blue) and BR-4L2 (magenta) estimated using the Bayesian birth-death approach. Black line is the total weekly incidence of dengue between 2015 and 2020 (until EW06), and the dotted green line is the index P (incidence is summed and index P is averaged over the three macro regions for which new sequences were generated: MW = Midwest, NE = Northeast, and SE = Southeast). Incidence (cases per 100 K population) is presented in log10 for visual purposes.

Given the substantial number of novel sequences, we examined the BR-4 clade in more detail (Fig. 3b) (and a linear regression of root-to-tip genetic distance against sampling date revealed sufficient temporal signal, r2 = 0.60; Supplementary Fig. 7). The BR-4 clade (PPS = 1.0) included 181 isolates from the Southeast, Northeast, and Midwest regions, 92% (167/181) of which were sampled during the 2019 outbreak and 88% (14/181) sampled between 2016 and 2018. Of these, 133 were from dengue without warning signs, while three isolates were recovered from severe dengue, and 28 were from fatal dengue cases. We identified two distinct BR-4 lineages circulating between 2017 and 2019, which we termed BR-4L1 and BR-4L2 (Fig. 3b). Both lineages contained sequences from the Northeast, Southeast, and Midwest regions. The tMRCA of BR-4L1 was estimated to be between September 2014 and June 2016, while the tMRCA of BR-4L2 was dated between March 2015 to November 2016. BR-4L2 contained a monophyletic cluster of isolates from the Midwest sampled between 2018 and 2019. We also observed that the other isolates from the Northeast and Midwest regions were intermixed throughout both lineages, indicating multiple introductions of DENV2 over time. Similar to recent studies22, we found the tMRCA of BR-4 to be between November 2013 and May 2015, likely in the Southeast (PPS = 0.99), from where the virus dispersed towards the Northeast and Midwest regions. Minas Gerais state, located in the Southeast, seems to have played an important role as source location, since sequences from this region (from 2016) fell close to the root of the clade (Fig. 3b).

We identified 34 single nucleotide variants between the BR-4L1 and BR-4L2 lineages, only three of which resulted in amino acid substitutions. Isolates of the BR-4L1 acquired one unique amino acid substitution A447V (ENV protein), while only a few isolates of this lineage had a second amino acid substitution K719I (NS5 protein). All isolates of BR-4L2 acquired one unique amino acid substitution V553I (NS5 protein) (Supplementary Table 4).

We used a birth-death skyline (BDSKY) model to estimate the effective reproduction number (Re) of BR-4L1 and BR-4L2 (Fig. 3c). This provided evidence for three significant seasonal oscillations in Re (although with wide credible intervals), consistent but generally preceding the time windows of reported outbreaks between 2016 and 2019 (Fig. 1c–e). Mosquito-viral suitability presented the same general patterns, but the timing of its oscillations was in between that of Re and incidence (Fig. 3c). In general, our estimates of Re for both lineages peaked at the end of each year, decreasing and remaining below 1 temporarily at the start of each following year (although again, with wide credible intervals). Notably, the time period with the largest Re for both lineages at the end of 2018 and preceding the large epidemic of 2019 (in excess of 2.5 for BR-4L1 and 1.5 for BR-4L2) did not coincide with similar increases in suitability, once again suggesting that climate-related factors were not the drivers of epidemic success.

Discussion

More than 16 million cases of dengue disease have been notified since the early 1980s in Brazil9,14. Previous studies have explored the evolution of DENV1-4 in the Americas, mainly focusing on a restricted range of countries using partial genome sequences26,27,28,29. To obtain a better understanding of DENV evolution in Brazil, in particular during its resurgence in 2019 when DENV1-2 dominated reported cases, we generated 227 new complete genome sequences of both serotypes using portable sequencing. Importantly, more than three quarters of the new sequences were processed and analyzed during a Nanopore-based genome sequencing training and surveillance program that took place in Belo Horizonte, Minas Gerais state, in 2019. The new 227 sequences generated corresponded to 55% (57/104) of DENV1-V, and 60% (170/285) of DENV2-III Brazilian complete genomes that are currently available in public databases (Supplementary Fig. 8). This highlights the large contribution of the training initiative, but also the current shortage of complete genome data for both serotypes. There is clearly a need for continued funding for genomic surveillance which, as shown here, can contribute to a better understanding of the introduction, spread, and persistence of dengue viruses in Brazil during epidemics with significant public health impact.

Time series of reported cases between 2015 and 2019 showed the typical yearly seasonal patterns of dengue transmission. Reporting was low in 2017 and 2018, coinciding with the post-epidemic period of the Zika virus in Brazil. Such trends have also been reported in other countries20,24 and are speculated to be driven by transient cross-protection from exposure to Zika, and/or temporary saturation or changes in surveillance20,30. When comparing to reported cases of chikungunya virus and estimated mosquito-viral suitability in the same period, we found no evidence of changes in capacity for arboviral surveillance or climate driven low transmission potential in favor of other mechanisms (e.g., Zika cross-immunity). In contrast to this period of low circulation, there were three particularly large DENV epidemics: in 2015 and 2016 when DENV1 was dominant across all regions, and in 2019 when DENV1 was dominant in the Northeast but DENV2 was dominant in the Midwest and Southeast regions. Due to the increased likelihood of secondary infections, serotype replacement is often associated with measurable changes in the clinical spectrum of reported cases, with increases in both disease severity and number of deaths11,29,31. While we describe an increase in absolute case and death numbers in 2019, the emergence of DENV2 in the Southeast and Midwest regions was not associated with a significant increase in the case fatality rate compared to previous years.

Our newly generated DENV1 sequences were classified as genotype V and formed three distinct clades (I–III). This supports previous reports suggesting that such clades were responsible for the latest DENV1 outbreaks in Brazil27,28,31,32. Within clade I, only eight of the new sequences were sampled between 2016–2018, with many isolates preceding 2015. The shortage of genomic data in the intermediate years severely hampered our capacity to draw further conclusions, such as the possibility of a temporary lineage replacement event. In contrast, most of the new sequences within clades II–III were sampled in 2019, supporting the co-circulation of two DENV1 lineages in recent epidemics - a phenomenon often described in DENV epidemics27,28,31,32. Viruses from the three clades have been identified in both the Southeast and Northeast regions, although the most recent ones only appeared in the Southeast, while viruses from clades II-III were present in the Northeast and Midwest regions. While this suggested some structure in spatio-temporal circulation, we could not ascertain if it was simply to biased sampling. Time estimates of the tMRCA of the new Brazilian isolates (2015–2019) were between May 2006 and February 2008 for clade I, between August 2007 and May 2010 for clade II, and between October 2009 to August 2011 for clade III. Such large ranges are likely reflecting the lack of genomic data during this period, reinforcing the need for a more effective surveillance in Brazil.

All new DENV2 complete genome sequences belong to genotype III, which has been found in previous epidemics in Brazil22,33. Our results are in line with reports of three different lineages causing outbreaks in Brazil since 199032,33, and support a recent introduction of DENV2-III20,22. Viruses from this genotype grouped in four different clades (BR-1 to BR-4) with apparent replacement over time. Specifically, the oldest lineage BR-1, including isolates from 1990–2000, was replaced by BR-2 comprising sequences from 2000–2006, itself subsequently replaced by BR-3 containing isolates from 2006–2019. Finally, BR-3 was replaced by BR-4, containing sequences sampled between 2016–2019, some closely related to Caribbean isolates sampled in 2005. In a similar manner to DENV1, DENV2 BR-3 and BR-4 isolates from 2019 demonstrated the co-circulation of at least two different lineages in recent years22. The phylogenetic relationship to Caribbean sequences suggested a possible origin in this region, although there is a large temporal gap between the sampling of the Caribbean sequences from 2005 and the early Brazilian sequences from 2016, again highlighting the need for more genomic surveillance in Brazil. The tMRCA of BR-4L1 was estimated between September 2014 and June 2016 and between March 2015 and November 2016 for BR-4L2, coinciding with the emergence and spread of the Zika34 and Chikungunya35 viruses, and a high incidence of dengue in Brazil. Six years after introduction, the lineages continue to circulate in the Southeast region and were present in the most recent large epidemic of 2019. From the Southeast, dispersion was towards the Northeast and Midwest regions, with multiple independent introductions identified.

Analysis of the 181 isolates from three macro regions (Northeast, Midwest and Southeast) allowed us to estimate the emergence of DENV2 BR-4 in the Southeast between November 2013 and May 2015, supporting previous reports20,22. After its introduction, BR-4 circulated as two distinct lineages (BR-4L1 and BR-4L2). We observed three single nucleotide variants among the lineages that resulted in amino acid substitutions: A447V and K719I were identified in the BR-4L1, while V553I was identified in BR-4L2. A447V (ENV protein) and V553I (NS5 protein) appear to be conservative changes due to the interchangeable character for the respective amino acids36. In contrast, K719I in the NS5 protein has changed from a negatively charged to an aliphatic amino acid36. Further studies are required to elucidate the impact of these variants on structure and function of the associated proteins, and any potential role in both viral pathogenesis and fitness.

Our retrospective reconstruction of the recent transmission history of DENV1 and DENV2 revealed that the Southeast and North regions of Brazil were key to dispersal in Brazil. This is in line with studies that have highlighted both regions as important hubs for introduction and dispersion in the country, not only of DENV9,10,11,12,27,31,33, but also for yellow fever virus37. By combining genetic and epidemiological models we showed that the establishment and the co-circulation of the BR-4L1 and BR-4L2 lineages of DENV2 in several Brazilian regions occurred during a time window of sustained transmission potential measured by estimates of Re and mosquito-viral suitability. These results are consistent with sufficient ecological suitability for the virus’s main vectors (Aedes spp.) and insufficient population-level herd-immunity, supporting the expectation of continuing endemic circulation of these dengue viruses in Brazil.

The new genomic data presented here were generated using portable sequencing tools in a field surveillance initiative (ZiBRA-2 project) and a genomic surveillance training program. We present a range of research outputs describing the recent history and genomic epidemiology of DENV1 and DENV2 in Brazil during the resurgence of this virus in 2019: this both corroborates previous studies and greatly increases the number of public viral genome sequences available for analysis. We also identified gaps in existing genomic data, that curtailed definite conclusions on key points of the recent history of DENV in Brazil. Importantly, epidemiological and genomic data was analyzed in real-time during the training program and subsequently during online sessions, and the participants attending the training program made a significant contribution to the research outputs generated. We call for continued funding of similar field and classroom genomic surveillance initiatives. These have the potential to build local capacity in the field of genomic surveillance and in doing so advance our understanding on the population biology of circulating arboviruses and other emerging pathogens.

Methods

Ethics statement

This project was reviewed and approved by the Pan American Health Organization Ethics Review Committee (PAHOERC) (Ref. No. PAHO-2016-08-0029) and the Oswaldo Cruz Foundation Ethics Committee (CAAE: 90249218.6.1001.5248). The availability of these samples for research purposes during outbreaks of national concern is allowed to the terms of the 510/2016 Resolution of the National Ethical Committee for Research – Brazilian Ministry of Health (CONEP - Comissão Nacional de Ética em Pesquisa, Ministério da Saúde), that authorize, without the necessity of an informed consent, the use of clinical samples collected in the Brazilian Central Public Health Laboratories to accelerate knowledge building and contribute to surveillance and outbreak response. The samples processed in this study were obtained anonymously from material exceeding the routine diagnosis of arboviruses in Brazilian public health laboratories that belong to the public network within BrMoH.

Field genomic surveillance with a mobile laboratory

In May 2019 we implemented an arbovirus surveillance project that took place across the Midwest of Brazil using a mobile genomics laboratory (Supplementary Fig. 9). This Brazilian-driven initiative, known as the ZiBRA-2 project, was supported by the BrMoH (https://www.zibra2project.org).

Classroom genomic surveillance in a training program

In August 2019 a genomic surveillance training program organized by PAHO and BrMoH took place in Belo Horizonte (Minas Gerais state) under the title “Nanopore-based genome sequencing technology for temporal investigation and epidemiology of dengue outbreak: training, research, surveillance, and scientific dissemination”. The syllabus included practical and theoretical courses on a variety of subjects related to arbovirus research and surveillance, including mobile sequencing technologies, bioinformatics, phylogenetics, epidemiological modeling, and field epidemiology and entomology. The course was taught by experienced researchers from national and international institutions, such as the University of Oxford (United Kingdom), University of KwaZulu-Natal (South Africa), University Nova de Lisboa (Portugal), Sechenov First Moscow State Medical University (Russia), Oswaldo Cruz Foundation (Brazil), Federal University of Minas Gerais (Brazil), Federal University of Rio de Janeiro (Brazil), Federal University of Pernambuco (Brazil), University of São Paulo (Brazil), University of Brasilia (Brazil), State University of Feira de Santana (Brazil), and University of Salvador (Brazil). The course had 62 students from 34 national and international institutions (age range of participants between 25–50). In addition to post-graduate students, course participants included laboratory technicians and health practitioners in universities and laboratories from several institutions responsible for laboratory-based surveillance of emerging and reemerging diseases, such as the Central Public Health Laboratories of the Brazilian states from the BrMoH’s network and public health laboratories from Paraguay, Argentina, Panama, Chile, Mexico, Uruguay, Costa Rica, and Ecuador. The event targeted post-graduate students, laboratory technicians, and health practitioners in universities and laboratories across the Americas and was based on the principles of Responsible Research and Innovation (RRI)38. Details on the program can be found in Supplementary Text File 1.

Sample collection and molecular diagnostic assays

Clinical samples from patients with suspected DENV infection were obtained for routine diagnostic purposes at local health services in different Brazilian municipalities. These samples were sent for molecular diagnosis to the respective local Central Laboratory of Public Health (LACEN) from the Brazilian Federal District (DF) and from the states of Bahia (BA), Goiás (GO), Mato Grosso (MT), Mato Grosso do Sul (MS), Minas Gerais (MG), Pernambuco (PE), and Rio de Janeiro (RJ). These states had some of the highest registered burdens during the 2019 DENV resurgence according to the protocol established by the BrMoH. Samples processed from the state of São Paulo (SP) were collected by the Blood Center of Ribeirão Preto from volunteer blood donors eligible for blood donation and who reported adverse effects up to 14 days after donation.

Viral RNA was extracted from all clinical samples using the QIAmp Viral RNA Mini Kit (Qiagen) and tested by RT-qPCR for detection of DENV1-4. Selected samples with previous positive diagnostic results for DENV1-2 were processed in two steps: (1) 73 samples from the states of Goiás, Mato Grosso do Sul, and Mato Grosso were processed during the ZiBRA-2 project, (2) 175 samples from the Brazilian Federal District and Bahia, Goiás, Minas Gerais, Pernambuco, Rio de Janeiro, and São Paulo states were processed during the training program (both initiatives described in the section above). Samples from the 2019 outbreak, as well as available samples from previous epidemic waves in 2008 and between 2015-2018, were included for diagnostic screening.

cDNA synthesis and whole genome sequencing using the MinION

Samples were selected for sequencing based on the Ct value (≤35) and availability of epidemiological metadata, such as date of symptom onset, date of sample collection, sex, age, municipality of residence, symptoms, and disease classification. For complementary DNA synthesis, the SuperScript IV Reverse Transcriptase kit (Invitrogen) was used following the manufacturer’s instructions. The cDNA generated was subjected to sequencing multiplex PCR (35-cycles) using Q5 High Fidelity Hot-Start DNA Polymerase (NEB) and a set of specific primers designed by the CADDE project (https://www.caddecentre.org/) for sequencing the complete genomes of DENV1 and DENV239 (Supplementary Table 5).

Amplicons were purified using 1x AMPure XP Beads (Beckman Coulter) and quantified on a Qubit 3.0 fluorimeter (Thermofisher Scientific) using Qubit™ dsDNA HS Assay Kit (Thermofisher Scientific). Of the 248 samples, 227 contained sufficient DNA (≥2 ng/µL) to proceed to library preparation. DNA library preparation was performed using the Ligation Sequencing Kit (Oxford Nanopore Technologies) and the Native Barcoding Kit (NBD103, Oxford Nanopore Technologies)23. Sequencing libraries were generated from the barcoded products using the Genomic DNA Sequencing Kit SQK-MAP007/SQK-LSK208 (Oxford Nanopore Technologies) and loaded into a R9.4 flow cell (Oxford Nanopore Technologies). In each sequencing run we used negative controls to prevent and check for possible contamination with less than 2% mean coverage.

Generation of consensus sequences

Raw files were basecalled using Guppy v3.4.5 and barcode demultiplexing was performed using qcat. Consensus sequences were generated by de novo assembling using Genome Detective (https://www.genomedetective.com/)40. Briefly, Genome Detective use DIAMOND to identify and classify candidate viral reads in broad taxonomic units, using the viral subset of the Swissprot UniRef90 protein database. Candidate reads are next assigned to candidate reference sequences using NCBI blastn and aligned using AGA (Annotated Genome Aligner) and MAFFT. Final contigs and consensus sequence are then available as FASTA file. More detail about Genome Detective can be found in40. The new sequences reported in this study (DENV1 n = 57 and DENV2 n = 170), were initially submitted to a genotyping analysis using the arbovirus phylogenetic subtyping tool, available at http://genomedetective.com/app/typingtool/dengue; this confirmed that the newly genomes belonged to the genotypes DENV1-V and DENV2-III, respectively.

Phylogenetic analysis

DENV genotyping was performed using the Dengue Virus Typing Tool (https://www.genomedetective.com/app/typingtool/dengue/)6. To investigate the evolution and population dynamics of DENV1-2 in different Brazilian regions, the DENV1 (n = 57) and DENV2 (n = 170) complete genome sequences generated in this study were combined with globally sampled and publicly available complete genome sequences from DENV1 genotype V (DENV1-V = 444) and DENV2 genotype III (DENV2-III = 450) as these represent the dominant genotypes in the Americas. The latter were retrieved from NCBI up to November 2019. We also included 17 recently published of the outbreaks in the Brazilian Southeast region25. Sequences without sampling date and location were excluded, as were sequences covering less than 50% of the viral genome.

Sequence alignment was performed using MAFFT41 and manually curated to remove artifacts using Aliview42. Maximum Likelihood (ML) phylogenetic trees were estimated using IQ-TREE43 under the GTR nucleotide substitution model, which was inferred as the best-fit model by the ModelFinder application implemented in IQ-TREE44. The robustness of the tree topology was determined using 1,000 bootstrap replicates, and the presence of temporal signal was evaluated in TempEst45 through a regression of root-to-tip genetic distances against sampling time. Time-scaled phylogenetic trees were inferred using the BEAST package46. We employed a stringent model selection analysis using both path-sampling (PS) and stepping stone (SS) procedures to estimate the most appropriate molecular clock model for the Bayesian phylogenetic analysis47. The uncorrelated relaxed molecular clock model was chosen as indicated by estimating marginal likelihoods, also employing the codon based SRD06 model of nucleotide substitution and the non-parametric Bayesian Skyline coalescent model. A discrete phylogeographic model48 was used to reconstruct the virus spatial diffusion across the sampling locations. Discrete locations were initially defined as the country of sampling. However, a different resolution was applied according to sampling availability. Phylogeographic analyses were then performed by applying an asymmetric model of location transitioning coupled with the Bayesian Stochastic Search Variable Selection (BSSVS) procedure. Markov Chain Monte Carlo (MCMC) were run in duplicate for 100 million iterations to ensure stationarity and an adequate effective sample size (ESS) of >200. Maximum clade trees were summarized using TreeAnnotator after discarding 10% as burn-in and visualized using FigTree v1.4.4.

Epidemiological data and integration with genomic data

Data of weekly notified and laboratory confirmed cases of infection by DENV in Brazil during 2015 to 2019, as well as monthly fatal cases with confirmed dengue infection, were supplied by the BrMoH. A mosquito-viral suitability measure (index P) was estimated using the MVSE R-package49. The index P measures the reproductive (transmission) potential of a single adult female mosquito in a completely susceptible host population and is informed by local temperature and humidity time trends. We used daily climatic data from the three largest cities of each macro region for which new sequences were generated (Midwest: Goiânia, Brasília, Campo Grande; Southeast: São Paulo, Rio de Janeiro, Belo Horizonte; Northeast: Salvador, Recife, Fortaleza), with data obtained from openweathermap.org.

Estimating Re from genetic sequences

We used birth-death models implemented in BEASTv2.450 to estimate the Re. In this model, each infection may transmit at a rate λ and will become non-infectious at a rate δ. Upon becoming infected, each individual is sampled with a probability ss. The model enables the piecewise estimation of Re, δ, and s through time. We assumed sampling proportion s to be constant over time. Relaxing this assumption to allow the parameter s to be zero for the periods when no sequence data was available resulted in similar trends for Re, with wider Bayesian credible intervals. The rate δ was modeled using a lognormal prior with a mean of 14 days and a standard deviation of 0.5, which roughly corresponds to the sum of the intrinsic and extrinsic incubation period of dengue virus. The BDSKY analysis was run for an independent MCMC chains of >100 steps, with parameters and trees being sampled once every 10,000 steps. After removal of 10% burn in, sampled parameters were combined using LogCombiner.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.