Virus genomes reveal factors that spread and sustained the Ebola epidemic

Journal name:
Nature
Volume:
544,
Pages:
309–315
Date published:
DOI:
doi:10.1038/nature22040
Received
Accepted
Published online

Abstract

The 2013–2016 West African epidemic caused by the Ebola virus was of unprecedented magnitude, duration and impact. Here we reconstruct the dispersal, proliferation and decline of Ebola virus throughout the region by analysing 1,610 Ebola virus genomes, which represent over 5% of the known cases. We test the association of geography, climate and demography with viral movement among administrative regions, inferring a classic ‘gravity’ model, with intense dispersal between larger and closer populations. Despite attenuation of international dispersal after border closures, cross-border transmission had already sown the seeds for an international epidemic, rendering these measures ineffective at curbing the epidemic. We address why the epidemic did not spread into neighbouring countries, showing that these countries were susceptible to substantial outbreaks but at lower risk of introductions. Finally, we reveal that this large epidemic was a heterogeneous and spatially dissociated collection of transmission clusters of varying size, duration and connectivity. These insights will help to inform interventions in future epidemics.

At a glance

Figures

  1. Summary of early epidemic events.
    Figure 1: Summary of early epidemic events.

    a, Temporal phylogeny of earliest sampled EBOV lineages in Guéckédou Prefecture, Guinea. 95% posterior densities of most recent common ancestor estimates for all lineages (grey) and lineages into Kailahun District, Sierra Leone (SLE; blue) and to Conakry Prefecture, Guinea (GIN; green) are shown at the bottom. Posterior probabilities >0.5 are shown for lineages with >5 descendent sequences. LBR, Liberia. b, Dispersal events marked by coloured lineages and labelled by name on the phylogeny are projected on a map with directionality indicated by colour intensity (from light to dark). Lineages that migrated to Conakry Prefecture (labelled as GN-1 lineage) and Kailahun District (labelled as SL lineages) have led to the vast majority of EVD cases throughout the region.

  2. Transmission chains arising from independent international movements.
    Figure 2: Transmission chains arising from independent international movements.

    a, EBOV lineages by country (Guinea, green; Sierra Leone, blue; Liberia, red), tracked until the sampling date of their last known descendants. Circles at the roots of each subtree denote the country of origin for the introduced lineage. b, Estimates of the change point probability (left y axis) and log coefficient (mean and credible interval; right y axis) for the nat./int. factor. Vertical lines represent dates that border closures were announced by the respective countries.

  3. The metapopulation structure of the epidemic.
    Figure 3: The metapopulation structure of the epidemic.

    a, Kernel density estimate of distances associated with inferred EBOV dispersal events: 50% occur over distances <72 km and <5% occur over distances >232 km. b, Kernel density estimate of the number of independent EBOV introductions into each administrative region: 50% have fewer than 4.8 and <5% greater than 21.3. c, Kernel density estimate of the mean size of sampled cases resulting from each introduction with at least 2 sampled cases: 50% <5.3 cases, 95% <32 cases. d, Kernel density estimate of the persistence of clusters in days (from time of introduction to time of the last sampled case): 50% <36 days, 95% <181 days. ad, 50% and 95% are indicated by the dashed lines.

  4. Predicted destinations and consequences of viral dispersal.
    Figure 4: Predicted destinations and consequences of viral dispersal.

    a, Predicted number of EBOV imports into each of the 63 regions in Guinea, Sierra Leone and Liberia (including 7 without recorded cases in Guinea) and the surrounding 18 regions of the neighbouring countries of Guinea-Bissau, Senegal, Mali and Côte d’Ivoire. The expected number of EBOV exports from locations in the phylogeographic tree and imports to any location were calculated on the basis of the phylogeographic GLM model estimates and associated predictors that were extended to apparently EVD-free locations (see Supplementary Methods). b, Predicted EVD cluster sizes from the Bayesian GLM fitted to case data.

  5. Distribution and correlation of EVD cases and EBOV sequences.
    Extended Data Fig. 1: Distribution and correlation of EVD cases and EBOV sequences.

    a, Administrative regions within Guinea (green), Sierra Leone (blue) and Liberia (red); shading is proportional to the cumulative number of known and suspected EVD cases in each region. Darkest shades represent 784 cases for Guinea (Macenta prefecture); 3,219 cases for Sierra Leone (Western Area urban district); and 2,925 cases for Liberia (Montserrado county); hatching indicate regions without reported EVD cases. Circle diameters are proportional to the number of EBOV genomes available from that region over the entire EVD epidemic with the largest circle representing 152 sequences. Crosses mark regions for which no sequences are available. Circles and crosses are positioned at population centroids within each region. b, A plot of number of EBOV genomes sampled against the known and suspected cumulative EVD case numbers. Regions in Guinea are denoted in green, Sierra Leone in blue and Liberia in red. Spearman correlation coefficient: 0.93.

  6. Dispersal of virus lineages over time.
    Extended Data Fig. 2: Dispersal of virus lineages over time.

    Virus dispersal between administrative regions estimated using the GLM phylogeography model (see Methods). The arcs are between population centroids of each region, show directionality from the thin end to the thick end and are coloured in a scale denoting time from December 2013 in blue to October 2015 in yellow. Countries are coloured with Liberia in red, Guinea in green and Sierra Leone in blue.

  7. Inference of GLM predictors in a ‘real-time’ context.
    Extended Data Fig. 3: Inference of GLM predictors in a ‘real-time’ context.

    For the dataset constructed from EBOV genome sequences derived from samples taken up until October 2014 (blue), the same 5 spatial EBOV movement predictors were given categorical support (inclusion probabilities = 1.0) as for the full dataset (red). Likewise, the coefficients for these predictors are consistent in their sign and magnitude.

  8. The effect of borders on EBOV migration rates between regions.
    Extended Data Fig. 4: The effect of borders on EBOV migration rates between regions.

    Posterior densities for the migration rates between locations that share a geographical border and those that do not share borders for international migrations and national migrations. Where two regions share a border (right y axis), national migrations are only marginally more frequent than international migrations showing that both types of borders are porous to short local movement. Where the two regions are not adjacent (left y axis), international migrations are much rarer than national migrations.

  9. Summarized international migration history of the epidemic.
    Extended Data Fig. 5: Summarized international migration history of the epidemic.

    a, b, All viral movement events between countries (Guinea, green; Sierra Leone, blue; Liberia, red) are shown split by whether they are between regions that are geographically distant (a) or regions that share the international border (b). Curved lines indicate median (intermediate colour intensity), and 95% highest posterior density intervals (lightest and darkest colour intensities) for the number of migrations that are inferred to have taken place between countries.

  10. Comparison of predicted and observed numbers of introductions and case numbers.
    Extended Data Fig. 6: Comparison of predicted and observed numbers of introductions and case numbers.

    a, b, Left, scatter plots show inferred introduction numbers (a) or observed case numbers (b), coloured by region as in Extended Data Fig. 1. Administrative regions that did not report any cases are indicated with empty circles on the scatter plot. Right, administrative regions on the map are coloured by the residuals (as observed/predicted) of the scatter plot. Regions are coloured grey where 0.5 < observed/predicted < 2.0 and transition into red or blue colours for overestimation or underestimation, respectively.

  11. Region-specific introductions, cluster sizes and persistence.
    Extended Data Fig. 7: Region-specific introductions, cluster sizes and persistence.

    Each row summarizes independent introductions and the sizes (as numbers of sequences) of resulting outbreak clusters. Clusters are coloured by their inferred region of origin (colours are the same as in Extended Data Fig. 1). The horizontal lines represent the persistence of each cluster from the time of introduction to the last sampled case (individual tips have persistence 0). The areas of the circles in the middle of the lines are proportional to the number of sequenced cases in the cluster. The areas of the circles next to the labels on the left represent the population sizes of each administrative region. Vertical lines within each cell indicate the dates of declared border closures by each of the three countries: 11 June 2014 in Sierra Leone (blue), 27 July 2014 in Liberia (red) and 09 August 2014 in Guinea (green).

  12. Kernel density estimates for inferred epidemiological statistics.
    Extended Data Fig. 8: Kernel density estimates for inferred epidemiological statistics.

    From top to bottom, distance travelled (distance between population centroids, in kilometres); number of introductions that each location experienced; cluster size (number of sequences collected in a location as a result of a single introduction); cluster persistence (days from the common ancestor of a cluster to its last descendent, single tips have persistence of 0. Left, analysis for Sierra Leone (blue), Liberia (red) and Guinea (green). Right, analysis for before October 2014 (grey) and after October 2014 (orange). Points with vertical lines connected to the x axis indicate the 50% and 95% quantiles of the parameter density estimates. Within Sierra Leone, Liberia and Guinea, 50% of all migrations occurred over distances of around 100 km and persisted for around 25 days. Exceptions were for Sierra Leone, which experienced more introductions per location (around 12) than Guinea and Liberia (around 4); and Guinea, where migrations tended to occur over larger distances owing to the size of the country and whose cluster sizes following introductions tended to be lower (3 sequences versus Liberia and Sierra Leone, which had 5 sequences each). Between the first (grey) and second (orange) years of the epidemic there were considerable reductions in cluster persistence, cluster sizes and distances travelled by viruses, whereas dispersal intensity remained largely the same.

  13. Relationship between cluster size, introductions or persistence and population size.
    Extended Data Fig. 9: Relationship between cluster size, introductions or persistence and population size.

    a, The mean number of introductions into each location against (log) population sizes. The Western Area (in Sierra Leone) received the most introductions, whereas Conakry and Montserrado were closer to the average. The association between population size and the number of introductions was not very strong (R2 = 0.28, Pearson correlation = 0.54, Spearman correlation = 0.57). b, The mean cluster size for each location plotted against (log) population sizes. The association is weaker than for a (R2 = 0.11, Pearson correlation = 0.35, Spearman correlation = 0.57). c, The mean persistence times (per cluster, in days) against population sizes. A similarly weak association is observed as in b (R2 = 0.12, Pearson correlation = 0.37, Spearman correlation = 0.36). All computations were based on a sample of 10,000 trees from the posterior distribution.

Tables

  1. Predictors included in the time-homogenous GLM
    Extended Data Table 1: Predictors included in the time-homogenous GLM

Videos

  1. Video 1: Reconstructed history of the West African Ebola virus epidemic
    Video 1: Video 1: Reconstructed history of the West African Ebola virus epidemic
    Map of the three most affected countries - Guinea, Liberia and Sierra Leone - is shown on the left. Colours indicate country - Guinea is green, Liberia is red and Sierra Leone is blue. Weekly incidence of EVD cases is indicated by shading of administrative divisions (darker shades correspond to more cases, on a logarithmic scale) within each country. Cases are linearly interpolated between successive reporting weeks. Inferred movements of Ebola virus are indicated with tapered projectiles, coloured by its origin country (Guinea in green, Sierra Leone in blue, Liberia in red) if lineage is crossing an international border and black otherwise. Red circles at population centroids of each administrative division indicate the number of lineages estimated to be present within the location. Phylogenetic tree in the upper right shows the relationships between sampled Ebola lineages, with branches coloured by location (lighter shades indicate locations further west within each country). Migrations inferred between any two locations in the tree are animated on the map on the left. Plot on the lower right shows the sum of weekly cases reported for each administrative division, for each individual country (Guinea in green, Sierra Leone in blue, Liberia in red). Weekly cases for individual administrative divisions are animated as changes in administrative division's colour on the map on the left.

References

  1. World Health Organization. Ebola Situation Report—10 June 2016 http://apps.who.int/iris/bitstream/10665/208883/1/ebolasitrep_10Jun2016_eng.pdf (2016)
  2. Kuhn, J. H. et al. Nomenclature- and database-compatible names for the two Ebola virus variants that emerged in Guinea and the Democratic Republic of the Congo in 2014. Viruses 6, 47604799 (2014)
  3. Baize, S. et al. Emergence of Zaire Ebola virus disease in Guinea. N. Engl. J. Med. 371, 14181425 (2014)
  4. World Health Organization Regional Office for Africa. Ebola Virus Disease, West Africa (situation as of 25 April 2014) http://www.afro.who.int/en/clusters-a-programmes/dpc/epidemic-a-pandemic-alert-and-response/4121-ebola-virus-disease-west-africa-25-april-2014.html (2014)
  5. Goba, A. et al. An outbreak of Ebola virus disease in the Lassa fever zone. J. Infect. Dis. 214, S110S121 (2016)
  6. Sack, K., Fink, S., Belluck, P., Nossiter, A. & Berehulak, D. How Ebola roared back http://nyti.ms/1wwG5VX (2014)
  7. Gire, S. K. et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345, 13691372 (2014)
  8. Dudas, G. & Rambaut, A. Phylogenetic analysis of Guinea 2014 EBOV Ebolavirus outbreak. PLoS Curr. 6, http://dx.doi.org/10.1371/currents.outbreaks.84eefe5ce43ec9dc0bf0670f7b8b417d (2014)
  9. Carroll, M. W. et al. Temporal and spatial analysis of the 2014–2015 Ebola virus outbreak in West Africa. Nature 524, 97101 (2015)
  10. Quick, J. et al. Real-time, portable genome sequencing for Ebola surveillance. Nature 530, 228232 (2016)
  11. Blackley, D. J. et al. Reduced evolutionary rate in reemerged Ebola virus transmission chains. Sci. Adv. 2, e1600378 (2016)
  12. Mate, S. E. et al. Molecular evidence of sexual transmission of Ebola virus. N. Engl. J. Med. 373, 24482454 (2015)
  13. Simon-Loriere, E. et al. Distinct lineages of Ebola virus in Guinea during the 2014 West African epidemic. Nature 524, 102104 (2015)
  14. Arias, A. et al. Rapid outbreak sequencing of Ebola virus in Sierra Leone identifies transmission chains linked to sporadic cases. Virus Evol. 2, vew016 (2016)
  15. Park, D. J. et al. Ebola virus epidemiology, transmission, and evolution during seven months in Sierra Leone. Cell 161, 15161526 (2015)
  16. Kugelman, J. R. et al. Monitoring of Ebola virus Makona evolution through establishment of advanced genomic capability in Liberia. Emerg. Infect. Dis. 21, 11351143 (2015)
  17. Ladner, J. T. et al. Evolution and spread of Ebola virus in Liberia, 2014–2015. Cell Host Microbe 18, 659669 (2015)
  18. Lemey, P. et al. Unifying viral genetics and human transportation data to predict the global transmission dynamics of human influenza H3N2. PLoS Pathog. 10, e1003932 (2014)
  19. Viboud, C. et al. Synchrony, waves, and spatial hierarchies in the spread of influenza. Science 312, 447451 (2006)
  20. Truscott, J. & Ferguson, N. M. Evaluating the adequacy of gravity models as a description of human mobility for epidemic modelling. PLOS Comput. Biol. 8, e1002699 (2012)
  21. Yang, W. et al. Transmission network of the 2014–2015 Ebola epidemic in Sierra Leone. J. R. Soc. Interface 12, 20150536 (2015)
  22. Fischer, R. et al. Ebola virus stability on surfaces and in fluids in simulated outbreak environments. Emerg. Infect. Dis. 21, 12431246 (2015)
  23. Bausch, D. G. & Schwarz, L. Outbreak of Ebola virus disease in Guinea: where ecology meets economy. PLoS Negl. Trop. Dis. 8, e3056 (2014)
  24. Chan, M. Ebola virus disease in West Africa—no early end to the outbreak. N. Engl. J. Med. 371, 11831185 (2014)
  25. Wesolowski, A. et al. Commentary: containing the Ebola outbreak—the potential and challenge of mobile network data. PLoS Curr. 6, http://dx.doi.org/10.1371/currents.outbreaks.0177e7fcf52217b8b634376e2f3efc5e (2014)
  26. Goodfellow, I., Reusken, C. & Koopmans, M. Laboratory support during and after the Ebola virus endgame: towards a sustained laboratory infrastructure. Euro Surveill. 20, 21074 (2015)
  27. World Health Organization. Ebola Response Roadmap Situation Report Update—12 November 2014 http://apps.who.int/iris/bitstream/10665/141468/1/roadmapsitrep_12Nov2014_eng.pdf (2014)
  28. Folarin, O. A. et al. Ebola virus epidemiology and evolution in Nigeria. J. Infect. Dis. 214, S102S109 (2016)
  29. Abdoulaye, B. et al. Experience on the management of the first imported Ebola virus disease case in Senegal. Pan Afr. Med. J. 22, 6 (2015)
  30. Whitmer, S. L. M. et al. Preliminary evaluation of the effect of investigational Ebola virus disease treatments on viral genome sequences. J. Infect. Dis. 214, S333S341 (2016)
  31. Xia, Y., Bjørnstad, O. N. & Grenfell, B. T. Measles metapopulation dynamics: a gravity model for epidemiological coupling and dynamics. Am. Nat. 164, 267281 (2004)
  32. Ferrari, M. J. et al. The dynamics of measles in sub-Saharan Africa. Nature 451, 679684 (2008)
  33. WHO Ebola Response Team. Ebola virus disease in West Africa — the first 9 months of the epidemic and forward projections. N. Engl. J. Med. 371, 14811495 (2014)
  34. Gardy, J., Loman, N. J. & Rambaut, A. Real-time digital pathogen surveillance — the time is now. Genome Biol. 16, 155 (2015)
  35. Yozwiak, N. L., Schaffner, S. F. & Sabeti, P. C. Data sharing: make outbreak research open access. Nature 518, 477479 (2015)
  36. Woolhouse, M. E. J., Rambaut, A. & Kellam, P. Lessons from Ebola: improving infectious disease surveillance to inform outbreak management. Sci. Transl. Med. 7, 307rv5 (2015)
  37. Stadler, T., Kühnert, D., Rasmussen, D. A. & du Plessis, L. Insights into the early epidemic spread of Ebola in Sierra Leone provided by viral sequence data. PLoS Curr. 6, http://dx.doi.org/10.1371/currents.outbreaks.02bc6d927ecee7bbd33532ec8ba6a25f (2014)
  38. Tong, Y.-G. et al. Genetic diversity and evolutionary dynamics of Ebola virus in Sierra Leone. Nature 524, 9396 (2015)
  39. Diallo, B. et al. Resurgence of Ebola virus disease in Guinea linked to a survivor with virus persistence in seminal fluid for more than 500 days. Clin. Infect. Dis. 63, 13531356 (2016)
  40. Rowe, A. K. et al. Clinical, virologic, and immunologic follow-up of convalescent Ebola hemorrhagic fever patients and their household contacts, Kikwit, Democratic Republic of the Congo. J. Infect. Dis. 179, S28S35 (1999)
  41. Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 30593066 (2002)
  42. Gélinas, J.-F., Clerzius, G., Shaw, E. & Gatignol, A. Enhancement of replication of RNA viruses by ADAR1 via RNA editing and inhibition of RNA-activated protein kinase. J. Virol. 85, 84608466 (2011)
  43. Bass, B. L. & Weintraub, H. An unwinding activity that covalently modifies its double-stranded RNA substrate. Cell 55, 10891098 (1988)
  44. Cattaneo, R. et al. Biased hypermutation and other genetic changes in defective measles viruses in human brain infections. Cell 55, 255265 (1988)
  45. Rueda, P., García-Barreno, B. & Melero, J. A. Loss of conserved cysteine residues in the attachment (G) glycoprotein of two human respiratory syncytial virus escape mutants that contain multiple A–G substitutions (hypermutations). Virology 198, 653662 (1994)
  46. Carpenter, J. A., Keegan, L. P., Wilfert, L., O’Connell, M. A. & Jiggins, F. M. Evidence for ADAR-induced hypermutation of the Drosophila sigma virus (Rhabdoviridae). BMC Genet. 10, 75 (2009)
  47. Smits, S. L. et al. Genotypic anomaly in Ebola virus strains circulating in Magazine Wharf area, Freetown, Sierra Leone, 2015. Euro Surveill. 20, 30035 (2015)
  48. Hasegawa, M., Kishino, H. & Yano, T. Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160174 (1985)
  49. Yang, Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306314 (1994)
  50. Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88 (2006)
  51. Gill, M. S. et al. Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci. Mol. Biol. Evol. 30, 713724 (2013)
  52. Ferreira, M. A. R. & Suchard, M. A. Bayesian analysis of elapsed times in continuous-time Markov chains. Can. J. Stat. 36, 355368 (2008)
  53. Lemey, P., Suchard, M. & Rambaut, A. Reconstructing the initial global spread of a human influenza pandemic: a Bayesian spatial-temporal model for the global spread of H1N1pdm. PLoS Curr. 1, RRN1031 (2009)
  54. Drummond, A. J., Suchard, M. A., Xie, D. & Rambaut, A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 19691973 (2012)
  55. Edwards, C. J. et al. Ancient hybridization and an Irish origin for the modern polar bear matriline. Curr. Biol. 21, 12511258 (2011)
  56. Minin, V. N. & Suchard, M. A. Fast, accurate and simulation-free stochastic mapping. Phil. Trans. R. Soc. B 363, 39853995 (2008)
  57. Bielejec, F., Lemey, P., Baele, G., Rambaut, A. & Suchard, M. A. Inferring heterogeneous evolutionary processes through time: from sequence substitution to phylogeography. Syst. Biol. 63, 493504 (2014)

Download references

Author information

Affiliations

  1. Institute of Evolutionary Biology, University of Edinburgh, King’s Buildings, Edinburgh EH9 3FL, UK

    • Gytis Dudas,
    • Luiz Max Carvalho &
    • Andrew Rambaut
  2. Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA

    • Gytis Dudas &
    • Trevor Bedford
  3. WorldPop, Department of Geography and Environment, University of Southampton, Highfield, Southampton SO17 1BJ, UK

    • Andrew J. Tatem
  4. Flowminder Foundation, Stockholm, Sweden

    • Andrew J. Tatem
  5. Department of Microbiology and Immunology, Rega Institute, KU Leuven – University of Leuven, 3000 Leuven, Belgium

    • Guy Baele,
    • Filip Bielejec,
    • Simon Dellicour &
    • Philippe Lemey
  6. Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK

    • Nuno R. Faria &
    • Oliver G. Pybus
  7. Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA

    • Daniel J. Park,
    • Stephen Gire,
    • Adrianne Gladden-Young,
    • Andreas Gnirke,
    • Christine M. Malboeuf,
    • Christian B. Matranga,
    • James Qu,
    • Stephen F. Schaffner,
    • Rachel S. Sealfon,
    • Kendra West,
    • Sarah M. Winnicki,
    • Shirlee Wohl,
    • Nathan L. Yozwiak &
    • Pardis C. Sabeti
  8. Center for Genome Sciences, US Army Medical Research Institute of Infectious Diseases, Fort Detrick, Frederick, Maryland 21702, USA

    • Jason T. Ladner,
    • Jonathan D’Ambrozio,
    • Merle L. Gilbert,
    • Jeffrey R. Kugelman,
    • Suzanne Mate,
    • Mariano Sanchez-Lockhart,
    • Michael R. Wiley &
    • Gustavo Palacios
  9. Department of Pathology, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 2QQ, UK

    • Armando Arias,
    • Sarah L. Caddy,
    • Jia Lu,
    • Luke W. Meredith,
    • Lucy Thorne &
    • Ian Goodfellow
  10. National Veterinary Institute, Technical University of Denmark, Bülowsvej 27, 1870, Frederiksberg C, Denmark

    • Armando Arias
  11. Institute of Lassa Fever Research and Control, Irrua Specialist Teaching Hospital, Irrua, Nigeria

    • Danny Asogun &
    • Ekaete Alice Tobin
  12. The European Mobile Laboratory Consortium, 20359 Hamburg, Germany

    • Danny Asogun,
    • Antonino Di Caro,
    • Sophie Duraffour,
    • Kilian Stoecker,
    • Ekaete Alice Tobin,
    • Roman Wölfel,
    • Miles W. Carroll &
    • Stephan Günther
  13. Virus Genomics, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK

    • Matthew Cotten,
    • My V. T. Phan,
    • Simon J. Watson &
    • Paul Kellam
  14. Department of Viroscience, Erasmus University Medical Centre, PO Box 2040, 300 CA Rotterdam, the Netherlands

    • Matthew Cotten,
    • Bart L. Haagmans,
    • Suzan D. Pas,
    • My V. T. Phan,
    • Chantal B. Reusken,
    • Saskia L. Smits &
    • Marion P. G. Koopmans
  15. National Institute for Infectious Diseases ‘L. Spallanzani’—IRCCS, Via Portuense 292, 00149 Rome, Italy

    • Antonino Di Caro
  16. Naval Medical Research Unit 3, 3A Imtidad Ramses Street, Cairo 11517, Egypt

    • Joseph W. Diclaro
  17. Bernhard Nocht Institute for Tropical Medicine, 20359 Hamburg, Germany

    • Sophie Duraffour &
    • Stephan Günther
  18. National Infections Service, Public Health England, Porton Down, Salisbury, Wilts SP4 0JG, UK

    • Michael J. Elmore &
    • Miles W. Carroll
  19. Liberian Institute for Biomedical Research, Charlesville, Liberia

    • Lawrence S. Fakoli &
    • Fatorma Bolay
  20. Institut Pasteur de Dakar, Arbovirus and Viral Hemorrhagic Fever Unit, 36 Avenue Pasteur, BP 220, Dakar, Sénégal

    • Ousmane Faye &
    • Amadou Sall
  21. University of Sierra Leone, Freetown, Sierra Leone

    • Sahr M. Gevao &
    • Isatta Wurie
  22. Center for Systems Biology, Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA

    • Stephen Gire,
    • Shirlee Wohl,
    • Nathan L. Yozwiak &
    • Pardis C. Sabeti
  23. Viral Hemorrhagic Fever Program, Kenema Government Hospital, 1 Combema Road, Kenema, Sierra Leone

    • Augustine Goba,
    • Donald S. Grant &
    • Mohamed A. Vandi
  24. Ministry of Health and Sanitation, 4th Floor Youyi Building, Freetown, Sierra Leone

    • Augustine Goba,
    • Donald S. Grant,
    • Mohamed A. Vandi &
    • Brima Kargbo
  25. Institute of Infection and Global Health, University of Liverpool, Liverpool L69 2BE, UK

    • Julian A. Hiscox &
    • Georgios Pollakis
  26. NIHR Health Protection Research Unit in Emerging and Zoonotic Infections, University of Liverpool, Liverpool L69 3GL, UK

    • Julian A. Hiscox &
    • Miles W. Carroll
  27. University of Makeni, Makeni, Sierra Leone

    • Umaru Jah,
    • Luke W. Meredith &
    • Ian Goodfellow
  28. Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China

    • Di Liu &
    • George F. Gao
  29. University of Bristol, Bristol BS8 1TD, UK

    • David A. Matthews
  30. Institute of Microbiology and Infection, University of Birmingham, Birmingham B15 2TT, UK

    • Joshua Quick &
    • Nicholas J. Loman
  31. University of Nebraska Medical Center, Omaha, Nebraska 68198, USA

    • Mariano Sanchez-Lockhart &
    • Michael R. Wiley
  32. Department of Pediatrics, Section of Infectious Diseases, New Orleans, Louisiana 70112, USA

    • John S. Schieffelin &
    • Sarah M. Winnicki
  33. Center for Computational Biology, Flatiron Institute, New York, New York 10010, USA

    • Rachel S. Sealfon
  34. Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA

    • Rachel S. Sealfon
  35. Institut Pasteur, Functional Genetics of Infectious Diseases Unit, 28 rue du Docteur Roux, 75724 Paris Cedex 15, France

    • Etienne Simon-Loriere
  36. Génétique Fonctionelle des Maladies Infectieuses, CNRS URA3012, Paris 75015, France

    • Etienne Simon-Loriere
  37. Bundeswehr Institute of Microbiology, Neuherbergstrasse 11, 80937 Munich, Germany

    • Kilian Stoecker &
    • Roman Wölfel
  38. Viral Special Pathogens Branch, Centers for Disease Control and Prevention, 1600 Clifton Road NE, Atlanta, Georgia 30333, USA

    • Shannon Whitmer,
    • Stuart T. Nichol &
    • Ute Ströher
  39. The Scripps Research Institute, Department of Immunology and Microbial Science, La Jolla, California 92037, USA

    • Kristian G. Andersen
  40. Scripps Translational Science Institute, La Jolla, California 92037, USA

    • Kristian G. Andersen
  41. Ministry of Social Welfare, Gender and Children’s Affairs, New Englandville, Freetown, Sierra Leone

    • Sylvia O. Blyden
  42. University of Southampton, South General Hospital, Southampton SO16 6YD, UK

    • Miles W. Carroll
  43. Minstry of Health Liberia, Monrovia, Liberia

    • Bernice Dahn &
    • Tolbert Nyenswah
  44. World Health Organization, Conakry, Guinea

    • Boubacar Diallo
  45. World Health Organization, Geneva, Switzerland

    • Pierre Formenty &
    • Dhamari Naidoo
  46. Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK

    • Christophe Fraser
  47. Chinese Center for Disease Control and Prevention (China CDC), Beijing 102206, China

    • George F. Gao
  48. Department of Microbiology and Immunology, New Orleans, Louisiana 70112, USA

    • Robert F. Garry
  49. Department of Biological Sciences, Redeemer’s University, Ede, Osun State, Nigeria

    • Christian T. Happi
  50. African Center of Excellence for Genomics of Infectious Diseases (ACEGID), Redeemer’s University, Ede, Osun State, Nigeria

    • Christian T. Happi
  51. Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, the University of Sydney, Sydney, New South Wales 2006, Australia

    • Edward C. Holmes
  52. Ministry of Health Guinea, Conakry, Guinea

    • Sakoba Keïta
  53. Division of Infectious Diseases, Faculty of Medicine, Imperial College London, London W2 1PG, UK

    • Paul Kellam
  54. Integrated Research Facility at Fort Detrick, National Institute of Allergy and Infectious Diseases, National Institutes of Health, B-8200 Research Plaza, Fort Detrick, Frederick, Maryland 21702, USA

    • Jens H. Kuhn
  55. Université Gamal Abdel Nasser de Conakry, Laboratoire des Fièvres Hémorragiques en Guinée, Conakry, Guinea

    • N’Faly Magassouba
  56. Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, California 90095, USA

    • Marc A. Suchard
  57. Department of Biomathematics David Geffen School of Medicine at UCLA, University of California, Los Angeles, California 90095, USA

    • Marc A. Suchard
  58. Department of Human Genetics, David Geffen School of Medicine at UCLA, University of California, Los Angeles, California 90095, USA

    • Marc A. Suchard
  59. Centre for Immunology, Infection and Evolution, University of Edinburgh, King’s Buildings, Edinburgh, EH9 3FL, UK

    • Andrew Rambaut
  60. Fogarty International Center, National Institutes of Health, Bethesda, Maryland 20892, USA

    • Andrew Rambaut

Contributions

G.D., L.M.C., T.B., C.F., M.A.S., P.L. and A.R. designed the study. G.D., L.M.C., T.B., A.J.T., G.B., P.L. and A.R. performed the analysis. G.D., T.B., M.A.S, P.L. and A.R. wrote the manuscript. L.M.C., A.J.T., G.B., N.R.F., J.T.L., M.C., S.F.S., K.G.A., M.W.C., R.F.G., I.G., E.C.H., P.K., M.P.G.K., J.H.K., S.T.N., G.Pa., O.G.P., P.C.S. and U.S. edited the manuscript. The other authors were critical for the coordination, collection, processing of virus samples or the sequencing and bioinformatics of virus genomes. All authors read and approved the contents of the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Reviewer Information Nature thanks R. Biek, C. Viboud, M. Worobey and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Distribution and correlation of EVD cases and EBOV sequences. (484 KB)

    a, Administrative regions within Guinea (green), Sierra Leone (blue) and Liberia (red); shading is proportional to the cumulative number of known and suspected EVD cases in each region. Darkest shades represent 784 cases for Guinea (Macenta prefecture); 3,219 cases for Sierra Leone (Western Area urban district); and 2,925 cases for Liberia (Montserrado county); hatching indicate regions without reported EVD cases. Circle diameters are proportional to the number of EBOV genomes available from that region over the entire EVD epidemic with the largest circle representing 152 sequences. Crosses mark regions for which no sequences are available. Circles and crosses are positioned at population centroids within each region. b, A plot of number of EBOV genomes sampled against the known and suspected cumulative EVD case numbers. Regions in Guinea are denoted in green, Sierra Leone in blue and Liberia in red. Spearman correlation coefficient: 0.93.

  2. Extended Data Figure 2: Dispersal of virus lineages over time. (548 KB)

    Virus dispersal between administrative regions estimated using the GLM phylogeography model (see Methods). The arcs are between population centroids of each region, show directionality from the thin end to the thick end and are coloured in a scale denoting time from December 2013 in blue to October 2015 in yellow. Countries are coloured with Liberia in red, Guinea in green and Sierra Leone in blue.

  3. Extended Data Figure 3: Inference of GLM predictors in a ‘real-time’ context. (166 KB)

    For the dataset constructed from EBOV genome sequences derived from samples taken up until October 2014 (blue), the same 5 spatial EBOV movement predictors were given categorical support (inclusion probabilities = 1.0) as for the full dataset (red). Likewise, the coefficients for these predictors are consistent in their sign and magnitude.

  4. Extended Data Figure 4: The effect of borders on EBOV migration rates between regions. (129 KB)

    Posterior densities for the migration rates between locations that share a geographical border and those that do not share borders for international migrations and national migrations. Where two regions share a border (right y axis), national migrations are only marginally more frequent than international migrations showing that both types of borders are porous to short local movement. Where the two regions are not adjacent (left y axis), international migrations are much rarer than national migrations.

  5. Extended Data Figure 5: Summarized international migration history of the epidemic. (718 KB)

    a, b, All viral movement events between countries (Guinea, green; Sierra Leone, blue; Liberia, red) are shown split by whether they are between regions that are geographically distant (a) or regions that share the international border (b). Curved lines indicate median (intermediate colour intensity), and 95% highest posterior density intervals (lightest and darkest colour intensities) for the number of migrations that are inferred to have taken place between countries.

  6. Extended Data Figure 6: Comparison of predicted and observed numbers of introductions and case numbers. (474 KB)

    a, b, Left, scatter plots show inferred introduction numbers (a) or observed case numbers (b), coloured by region as in Extended Data Fig. 1. Administrative regions that did not report any cases are indicated with empty circles on the scatter plot. Right, administrative regions on the map are coloured by the residuals (as observed/predicted) of the scatter plot. Regions are coloured grey where 0.5 < observed/predicted < 2.0 and transition into red or blue colours for overestimation or underestimation, respectively.

  7. Extended Data Figure 7: Region-specific introductions, cluster sizes and persistence. (632 KB)

    Each row summarizes independent introductions and the sizes (as numbers of sequences) of resulting outbreak clusters. Clusters are coloured by their inferred region of origin (colours are the same as in Extended Data Fig. 1). The horizontal lines represent the persistence of each cluster from the time of introduction to the last sampled case (individual tips have persistence 0). The areas of the circles in the middle of the lines are proportional to the number of sequenced cases in the cluster. The areas of the circles next to the labels on the left represent the population sizes of each administrative region. Vertical lines within each cell indicate the dates of declared border closures by each of the three countries: 11 June 2014 in Sierra Leone (blue), 27 July 2014 in Liberia (red) and 09 August 2014 in Guinea (green).

  8. Extended Data Figure 8: Kernel density estimates for inferred epidemiological statistics. (259 KB)

    From top to bottom, distance travelled (distance between population centroids, in kilometres); number of introductions that each location experienced; cluster size (number of sequences collected in a location as a result of a single introduction); cluster persistence (days from the common ancestor of a cluster to its last descendent, single tips have persistence of 0. Left, analysis for Sierra Leone (blue), Liberia (red) and Guinea (green). Right, analysis for before October 2014 (grey) and after October 2014 (orange). Points with vertical lines connected to the x axis indicate the 50% and 95% quantiles of the parameter density estimates. Within Sierra Leone, Liberia and Guinea, 50% of all migrations occurred over distances of around 100 km and persisted for around 25 days. Exceptions were for Sierra Leone, which experienced more introductions per location (around 12) than Guinea and Liberia (around 4); and Guinea, where migrations tended to occur over larger distances owing to the size of the country and whose cluster sizes following introductions tended to be lower (3 sequences versus Liberia and Sierra Leone, which had 5 sequences each). Between the first (grey) and second (orange) years of the epidemic there were considerable reductions in cluster persistence, cluster sizes and distances travelled by viruses, whereas dispersal intensity remained largely the same.

  9. Extended Data Figure 9: Relationship between cluster size, introductions or persistence and population size. (502 KB)

    a, The mean number of introductions into each location against (log) population sizes. The Western Area (in Sierra Leone) received the most introductions, whereas Conakry and Montserrado were closer to the average. The association between population size and the number of introductions was not very strong (R2 = 0.28, Pearson correlation = 0.54, Spearman correlation = 0.57). b, The mean cluster size for each location plotted against (log) population sizes. The association is weaker than for a (R2 = 0.11, Pearson correlation = 0.35, Spearman correlation = 0.57). c, The mean persistence times (per cluster, in days) against population sizes. A similarly weak association is observed as in b (R2 = 0.12, Pearson correlation = 0.37, Spearman correlation = 0.36). All computations were based on a sample of 10,000 trees from the posterior distribution.

Extended Data Tables

  1. Extended Data Table 1: Predictors included in the time-homogenous GLM (470 KB)

Supplementary information

Video

  1. Video 1: Video 1: Reconstructed history of the West African Ebola virus epidemic (10.94 MB, Download)
    Map of the three most affected countries - Guinea, Liberia and Sierra Leone - is shown on the left. Colours indicate country - Guinea is green, Liberia is red and Sierra Leone is blue. Weekly incidence of EVD cases is indicated by shading of administrative divisions (darker shades correspond to more cases, on a logarithmic scale) within each country. Cases are linearly interpolated between successive reporting weeks. Inferred movements of Ebola virus are indicated with tapered projectiles, coloured by its origin country (Guinea in green, Sierra Leone in blue, Liberia in red) if lineage is crossing an international border and black otherwise. Red circles at population centroids of each administrative division indicate the number of lineages estimated to be present within the location. Phylogenetic tree in the upper right shows the relationships between sampled Ebola lineages, with branches coloured by location (lighter shades indicate locations further west within each country). Migrations inferred between any two locations in the tree are animated on the map on the left. Plot on the lower right shows the sum of weekly cases reported for each administrative division, for each individual country (Guinea in green, Sierra Leone in blue, Liberia in red). Weekly cases for individual administrative divisions are animated as changes in administrative division's colour on the map on the left.

PDF files

  1. Supplementary Table (40 KB)

    This file contains Supplementary Table 1.

Additional data