Genomic insights into the 2016–2017 cholera epidemic in Yemen


Yemen is currently experiencing, to our knowledge, the largest cholera epidemic in recent history. The first cases were declared in September 2016, and over 1.1 million cases and 2,300 deaths have since been reported1. Here we investigate the phylogenetic relationships, pathogenesis and determinants of antimicrobial resistance by sequencing the genomes of Vibrio cholerae isolates from the epidemic in Yemen and recent isolates from neighbouring regions. These 116 genomic sequences were placed within the phylogenetic context of a global collection of 1,087 isolates of the seventh pandemic V. cholerae serogroups O1 and O139 biotype El Tor2,3,4. We show that the isolates from Yemen that were collected during the two epidemiological waves of the epidemic1—the first between 28 September 2016 and 23 April 2017 (25,839 suspected cases) and the second beginning on 24 April 2017 (more than 1 million suspected cases)—are V. cholerae serotype Ogawa isolates from a single sublineage of the seventh pandemic V. cholerae O1 El Tor (7PET) lineage. Using genomic approaches, we link the epidemic in Yemen to global radiations of pandemic V. cholerae and show that this sublineage originated from South Asia and that it caused outbreaks in East Africa before appearing in Yemen. Furthermore, we show that the isolates from Yemen are susceptible to several antibiotics that are commonly used to treat cholera and to polymyxin B, resistance to which is used as a marker of the El Tor biotype.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Geographical location of the sequenced V. cholerae O1 El Tor isolates and number of reported cholera cases.
Fig. 2: Phylogenetic relatedness of the V. cholerae O1 El Tor isolates from the 2016–2017 epidemic in Yemen.

Data availability

The whole-genome alignment for the 1,203 genomes and other files that support the findings of this study have been deposited in FigShare: Short-read sequence data were submitted to the ENA, under study accession numbers PRJEB24611 and ERP021285 and the genome accession numbers are provided in Supplementary Table 1. Phylogeny and metadata can be viewed interactively at

Change history

  • 12 February 2019

    In the HTML version of this Letter, the affiliations for authors Andrew S. Azman, Dhirendra Kumar and Thandavarayan Ramamurthy were inverted (the PDF and print versions of the Letter were correct); the affiliations have been corrected online.


  1. 1.

    Camacho, A. et al. Cholera epidemic in Yemen, 2016–18: an analysis of surveillance data. Lancet Glob. Health 6, e680–e690 (2018).

    Article  Google Scholar 

  2. 2.

    Weill, F. X. et al. Genomic history of the seventh pandemic of cholera in Africa. Science 358, 785–789 (2017).

    ADS  CAS  Article  Google Scholar 

  3. 3.

    Kachwamba, Y. et al. Genetic characterization of Vibrio cholerae O1 isolates from outbreaks between 2011 and 2015 in Tanzania. BMC Infect. Dis. 17, 157 (2017).

    Article  Google Scholar 

  4. 4.

    Bwire, G. et al. Molecular characterization of Vibrio cholerae responsible for cholera epidemics in Uganda by PCR, MLVA and WGS. PLoS Negl. Trop. Dis. 12, e0006492 (2018).

    Article  Google Scholar 

  5. 5.

    Mutreja, A. et al. Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477, 462–465 (2011).

    ADS  CAS  Article  Google Scholar 

  6. 6.

    Naha, A. et al. Development and evaluation of a PCR assay for tracking the emergence and dissemination of Haitian variant ctxB in Vibrio cholerae O1 strains isolated from Kolkata, India. J. Clin. Microbiol. 50, 1733–1736 (2012).

    CAS  Article  Google Scholar 

  7. 7.

    Chin, C. S. et al. The origin of the Haitian cholera outbreak strain. N. Engl. J. Med. 364, 33–42 (2011).

    ADS  CAS  Article  Google Scholar 

  8. 8.

    Domman, D. et al. Integrated view of Vibrio cholerae in the Americas. Science 358, 789–793 (2017).

    ADS  CAS  Article  Google Scholar 

  9. 9.

    Katz, L. S. et al. Evolutionary dynamics of Vibrio cholerae O1 following a single-source introduction to Haiti. mBio 4, e00398-13 (2013).

    Article  Google Scholar 

  10. 10.

    Ghosh Dastidar, P., Sinha, A. M., Ghosh, S. & Chatterjee, G. C. Biochemical mechanism of nitrofurantoin resistance in Vibrio el tor. Folia Microbiol. (Praha) 24, 487–494 (1979).

    CAS  Article  Google Scholar 

  11. 11.

    Sandegren, L., Lindqvist, A., Kahlmeter, G. & Andersson, D. I. Nitrofurantoin resistance mechanism and fitness cost in Escherichia coli. J. Antimicrob. Chemother. 62, 495–503 (2008).

    CAS  Article  Google Scholar 

  12. 12.

    Herrera, C. M. et al. The Vibrio cholerae VprA–VprB two-component system controls virulence through endotoxin modification. mBio 5, e02283-14 (2014).

    Article  Google Scholar 

  13. 13.

    Matson, J. S., Livny, J. & DiRita, V. J. A putative Vibrio cholerae two-component system controls a conserved periplasmic protein in response to the antimicrobial peptide polymyxin B. PLoS ONE 12, e0186199 (2017).

    Article  Google Scholar 

  14. 14.

    Devault, A. M. et al. Second-pandemic strain of Vibrio cholerae from the Philadelphia cholera outbreak of 1849. N. Engl. J. Med. 370, 334–340 (2014).

    CAS  Article  Google Scholar 

  15. 15.

    Samanta, P., Ghosh, P., Chowdhury, G., Ramamurthy, T. & Mukhopadhyay, A. K. Sensitivity to polymyxin B in El Tor Vibrio cholerae O1 strain, Kolkata, India. Emerg. Infect. Dis. 21, 2100–2102 (2015).

    CAS  Article  Google Scholar 

  16. 16.

    Hasan, N. A. et al. Genomic diversity of 2010 Haitian cholera outbreak strains. Proc. Natl Acad. Sci. USA 109, E2010–E2017 (2012).

    CAS  Article  Google Scholar 

  17. 17.

    Zarocostas, J. Cholera outbreak in Haiti-from 2010 to today. Lancet 389, 2274–2275 (2017).

    Article  Google Scholar 

  18. 18.

    UN Office for the Coordination of Humanitarian Affairs. Humanitarian needs overview, Yemen. (2017).

  19. 19.

    International Organization for Migration. Irregular migration in Horn of Africa increases in 2015. (2016).

  20. 20.

    Danish Refugee Council. Mixed migration in the Horn of Africa & Yemen region. RMMS (2016).

  21. 21.

    Dodin, A. & Fournier, J. M. Laboratory Methods for the Diagnosis of Cholera Vibrio and Other Vibrios 59–82 (Institut Pasteur Paris, Paris 1992).

    Google Scholar 

  22. 22.

    CA-SFM & EUCAST. Comité de l’Antibiogramme de la Société Française de Microbiologie Recommandations 2017. (2017).

  23. 23.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  24. 24.

    Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).

    MathSciNet  CAS  Article  Google Scholar 

  25. 25.

    Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).

    CAS  Article  Google Scholar 

  26. 26.

    Croucher, N. J. et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res. 43, e15 (2015).

    Article  Google Scholar 

  27. 27.

    Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    CAS  Article  Google Scholar 

  28. 28.

    Drummond, A. J., Suchard, M. A., Xie, D. & Rambaut, A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012).

    CAS  Article  Google Scholar 

  29. 29.

    Rambaut, A., Lam, T. T., Max Carvalho, L. & Pybus, O. G. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol. 2, vew007 (2016).

    Article  Google Scholar 

  30. 30.

    Rieux, A. & Khatchikian, C. E. tipdatingbeast: an R package to assist the implementation of phylogenetic tip-dating tests using beast. Mol. Ecol. Resour. 17, 608–613 (2017).

    CAS  Article  Google Scholar 

Download references


This study was supported by the Institut Pasteur, Santé publique France, the French government’s Investissement d’Avenir programme, Laboratoire d’Excellence ‘Integrative Biology of Emerging Infectious Diseases’ (grant number ANR-10-LABX-62-IBEID), the Wellcome Trust through grant 098051 to the Sanger Institute and the Department of Biotechnology of India. The Institut Pasteur Genomics Platform is a member of the France Génomique consortium (ANR10-INBS-09-08). We thank D. Legros, A. Fadaq, A. Alsomine, F. Bazel and H. A. Jokhdar for their support; M. Musoke and S. Vernadat for technical assistance; Z. M. Eisa for providing isolates; L. Ma, C. Fund, S. Sjunnebo and the sequencing teams at the Institut Pasteur and Wellcome Sanger Institute for sequencing the samples. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Reviewer information

Nature thanks J. Mekalanos, C. Stine, M. Waldor and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information




F.-X.W. and M.-L.Q. designed the study. A.A.A., M.N., S.S.N., A.R., A.M.A., N.C.S., S.K., M.R.P., A.A., J.Y.C., J.F.W., C.S., B.B., H.H.N.A., D.K., S.M.N., M.R.M., J.K., F.J.L., A.S.A., T.R. and M.-L.Q. collected, selected and provided characterized isolates and their corresponding epidemiological information. J.R. performed the DNA extractions and phenotypic and molecular typing experiments. T.M. analysed protein data. C.B. performed the whole-genome sequencing. F.-X.W., D.D. and E.N. analysed the genomic sequencing data. F.-X.W. and D.D. wrote the manuscript, with major contributions from N.R.T. All authors contributed to the editing of the manuscript.

Corresponding author

Correspondence to François-Xavier Weill.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Geographic location of the sequenced V. cholerae O1 El Tor isolates and number of reported cholera cases.

a, Geographic location of the 116 V. cholerae O1 El Tor isolates sequenced. The number of isolates collected per country is indicated. The three isolates collected in Jizan, Saudi Arabia (denoted by an asterisk) were from Yemeni refugees originating from Hajjah District. The map is a cropped version of the one available at b, Number of cholera cases per country reported to the WHO (World Health Organisation) between 2014 and 2016. The total number of cholera cases reported to the WHO by the countries was 268,337. The maps were created using Paintmaps, a free online map generating tool (

Extended Data Fig. 2 Assessment of the temporal signal within the dataset.

a, Linear regression of the root-to-tip distance against sampling time obtained with TempEst29 using a maximum-likelihood phylogeny of 81 representative seventh pandemic V. cholerae O1 isolates (that is, those used for the BEAST analysis). Bars on nodes indicate the precision of the isolation date (for example, if only the year of isolation is known, the bar spans the entire year). b, Comparison of the ucld.mean parameter estimated from 20 date-randomization BEAST experiments and the original dataset. The rate for the correctly dated tree is shown in red. The median and 95% Bayesian credible interval for the ucld.mean parameter are provided.

Extended Data Fig. 3 Timed phylogeny of the ctxB7 clade.

Maximum clade credibility tree produced with BEAST28 for a subset of 81 representative isolates of the distal part of the genomic wave 3 (that is, those with the ctxB7 allele). The nodes supported by posterior probability values ≥0.5 are indicated.

Extended Data Fig. 4 Visualization of the posterior distribution of trees from the BEAST Markov chain Monte Carlo analysis.

The opacity of the branches is scaled according to the number of times a clade is seen in the distribution. There is high support for the East Africa/Yemen clade. The uncertainty in the placement of the node for the Indian/East African isolates is the reason for the low posterior support value for this node in Extended Data Fig. 3.

Extended Data Fig. 5 Multiple sequence alignment of VprA (VC1320) with two-component response regulators.

A non-synonymous mutation at position 89 of VC1320 that resulted in a D-to-N amino acid change was associated with a phenotype of polymyxin B susceptibility.

Extended Data Table 1 Summary of the Bayesian models used for BEAST28 analyses
Extended Data Table 2 Gene alteration frequencies in isolates susceptible or resistant to certain antibiotics

Supplementary information

Reporting Summary

Supplementary Table

This file contains Supplementary Table S1, which includes the details of Vibrio cholerae El Tor isolates and genomes used in this study

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Weill, F., Domman, D., Njamkepo, E. et al. Genomic insights into the 2016–2017 cholera epidemic in Yemen. Nature 565, 230–233 (2019).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.