Abstract
In 2022, one of its worst cholera outbreaks began in Bangladesh and the icddr,b Dhaka hospital treated more than 1300 patients and ca. 42,000 diarrheal cases from March-1 to April-10, 20221. Here, we present genomic attributes of V. cholerae O1 responsible for the 2022 Dhaka outbreak and 960 7th pandemic El Tor (7PET) strains from 88 countries. Results show strains isolated during the Dhaka outbreak cluster with 7PET wave-3 global clade strains, but comprise subclade BD-1.2, for which the most recent common ancestor appears to be that responsible for recent endemic cholera in India. BD-1.2 strains are present in Bangladesh since 2016, but not establishing dominance over BD-2 lineage strains2 until 2018 and predominantly associated with endemic cholera. In conclusion, the recent shift in lineage and genetic attributes, including serotype switching of BD-1.2 from Ogawa to Inaba, may explain the increasing number of cholera cases in Bangladesh.
Similar content being viewed by others
Introduction
Vibrio cholerae, native to the aquatic environment and the causative agent of cholera, has undergone continuous evolution in different parts of the world3. It gained mutations, genomic islands, and phages during its evolution4. According to a recent study, V. cholerae O1 El Tor strains responsible for the ongoing seventh pandemic evolved from non-pathogenic ancestors, acquiring the El Tor-form of the tcpA gene, CTX prophage, vibrio seventh pandemic island I (VSP-I), and vibrio seventh pandemic island II (VSP-II), and exhibited high spreading capability in 1961 in the Indonesian island, Sulawesi5. Strains of the El Tor biotype were isolated in Bangladesh in 1963, India in 1964, and in the former U.S.S.R., Iran, and Iraq from 1965 to 1966, and in Africa from 1970 to 19716. The Ganges Delta of Bay of Bengal is the historical hotspot for the evolution of the V. cholerae pandemic clone7. According to a recent study, the 7th pandemic El Tor (7PET) strains spread out from the Bay of Bengal in at least three different but overlapping waves and there had been numerous transcontinental transmission events8. Genomic studies identified 13 transmission lineages (T1-T13) in Africa9 and three transmission lineages (LAT1-LAT3) in Latin America10. And in recent years, two contemporary circulating lineages belonging to the 7th pandemic El Tor (7PET) wave-3 were reported in Asia11. Two recently circulating lineages in Bangladesh were identified as BD-1 and BD-2, differing significantly in genomic attributes, e.g., mutant genes, heterogeneity in VSP-II, vibrio pathogenic island 1 (VPI-1), mobile genetic elements, toxin encoding elements, total gene abundance, phage-inducible chromosomal island-like element (PLE), and SXT-related integrating conjugative elements (SXT ICE)2. These genes and genomic islands have a crucial role in adaptation of the bacterium. For example, SXT ICE elements encode resistance to multiple antibiotics12, VSP-II facilitates chemotactic responses and cell congregation13, and PLE protects against bacteriophage infection14. The BD-2 strains had more SNPs and indels (insertion-deletions) than BD-1, and also richness in gene abundance, including antimicrobial resistance genes, gene cassettes, and PLE against bacteriophage infection, and were predominantly associated with endemic cholera in Bangladesh between 2013 and 20172.
Dhaka, the capital city of Bangladesh, experienced a massive cholera outbreak, the largest in 20–25 years. The icddr,b hospital treated a record high number of daily patients (more than 1300), numbering ca. 42,000 diarrhea cases between March 1 and April 10, 20221. It was imperative to investigate isolates of the causative agent V. cholerae, employing whole genome sequencing and comparative genomics to understand the genetic drivers of such a massive cholera outbreak. Therefore, the genomes of 21 V. cholerae isolates from diarrhea samples collected between February and April, 2022, from patients admitted to the icddr,b hospital, Bangladesh, were sequenced and those sequences were compared with 267 genome sequences of strains from our laboratory collection (1991–2021), and 693 sequences retrieved from a public database15, which included strains isolated from 1957 to 2017. In this work, we explore the recent evolutionary changes and transmission dynamics of the V. cholerae O1 El Tor Dhaka outbreak strains.
Results
Phylogenetic relationships and genetic clusters
The whole-genome sequence analysis of V. cholerae O1 El Tor strains associated with endemic cholera in Dhaka, Bangladesh, and Kolkata, India, showed two circulating lineages, one dominant in India and the other in Bangladesh11. To understand its evolutionary dynamics, a temporal genomic study of V. cholerae associated with endemic cholera in Dhaka, Bangladesh (1991–2017) was undertaken2. Results revealed distinct genomic attributes for the two circulating lineages, BD-1 and BD-2, which were negatively associated with each other in endemic cholera in Bangladesh. V. cholerae strains isolated between 1991 and 2022 in Bangladesh were the first to be analyzed and results showed the majority of strains isolated between 2018 and 2022 clustered with BD-1, formed a separate subclade (BD-1.2), indicating a shift back to predominance of BD-1 like strains in endemic cholera in Bangladesh (Fig. 1). V. cholerae strains phylogenetically related to BD-1 (Asian lineage 2) were predominant in India during the period when BD-2 strains were predominant in Bangladesh11. A comparative genomic analysis of Bangladeshi strains, including strains isolated in India between 1991 and 2016 showed BD-1.2 strains to be genetically similar to the Indian strains (Supplementary Fig. 1). For further investigation, a maximum-likelihood phylogenetic tree was constructed for a total of 981 genome sequences of V. cholerae O1 El Tor strains isolated from 88 countries between 1957 and 2022 (Supplementary Data 1). A total of 6399 high quality SNPs of non-repetitive, non-recombinant core genome sites were analyzed for maximum likelihood phylogenetic tree reconstruction. The revealed lineages of strains isolated in Bangladesh were annotated in the phylogenetic tree with strains of recently recognized transmission lineages (T9-T13) in Africa9 and Latin American transmission 3 (LAT-3)10. Three distinct subclades of strains isolated in Bangladesh, BD-1, BD-1.1, and BD-1.2, clustered with a single clade of globally distributed strains (Fig. 2a). However, BD-2, representing the recent predominant lineage in Bangladesh, belonged to a separate clade that included strains isolated in Asia. Henceforth, the first clade is here referred to as the global clade and the clade containing BD-2 strains as the Asian clade. Strains of the global clade have been reported to be associated with endemic and epidemic cholera in different parts of the world, including the 2010 cholera epidemic in Haiti16 and the 2016–2017 in cholera epidemic in Yemen17. In the constructed phylogenetic tree, BD-1 strains isolated during 1999–2007 located at the bottom of the global clade (Fig. 2a), followed by strains from China, Thailand, Mozambique, Zimbabwe, India, and Nepal. BD-1.1 strains isolated from Bangladesh between 2008 and 2014 clustered below the Indian strains isolated between 2008 and 2009. BD-1.1 strains differed from BD-1 strains in certain genomic features, namely BD-1 strains had ctxB1 genotype, while BD-1.1 strains had ctxB7 genotype, similar to strains isolated from India and Nepal. Strains carrying the ctxB7 genotype were reported in the massive Haitian cholera outbreak in 201016. BD-1.1 subclade strains overlapped in predominance with BD-2 strains in endemic cholera in Bangladesh. However, strains of the subclade have not been found clinically since 2015.
African strains (T12) formed a distinct subclade in the global clade, closely related to strains isolated in India and Nepal in 2010, followed by Latin American (LAT-3) strains, reported as having been from Africa in the earlier study10. Another report suggested relatedness of strains in Africa (T13) with Indian strains isolated between 2011 and 20159. Strains isolated between 2018 and 2022 in Bangladesh formed a subclade, BD-1.2, separate from T13 strains and phylogenetically close to strains isolated from India in 2016. Hence, the Bangladesh outbreak strains may be an expansion of the global clade as a subclade BD-1.2 (Fig. 2a, b).
Results of Monte Carlo Markov Chain (MCMC) phylodynamic analysis using BEAST18, showed BD-1.1 and BD-1.2 strains from the present study with their recent common ancestor (MRCA) to be cholera strains isolated in India in 2005 (95% height posterior density (HPD): 2004-2006) and 2015 (95% HPD: 2015-2016), respectively (Fig. 2b). The data suggest temporal shifts in predominance among the clade-specific strains isolated from Bangladesh (Fig. 2c). BD-1 and BD-2 overlap but negatively related in predominance from 2001 to 2007. From 2007 to 2015, strains of BD-1.1 subclade and BD-2 were detected. However, in subsequent years, BD-2 comprised most of the isolates from 2013 to 2017. Interestingly, BD-1.2 strains were isolated in high numbers in 2018 continuing in subsequent years, suggesting a reverse shift to global clade in Bangladesh. The results suggest endemic and epidemic cholera in Bangladesh had progressed via evolution, transmission, and temporal transition to predominance between lineages over the years.
Emergence of the contemporary lineages
Pandemic V. cholerae strains continue to evolve by acquiring mutations in their core genome, thus introducing various lineages. In addition to the six defined lineages reported in prior studies2,9,10, nine additional lineages/groups of global and Asian clades were defined (Supplementary Data 2 and Fig. 3a). These include BD-1.1 and BD-1.2 in Bangladesh, IND-1, IND-1.1, IND-1.2, IND-1.3, and IND-2 from India, and AS-1 and AS-2, isolated from several Asian countries. SNPs and indels (insertion-deletions) of the lineages were compared to identify the set of mutations acquired during emergence of the lineages and genetic changes with adaptive implications. Chi-square test was used to analyze allelic diversity among the lineages to identify associated SNPs and indels (Fig. 3a). A total of 134 SNPs had alternative alleles in all strains of at least one lineage. Hierarchical cluster analysis showed several SNP groups in one of the emergent parental lineages and sustained in descendent lineages. For example, the accumulation of SNPs in cluster A was observed in strains of lineages of globally distributed 7th pandemic El Tor clade. A total of 26 SNPs identified in subcluster A1 were common in strains of the global clade (Supplementary Data 2). Asian clade lineages had wild type SNP alleles of El Tor reference strain N16961. Similarly, 31 SNPs of subcluster B1 were detected in all lineages of the Asian clade, but not in the global clade. SNPs of subclusters A2 and A3 were accumulated by a specific lineage of the global clade, e.g., 9 SNPs were in A2 cluster of which two SNPs accumulated by AS-1, three by IND-1, and four by IND-1.1. Again, 15 SNPs in cluster A3 were acquired by different lineages of the global clade. SNP clusters B2, B3, and B4 suggest recent transmission events, based on allele sharing of SNPs. Asian strains (AS-1) appear to be recent ancestor of an African transmission lineage (T11) based on SNP cluster B2, Indian strains (IND-1) as a recent ancestor of BD-1.1 based on B4, and IND-1.3 as a recent ancestor of BD-1.2 based on B3 (Fig. 3). Gene enrichment analysis was conducted for genes corresponding to lineage-associated SNPs (Supplementary Figs. 2–4). According to gene ontology (GO) results, the genes showing mutations in the outbreak strains were involved in important biological processes, such as cell wall organization, cell-to-cell communication, toxin transport, modulation process of yet another organism, importing into cell, DNA integration, and chemotaxis that constitute a complex network. Such complex networks of molecular functions, including toxin transmembrane transporter activity, porin function, may have contributed to environmental fitness and infection potential of the bacterium.
Differences among BD-1.1, BD-1.2, and BD-2 strains
BD-1.1 is a recent and BD-1.2 the current Bangladesh lineage of strains, with respect to the global clade. Phylogenetic analysis and SNP-clustering provide compelling evidence for BD-1.2 strains having appeared in Bangladesh recently, rather than having derived locally from BD-1.1, which displaced locally dominant BD-2 strains responsible for endemic cholera until recently. The BD-1.2 strains responsible for the 2022 Dhaka outbreak differed from BD-2 in major genetic characteristics, antimicrobial resistance, and epidemiological behavior (Table 1). For example, BD-1.2 strains harbored ctxB7 genotype while BD-2 strains had ctxB1. Phage-inducible chromosomal island-like elements (PLE), which protect bacteria from bacteriophage infection, found in most of the BD-2 strains, but not in the majority of BD-1.2 strains. BD-1.2 and BD-2 strains had different types of SXT ICE element, VSP-II, VPI-1, and gryA gene alleles. Most importantly, BD-2 strains clustered in a single expanded clade, namely Asian clade, with strains isolated from Asian countries and associated with endemic cholera in this region2,11. BD-1.2 strains clustered in a single expanded clade, namely global clade, with strains isolated from different parts of the world and associated with endemic and epidemic cholera in recent years9,10,17,19.
Although both BD-1.2 and BD-1.1 strains belong to a same extended global clade, BD-1.2 strains have a number of novel mutations absent in the BD-1.1 strains. Several missense SNP mutations, with different alleles, were detected in BD-1.2 strains when compared with BD-1.1 strains (Supplementary Table 1). Of these SNPs, 21 had reference alleles for BD-1.1, but alternative alleles for BD-1.2 strains. A new missense SNP mutation resulting in Arg491His in the penicillin-binding protein 3 (PBP3) domain (267-562) of the gene VC_2407 was detected in all BD-1.2 strains. A recent study showed PBP3 is essential for growth of Pseudomonas aeruginosa20. It is interesting to suggest the mutation in PBP3 may accelerate growth and adaptability of BD-1.2 strains, which also have an altered protein structure of the bile salt resistance genes ompU21 that forms passive diffusion pores to allow small molecular weight hydrophilic materials to move across the outer membrane (EggNOG: COG3203). Bile salt disorganizes cell membrane structure and also triggers DNA damage22. A mutation in membrane-bound lytic murein transglycosylase D precursor gene mltD was detected in for all BD-1.2 strains and is involved in the peptidoglycan metabolic process (Uniport ID: Q9KPX5). E. coli mltD is involved in cell wall organization and plays a role in recycling muropeptides during cell elongation and/or cell division (UniPort ID: P0AEZ7). A mutation in this virulence-related gene in Vibrio anguillarum has been shown to enhance lethality in zebrafish23. BD-1.2 strains had a mutation in the skp gene encoding a protein, a molecular chaperone of gram-negative bacteria, required for formation of soluble periplasmic intermediates of outer membrane proteins24. BD-1.2 strains also had a missense mutation (Ala582Gly) in multifunctional-autoprocessing repeats-in-toxin holotoxin (MARTX) rtxA gene. All BD-1.1 strains had alternative alleles for 10 SNPs while the BD-1.2 strains had reference alleles for the SNPs. Along with missense SNPs, allelic differences in several synonymous and intergenic SNPs and indels were observed between BD-1.1 and BD-1.2 strains (Supplementary Tables 2 and 3). A total of nine synonymous SNPs and five intergenic SNPs had different alleles in BD-1.1 and BD-1.2 strains. In addition, eight indels revealed different alleles for strains of these subclades.
Antimicrobial resistance (AMR) and related phenotypes
AMR genes of strains comprising the global and Asian clades were studied using the bioinformatics pipeline ABRicate25. Results of the genome analysis revealed essentially a uniform pattern for drug resistance in lineages of the global clade, including 2022 Bangladesh outbreak strains. African transmission lineage T13 showed three AMR genes, varG, catB9, and dfrA1, are associated with resistance to carbapenem, chloramphenicol, and trimethoprim, respectively (Supplementary Table 4). All other lineages of the global clade had four additional genes, with aph(6)-Id and aph(3”)-Ib associated with streptomycin, sul2 with sulfonamide, and floR with chloramphenicol and florfenicol resistance. Lineages of the Asian clade revealed a different pattern for drug resistance compared to strains of the global clade. AMR gene floR was detected in 20% of Asian clade AS-2 strains, but not in descendent lineages IND-2 and BD-2. Tetracycline resistance tetA(D) was detected in 71–89% of the strains and trimethoprim resistance gene dfrA31and quinolone resistance qnrVC1 in 17–46% of Asian clade lineage strains. The AMR gene profile suggests BD-1.2 is an expansion of the global clade in Bangladesh, differing from T13. In addition to AMR some strains carried additional genes contributing to multidrug-resistance (MDR) and potentially highly drug-resistant (XDR). An XDR strain of BD-2 lineage was isolated from a clinical specimen in Dhaka during 2019. The strain harbored 14 AMR genes and was resistant to all tested drugs available in our laboratory at the time. Emergence of the BD-2 strain exhibiting MDR/XDR occurred when strains of the lineage had essentially been superseded by BD-1.2, associated with cholera in 2019. Although the MDR/XDR was not subsequently detected, the data suggest that BD-2 strain acquired resistance genes due to intense selective pressure at that time to combat the widely circulating BD-1.2. Nucleotide blast was used to match contigs of representative strains of Asian and Global clades with seven publicly available sequences of the Integrative and conjugative elements (ICEs)- ICEVchban5 (GQ463140.1), ICEVchind4 (GQ463141.1), ICEVchind5 (GQ463142.1), ICEVchmex1 (GQ463143.1), ICEVflInd1 (GQ463144.1), ICETET (MK165649.1), and ICEGEN (MK165650.1). Except for a few strains, blast searches for global clade strains produced high bit scores when aligned with ICEGEN (MK165650.1), ICEVchInd5 (GQ463142.1), or ICEVchBan5 (GQ463140.1). In contrast, for the majority of Asian clade strains, the high bit score was obtained when aligned with ICETET (MK165649.1) (Supplementary Data 3) and, in addition, mutation Ser 83 Ile in gryA commonly found in the Global and Asian clade strains. However, the BD-2 strains isolated since 2009 contained an additional mutation (Asp 660 Glu). All strains of the global and Asian clades had a mutation in ParC (Ser 85 Leu), with the exception of a small number of BD-1 (n = 10) and AS-2 (n = 2) strains.
Serotype switching
Two BD-1.2 serotypes, Ogawa and Inaba, were encountered and temporal analysis results revealed all BD-1.2 strains isolated between 2016 and 2019 were Ogawa serotype26, whereas the serotype predominantly associated with recent cholera outbreaks in Yemen, Tanzania, and Uganda was Ogawa17. The Bangladesh clinical data indicated BD-1.2 serotype switching from Ogawa to Inaba, the serotype that was prevalent as the causative agent of cholera since September, 2020 (Supplementary Table 5).
Serotype conversion from Ogawa to Inaba has been linked to mutations in the rfbT gene22,23 as it has also been shown that the gene coding for serotype can be laterally transferred5. To identify the genetic basis of the recent serotype switch in Bangladesh, sequences of rfbT were extracted from both Inaba and Ogawa serotype strains. Comparison of the sequences showed all of the Inaba strains had an insertion (nt position: 27–28; ins113bp for 2 strains and ins112bp for 12 strains) within rfbT gene (Supplementary Figs. 5 and 3c).
Determination of ctxB allele and drug response pattern of additional strains
We performed comparative genomic analysis of 21 V. cholerae O1 El Tor strains isolated from the peak of the 2022 cholera outbreak to show their phylogenetic relationship with 960 whole genome sequences retrieved from the public database. Our results suggest a recent shift in the predominance of V. cholerae strains responsible for endemic cholera in Bangladesh. Results also suggest that the 2022 massive cholera outbreak was attributed to BD-1.2 strains that were successful in replacing the BD-2 strains predominantly associated with endemic cholera in Bangladesh. We tested additional V. cholerae by randomly selecting 30 strains isolated between March and September 2022 in order to support our study findings and strengthen the conclusions made. The ctxB genotype was determined by a double-mismatch-amplification mutation assay (DMAMA) PCR and drug response patterns using disk diffusion assays27. According to our results, all of the tested strains carried ctxB7, and proved tetracycline sensitive as observed for the BD-1.2 (Supplementary Table 6). These results appeared in sharp contrary to the BD-2, which carried ctxB1, and tetracycline resistance as markers, supporting the overall findings of the present study.
Discussion
The Ganges Delta of the Bay of Bengal, Bangladesh, historically is considered an ancestral home of cholera with the disease endemic there for centuries28,29. In this study, we investigated genomic attributes of V. cholerae O1 El Tor associated with the 2022 massive cholera outbreak in Dhaka, Bangladesh. A genomic analysis of outbreak strains isolated from admitted patients at icddr,b hospital was done, including comparative genomic analysis of sequence data of earlier studies2,9,10,11 in 88 countries from 1957 to 2022. The results provided evidence for 7th pandemic El Tor wave-3 global clade strains forming a new subclade, designated BD-1.2, tracing to a most recent common ancestor, namely a globally distributed 7th pandemic El Tor lineage predominantly associated with recent endemic cholera in India. Cholera is well established as a climate-driven disease30 with V. cholerae resident flora in aquatic environments31. Our results indicate an earlier presence of strains belonging to the subclade BD-1.1 in Bangladesh around 2007. Both, the resident BD-2, predominantly associated with cholera in Bangladesh, and the new BD-1.1 strains showed a negative relationship in predominance, the latter not associated with any major outbreak before last isolated in 2015. Emergence of BD-1.2 strains in Bangladesh was striking thereafter. Both BD-1.1 and BD-1.2 comprise a component of a global clade that includes strains associated with cholera world-wide, e.g., Haiti16 and Yemen17. The Bangladesh BD-1.2 strains initiated a massive cholera outbreak in 2022 displacing BD-2 strains locally dominant at the time2. Intense niche competition with BD-2 strains may account for this turn of events. In addition, the BD-1.2 strains proved to be genetically different from recently recognized transmission lineages T11-13 and LAT-39,10 and circulating lineages in Bangladesh2. They carried unique mutations in several genes playing a critical role in promoting growth, resistance to bile salt, cell wall organization, and toxigenicity, presumably selectively advantageous for this subclade. While most of the global clade strains are known to possess seven AMR genes, T13 strains harbored only three, similarly observed for 30% of related Indian strains, hence some evidence for concluding the relatively drug-sensitive T13 strains transmitted from India to Africa as the T13 strains. A BD-2 strain carrying 14 AMR genes (aac(6’)-lb-cr5, aph(3”)-lb, aph(6)-ld, blaOXA-1, blaPER-3, catB3, catB9, dfrA1, dfrA15, mph(A), sul1, sul2, tetA(D), and varG) and showing resistance to all of the eleven tested drugs (Cephalothin (KF), Streptomycin (S), Cefixime (CFM), Ceftriaxone (CRO), Nalidixic Acid (NA), Sulfamethoxazole/Trimethoprim (SXT), Cefepime (FEP), Mecillinam (MEL), Ciprofloxacin (CIP), Ampicillin (AMP), Aztreonam (ATM))27 may be considered a remnant of the lineage during transition while BD-1.2 was predominant, with perhaps extreme drug resistance the result of intense selective pressure as the two lineages fights to establish their respective niches32. The recent observation of the switch in serotype from Ogawa to Inaba of BD-1.2, coexistence of both serotypes, and subsequent massive cholera outbreak in Bangladesh is remarkable, since cholera outbreaks have been linked with serotype switching33. Ogawa and Inaba serotypes do not appear to differ in severity or duration of illness caused33,34. However, Ogawa serotype offers less protective immunity than Inaba from reinfection with the heterologous serotype34. Thus, it can be concluded that observed genetic changes including serotype switching from Ogawa to Inaba may be associated with the high number of cases that occurred during the 2022 Dhaka outbreak. Although BD-1.2 strains comprise a component of the global clade, its notoriety as a devastating pathogen is demonstrated very clearly by the extent of the outbreak with which it is linked, humbling locally dominant BD-2 strains and reflecting greater risk this new subclade may cause even more devastating epidemics.
Methods
V. cholerae strains
Vibrio cholerae O1 El Tor strains used in the present study were isolated from dipsticks positive stool samples collected from patients admitted at icddr,b hospital during the 2022 massive cholera outbreak. Stool sample were collected according to the study protocol reviewed and approved by icddr,b institutional review board (Research review and ethical review committees). The 21 outbreak strains subjected to whole genome sequencing in the present study were compared with 960 El Tor strains of the ongoing 7th cholera pandemic isolated between 1957 and 2021 from 88 countries. Additional 30 El Tor strains isolated from cholera patients admitted at icddr,b hospital (March–September, 2022) were tested for serotype, ctxB genotype, and drug resistance patterns27 to increase the power of the study. This study did not need participant consent since there was no risk involved, no personally identifiable information or identifiable biospecimens were collected, and no follow-up was made after collecting the stool samples.
Whole genome sequencing
Genomic DNA was extracted from pure broth culture of V. cholerae isolates using Qiagen DNA Extraction Kit as per the manufacturer’s instructions. DNA QC and quantification were performed employing a NanoDrop 1000 spectrophotometer (Thermo Fisher Scientific, United States) and Qubit 4.0 fluorometer (Life Technologies). The whole-genome sequencing libraries were prepared from 300 to 350 ng of genomic DNA using Illumina DNA Prep Library Preparation Kit (Illumina) according to the manufacturer’s instructions. Bacterial whole genome sequencing was carried out at the icddr,b Genomics Center of the International Center for Diarrheal Disease Research, Bangladesh (icddr,b). The 150 bp paired-end sequencing reads were generated on an Illumina NextSeq 500 system using a NextSeq Mid output v2.5 reagent kit.
Bioinformatics analysis
Initially, fastp v0.23.235, an ultra-fast quality control analysis program that inspects raw paired-end reads and filters out faulty ligation or adapter sections, was used to evaluate the quality of the raw shotgun paired-end sequences. Spades v3.15.436 and ragout v2.337 genome assemblers were used to generate contigs and reference-based scaffolds, respectively. Prokka v1.14.538, a bacterial genome annotation tool, was used to annotate the whole genome. The antimicrobial resistance gene profiles for all strains were determined using ResFinder39 and ABRicate v1.0.125. Phylogenetic analysis was conducted using IQ-TREE v2.2.040 with 1000 bootstrap and best fitted evolutionary model selected using ModelFinder41. Phylodynamic analysis was carried out using BEAST v.2.6.718 according to the parameter setting described in a prior study9. Reference sequences JX565645-JX565687 were used as query and scaffolds of strains as subjects in blast searching for extracting rfbT gene sequences. Functional gene enrichment analysis was conducted using PANNZER42 and GSEA-Pro.v3 (http://gseapro.molgenrug.nl), and network of GO terms were constructed using REVIGO43. Other bioinformatic analyses used in this study were described in supplementary methods.
Statistical analysis
Genome wide association study using Pearson’s chi-squared test was conducted to identify lineage/group associated SNPs and indels (degrees of freedom = 14). Contingency tables for each of tested SNPs were constructed with respect to lineages for testing significance of association between lineages and SNPs. In house R-script was used for the analysis.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Newly sequenced data used in this study were submitted under Bio-project accessions IDs PRJDB13928 and PRJDB13857. Publicly available sequence data used in this study downloaded from the European Nucleotide Archive (ENA), and metadata along with accession numbers are provided in Supplementary Data S1. In addition, other relevant data used are provided in supplementary data, and supplementary tables. Source data are provided with this paper.
Code availability
The source code for the analysis performed for this study is available on Github (https://github.com/mamunmonir/Vibrio_genomics)44.
References
icddr,b’s Response to the Ongoing Massive Diarrhoea Outbreak in Dhaka and Matlab. https://www.icddrb.org/news-and-events/news?id=890 (2022).
Monir, M. M. et al. Genomic characteristics of recently recognized Vibrio cholerae El Tor lineages associated with cholera in Bangladesh, 1991 to 2017. Microbiol. Spectr. 10, e0039122 (2022).
Kim, E. J., Lee, C. H., Nair, G. B. & Kim, D. W. Whole-genome sequence comparisons reveal the evolution of Vibrio cholerae O1. Trends Microbiol. 23, 479–489 (2015).
Cho, Y. J., Yi, H., Lee, J. H., Kim, D. W. & Chun, J. Genomic evolution of Vibrio cholerae. Curr. Opin. Microbiol. 13, 646–651 (2010).
Chun, J. et al. Comparative genomics reveals mechanism for short-term and long-term clonal transitions in pandemic Vibrio cholerae. Proc. Natl Acad. Sci. USA 106, 15442–15447 (2009).
Colwell, R. R. Global climate and infectious disease: the cholera paradigm. Science 274, 2025–2031 (1996).
Hossain, Z. Z. et al. Comparative genomics of Vibrio cholerae O1 isolated from cholera patients in Bangladesh. Lett. Appl. Microbiol. 67, 329–336 (2018).
Mutreja, A. et al. Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477, 462–465 (2011).
Weill, F. X. et al. Genomic history of the seventh pandemic of cholera in Africa. Science 358, 785–789 (2017).
Domman, D. et al. Integrated view of Vibrio cholerae in the Americas. Science 358, 789–793 (2017).
Morita, D. et al. Whole-genome analysis of clinical vibrio cholerae o1 in Kolkata, India, and Dhaka, Bangladesh, reveals two lineages of circulating strains, indicating variation in genomic attributes. mBio 11, 1–9 (2020).
Burrus, V., Quezada-Calvillo, R., Marrero, J. & Waldor, M. K. SXT-related integrating conjugative element in new world Vibrio cholerae. Appl Environ. Microbiol. 72, 3054–3057 (2006).
Murphy, S. G., Johnson, B. A., Ledoux, C. M. & Dörr, T. Vibrio cholerae’s mysterious Seventh Pandemic island (VSP-II) encodes novel Zur-regulated zinc starvation genes involved in chemotaxis and cell congregation. PLoS Genet. 17, e1009624 (2021).
O’Hara, B. J., Barth, Z. K., McKitterick, A. C. & Seed, K. D. A highly specific phage defense system is a conserved feature of the Vibrio cholerae mobilome. PLoS Genet. 13, e1006838 (2017).
Leinonen, R. et al. The European nucleotide archive. Nucleic Acids Res. 39, D28 (2011).
Hasan, N. A. et al. Genomic diversity of 2010 Haitian cholera outbreak strains. Proc. Natl Acad. Sci. USA 109, E2010–E2017 (2012).
Weill, F.-X. et al. Genomic insights into the 2016–2017 cholera epidemic in Yemen. Nature 565, 230–233 (2019).
Bouckaert, R. et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).
Chin, C.-S. et al. The origin of the Haitian cholera outbreak strain. N. Engl. J. Med. 364, 33–42 (2011).
Chen, W., Zhang, Y. M. & Davies, C. Penicillin-binding protein 3 is essential for growth of pseudomonas aeruginosa. Antimicrob. Agents Chemother. 61, e01651–16 (2016).
Karunasagar, I. et al. ompU genes in non-toxigenic Vibrio cholerae associated with aquaculture. J. Appl. Microbiol. 95, 338–343 (2003).
Ruiz, L., Margolles, A. & Sánchez, B. Bile resistance mechanisms in Lactobacillus and Bifidobacterium. Front. Microbiol. 0, 396 (2013).
Xu, Z., Wang, Y., Han, Y., Chen, J. & Zhang, X. H. Mutation of a novel virulence-related gene mltD in Vibrio anguillarum enhances lethality in zebra fish. Res. Microbiol. 162, 144–150 (2011).
Schäfer, U., Beck, K. & Müller, M. Skp, a molecular chaperone of gram-negative bacteria, is required for the formation of soluble periplasmic intermediates of outer membrane proteins. J. Biol. Chem. 274, 24567–24574 (1999).
Seemann, T. GitHub - tseemann/abricate: Mass screening of contigs for antimicrobial and virulence genes. https://github.com/tseemann/abricate (2020).
Angermeyer, A. et al. Evolutionary sweeps of subviral parasites and their phage host bring unique parasite variants and disappearance of a phage CRISPR-Cas system. mBio 13, e0308821 (2022).
Tuz Jubyda, F. et al. Vibrio cholerae O1 associated with recent endemic cholera shows temporal changes in serotype, genotype, and drug-resistance patterns in Bangladesh. https://doi.org/10.21203/RS.3.RS-2303715/V1 (2022).
Khan, A. I. et al. Epidemiology of cholera in Bangladesh: findings from nationwide hospital-based surveillance, 2014–2018. Clin. Infect. Dis. 71, 1635–1642 (2020).
Karlsson, E. K. et al. Natural selection in a Bangladeshi population from the cholera-endemic Ganges river delta. Sci. Transl. Med. 5, 192ra86 (2013).
Lipp, E. K., Huq, A. & Colwell, R. R. Effects of global climate on infectious disease: the Cholera model. Clin. Microbiol. Rev. 15, 757 (2002).
M, A. et al. Toxigenic Vibrio cholerae in the aquatic environment of Mathbaria, Bangladesh. Appl. Environ. Microbiol. 72, 2849–2855 (2006).
Ismail, E. M. et al. Ecoepidemiology and potential transmission of Vibrio cholerae among different environmental niches: an upcoming threat in Egypt. Pathogens 10, 190 (2021).
Alam, M. T. et al. Major shift of toxigenic V. cholerae O1 from Ogawa to Inaba serotype isolated from clinical and environmental samples in Haiti. PLoS Negl. Trop. Dis. 10, e0005045 (2016).
Centers for Disease Control and Prevention (CDC). Notes from the field: identification of Vibrio cholerae serogroup O1, serotype Inaba, biotype El Tor strain - Haiti, March 2012. MMWR Morb. Mortal. Wkly Rep. 61, 309 (2012).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455 (2012).
Kolmogorov, M., Raney, B., Paten, B. & Pham, S. Ragout-a reference-assisted assembly tool for bacterial genomes. Bioinformatics 30, i302–9 (2014).
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Bortolaia, V. et al. ResFinder 4.0 for predictions of phenotypes from genotypes. J. Antimicrob. Chemother. 75, 3491–3500 (2020).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Kalyaanamoorthy, S., Minh, B. Q., Wong, T. K. F., von Haeseler, A. & Jermiin, L. S. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017).
Törönen, P. & Holm, L. PANNZER—a practical tool for protein function prediction. Protein Sci. 31, 118–128 (2022).
Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS One 6, e21800 (2011).
Monir, M. M. mamunmonir/Vibrio_genomics: Genomic attributes of Vibrio cholerae O1 responsible for 2022 massive cholera outbreak in Bangladesh. https://doi.org/10.5281/zenodo.7567456 (2023).
Acknowledgements
This work was supported in part by icddr,b, National Institutes of Infectious Diseases (NIID), Tokyo, the Research Program on Emerging and Re-emerging Infectious Diseases (JP21fk0108139) from the Japan Agency for Medical Research and Development (AMED), the National Institute of Allergy and Infectious Diseases (NIAID), the Foreign, Commonwealth and Development Office (FCDO)/Wellcome), the National Science Foundation (NSF) and the National Institutes of Health (NIH). M.A. was supported by AMED, NIAID (R01AI039129) and FCDO)/Wellcome (215704/Z/19/Z), R.R.C. was by NSF (OCE1839171 and CCF1918749), and M.A. and K.S. by NIH (R01AI53303). Authors acknowledge icddr,b hospital and laboratory staff for their support. icddr,b gratefully acknowledges the following donors for providing unrestricted support: Governments of the People’s Republic of Bangladesh, Global Affairs Canada (GAC), Swedish International Development Cooperation Agency (Sida), and the Foreign Commonwealth & Development Office (FCDO), UK. All the authors read and approved the final manuscript.
Author information
Authors and Affiliations
Contributions
M.A. and M.M.M. designed this study. M.T.I., K.S.N., and M.S. collected stool samples from hospitalized patients, isolated strains, cultured, and performed DNA extraction. R.M., D.M., and M.R. sequenced outbreak strains. M.M.M. performed data analyses and written draft manuscript. R.R.C., T.A., K.S., N.T., F.Q., H.W., A.H., M.O., M.M., M.T.I., and M.A. edited the draft Manuscript. M.A. supervised this study, and all authors contributed to the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Gildas Hounmanou and the other anonymous reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Monir, M.M., Islam, M.T., Mazumder, R. et al. Genomic attributes of Vibrio cholerae O1 responsible for 2022 massive cholera outbreak in Bangladesh. Nat Commun 14, 1154 (2023). https://doi.org/10.1038/s41467-023-36687-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-023-36687-7
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.