Global phylogeography and evolutionary history of Shigella dysenteriae type 1

Njamkepo, Elisabeth; Fawal, Nizar; Tran-Dien, Alicia; Hawkey, Jane; Strockbine, Nancy; Jenkins, Claire; Talukder, Kaisar A.; Bercion, Raymond; Kuleshov, Konstantin; Kolínská, Renáta; Russell, Julie E.; Kaftyreva, Lidia; Accou-Demartin, Marie; Karas, Andreas; Vandenberg, Olivier; Mather, Alison E.; Mason, Carl J.; Page, Andrew J.; Ramamurthy, Thandavarayan; Bizet, Chantal; Gamian, Andrzej; Carle, Isabelle; Sow, Amy Gassama; Bouchier, Christiane; Wester, Astrid Louise; Lejay-Collin, Monique; Fonkoua, Marie-Christine; Le Hello, Simon; Blaser, Martin J.; Jernberg, Cecilia; Ruckly, Corinne; Mérens, Audrey; Page, Anne-Laure; Aslett, Martin; Roggentin, Peter; Fruth, Angelika; Denamur, Erick; Venkatesan, Malabi; Bercovier, Hervé; Bodhidatta, Ladaporn; Chiou, Chien-Shun; Clermont, Dominique; Colonna, Bianca; Egorova, Svetlana; Pazhani, Gururaja P.; Ezernitchi, Analia V.; Guigon, Ghislaine; Harris, Simon R.; Izumiya, Hidemasa; Korzeniowska-Kowal, Agnieszka; Lutyńska, Anna; Gouali, Malika; Grimont, Francine; Langendorf, Céline; Marejková, Monika; Peterson, Lorea A.M.; Perez-Perez, Guillermo; Ngandjio, Antoinette; Podkolzin, Alexander; Souche, Erika; Makarova, Mariia; Shipulin, German A.; Ye, Changyun; Žemličková, Helena; Herpay, Mária; Grimont, Patrick A. D.; Parkhill, Julian; Sansonetti, Philippe; Holt, Kathryn E.; Brisse, Sylvain; Thomson, Nicholas R.; Weill, François-Xavier

doi:10.1038/nmicrobiol.2016.27

Download PDF

Letter
Published: 21 March 2016

Global phylogeography and evolutionary history of Shigella dysenteriae type 1

Elisabeth Njamkepo¹^na1,
Nizar Fawal¹^na1,
Alicia Tran-Dien¹^na1,
Jane Hawkey ORCID: orcid.org/0000-0001-9661-5293^2,3,4,
Nancy Strockbine⁵,
Claire Jenkins⁶,
Kaisar A. Talukder⁷,
Raymond Bercion^8,9,
Konstantin Kuleshov¹⁰,
Renáta Kolínská¹¹,
Julie E. Russell¹²,
Lidia Kaftyreva¹³,
Marie Accou-Demartin¹,
Andreas Karas¹⁴,
Olivier Vandenberg^15,16,
Alison E. Mather^17,18,
Carl J. Mason ORCID: orcid.org/0000-0002-3676-2811¹⁹,
Andrew J. Page¹⁷,
Thandavarayan Ramamurthy²⁰,
Chantal Bizet²¹,
Andrzej Gamian²²,
Isabelle Carle¹,
Amy Gassama Sow⁹,
Christiane Bouchier²³,
Astrid Louise Wester²⁴,
Monique Lejay-Collin¹,
Marie-Christine Fonkoua²⁵,
Simon Le Hello¹,
Martin J. Blaser²⁶,
Cecilia Jernberg²⁷,
Corinne Ruckly¹,
Audrey Mérens²⁸,
Anne-Laure Page²⁹,
Martin Aslett¹⁷,
Peter Roggentin³⁰,
Angelika Fruth³¹,
Erick Denamur³²,
Malabi Venkatesan³³,
Hervé Bercovier³⁴,
Ladaporn Bodhidatta¹⁹,
Chien-Shun Chiou³⁵,
Dominique Clermont²¹,
Bianca Colonna³⁶,
Svetlana Egorova¹³,
Gururaja P. Pazhani²⁰,
Analia V. Ezernitchi³⁷,
Ghislaine Guigon³⁸,
Simon R. Harris ORCID: orcid.org/0000-0003-1512-6194¹⁷,
Hidemasa Izumiya³⁹,
Agnieszka Korzeniowska-Kowal ORCID: orcid.org/0000-0001-7569-8018²²,
Anna Lutyńska⁴⁰,
Malika Gouali¹,
Francine Grimont¹,
Céline Langendorf²⁹,
Monika Marejková⁴¹,
Lorea A.M. Peterson⁴²,
Guillermo Perez-Perez²⁶,
Antoinette Ngandjio²⁵,
Alexander Podkolzin¹⁰,
Erika Souche⁴³,
Mariia Makarova¹³,
German A. Shipulin¹⁰,
Changyun Ye⁴⁴,
Helena Žemličková^11,45,
Mária Herpay⁴⁶,
Patrick A. D. Grimont¹,
Julian Parkhill ORCID: orcid.org/0000-0002-7069-5958¹⁷,
Philippe Sansonetti⁴⁷,
Kathryn E. Holt ORCID: orcid.org/0000-0003-3949-2471^2,3,
Sylvain Brisse^38,48,49,
Nicholas R. Thomson^17,50 &
…
François-Xavier Weill^1,17

Nature Microbiology volume 1, Article number: 16027 (2016) Cite this article

4413 Accesses
54 Citations
89 Altmetric
Metrics details

Subjects

An Erratum to this article was published on 03 October 2016

Abstract

Together with plague, smallpox and typhus, epidemics of dysentery have been a major scourge of human populations for centuries¹. A previous genomic study concluded that Shigella dysenteriae type 1 (Sd1), the epidemic dysentery bacillus, emerged and spread worldwide after the First World War, with no clear pattern of transmission². This is not consistent with the massive cyclic dysentery epidemics reported in Europe during the eighteenth and nineteenth centuries^1,3,4 and the first isolation of Sd1 in Japan in 1897⁵. Here, we report a whole-genome analysis of 331 Sd1 isolates from around the world, collected between 1915 and 2011, providing us with unprecedented insight into the historical spread of this pathogen. We show here that Sd1 has existed since at least the eighteenth century and that it swept the globe at the end of the nineteenth century, diversifying into distinct lineages associated with the First World War, Second World War and various conflicts or natural disasters across Africa, Asia and Central America. We also provide a unique historical perspective on the evolution of antibiotic resistance over a 100-year period, beginning decades before the antibiotic era, and identify a prevalent multiple antibiotic-resistant lineage in South Asia that was transmitted in several waves to Africa, where it caused severe outbreaks of disease.

Dissecting the molecular evolution of fluoroquinolone-resistant Shigella sonnei

Article Open access 23 October 2019

Hao Chung The, Christine Boinett, … Stephen Baker

Evolutionary histories and antimicrobial resistance in Shigella flexneri and Shigella sonnei in Southeast Asia

Article Open access 19 March 2021

Hao Chung The, Ladaporn Bodhidatta, … Maia A. Rabaa

An Escherichia coli ST131 pangenome atlas reveals population structure and evolution across 4,071 isolates

Article Open access 22 November 2019

Arun Gonzales Decano & Tim Downing

January 2016 marks one hundred years since the invasion force from Britain, Australia, New Zealand and France withdrew from the Dardanelles, in the then Ottoman Empire, only eight months after landing. Most of the more than 120,000 casualties evacuated from the Gallipoli Peninsula were suffering from epidemic bacillary dysentery⁶, caused by Shigella dysenteriae type 1^7,8 (Sd1), a bacterium producing the powerful Shiga toxin. This human-adapted clone of Escherichia coli⁹ was isolated for the first time by Kiyoshi Shiga during a dysentery outbreak in Japan, during which 90,000 cases and 20,000 deaths occurred in the last six months of 1897 alone⁵. In the second half of the twentieth century, large outbreaks of disease due to Sd1 were still being reported in Central America, with estimates of more than 500,000 cases and 20,000 deaths for the 1969–1973 epidemic^10,11, in Africa, where there were an estimated 100,000 cases and 5,000–10,000 deaths in the 1979 epidemic¹², and in Asia^13,14.

Very little is known about the origins, evolution and spread of this important human pathogen, including, in particular, the strains involved in the major outbreaks and the genetic relationships between them. We carried out a whole-genome sequence analysis on a set of Sd1 isolates selected from more than 35 international strain collections, representing the widest possible temporal and geographic distribution of available isolates, to obtain a phylogenetic framework that was robust over time and space and to infer transmission dynamics. This unique collection included 325 isolates from 66 countries spanning four continents, collected between 1915 and 2011. Sixty-seven historical isolates collected between 1915 and 1960, including 14 isolates obtained during the First World War^15,16, were included in the collection, together with several isolates from each major outbreak reported since the 1960s. Short-read sequences from six Sd1 published genomes² were also included, with S. flexneri, S. boydii, S. sonnei and E. coli genomes used as outgroups.

Single-nucleotide polymorphisms (SNPs) were detected by mapping short-read sequences against Sd1 reference genomes: Sd197 (ref. 17), which was isolated during an outbreak in China in the 1950s, and Sd1617 (ref. 18), which was isolated in Guatemala during the 1968–1969 epidemic. Maximum likelihood (ML) phylogenetic analysis was performed on 14,677 (mapping against Sd197) and 15,752 (mapping against Sd1617) chromosomal SNPs, which were randomly distributed over the non-repetitive non-recombinant core genome (85.6% of the Sd197 chromosome, Supplementary Discussion Section ‘SNP variation in S. dysenteriae type 1’ and Supplementary Fig. 12). Four genetic lineages (Fig. 1a and Supplementary Discussion Section ‘SNP variation in S. dysenteriae type 1’ and Supplementary Figs 9–11) were identified. Lineage I contained only M115, which was isolated from a case in England in 1926. Lineage II contained mostly isolates collected in Europe between 1915 and 1958. Lineage III contained isolates from around the world and could be split into four sublineages with strong geographical affinities: IIIa in eastern and southeastern Asia (with isolates collected between 1927 and 1971), IIIb in Central America (1955–1992), IIIc in West Africa (1954–2006) and IIId in southern Asia and eastern Africa (1956–1977) and then in West Africa (1979–1998). Finally, lineage IV contained most of the Sd1 isolates obtained from the Indian subcontinent and Africa in the last few decades.

**Figure 1: Geographic distribution and transmission patterns of *Shigella dysenteriae* type 1 genetic lineages.**

Ten of the 14 isolates (71%) amassed by Captain E.G.D. Murray during the First World War belonged to the European lineage, lineage II, and most were isolated at the second Western General Hospital, Manchester, which received many of the soldiers evacuated during the Gallipoli campaign (Supplementary Fig. 1). The other four isolates belonged to three of the four sublineages of the global lineage, lineage III. None of the First World War isolates belonged to sublineage IIId, which gave rise to the modern lineage, lineage IV.

The two candidate vaccine strains developed so far are derived from lineage III parental isolates: IIIb for parental strain Sd1617 of vaccine strain WRSd1 (ref. 19) and IIId for parental strain 7–87 of vaccine strain SC-599 (ref. 20).

ML phylogenetic analysis revealed a strong correlation between root-to-tip branch lengths and the known years of isolation for the sequenced Sd1 isolates, indicating a clock-like evolution (Supplementary Fig. 2). We therefore used a Bayesian phylogenetic approach to provide estimates of the nucleotide substitution rates and divergence times of the different lineages for a spatially and temporally representative subset of 125 isolates (Fig. 2). We estimated the genome-wide substitution rate at 8.7 × 10⁻⁷ substitutions site⁻¹ year⁻¹ (95% credible interval (CI) = 7.6 × 10⁻⁷ to 9.9 × 10⁻⁷), giving a most recent common ancestor (MRCA) for all the Sd1 in our collection dating from 1747 (95% CI, 1645–1822). This finding is consistent with historical data from the eighteenth to mid-nineteenth centuries, which describe cyclic dysentery epidemics in Western and Northern Europe associated with extraordinarily high mortality rates. For example, the 1738–1742 and 1779 epidemics in France killed more than 200,000 people¹, the 1770–1775 epidemic in Sweden killed almost 35,000 people (12% of all deaths during the period)³, and a large number of deaths from dysentery were also reported during the Irish Great Famine of 1846–1849 (ref. 4). The MRCA for all isolates other than M115 was dated to the mid-nineteenth century (1853; 95% CI 1831–1871), whereas the MRCAs for each of the sublineages of global lineage III were estimated to have existed between 1889 (95% CI 1881–1897) and 1903 (95% CI 1893–1913), indicating that this lineage spread worldwide over a period of less than two decades. This dating is also consistent with Shiga's observation that the dysentery outbreak of 1897 had begun in the late 1880s in the southern part of Japan²¹.

**Figure 2: Timed phylogeny of a subsample of 125 *Shigella dysenteriae* type 1 isolates.**

Our findings show that the global spread of Sd1 predates the First World War. It therefore occurred earlier than for another Shigella serogroup, S. sonnei, which has been shown to have spread to other continents from Europe during the second half of the twentieth century²². We cannot demonstrate causality between the spread of Sd1 and historical events on the basis of the results presented here, but the late 1800s coincided with a period of intense European emigration, with the colonization of various territories in Africa and Asia by European powers, facilitated by the opening of the Suez Canal in 1869 and the development of steamships.

Geographic and temporal analyses identified several intercontinental transmission events resulting in long-term establishment of the bacterium (Figs 1b,c and 2). Transmission event T1 involved the European lineage II and led to an introduction of Sd1 in Madagascar between 1915 (95% CI 1910–1921) and 1967 (95% CI 1956–1977), during French colonization. This is consistent with the first report, which unambiguously described Sd1 there in 1927²³. Transmission event T2, involving eastern Asia and Poland, is estimated to have occurred between 1910 (95% CI 1899–1925) and 1944 (95% CI 1942–1945). All other transmission waves originated in the Indian subcontinent and affected mostly East Africa. Two of these transmission waves, T5 and T8, led to major outbreaks; according to our estimates, T5 occurred between 1970 (95% CI 1963–1975) and 1979 (95% CI 1976–1981). This dating is consistent with the first reported outbreak in the northeastern part of what is now the Democratic Republic of the Congo (DRC) in 1979, 28 years after the last isolation of Sd1 in Central Africa¹². This epidemic then spread to the Great Lakes region, where it persisted until at least 1990 (ref. 12). T8 occurred between 1984 (95% CI 1978–1987) and 1987 (95% CI 1985–1989), with a first reported outbreak in Zambia in 1990–1991 (refs 12,24). The strain then spread rapidly across an Africa ravaged by civil unrest, war (for example, in Mozambique, Angola, Rwanda and Sierra Leone) and HIV infection^12,24 until 2011. With the exception of a localized outbreak in the northern part of the Central African Republic in 2004²⁵ caused by sublineage IIIc (see below), all other outbreaks in Africa since 1990 have been caused by lineage IV.

The high resolution of whole-genome sequence analysis (WGS) has significantly changed our understanding of the patterns of Sd1 transmission over time at a global scale. The classical molecular epidemiology tools (Supplementary Discussion Section ‘Correlation of S. dysenteriae type 1 phylogenetic lineages with existing typing and subtyping schemes’) used previously were unable to unravel these patterns of transmission. Furthermore, a re-evaluation of two outbreaks that occurred in the Central African Republic in 2003–2004 (ref. 25), which we had previously investigated by pulsed-field gel electrophoresis (PFGE), the current method of choice for subtyping Sd1, revealed a lack of correlation between PFGE and WGS data (Supplementary Discussion Section ‘Correlation of S. dysenteriae type 1 phylogenetic lineages with existing typing and subtyping schemes’ and Supplementary Fig. 3). In particular, PFGE grouped the isolates from the two outbreaks closely together, but they actually belonged to two different lineages, IIIc and IV, separated by ∼700 SNPs. By contrast, other African T8 lineage IV isolates (differing by 37 to 61 SNPs from the Central African Republic T8 lineage IV outbreak isolates) formed a more distant group. Thus, PFGE cannot attribute profiles from different apparently geographically restricted outbreaks to a single, longer epidemic, such as that associated with the T8 transmission wave in Africa. PFGE should, therefore, no longer be used for the assessment of phylogenetic relationships in Sd1. Instead, WGS provides a robust phylogenetic framework for the epidemiological tracking of this bacterium.

One key feature in the evolution of Sd1 is the acquisition and accumulation of antibiotic resistance genes (ARGs; Figs 3 and 4, Supplementary Discussion Section ‘Pan-genome and antibiotic resistance’ and Supplementary Fig. 4). The first antibiotic-resistant Sd1 isolates in our collection were recovered in Asia and America during the 1960s and rapidly became predominant, such that susceptible isolates had become exceptional by 1991 (100%, [67/67] susceptible isolates, between 1915 and 1960 and <1% [1/123], between 1991 and 2011). Lineage IV, the most recent of the lineages identified, is the most affected by antibiotic resistance, but almost all the contemporary circulating strains from older lineages have also become resistant to multiple antibiotics. ARGs were acquired following the first use of antibiotics in clinical practice (Fig. 4b). The first ARGs identified in Sd1 were borne on small plasmids (<10 kb), encoding resistance to streptomycin and sulfonamides. Larger plasmids (80–130 kb) of different types encoding additional resistance to tetracycline, chloramphenicol and, for some plasmids, ampicillin (via the bla_OXA-1 or bla_TEM-1 genes) were then acquired in various geographic areas, from the mid-1960s to the 1980s. These plasmids belonged to the IncK and IncF groups in Asia and to the IncB/O group in Central America. The use of cotrimoxazole, beginning in the late 1960s, led to the acquisition of dihydrofolate reductase genes, mostly dfrA1, carried by 110 kb pST186 IncI1 and 30-kb IncX4 plasmids or by the Tn7 transposon inserted into the Sd1 chromosome close to the glmS gene, as observed for S. sonnei²². Since the 1990s, the principal structure associated with multidrug resistance in Sd1 has been a 66 kb genomic element called the Shigella resistance locus pathogenicity island (SRL-PAI)²⁶. This was acquired four times in lineage IV (South Asia or the Middle East), once in sublineage IIIc (West Africa) and once in lineage II (Madagascar). Further evidence for the independent acquisition of the SRL-PAI is provided by the presence of slight differences between the different acquired SRL-PAIs (Supplementary Fig. 5). SRL-A is very similar to the first SRL-PAI to be described in S. flexneri²⁶, and it was found exclusively in lineage IV. SRL-B, found only in lineage IV African T8 isolates, was probably derived from SRL-A by insertion sequence (IS) ISSd1-mediated rearrangements rather than being independently acquired. The other SRL-PAI contained various insertions (group II introns, part of the shf operon, region replacing orf47) not present in SRL-A. Among the 149 isolates bearing the SRL-PAI, only two showed a partial deletion of the SRL-PAI, resulting in a loss of the antibiotic resistance cluster (that is, the SRL sensu stricto). This structure is therefore quite stable over time, particularly in a bacterium containing hundreds of ISs (refs 17,18). This 66 kb element encodes resistance to ampicillin, streptomycin, chloramphenicol and tetracycline, with no more resistance than the previously circulating large plasmids. Its persistence may therefore be associated with a lower fitness cost and the presence of an fec operon for the capture of iron, serving as selective advantages²⁶. Before the principal acquisition of SRL-A, the closest ancestral group (consisting initially of South Asian and then South-East and Central Asian isolates) had acquired a chromosomally encoded transposon (Fig. 2 and Supplementary Fig. 6). This 10 kb structure encodes resistance to chloramphenicol and tetracycline. The structure of the double drug-resistance module is similar to that found in the SRL and to some previously circulating large multidrug resistance IncF plasmids, such as p3099-85 and p80-547. This recent trend towards acquiring ARG-containing genomic islands or chromosomally encoded transposons rather than plasmids is also displayed by the seventh pandemic Vibrio cholerae (SXT/R391) and Salmonella enterica serotype Typhi H58 (24 kb composite transposon) strains, which also originate from the Indian subcontinent^27,28.

**Figure 3: Phenotypic and genetic characterization of antibiotic resistance in *Shigella dysenteriae* type 1.**

**Figure 4: Evolution of antibiotic resistance of *Shigella dysenteriae* type 1.**

Resistance to nalidixic acid, a quinolone, mediated by point mutations in the DNA gyrase gene, gyrA, was acquired seven times in lineage IV Sd1 isolates from South Asia and Africa (Fig. 2) from the 1980s. The gyrA mutation leading to a serine-to-leucine substitution in the amino-acid sequence S83L was the most frequently observed, but others, involving codon 87, such as D87G and D87Y, were observed in isolates from Central Africa and Thailand, respectively, during the 1990s. Interestingly, in the same geographic area of DRC and Rwanda in 1994, two different mutations were acquired (S83L and D87G). This may reflect the heavy use of nalidixic acid in the Rwandan refugee camps, which experienced outbreaks of disease caused by V. cholerae O1 and Sd1 (ref. 29).

Resistance to ciprofloxacin, a fluoroquinolone, mediated by a double mutation in gyrA (S83L and a second mutation in codon 87) and a mutation in the topoisomerase IV parC gene (S80I), was acquired only once, in a group of 20 isolates from the Indian subcontinent collected between 1995 and 2010 (Fig. 2). We observed no resistance to extended-spectrum cephalosporins, carbapenems or azithromycin in the isolates studied here, but the existence of such resistance is almost inevitable, as the area of circulation of Sd1 overlaps with that of Enterobacteriaceae possessing mobile ARGs encoding resistance to the latest generation of antibiotics, such as NDM-1 (ref. 30). However, the dramatic decrease in Sd1 isolation reported since the turn of the century, and not explained by the findings of this genomic study, may counterbalance these pessimistic predictions.

Methods

Bacterial isolates

The Sd1 isolates analysed in this study (listed in Supplementary Table 1) originated from the collections of the Centers for Disease Control and Prevention, Atlanta, GA, USA (n = 56); Institut Pasteur, Paris, France (n = 53); Public Health England, Colindale, UK (n = 29); Icddr,b, Dhaka, Bangladesh (n = 29); Central Research Institute for Epidemiology, Moscow, Russian Federation (n = 22); National Institute of Public Health, Prague, Czech Republic (n = 19); Public Health England, Porton Down, UK (n = 17); Iris-Lab, Brussels, Belgium (n = 11); National Institute of Cholera and Enteric Diseases, Kolkata, India (n = 8); Institut Pasteur de Bangui, Bangui, Central African Republic (n = 7); Norwegian Institute of Public Health, Oslo, Norway (n = 6); Hungarian National Collection of Medical Bacteria, Budapest, Hungary (n = 6); Pasteur Institute of St Petersburg, St Petersburg, Russian Federation (n = 5); National Institute of Public Health, Warsaw, Poland (n = 5); Institut Pasteur de Dakar, Dakar, Senegal (n = 4); New York University Langone Medical Center, New York, USA (n = 4); Robert Koch Institut, Wernigerode, Germany (n = 4); Institut für Hygiene und Umwelt, Hamburg, Germany (n = 3); Bégin Military Hospital, Saint-Mandé, France (n = 3); IAME, Paris, France (n = 3); Swedish Institute for Communicable Disease Control, Solna, Sweden (n = 3); Walter Reed Army Institute of Research, Silver Spring, MA, USA (n = 3); Epicentre, Maradi, Niger (n = 2); Polish Collection of Microorganisms, Wroclaw, Poland (n = 2); Ministry of Health, Jerusalem, Israel (n = 2); Centers for Disease Control, Taichung, Taiwan (n = 2); Centre Pasteur du Cameroun, Yaoundé, Cameroon (n = 2); National Institute of Infectious Diseases, Tokyo, Japan (n = 1); Public Health Agency of Canada, Winnipeg, Canada (n = 1); Istituto Pasteur-Fondazione Cenci Bolognetti, Rome, Italy (n = 1); Félix d'Hérelle reference center for bacterial viruses, Université Laval, Québec, Canada (n = 1); National Institute for Communicable Disease Control and Prevention, Beijing, China (n = 1).

Bacterial DNA samples were also received from the Armed Forces Research Institute of Medical Sciences, Bangkok, Thailand (n = 10).

The 18 Sd1 isolates from the E.G.D. Murray collection^15,16,31 included 14 isolates recovered during the First World War and four isolates obtained between 1926 and 1930. The First World War isolates were obtained from different sources (Supplementary Fig. 1) and were stored at room temperature in Douglas digest agar slant glass tubes after sealing with a gas-air burner between August 1918 and October 1919. In 1980, the 18 tubes and the 680 other cultures of Enterobacteriaceae from the entire collection were shipped to the National Collection of Type Cultures (NCTC), Porton Down, UK, opened and freeze-dried.

It was confirmed that all the isolates included belonged to Sd1, by conventional methods and serotyping at the French National Reference Center for E. coli, Shigella and Salmonella, Institut Pasteur, Paris, as described previously³².

Antibiotic susceptibility testing

Antibiotic susceptibility was determined by disk diffusion on Mueller-Hinton (MH) agar in accordance with the guidelines of the Antibiogram Committee of the French Society for Microbiology (CA-SFM 2014) (www.sfm-microbiologie.org/). The following antimicrobial drugs (Bio-Rad) were tested: amoxicillin, ceftriaxone, ceftazidime, streptomycin, kanamycin, amikacin, gentamicin, nalidixic acid, ofloxacin, ciprofloxacin, sulfonamides, trimethoprim, sulfamethoxazole-trimethoprim, chloramphenicol, tetracycline and azithromycin. E. coli CIP 76.24 (ATCC 25922) was used as a control. For strains displaying resistance to either nalidixic acid or ciprofloxacin by the disk diffusion method, this resistance was confirmed by determination of the minimal inhibitory concentration (MIC) with the corresponding Etest strips (bioMérieux). The MICs of azithromycin and nitrofurantoin were determined by Etests for 30 isolates chosen on the basis of resistance phenotype and year and country of isolation.

Determination of the mutator phenotype of strain M115

The mutation rate of M115 was estimated by monitoring the capacity of this strain to generate mutations conferring resistance to rifampin in two independent experiments including duplicates, as previously described³³. E. coli strain ECOR48 (CIP 106023) was used as a strong mutator positive control³⁴, the Sd1 97–13397 isolate was used as a putative strong mutator isolate (deletion of the mutS gene), and Sd1 M116 and Sd197 were used as putative normomutator isolates (integrity of the mutS, mutH, mutL and uvrD methyl-directed mismatch repair genes).

Total DNA extraction

Total DNA was extracted with the InstaGene matrix kit (Bio-Rad) for the PCR identification of antibiotic resistance genes, the Wizard Genomic DNA Kit (Promega) for multilocus sequence typing and Illumina sequencing and the phenol chloroform method³⁵ for Illumina sequencing and PacBio sequencing.

Multi-locus sequence typing

Conventional multi-locus sequence typing (MLST) was performed on a subset of 33 Sd1 isolates, as previously described³⁶. Sequencing was performed at the Plateforme de Génotypage des Pathogènes et Santé Publique, PF8 (Institut Pasteur). The nucleotide sequences and deduced protein sequences were analysed with EditSeq and Megalign software (DNASTAR). The BLASTN program of NCBI was used for database searches (http://www.ncbi.nlm.nih.gov/BLAST/).

PCR identification of antibiotic resistance genes

The bla_TEM, bla_SHV, bla_OXA-1,cat1, sul1, dfrA1 and aadA1 resistance genes and the class 1 and 2 integron gene cassettes were amplified by PCR, as described previously³⁷.

The presence of the Shigella resistance locus pathogenicity island (SRL-PAI) was assessed by PCR, as previously described³⁸. The structure of the SRL-PAI was assessed by PCR mapping with the primers described or with new primers designed on the basis of GenBank accession no. AF326777. Amplicons not of the expected size were sequenced.

Plasmid analyses

Plasmids were obtained from E. coli transconjugants or transformants, as previously described³⁷, except that ampicillin (50 mg l^–1) or chloramphenicol (20 mg l^–1) was used as a selective agent.

Plasmid size was determined in parental and transconjugant or transformants strains by S1 nuclease treatment and pulsed-field gel electrophoresis, as previously described³⁷. PCR-based replicon-typing analysis was performed as previously described³⁹.

Eight 30–130 kb plasmids conferring antimicrobial resistance were sequenced. Plasmid DNA was extracted with the Large-Construct Kit (Qiagen) and sequenced through services provided by GATC Biotech, using shotgun sequencing runs on a 454/Roche GS FLX Analyzer (Roche). The resulting sequences were assembled into a unique scaffold. Gap closure was carried out by PCR followed by Sanger DNA sequencing with the Big Dye Terminator V3.1 Cycle Sequencing Kit (Applied Biosystems) and a 96-capillary 3730xl DNA Analyzer (Applied Biosystems), by Eurofins MGW Operon (Cochin Platform). Automatic annotation was performed with the RAST⁴⁰ server (http://rast.nmpdr.org/), followed by manual inspection and correction. The sequences obtained have been deposited in GenBank under accession numbers KT754160 (p80-547), KT754161 (pCAR10), KT754162 (pBU53M1), KT754163 (pA5468), KT754164 (p3099-85), KT754165 (p93-531-1), KT754166 (p92-9000), KT754167 (p69-3818).

Whole-genome sequencing

High-throughput genome sequencing was carried out at the genomics platform of the Pasteur Institute, GATC Biotech, Beckman Coulter Genomics or at the Wellcome Trust Sanger Institute, on Illumina platforms generating 100 to 146 bp paired-end reads, yielding a mean of 206-fold coverage (minimum 37-fold, maximum 990-fold) (Supplementary Table 2). Short-read sequence data were submitted to the European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), and the genome accession numbers are provided in Supplementary Table 1.

We optimized the resolution of the chromosome-encoded antibiotic resistance structures and ensured that representative isolates from the various lineages were included, by sequencing 10 isolates on the PacBIO RS II platform (Pacific Biosciences), as previously described²⁸. The PacBio data were submitted to the ENA and the genome accession numbers are provided in Supplementary Table 1.

Other studied genomes

Sd1 strain Sd197¹⁷ was used as the reference genome. A second Sd1 genome Sd1617 (ref. 18) was used as a second reference genome, to confirm the population structure found with Sd197.

Short-read sequences from the following six Sd1 genomes published by Rohmer et al.² were downloaded from the ENA and included in this study: 2735 (USA, 1974, SRR765065), 91R17 (Guatemala, 1991, SRR765098), 91R14 (Guatemala, 1991, SRR765104), DH03 (Central African Republic, 1996, SRR765110), DH05 (Central African Republic, 1996, SRR765112) and DH06 (Central African Republic, 1996, SRR765113).

The following genomes were used as outgroups: E. coli O157:H7 strain Sakai (GenBank accession no. NC_002695), E. coli strain K-12 MG1655 (GenBank accession no. NC_000913), S. flexneri type 2a strain 2457T (GenBank accession no. AE014073), S. boydii strain Sb227 (GenBank accession no. NC_007613) and S. sonnei strain Ss046 (GenBank accession no. NC_007384).

Read alignment and SNP detection

For the analysis of SNPs, Illumina-generated paired-end reads and the simulated paired-end reads from publicly available assembled genomes were mapped to the reference genome of Sd1 strain Sd197, including the chromosome (CP000034) and plasmids pSD1_197 (CP000035) and pSD197_spA (CP000640), with SMALT (version 0.7.4; http://www.sanger.ac.uk/resources/software/smalt/) as previously described²⁸.

De novo assembly

The reads for each strain were assembled de novo with Velvet⁴¹ version 1.2.09, with parameters optimized with VelvetOptimiser version 2.2.5 (https://github.com/tseemann/VelvetOptimiser). They were scaffolded with SSPACE⁴² version v2.0. The gaps were closed with GapFiller⁴³ version 1.11 and the sequences were annotated with Prokka⁴⁴ version 1.5, as previously described²⁸. CLC Assembly Cell version 4.2.0 (CLC bio) was also used to investigate antibiotic resistance determinants.

Phylogenetic analyses

The ML phylogenetic tree shown in Supplementary Fig. 7 was built from a 140,385-chromosomal SNP alignment generated by snp_sites software (https://github.com/sanger-pathogens/snp_sites) from all 331 short-read sequences, plus Sd1 genomes Sd197 (used as a reference) and Sd1617, together with the six E. coli and Shigella sp. genomes used as outgroups. RAxML⁴⁵ version 7.8.6 was used with the generalized time-reversible model and a Gamma distribution to model site-specific rate variation (the GTR+⌈ substitution model; GTRGAMMA in RAxML). Support for the ML phylogeny was assessed by 100 bootstrap pseudo-analyses of the alignment data and the final tree was visualized in FigTree version 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/).

The ML phylogenetic trees shown in Figs 1a and 3a,c and Supplementary Figs 1a, 3b, 4, 9, 11 and 14 were built from a 14,677-chromosomal SNP alignment of all 331 Sd1 short-read sequences, plus Sd1 genome Sd197, used as the reference. Repetitive regions (within the chromosome, between the chromosome and the virulence plasmid (VP) or the SRL-PAI) were removed manually with the Artemis⁴⁶ genome browser. Recombinogenic regions were also removed with the Gubbins⁴⁷ software. The remaining 14,677 chromosomal SNPs were randomly distributed along the non-repetitive non-recombinant core genome (3,750,125 bp), with a spacing of about one SNP per 256 bp or a nucleotide divergence of 0.39% (Supplementary Fig. 12). RAxML version 7.8.6 (GTRGAMMA substitution model) was used to construct the tree. We performed 500 bootstrap pseudoreplicate analyses to assess support for the ML phylogeny. The tree was rooted on M115, which was shown to be the most closely related to the ancestral strain of Sd1 by two different approaches (ML and Bayesian) and was visualized with MEGA⁴⁸ version 6, iTOL^49,50 or FigTree version 1.4.2.

The ML phylogenetic trees shown in Supplementary Figs 10 and 11 were built from a 15,752-chromosomal SNP alignment of all 331 Sd1 short-read sequences, plus Sd1 genome Sd1617, used as the reference. The method used was similar to that described above, except that the repetitive regions were not removed manually and phylogenetic support was assessed by 100 bootstrap pseudo-analyses.

The VP phylogenetic tree shown in Supplementary Fig. 15 was constructed similarly, from the 226 plasmid-containing isolates (>90% coverage at read depth >10×), based on 290 SNPs randomly distributed along the non-repetitive non-recombinant pSD1_197 sequence (99,704 bp, 54.6% of pSD1_197). The tree was unrooted.

Phylogenetic clustering

We clustered the isolates of Sd1 into various lineages by eye and by applying hierarchical Bayesian analysis of population structure (BAPS)⁵¹ software to the 14,677-chromosomal SNP alignment. Five iterations (L value) were run with a maximum cluster number (K value) of 6 or 10 and three iterations were run with K = 6.

Temporal analysis

We investigated the temporal signal in the ML phylogeny for Sd1, using Path-O-Gen (http://tree.bio.ed.ac.uk/software/pathogen/). The relationships between root-to-tip distances, year of isolation and lineage were analysed by linear regression methods.

We used Bayesian Evolutionary Analysis by Sampling Trees (BEAST)⁵² version 1.8 to date the important nodes. The analyses were conducted on a subsample of 125 isolates from across the ML tree, covering the full temporal and geographic range of this pathogen. The concatenated 10,798 chromosomal SNP alignments of these 125 strains were subjected to multiple BEAST analyses with both constant-size and Bayesian skyline population size change models, in combination with either a strict molecular clock or a relaxed clock, to identify the best-fit model^22,53. For the BEAST analysis, the GTR+⌈ substitution model was selected and tip dates were defined as the year of isolation. For all model combinations, three independent chains of 100 million generations each were run to ensure convergence, with sampling every 1,000 iterations. Convergence and effective sample size (ESS) values were inspected using Tracer⁵² version 1.5. A marginal likelihood estimation was carried out, with path sampling and stepping stone sampling for each run that had converged, to compare the different combinations of clock and tree models^54,55. The marginal likelihood estimation was then used to determine which model gave the best fit, by calculating the Bayes factor. The relaxed, uncorrelated lognormal clock model, which allows evolutionary rates to vary among the branches of the tree, together with the skyline demographic model, proved a much better fit for the data, as found previously for S. sonnei²² and S. flexneri⁵³. The parameter and tree estimates of the three runs were combined with LogCombiner⁵² version 1.7.5, with the first 20% of states in each chain removed as burn-in. Maximum clade credibility (MCC) trees were generated with TreeAnnotator⁵² version 1.7.5 on the combined files and visualized with FigTree version 1.4.2. Estimates are reported as median values with the 95% highest posterior density (HPD, hereafter referred to as the credible interval). The Bayesian skyline plot was calculated and visualized with Tracer⁵² version 1.5, to investigate changes in the effective population size of Sd1 over time. To confirm the dating estimates, ten other random subsamples were generated from clusters calculated using the Prosperi method⁵⁶ (code at http://figshare.com/articles/clustertree.R_Code_for_clustering_phylogenetic_trees/97225) with a threshold of 0.03. All singleton isolates were included (n = 86), and one isolate from each of the 33 clusters was randomly selected to generate the ten subsamples. These alignments were analysed in BEAST using the same model and showed similar dating for each of the lineages (Supplementary Table 2).

Genetic analyses

In silico MLST was then carried out by MLST version 1.8 (https://cge.cbs.dtu.dk/services/MLST/) on assembled sequences for all the data set. New alleles were confirmed by Sanger sequencing and submitted to the MLST database website (http://mlst.warwick.ac.uk/mlst/).

The presence and type of ARG or ARG-containing structures (Fig. 3b and Supplementary Fig. 4) were determined with ResFinder⁵⁷ version 2.1 (https://cge.cbs.dtu.dk/services/ResFinder/), BLAST analysis against defined reference sequences (plasmids or chromosomally encoded structures), PlasmidFinder⁵⁸ version 1.3 (https://cge.cbs.dtu.dk/services/PlasmidFinder/), Plasmid MLST locus/sequence definitions database (http://pubmlst.org/plasmid/) and pMLST version 1.2 (https://cge.cbs.dtu.dk/services/pMLST/) on CLC or Velvet assemblies. The new alleles and STs of IncI⁵⁹ and IncN⁶⁰ plasmids have been deposited in the PubMLST database (http://pubmlst.org/plasmid/). The presence of mutations in the quinolone resistance determining region of the DNA gyrase and topoisomerase IV genes was determined manually on de novo assembled sequences. PacBio sequences were used to analyse the structure of the SRL-PAI variants and the composite transposon inserted into the chromosome in genome CDC 87-3330. The in silico results were compared with PCR data, when available.

Pan-genome analysis

Roary⁶¹ version 3.2.4 was used on Velvet-annotated assemblies, to construct a pan-genome. The pan-genome analysis identified genome 2735² as an outlier. Further investigation revealed an extreme AT bias, so this sample was excluded from subsequent analyses. A more sensitive annotation was performed on the resulting clusters of proteins with InterPro⁶², to provide Gene Ontology⁶³ classifications for each gene.

Accession codes

Short-read sequences have been deposited at EBI-ENA, under study accession numbers PRJEB10304, PRJEB2846 and PRJEB3255. PacBio sequences have been deposited at EBI-ENA, under study accession number PRJEB7928. Plasmid, SRL-PAI and Tn87-3330 sequences have been deposited in GenBank, under accession numbers KT754160–KT754167, KT777637–KT777641 and KT777642, respectively.

References

Kohn, G. C. (ed.) Encyclopedia of Plague and Pestilence (Facts on File New York, 1995).
Google Scholar
Rohmer, L. et al. Genomic analysis of the emergence of 20th century epidemic dysentery. BMC Genomics 15, 355 (2014).
Article Google Scholar
Castenbrandt, H. K. A forgotten plague: dysentery in Sweden, 1750–1900. Scand. J. Hist. 39, 612–639 (2014).
Article Google Scholar
Creighton, C. A History of Epidemics in Britain. Volume II: from the Extinction of Plague to the Present Time (Cambridge Univ. Press, 1894).
Google Scholar
Shiga, K. Ueber den erreger der dysenterie in Japan. Vorläufige mitteilung. Zentralbl. Bakteriol. Microbiol. Hyg. 23, 599–600 (1898).
Google Scholar
Manson-Bahr, P. H. Dysentery and diarrhoea in wartime. Br. Med. J. 2, 346–348 (1942).
Article Google Scholar
Ledingham, J. C. & Penfold, W. J. Recent bacteriological experiences with typhoidal disease and dysentery. Br. Med. J. 2, 704–711 (1915).
Article Google Scholar
Tribondeau, L. & Fichet, M. Note sur les dysenteries des Dardanelles. Ann. Inst. Pasteur (Paris) 30, 357–362 (1916).
Google Scholar
Pupo, G. M., Lan, R. & Reeves, P. R. Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc. Natl Acad. Sci. USA 97, 10567–10572 (2000).
Article Google Scholar
Mata, L. J., Gangarosa, E. J., Cáceres, A., Perera, D. R. & Mejicanos, M. L. Epidemic Shiga bacillus dysentery in Central America. I. Etiologic investigations in Guatemala, 1969. J. Infect. Dis. 122, 170–180 (1970).
Article Google Scholar
Parsonnet, J. et al. Shigella dysenteriae type 1 infections in US travellers to Mexico, 1988. Lancet 2, 543–545 (1989).
Article Google Scholar
Cobra, C. & Sack, D. A. The Control of Epidemic Dysentery in Africa: Overview, Recommendations, and Checklists (SD Publication Series, Technical Paper No. 27 (USAID, Bureau for Africa, Office of Sustainable Development, 1996); http://pdf.usaid.gov/pdf_docs/pnaby890.pdf
Rahaman, M. M., Khan, M. M., Aziz, K. M., Islam, M. S. & Kibriya, A. K. An outbreak of dysentery caused by Shigella dysenteriae type 1 on a coral island in the Bay of Bengal. J. Infect. Dis. 132, 15–19 (1975).
Article Google Scholar
Taylor, D. N. et al. Introduction and spread of multi-resistant Shigella dysenteriae 1 in Thailand. Am. J. Trop. Med. Hyg. 40, 77–85 (1989).
Article Google Scholar
Murray, G. R. E. More on bacterial longevity: the Murray collection. ASM News 51, 261–262 (1985).
Google Scholar
Baker, K. S. et al. The Murray collection of pre-antibiotic era Enterobacteriaceae: a unique research resource. Genome Med. 7, 97 (2015).
Article Google Scholar
Yang, F. et al. Genome dynamics and diversity of Shigella species, the etiologic agents of bacillary dysentery. Nucleic Acids Res. 33, 6445–6458 (2005).
Article Google Scholar
Vongsawan, A. A. et al. The genome of Shigella dysenteriae strain Sd1617 comparison to representative strains in evaluating pathogenesis. FEMS Microbiol. Lett. 362, pii: fnv011 (2015).
McKenzie, R. et al. Safety and immunogenicity of WRSd1, a live attenuated Shigella dysenteriae type 1 vaccine candidate. Vaccine 26, 3291–3296 (2008).
Article Google Scholar
Launay, O. et al. Safety and immunogenicity of SC599, an oral live attenuated Shigella dysenteriae type-1 vaccine in healthy volunteers: results of a phase 2, randomized, double-blind placebo-controlled trial. Vaccine 27, 1184–1191 (2009).
Article Google Scholar
Shiga, K. Observations on the epidemiology of dysentery in Japan. Philipp. J. Sci. 1, 485–500 (1906).
Google Scholar
Holt, K. E. et al. Shigella sonnei genome sequencing and phylogenetic analysis indicate recent global dissemination from Europe. Nature Genet. 44, 1056–1059 (2012).
Article Google Scholar
Robic, J. Une épidémie de dysenterie bacillaire à Madagascar (1927–1928). Bull. Soc. Path. Exot. 21, 709–713 (1928).
Google Scholar
Guerin, P. J., Grais, R. F., Rottingen, J. A. & Valleron, A. J. & Shigella Study Group. Using European travellers as an early alert to detect emerging pathogens in countries with limited laboratory resources. BMC Public Health 7, 8 (2007).
Article Google Scholar
Bercion, R. et al. Molecular epidemiology of multidrug-resistant Shigella dysenteriae type 1 causing dysentery outbreaks in Central African Republic, 2003–2004. Trans R. Soc. Trop. Med. Hyg. 100, 1151–1158 (2006).
Article Google Scholar
Luck, S. N., Turner, S. A., Rajakumar, K., Sakellaris, H. & Adler, B. Ferric dicitrate transport system (Fec) of Shigella flexneri 2a YSH6000 is encoded on a novel pathogenicity island carrying multiple antibiotic resistance genes. Infect. Immun. 69, 6012–6021 (2001).
Article Google Scholar
Mutreja, A. et al. Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477, 462–465 (2011).
Article Google Scholar
Wong, V. K. et al. Phylogeographical analysis of the dominant multidrug-resistant H58 clade of Salmonella Typhi identifies inter- and intracontinental transmission events. Nature Genet. 47, 632–639 (2015).
Article Google Scholar
Islam, M. S. et al. Microbiological investigation of diarrhoea epidemics among Rwandan refugees in Zaire. Trans. R. Soc. Trop. Med. Hyg. 89, 506 (1995).
Article Google Scholar
Kumarasamy, K. K. et al. Emergence of a new antibiotic resistance mechanism in India, Pakistan, and the UK: a molecular, biological, and epidemiological study. Lancet Infect. Dis. 10, 597–602 (2010).
Article Google Scholar
Murray, E. G. D. An attempt at classification of Bacillus dysenteriae, based upon an examination of the agglutinating properties of fifty-three strains. J. R. Army Med. Corps 31, 257–271 (1918).
Google Scholar
Langendorf, C. et al. Enteric bacterial pathogens in children with diarrhea in Niger: diversity and antimicrobial resistance. PLoS ONE 10, e0120275 (2015).
Article Google Scholar
Taddei, F., Matic, I. & Radman, M. cAMP-dependent SOS induction and mutagenesis in resting bacterial populations. Proc. Natl Acad. Sci. USA 92, 11736–11740 (1995).
Article Google Scholar
Picard, B. et al. Mutator natural Escherichia coli isolates have an unusual virulence phenotype. Infect. Immun. 69, 9–14 (2001).
Article Google Scholar
Grimont, F. & Grimont, P. A. D. Determination of rDNA gene restriction patterns. Methods Mol. Biol. 46, 181–200 (1995).
Google Scholar
Wirth, T. et al. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol. Microbiol. 60, 1136–1151 (2006).
Article Google Scholar
Fabre, L. et al. Chromosomal integration of the extended-spectrum beta-lactamase gene blaCTX-M-15 in Salmonella enterica serotype Concord isolates from internationally adopted children. Antimicrob. Agents Chemother. 53, 1808–1816 (2009).
Article Google Scholar
Turner, S. A., Luck, S. N., Sakellaris, H., Rajakumar, K. & Adler, B. Molecular epidemiology of the SRL pathogenicity island. Antimicrob. Agents Chemother. 47, 727–734 (2003).
Article Google Scholar
Carattoli, A. et al. Identification of plasmids by PCR-based replicon typing. J. Microbiol. Methods 63, 219–228 (2005).
Article Google Scholar
Aziz, R. K. et al. The RAST server: rapid annotation using subsystems technology. BMC Genomics 9, 75 (2008).
Article Google Scholar
Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Article Google Scholar
Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).
Article Google Scholar
Boetzer, M. & Pirovano, W. Toward almost closed genomes with GapFiller. Genome Biol. 13, R56 (2012).
Article Google Scholar
Seemann, T. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069 (2014).
Article Google Scholar
Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).
Article Google Scholar
Carver, T. et al. Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. Bioinformatics 24, 2672–2676 (2008).
Article Google Scholar
Croucher, N. J. et al. Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res. 43, e15 (2015).
Article Google Scholar
Tamura, K., Stecher, G., Peterson, D., Filipski, A. & Kumar, S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol. Biol. Evol. 30, 2725–2729 (2013).
Article Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).
Article Google Scholar
Letunic, I. & Bork, P. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 39, W475–W478 (2011).
Article Google Scholar
Cheng, L., Connor, T. R., Siren, J., Aanensen, D. M. & Corander, J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol. Biol. Evol. 30, 1224–1228 (2013).
Article Google Scholar
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
Article Google Scholar
Connor, T. R. et al. Species-wide whole genome sequencing reveals historical global spread and recent local persistence in Shigella flexneri. eLife 4, e07335 (2015).
Article Google Scholar
Baele, G. et al. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol. Biol. Evol. 29, 2157–2167 (2012).
Article Google Scholar
Baele, G., Li, W. L., Drummond, A. J., Suchard, M. A. & Lemey, P. Accurate model selection of relaxed molecular clocks in Bayesian phylogenetics. Mol. Biol. Evol. 30, 239–243 (2013).
Article Google Scholar
Prosperi, M. C. et al. A novel methodology for large-scale phylogeny partition. Nature Commun. 2, 321 (2011).
Article Google Scholar
Zankari, E. et al. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67, 2640–2644 (2012).
Article Google Scholar
Carattoli, A. et al. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob. Agents Chemother. 58, 3895–3903 (2014).
Article Google Scholar
García-Fernández, A. et al. Multilocus sequence typing of IncI1 plasmids carrying extended-spectrum beta-lactamases in Escherichia coli and Salmonella of human and animal origin. J. Antimicrob. Chemother. 61, 1229–1233 (2008).
Article Google Scholar
García-Fernández, A. et al. Multilocus sequence typing of IncN plasmids. J. Antimicrob. Chemother. 66, 1987–1991 (2011).
Article Google Scholar
Page, A. J. et al. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31, 3691–3693 (2015).
Article Google Scholar
Mitchell, A. et al. The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 43, D213–D221 (2015).
Article Google Scholar
Gene Ontology Consortium. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, D1049–D1056 (2015).

Download references

Acknowledgements

This study was supported by the Institut Pasteur and the Institut Pasteur International Network, the Institut de Veille Sanitaire, the French government's Investissement d'Avenir programme, Laboratoire d'Excellence ‘Integrative Biology of Emerging Infectious Diseases’ (grant no. ANR-10-LABX-62-IBEID), the Fondation ‘Le Roch-Les Mousquetaires’, the Canetti family through the Georges, Jacques et Elias Canetti Award 2013, the Wellcome Trust through grant 098051 to the Sanger Institute, the NHMRC of Australia (grant 1061409 to K.E.H), the Victorian Life Sciences Computation Initiative (VLSCI) (grant VR0082) and the Indian Council of Medical Research, New Delhi, India. The authors thank A. Dautry-Versat, A.P. Pugsley, C. Bréchot and J. Savall for their support; T. Hieu, C. Soto, E. Bourreterre and B. Faye for technical assistance; Z. Szabó, D. Tremblay for providing isolates; L.R. Hiltzik, N. Baldwin and C. Mackenzie for their searches of the archives; M. Toucas, H. d'Hauteville, E. Aldová, S. Formal, and A.T. Maurelli for information about isolates; D. Nedelec for discussion, I. Gut, M. Gut, L. Ma, D. Harris, K. Oliver and the sequencing teams at the Institut Pasteur and Wellcome Trust Sanger Institute for sequencing the samples. The views expressed in this publication are those of the authors and do not reflect the views of the US Department of the Army or Department of Defense. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Elisabeth Njamkepo, Nizar Fawal and Alicia Tran-Dien: These authors contributed equally to this work.

Authors and Affiliations

Institut Pasteur, Unité des Bactéries Pathogènes Entériques, 75724 Paris Cedex 15, France
Elisabeth Njamkepo, Nizar Fawal, Alicia Tran-Dien, Marie Accou-Demartin, Isabelle Carle, Monique Lejay-Collin, Simon Le Hello, Corinne Ruckly, Malika Gouali, Francine Grimont, Patrick A. D. Grimont & François-Xavier Weill
Centre for Systems Genomics, University of Melbourne, Parkville, 3010, Victoria, Australia
Jane Hawkey & Kathryn E. Holt
Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, 3010, Victoria, Australia
Jane Hawkey & Kathryn E. Holt
School of Agriculture and Veterinary Science, University of Melbourne, Parkville, 3010, Victoria, Australia
Jane Hawkey
Centers for Disease Control and Prevention, Escherichia and Shigella Reference Unit, Atlanta, 30333, Georgia, USA
Nancy Strockbine
Public Health England, Gastrointestinal Bacteria Reference Unit, Colindale NW9 5HT, UK
Claire Jenkins
icddr,b, Enteric and Food Microbiology Laboratory, Dhaka 1212, Bangladesh
Kaisar A. Talukder
Institut Pasteur de Bangui, BP 923, Bangui, République Centrafricaine
Raymond Bercion
Institut Pasteur de Dakar, BP 220, Dakar, Senegal
Raymond Bercion & Amy Gassama Sow
Federal Budget Institute of Science, Central Research Institute for Epidemiology, Moscow 111123, Russia
Konstantin Kuleshov, Alexander Podkolzin & German A. Shipulin
Czech National Collection of Type Cultures (CNCTC), National Institute of Public Health, Prague 10, Czech Republic
Renáta Kolínská & Helena Žemličková
Public Health England, National Collection of Type Cultures, Porton Down SP4 0JG, UK
Julie E. Russell
Pasteur Institute of St Petersburg, St Petersburg 197101, Russia
Lidia Kaftyreva, Svetlana Egorova & Mariia Makarova
Department of Medical Microbiology, University of KwaZulu-Natal, Durban 4041, South Africa
Andreas Karas
Department of Microbiology, LHUB-ULB, Brussels University Hospitals Laboratory, 1000 Brussels, Belgium
Olivier Vandenberg
Environmental Health Research Centre, Public Health School, Université Libre de Bruxelles, 1070 Brussels, Belgium
Olivier Vandenberg
Wellcome Trust Sanger Institute, Cambridge CB10 1SA, UK
Alison E. Mather, Andrew J. Page, Martin Aslett, Simon R. Harris, Julian Parkhill, Nicholas R. Thomson & François-Xavier Weill
Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK
Alison E. Mather
Armed Forces Research Institute of Medical Sciences (AFRIMS), Bangkok 10400, Thailand
Carl J. Mason & Ladaporn Bodhidatta
National Institute of Cholera and Enteric Diseases (NICED), Kolkata, 700010, West Bengal, India
Thandavarayan Ramamurthy & Gururaja P. Pazhani
Institut Pasteur, Collection de l'Institut Pasteur (CIP), 75724 Paris Cedex 15, France
Chantal Bizet & Dominique Clermont
Polish Collection of Microorganisms, Institute of Immunology and Experimental Therapy, 53-114 Wroclaw, Poland
Andrzej Gamian & Agnieszka Korzeniowska-Kowal
Institut Pasteur, Plate-forme Génomique (PF1), 75724 Paris Cedex 15, France
Christiane Bouchier
Department of Foodborne Infections, Norwegian Institute of Public Health, Nydalen 0403, Oslo, Norway
Astrid Louise Wester
Centre Pasteur du Cameroun, BP 1274, Yaoundé, Cameroon
Marie-Christine Fonkoua & Antoinette Ngandjio
Departments of Medicine and Microbiology, New York University Langone Medical Center, New York, 10016, New York, USA
Martin J. Blaser & Guillermo Perez-Perez
Public Health Agency of Sweden, 17182 Solna, Sweden
Cecilia Jernberg
Biology Department and Infection Control Unit, Bégin Military Hospital, 94160 Saint-Mandé, France
Audrey Mérens
Epicentre, 75011 Paris, France
Anne-Laure Page & Céline Langendorf
Institut für Hygiene und Umwelt, 20539 Hamburg, Germany
Peter Roggentin
Divison of Enteropathogenic Bacteria and Legionella, Robert Koch Institut, 38855 Wernigerode, Germany
Angelika Fruth
INSERM, IAME, UMR 1137, Univ. Paris Diderot, IAME, UMR 1137, Sorbonne Paris Cité, 75018 Paris, France
Erick Denamur
Bacterial Diseases Branch, Walter Reed Army Institute of Research, Silver Spring, 20910, Maryland, USA
Malabi Venkatesan
Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem 91120, Israel
Hervé Bercovier
Center of Research and Diagnostics, Centers for Disease Control, Taichung 40855, Taiwan
Chien-Shun Chiou
Dipartimento di Biologia e Biotecnologie C Darwin, Istituto Pasteur-Fondazione Cenci Bolognetti, Sapienza Università di Roma, 00185, Roma, Italy
Bianca Colonna
Central Laboratories, Ministry of Health, Jerusalem 91342, Israel
Analia V. Ezernitchi
Institut Pasteur, Genotyping of Pathogens and Public Health Platform, 75724 Paris Cedex 15, France
Ghislaine Guigon & Sylvain Brisse
Department of Bacteriology I, National Institute of Infectious Diseases, Tokyo, 162-8640, Japan
Hidemasa Izumiya
Department of Sera and Vaccines Evaluation, National Institute of Public Health–National Institute of Hygiene, 00-791 Warsaw, Poland
Anna Lutyńska
National Reference Laboratory for E. coli and Shigella, National Institute of Public Health, Prague 10, Czech Republic
Monika Marejková
National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, R3E 3R2, Manitoba, Canada
Lorea A.M. Peterson
Institut Pasteur, Bioinformatics platform, 75724 Paris Cedex 15, France
Erika Souche
State Key Laboratory of Infectious Disease Prevention and Control, National Institute for Communicable Disease Control and Prevention, China CDC, Beijing 102206, China
Changyun Ye
Department of Clinical Microbiology, Faculty of Medicine and University Hospital, Charles University, 500 05, Hradec Kralove, Czech Republic
Helena Žemličková
Hungarian National Collection of Medical Bacteria, National Center for Epidemiology, H-1097 Budapest, Hungary
Mária Herpay
Institut Pasteur, Unité de Pathogénie Microbienne Moléculaire, 75724 Paris Cedex 15, France
Philippe Sansonetti
Institut Pasteur, Microbial Evolutionary Genomics Unit, 75724 Paris Cedex 15, France
Sylvain Brisse
CNRS, UMR 3525, 75015 Paris, France
Sylvain Brisse
London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
Nicholas R. Thomson

Authors

Elisabeth Njamkepo
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Fawal
View author publications
You can also search for this author in PubMed Google Scholar
Alicia Tran-Dien
View author publications
You can also search for this author in PubMed Google Scholar
Jane Hawkey
View author publications
You can also search for this author in PubMed Google Scholar
Nancy Strockbine
View author publications
You can also search for this author in PubMed Google Scholar
Claire Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Kaisar A. Talukder
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Bercion
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Kuleshov
View author publications
You can also search for this author in PubMed Google Scholar
Renáta Kolínská
View author publications
You can also search for this author in PubMed Google Scholar
Julie E. Russell
View author publications
You can also search for this author in PubMed Google Scholar
Lidia Kaftyreva
View author publications
You can also search for this author in PubMed Google Scholar
Marie Accou-Demartin
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Karas
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Vandenberg
View author publications
You can also search for this author in PubMed Google Scholar
Alison E. Mather
View author publications
You can also search for this author in PubMed Google Scholar
Carl J. Mason
View author publications
You can also search for this author in PubMed Google Scholar
Andrew J. Page
View author publications
You can also search for this author in PubMed Google Scholar
Thandavarayan Ramamurthy
View author publications
You can also search for this author in PubMed Google Scholar
Chantal Bizet
View author publications
You can also search for this author in PubMed Google Scholar
Andrzej Gamian
View author publications
You can also search for this author in PubMed Google Scholar
Isabelle Carle
View author publications
You can also search for this author in PubMed Google Scholar
Amy Gassama Sow
View author publications
You can also search for this author in PubMed Google Scholar
Christiane Bouchier
View author publications
You can also search for this author in PubMed Google Scholar
Astrid Louise Wester
View author publications
You can also search for this author in PubMed Google Scholar
Monique Lejay-Collin
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Christine Fonkoua
View author publications
You can also search for this author in PubMed Google Scholar
Simon Le Hello
View author publications
You can also search for this author in PubMed Google Scholar
Martin J. Blaser
View author publications
You can also search for this author in PubMed Google Scholar
Cecilia Jernberg
View author publications
You can also search for this author in PubMed Google Scholar
Corinne Ruckly
View author publications
You can also search for this author in PubMed Google Scholar
Audrey Mérens
View author publications
You can also search for this author in PubMed Google Scholar
Anne-Laure Page
View author publications
You can also search for this author in PubMed Google Scholar
Martin Aslett
View author publications
You can also search for this author in PubMed Google Scholar
Peter Roggentin
View author publications
You can also search for this author in PubMed Google Scholar
Angelika Fruth
View author publications
You can also search for this author in PubMed Google Scholar
Erick Denamur
View author publications
You can also search for this author in PubMed Google Scholar
Malabi Venkatesan
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Bercovier
View author publications
You can also search for this author in PubMed Google Scholar
Ladaporn Bodhidatta
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Shun Chiou
View author publications
You can also search for this author in PubMed Google Scholar
Dominique Clermont
View author publications
You can also search for this author in PubMed Google Scholar
Bianca Colonna
View author publications
You can also search for this author in PubMed Google Scholar
Svetlana Egorova
View author publications
You can also search for this author in PubMed Google Scholar
Gururaja P. Pazhani
View author publications
You can also search for this author in PubMed Google Scholar
Analia V. Ezernitchi
View author publications
You can also search for this author in PubMed Google Scholar
Ghislaine Guigon
View author publications
You can also search for this author in PubMed Google Scholar
Simon R. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Hidemasa Izumiya
View author publications
You can also search for this author in PubMed Google Scholar
Agnieszka Korzeniowska-Kowal
View author publications
You can also search for this author in PubMed Google Scholar
Anna Lutyńska
View author publications
You can also search for this author in PubMed Google Scholar
Malika Gouali
View author publications
You can also search for this author in PubMed Google Scholar
Francine Grimont
View author publications
You can also search for this author in PubMed Google Scholar
Céline Langendorf
View author publications
You can also search for this author in PubMed Google Scholar
Monika Marejková
View author publications
You can also search for this author in PubMed Google Scholar
Lorea A.M. Peterson
View author publications
You can also search for this author in PubMed Google Scholar
Guillermo Perez-Perez
View author publications
You can also search for this author in PubMed Google Scholar
Antoinette Ngandjio
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Podkolzin
View author publications
You can also search for this author in PubMed Google Scholar
Erika Souche
View author publications
You can also search for this author in PubMed Google Scholar
Mariia Makarova
View author publications
You can also search for this author in PubMed Google Scholar
German A. Shipulin
View author publications
You can also search for this author in PubMed Google Scholar
Changyun Ye
View author publications
You can also search for this author in PubMed Google Scholar
Helena Žemličková
View author publications
You can also search for this author in PubMed Google Scholar
Mária Herpay
View author publications
You can also search for this author in PubMed Google Scholar
Patrick A. D. Grimont
View author publications
You can also search for this author in PubMed Google Scholar
Julian Parkhill
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Sansonetti
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn E. Holt
View author publications
You can also search for this author in PubMed Google Scholar
Sylvain Brisse
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas R. Thomson
View author publications
You can also search for this author in PubMed Google Scholar
François-Xavier Weill
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.B., P.A.D.G., S.B., N.R.T. and F.-X.W. designed the study. N.S., C.J., K.A.T., R.B., K.K., R.K., J.E.R., L.K., A.K., O.V., C.J.M., T.R., C. Bizet, A.G.S., A.G., A.L.W., M.-C.F., S.L.H., M.J.B., C.J., A.M., A.-L.P., P.R., A.F., E.D., M.V., H.B., M.H., P.A.D.G., P.S., L.B., C.-S.C., D.C., B.C., S.E., G.P.P., A.V.E., H.I., A.K.-K., A.L., M.G., F.G., C.L., M.M., L.A.M.P., G.P.-P., A.P., G.A.S., D.T., C.Y., H.Z., P.S. and F.-X.W. selected and provided characterized isolates and their epidemiological information. E.N., M.L.-C., I.C., C.R., A.T.-D., M.A.-D. and L.B. did the phenotypic experiments and DNA extractions. A.E.M. and S.R.H. provided guidance for genomic analyses. C. Bouchier performed the whole-genome sequencing. M.A. processed the short reads. E.N., N.F., K.K., S.B., K.E.H., J.H., A.J.P., G.G., E.S. and F.-X.W. analysed the genomic sequence data. F.-X.W. wrote the manuscript with major contributions from A.E.M., P.A.D.G., E.D., J.P., P.S., K.E.H., S.B. and N.R.T. All authors contributed to manuscript editing. F.-X.W. oversaw the project.

Corresponding author

Correspondence to François-Xavier Weill.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-16, Table 3, Discussion and References (PDF 4476 kb)

Supplementary Table 1

Details of Shigella dysenteriae type 1 isolates and genomes used in this study. (XLSX 181 kb)

Supplementary Table 2

Whole-genome sequences, SNPs and phylogenetic data used in this study. (XLSX 15396 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Njamkepo, E., Fawal, N., Tran-Dien, A. et al. Global phylogeography and evolutionary history of Shigella dysenteriae type 1. Nat Microbiol 1, 16027 (2016). https://doi.org/10.1038/nmicrobiol.2016.27

Download citation

Received: 10 October 2015
Accepted: 03 February 2016
Published: 21 March 2016
DOI: https://doi.org/10.1038/nmicrobiol.2016.27

This article is cited by

Population structure analysis and laboratory monitoring of Shigella by core-genome multilocus sequence typing
- Iman Yassine
- Sophie Lefèvre
- François-Xavier Weill
Nature Communications (2022)
Pathogenomic analyses of Shigella isolates inform factors limiting shigellosis prevention and control across LMICs
- Rebecca J. Bengtsson
- Adam J. Simpkin
- Kate S. Baker
Nature Microbiology (2022)
Antibiotic susceptibility and genetic relatedness of Shigella species isolated from food and human stool samples in Qazvin, Iran
- Babak Pakbin
- Abdollah Didban
- Mohammad Reza Modabber
BMC Research Notes (2021)
Twenty-year trends in antimicrobial resistance from aquaculture and fisheries in Asia
- Daniel Schar
- Cheng Zhao
- Thomas P. Van Boeckel
Nature Communications (2021)
Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
- Sebastian Duchene
- David A. Duchene
- Kathryn E. Holt
BMC Evolutionary Biology (2018)