The phylogeography and incidence of multi-drug resistant typhoid fever in sub-Saharan Africa

There is paucity of data regarding the geographical distribution, incidence, and phylogenetics of multi-drug resistant (MDR) Salmonella Typhi in sub-Saharan Africa. Here we present a phylogenetic reconstruction of whole genome sequenced 249 contemporaneous S. Typhi isolated between 2008-2015 in 11 sub-Saharan African countries, in context of the 2,057 global S. Typhi genomic framework. Despite the broad genetic diversity, the majority of organisms (225/249; 90%) belong to only three genotypes, 4.3.1 (H58) (99/249; 40%), 3.1.1 (97/249; 39%), and 2.3.2 (29/249; 12%). Genotypes 4.3.1 and 3.1.1 are confined within East and West Africa, respectively. MDR phenotype is found in over 50% of organisms restricted within these dominant genotypes. High incidences of MDR S. Typhi are calculated in locations with a high burden of typhoid, specifically in children aged <15 years. Antimicrobial stewardship, MDR surveillance, and the introduction of typhoid conjugate vaccines will be critical for the control of MDR typhoid in Africa.

T yphoid fever is a systemic infection primarily caused by the bacterium Salmonella enterica serovar Typhi (S. Typhi). The organism only infects humans, with the disease being contracted by the ingestion of bacteria through contaminated food or water. The vast majority of the global burden of disease (21.7 million estimated cases annually with 217,000 fatalities) 1 is thought to arise in urban areas in low-middle income countries (LMICs) in South and Southeast Asia, but more recent data have shown a substantial burden of disease in urban and rural areas of sub-Saharan Africa 2 . Between 2010 and 2014, the Typhoid Fever Surveillance in Africa Programme (TSAP) conducted population-based surveillance for typhoid fever in thirteen sites in ten sub-Saharan African countries 3 . The TSAP study, which recruited 13,431 febrile patients, isolated 135 S. Typhi from nine countries and found notably high incidences of typhoid fever in Burkina Faso, Ghana, and Kenya 2 .
Many antimicrobials remain effective for the treatment of typhoid fever. However, S. Typhi that exhibit resistance to empirical antimicrobials hamper successful therapy 4 . The phenomenon of antimicrobial resistance (AMR) in S. Typhi has been well described, and resistance to the traditional first-line antimicrobials, ampicillin, chloramphenicol, and trimethoprimsulfamethoxazole (co-trimoxazole), were associated with large outbreaks in Asia in the 1980s and 1990s 5,6 . The emergence of resistance to these first-line antimicrobials in Asia, which was dominated by the H58 genotype (now renamed 4.3.1) 7,8 , led to a change in typhoid treatment guidelines, with fluoroquinolones becoming the empirical choice for MDR infections 9,10 . However, this shift towards the more common use of fluoroquinolones was inevitably followed by a decline in susceptibility to this group of antimicrobials 4,11 .
Recent phylogenetic analyses further suggest that the multidrug resistant (MDR) S. Typhi genotype 4.3.1 dominates and circulates across Southeast (lineage I: Vietnam, Cambodia, and Laos) and South Asia (lineage II: mostly India with clusters in Nepal and Pakistan) 12 . Additionally, these 4.3.1 S. Typhi have transferred from South Asia into Eastern and Southern Africa (lineages I and II; Kenya, Tanzania, Malawi, South Africa) [12][13][14] . The characteristics of 4.3.1 S. Typhi define this genotype as a key driving force in global MDR S. Typhi, as intercontinental transmission, regional circulation, and multiple localised outbreaks over the last three decades are distinct from the evolutionary trends and population structure of other extent S. Typhi genotypes 12,15 . Despite the known circulation of 4.3.1 S. Typhi in sub-Saharan Africa, there is a paucity of data regarding the geographical distribution of AMR genotypes (MDR and reduced fluoroquinolone susceptibility), their phylogenetic structure, and the incidence of MDR typhoid fever across the African continent. Here, we aimed to investigate the phylogeography and incidence of MDR S. Typhi across sub-Saharan Africa, utilizing organisms generated through the TSAP initiative 2,3 and additional typhoid fever studies conducted in Ghana, Uganda, and The Gambia.

Results
Geographical distribution of S. Typhi genotypes in Africa. Phylogenetic analysis of 249 contemporary African S. Typhi genome sequences combined with 2,057 existing S. Typhi genome sequences (including 504 from Africa) permitted a visualisation of these new African isolates within a global S. Typhi genomic framework (Fig. 1). The primary observation was that these 249 contemporary African S. Typhi sequences were distributed throughout this framework, with multiple lineages found to be circulating simultaneously across sub-Saharan Africa in the last decade. With TSAP providing expansive sampling across the continent, we observed a substantial degree of genetic diversity, with 12 different S. Typhi genotypes represented in 11 different typhoid endemic countries (Fig. 2). This distribution of genotypes ranged from single organisms in particular countries (for example: The Gambia, Kenya, and Uganda) to numerous closely related organism clusters isolated in several countries (Supplementary Table 1).
In contrast, the 3.1.1 MDR S. Typhi from Ghana (68 isolates) represented a population that was found only in West Africa, with the resulting phylogeny showing no evidence for intercontinental transmission as observed for 4.3.1 (Table 1). Rather, 3.1.1 S. Typhi could be better defined as a repeating pattern of small country specific population expansions with organisms being regularly transferred between countries (Fig. 3b). Phylogeographical reconstruction has not previously been performed for S. Typhi 3.1.1, therefore we conducted a Bayesian spatiotemporal phylodynamics analysis for the subclade using BEAST2 (Fig. 3b). The results suggest that Ghana was the most likely recent source of this 3.1.1 S. Typhi population (posterior probability = 0.66) which emerged de novo, and the corresponding source of three major clusters, which then radiated into other nearby countries on multiple occasions. Notably, Ghanaian S. Typhi appear to have been the probable origin of 3.1.1 S. Typhi in Burkina Faso on at least two separate occasions. Furthermore, existing whole genome sequences of 131 S. Typhi from Nigeria, including two isolates from travellers returning to the United Kingdom from Nigeria, demonstrated that 3.1.1 S. Typhi has been introduced into Nigeria from Ghana on at least two separate occasions. One of these events, estimated to be between 2010 and 2011, formed a major population expansion encompassing the majority (76/86; 88%) of the isolates from Nigeria.
Outliers included: non-MDR S. Typhi isolates from Burkina Faso (genotype 2.2) with an IncX1 plasmid containing no resistance genes and Ghana (genotype 3. Typhi isolates unique to this study (highlighted by the blue points) combined with 2,057 global S. Typhi isolates. The tree is adjacent to three concentric circles highlighting associated metadata. The inner most circle represents the three most predominant genotypes (colour coded according to top of key), the middle circle represents the geographical sub-regions of Africa from where the S. Typhi organisms were isolated (colour coded according to top of key), and the outer circle (blue) again highlights the organisms unique to this study. The scale bar indicates the number of substitutions per variable site sulfonamide (sul1 and sul2), and tetracycline (tet(A) and tet(D)) ( Fig. 4). Additionally, none of the four MDR organisms from Tanzania possessed a detectable plasmid backbone. Using Bandage to investigate the location of MDR cassettes, we found that these isolates carried multiple resistance genes (aph(3'')-Ib, aph(6)-Id), TEM-95/-93, catA1, dfrA7, sul1, sul2) on a 24-kb composite chromosomal transposon (Tn2670-like element) inserted between coding sequences STY3618 and STY3619 12 .
In total, 16% (39/249) of the contemporaneous African S. Typhi exhibited reduced susceptibility against ciprofloxacin (9 from Kenya and 30 from Uganda). The Kenyan organisms exhibited the common mutation associated with reduced susceptibility to fluoroquinolones in S. Typhi, a substitution from serine to phenylalanine at codon 83 (Ser83Phe) in gyrA. The Ugandan organisms harboured an alternative serine to tyrosine gyrA mutation also at codon 83 (Ser83Tyr) ( Table 1).

Discussion
Here we present a contemporary dataset of S. Typhi genome sequences and AMR data from across sub-Saharan Africa generated through a major population-based surveillance study with data augmented from further locations. We exploited these data to assess the circulation of MDR S. Typhi genotypes and to calculate the incidence of MDR typhoid infections across the continent. Our results have major implications for the use of empirical antimicrobials for treating febrile disease of presumed bacterial origin and future intervention measures for controlling typhoid in Africa.       17 . After the likely importation from South Asia within the last 20 years, the extant population of S. Typhi 4.3.1 in Kenya, Tanzania, and Uganda has been formed through multiple introductions from South Asia followed by local expansions. Conversely, S. Typhi 3.1.1, which were isolated in Ghana, Burkina Faso, and Nigeria, do not appear to have recent ancestral roots in Asia, but have undergone localised microevolution within West Africa in recent decades. We speculate that these organisms have been transferred, maintained, and selected through the sustained movement of people and antimicrobial usage in West Africa. The MDR 4.3.1 S. Typhi from Kenya and Uganda also commonly exhibited mutations in gyrA, associated with reduced susceptibility to fluoroquinolones, which has also been reported in Africa in recent years. Conversely, no gyrA mutations were found in the MDR S. Typhi 3.1.1 from Ghana. These data mirror recent reports from Nigeria 17 , and suggest that first-line antimicrobial agents (ampicillin, chloramphenicol, and co-trimoxazole) for the treatment of febrile diseases are still in common use in West Africa.
The acquisition of an MDR phenotype in S. Typhi is typically associated with IncHI1 plasmids, which have long been considered the main vehicle for resistance to first-line antimicrobials in S. Typhi 8 . The distinct MDR lineages of S. Typhi found in West and East Africa, each associated with a distinct IncHI1 plasmid sequence type, suggest that S. Typhi and its AMR plasmids have not been transferred laterally across the continent. This may be because genotype 4.3.1 MDR S. Typhi has not been circulating for a sufficient period in Africa to reach the West African region. Furthermore, the four MDR S. Typhi isolates from Tanzania did not harbour plasmid-associated sequences, suggesting that these AMR genes are inserted into the chromosome, as has been observed previously in Asia 12,18,19 and Zambia 20 . The integration of AMR genes into the S. Typhi chromosome is a worrying development, as it provides a mechanism for stable vertical transmission of the MDR phenotype without the potential fitness deficit associated with maintaining large plasmids, increasing the likelihood that MDR will be sustained during the ongoing spread of related S. Typhi across East Africa.
Here we identified specific populations that are most at risk of MDR typhoid, which particularly warrants a reconsideration of current empirical antimicrobial use for treatment of typhoid. Generally, we found that the site incidences of MDR S. Typhi corresponded largely with the overall burden of typhoid in the various study sites 2 (that is, countries with high incidences of typhoid also had high incidences of MDR S. Typhi). Consequently, Kenya and Ghana exhibited the highest incidences of MDR typhoid in the sampled countries. Notably, Burkina Faso, which had a high burden of typhoid, had no incidence of MDR S. Typhi in comparison to neighbouring Ghana. Further, we found that children aged <15 years, the highest at-risk age group for typhoid in Africa, also generally exhibited the highest incidence rates of MDR S. Typhi infections. This age distribution of typhoid caused by MDR S. Typhi was not consistent across the continent, as those aged >15 years in Tanzania exhibited a higher incidence of MDR S. Typhi than younger children. Alternatively, some sites with a high burden of typhoid in specific age groups had no MDR infections. We suggest that this distribution is likely to mirror access to, and the generic usage of, specific antimicrobial agents in these locations and age groups, warranting the need for continued country/site-specific surveillance, review of local treatment policies, and the collection of antimicrobial usage data.
The incidence of MDR typhoid varied dramatically between settings and also between age groups in some individual locations. This discrepancy may be due to differing exposures to antimicrobials in different settings and age groups, which could lead to differential selective pressures in local circulating bacterial populations. Our data additionally indicate that AMR/MDR S. Typhi are not only spread through local population movements in East and West Africa but can also arise de novo. This phenomenon can be observed within the microevolution and expansion of 3.1.1 MDR S. Typhi in West Africa. The AMR genes associated with 4.3.1 MDR S. Typhi in East Africa appear to be both plasmid and chromosomally located. This observation, coupled with the acquisition of reduced susceptibility to fluoroquinolones, transmission between East African countries, and the importation of organisms from South Asia, raises further concerns regarding the progression of drug resistant S. Typhi in Africa. 4.3.1 S. Typhi has spread successfully cross South Asia and become increasingly resistant to ciprofloxacin, making treatment options more limited 4 . The pervasiveness of AMR in 4.3.1 S. Typhi in South Asia has been recently highlighted by an outbreak of a ceftriaxone-resistant 4.3.1 S. Typhi in Hyderabad, Pakistan, which appears to be resistant to commonly available antimicrobial classes 21 . We predict that new AMR phenotypes that emerge in 4.3.1 S. Typhi in Asia can be periodically introduced into East Africa. Further, the emergence of MDR S. Typhi 4.3.1 in South Africa suggests possible spread from East Africa to Southern Africa through human population movement, however this notion requires further investigation 22 .
This study highlights locations in sub-Saharan Africa where MDR typhoid is prevalent and where future activities to control its spread from Asia into Africa and also within Africa could be focused. In addition to continuing disease surveillance and investigating the genomic characteristics and phenotypic profiles of MDR S. Typhi, compiling antimicrobial usage data that can be linked with the distribution of AMR/MDR bacterial pathogens across Africa is becoming essential. The World Health Organization (WHO) has prequalified a typhoid conjugate vaccine (TCV) in January 2018 with a recommendation to introduce the vaccine for infants and children older than six months in typhoid endemic countries 23 . Targeted vaccination programs at sites with a high burden of AMR/MDR S. Typhi could also be considered and may be informed by the age-stratified MDR disease incidence data presented here. New and potentially highly efficacious S. Typhi conjugate vaccines are currently undergoing clinical trials and should become routinely available at the end of this decade 23 . Until these vaccines become available, countries in Africa with endemic typhoid should structure antimicrobial stewardship policies to control MDR S. Typhi and develop national roadmaps for their deployment.

Methods
Bacterial isolates and antimicrobial susceptibility testing. Between 2010 and 2014, a population-based surveillance of invasive Salmonella infections was conducted in ten sub-Saharan countries (see Supplementary Table 2) 2 . The research methodology including ethics approvals, sampling framework, and calculation of disease incidence of this programme have been previously reported 3 . Briefly, over the TSAP sampling period, blood culture-based surveillance was conducted in defined catchment areas. Cultured isolates were assessed for antimicrobial susceptibilities by the disc diffusion method locally and at a central reference laboratory. TSAP recruited 13,558 patients meeting the study inclusion criteria, of which 127 patients were excluded due to incomplete data. This resulted in 13,431 patients and 135 S. Typhi found in 9 countries for analysis 2 . We also included 114 additional S. Genome sequencing and SNP calling. Genomic DNA from the 249 S. Typhi isolates was extracted using the Wizard Genomic DNA Extraction Kit (Promega, Wisconsin, USA). Two μg of genomic DNA from each organism was subjected to indexed-tagged pair-end sequencing on an Illumina Hiseq 2000 platform (Illumina, CA, USA) to generate 100 bp paired-end reads. To identify single nucleotide polymorphisms (SNPs), raw Illumina reads were mapped to the reference sequence of S. Typhi CT18 (accession: AL513382) including plasmids pHCM1 (accession: AL513383) and pHCM2 (accession: AL513384), using SMALT version 0.7.4. Candidate SNPs were called against the reference sequence using SAMtools 24 and filtered with a minimal phred quality of 30 and minimum consensus base agreement of 75%. The allele at each locus in each isolate was determined by reference to the consensus base in that genome using SAMtools mpileup and removing low confidence alleles with consensus base quality ≤20, read depth ≤5 or a heterozygous base call. SNPs in phage regions, repetitive sequences, or recombinant regions identified previously were excluded 12,25 . We further identified an additional recombinant region from the whole genome alignment produced by SNP-calling isolates using Gubbins 26 and SNPs detected within this region (~20kb from nucleotide 1,439,032-1,459,472) were removed, resulting in a final set of 4,444 chromosomal SNPs. The SNP data were used to assign all isolates to previously defined subclades in the S. Typhi genotyping framework 15 .
Phylogenetic analysis. A maximum likelihood (ML) phylogenetic tree was constructed from the 4,444 SNP alignment using RAxML version 8.2.8 with a generalized time-reversible model and a Gamma distribution to model the site-specific rate variation (GTR+Γ 4 model) 27 . Branch support for this tree was assessed through a bootstrap analysis with 1,000 pseudo-replicates. To investigate the molecular epidemiology of our African isolates in regional and international context, a secondary ML tree was inferred from a separate alignment of 26,479 SNPs identified across a total of 2,306 S. Typhi isolates (249 from this study, 1,830 from the global collection 12 , 128 from Nigeria 17 , and 99 travel-associated S. Typhi organisms isolated in the United Kingdom 15 ) using RAxML with GTRGAMMA substitution model and S. Paratyphi A sequence data to outgroup root the tree. Branch support for this phylogeny was assessed through a 100 bootstrap pseudo-analysis. Annotation of this global tree was visualized using ITOL 28 . An interactive version of the global phylogeny, with organisms labeled by genotype, country of origin, year of isolation and antimicrobial susceptibility was generated in Microreact 29 .
Evolutionary timescale and phylogeographic patterns. For genotype 3.1.1 strains, Bayesian phylogenetic analyses was conducted in BEAST2 v2.4.7 30 . The GTR+Γ 4 substitution model, an uncorrelated lognormal relaxed-clock model, and the exponential-growth coalescent tree prior were used. Three independent analyses were performed with 5×10 8 steps, recording samples every 5×10 4 steps. We assessed sufficient sampling by combining the three independent runs and verifying that the effective sample size of all parameters was at least 500. To calibrate the molecular clock, we used the sampling year of all sequences. This analysis also included an outgroup sequence (CT18) to ensure a biologically meaningful root location. Our selected molecular clock model and tree prior have been shown to perform well even when the data display low rate variation and constant population size dynamics 31 . This model combination also allows for informal model testing via the coefficient of rate variation and the population growth rate parameters 32,33 . To determine phylogeographic patterns, we considered the country of sampling as a discrete trait in our analysis in BEAST2 34 . A potential shortcoming of this analysis was that it includes a large number of parameters (transition rates between all locations), therefore the output of these analysis may be affected by the prior distribution. We verified that the prior distribution differed from the posterior by comparing the distributions of all transition rates.
An important consideration when using sampling times as calibrations is that the sampling timespan should capture sufficient genetic variation to allow reliable inferences of evolutionary rates and timescales, such that the data have strong temporal structure. We verified the temporal structure in the data by using a rootto-tip regression and a date-randomisation test 35 . We conducted a root-to-tip regression for the outgroup-rooted ML tree using TempEst 36 , and obtained a positive value for the slope, an R 2 of 0.12, and a p-value of 3×10 -6 (Supplementary Figure 1). For the date-randomisation test we repeated the analysis 20 times while randomising the sampling times. Our expectation was that the randomisations should produce evolutionary rate estimates that were lower and that did not overlap with those obtained with the correct sampling times 37 , which was the case for our data (Supplementary Figure 2). Finally, we compared our estimate of the time of origin of the 3.1.1 lineage in BEAST with an independent method, LSD v0.3 38 . LSD and BEAST2 produced congruent estimates of the time of origin of the 3.1.1 lineage (Supplementary Figure 3).
Antimicrobial resistance gene and plasmid analyses. ARIBA (Antimicrobial Resistance Identifier by Assembly) 39 and CARD (https://card.mcmaster.ca/home) were used to investigate AMR gene content. ARIBA reported the AMR genes and the quality of assemblies and variants detected between the sequencing reads and the reference sequences, including mutations in the quinolone resistancedetermining region (QRDR) of the gyrA, gyrB, parC, and parE genes. For plasmid identification, the sequence reads from each isolate were de novo assembled using the short-read assembler Velvet with parameters optimized by Velvet Optimizer 40,41 . Contigs that were less than 300 bp long were excluded and the assembled contigs were annotated using Prokka 41,42 . Plasmid typing was performed in silico using PlasmidFinder 43 . The presence of the IncHI1 plasmid was confirmed by BLASTN searching the assembled sequences in reference to the pHCM1 reference plasmid sequence, and comparative analyses were performed and visualized using ACT 44 . The IncHI1 plasmid sequence type was identified using SRST2 software 45 with the IncHI1 plasmid MLST scheme 46 . To investigate the isolates with MDR phenotype and without plasmid, raw sequences were subjected to de novo genome assembly using SPAdes 47 version 3.11.0, and the resulting assembly graph was visualized in Bandage 48 to inspect the location of AMR genes in the genome.
Incidence analyses of MDR S. Typhi. Incidence of MDR S. Typhi was estimated per 100,000 person-years of observation (PYO) for MDR S. Typhi isolates found in Ghana, Kenya and Tanzania. Statistical methodology used previously to calculate the incidence of S. Typhi TSAP isolates 2,3 was applied to calculate MDR S. Typhi incidence. Briefly, age-stratified PYO were estimated using available demographic data in HDSS (Health and Demographic Surveillance System) and non-HDSS sites and health-seeking behaviour of randomly selected individuals, representative of the study population, were factored in (denominator). The recruitment proportion was adjusted to the age-stratified crude MDR S. Typhi cases (numerator). Adjusted incidence of MDR S. Typhi per 100,000 PYO was estimated with 95% CIs using these adjustment factors and crude MDR S. Typhi case numbers. The previously established multi-country database (FoxPro software) for TSAP was used for the three countries with MDR S. Typhi. The incidence of MDR S. Typhi in Uganda could not be measured, as data regarding adjustment factors (healthcare seeking behaviour and recruitment proportion) was unavailable at the time of analysis.