Introduction

The spread of AMR bacteria in humans, animals, and the environment causes increased morbidity and mortality, which is associated with a high economic burden1. E. coli is one of the main pathogens causing hospital acquired-infections in both developed and developing countries, and is also a leading cause of infections in the community and in livestock2,3. ESBL E. coli are one of the key AMR threats to human health4. The transmission of ESBL-producing bacteria from hospitals to the community may make community acquired infections such as urinary tract infections (UTI) increasingly more complicated to treat. One study showed that the probability of transmission of ESBL-producing bacteria in the household from an index patient to a household contact was 67%, and from a household member to another was 37%5. This risk could be exacerbated by imprudent antibiotic use in the community, as selective pressure from antibiotics leads to further dissemination of AMR6. Similarly, transmission may occur within hospital environments. A study from Vietnam showed that Hospital transmission of ESBL producing E. coli cause a high burden of hospital acquired infection (HAIs), mortality and healthcare costs7,8, and that about 30% of patients in 21 Vietnamese intensive care units (ICUs) had HAIs due to ESBL-producing E. coli. The studies also revealed that the ESBL colonization rates increased from 13% on admission to 89% at day 15 of hospital stay8. Similar findings regarding the burden of ESBL in health-care facilities have also been reported in Africa9,10. Such situations often lead to longer duration of hospital stay and increased risk of transmission of ESBL to family members, the environment, and the community upon discharge.

Reducing non-indicated antibiotic use in the community is important to reduce spread of ESBL (and AMR in general). It is equally important to prevent the introduction of “novel” ESBL strains into the community to prevent ESBL spread in the community. Recent studies from Kenya, England and The Netherlands reported MDR strain-sharing between humans in the community and between livestock, but failed to demonstrate close relatedness between strains from these different niches2,11,12.

There is an overall paucity of evidence especially in developing countries regarding the transmission of MDR ESBL producing bacteria between healthcare and community settings. In this study, we tested the hypothesis that MDR ESBL producing bacteria from an inpatient could be passed down to the community members, including to their livestock, when discharged while still colonized with ESBL strains. We further, investigated whether transmission of mobile genetic elements is the source of ESBL flow from hospitals to the community. The latter part was included as the dissemination of ESBL is mainly attributed to plasmids13. We used Illumina short reads and Oxford Nanopore long read sequences to characterize and compare ESBL E. coli and plasmids from healthcare and community settings to map transmission between and within the different niches.

Results

ESBL carriage

Of the 1556 samples analyzed, 81.9% (1275) yielded suspected ESBL producing bacteria. The antimicrobial susceptibility testing of the obtained isolates showed that many of the strains were resistant to critical cephalosporin and carbapenem antimicrobials such as Cefepime, Cefotaxime, Cefoxitin, Ceftazidime, Ertapenem and Imipenem in both healthcare and community settings. Isolates from community settings showed similar degree of resistance to several critically important drugs as the isolates from healthcare settings (Table 1).

Table 1 Resistance of ESBL E. coli from different sources in healthcare and community settings.

In total, 260 of the 268 ESBL E. coli isolates on which we performed AST were multidrug-resistant displaying resistance to at least one antimicrobial within three or more classes of antimicrobials14. The strains were selected based on their ESBL-status. Accordingly, isolates presented with high resistance to 4th generation cephalosporins, ranging from 65% in samples from community settings to 96% in isolates from healthcare environments (Table 1, Fig. 1).

Figure 1
figure 1

Relative frequencies of resistance, intermediate and susceptible strains by niche to each antimicrobial.

ESBL genes detected and resistome sharing across niches in healthcare and community settings

Sixty-seven strains were sequenced, and genomes were analyzed for the presence of resistance genes. The 67 sequenced isolates were all confirmed to be E. coli and their overall resistome showed the presence of 14 classes of antimicrobial resistance genes in each sample source from healthcare and community settings (Fig. 2). Details on resistance genes per strain are provided in Table S1. blaCTX-M-15 was the most common ESBL gene detected (n = 41), followed by the AmpC betalactam gene blaOXA-1 present in eight samples. The least common ones were blaCTX-M-55 (n = 3), blaCTX-M-3 (n = 3), blaCMY-2 (n = 2), and then blaCTX-M-25, blaCTX-M-102, blaCTX-M-103, blaCTX-M-137 (total n = 41), found in one isolate each (data shown in Table S1).

Figure 2
figure 2

Relative abundance of antimicrobial resistance genes detected by class per niche. Although the number of ARGs vary according to the number of strains in each category, the plot shows the presence of all 14 AMR classes in all the categories.

For assessing sharing of resistant strains between healthcare and community settings, an initial assessment of the AMR situation in the two compartments was performed by comparing the resistome richness based on all AMR genes detected by class of antimicrobials in the sequenced isolates. Similarity in richness would indicate similarity of strains between the two compartments and a similar selection pressure for maintenance of strains. Abundance plots showed that all niches presented indiscriminate presence of all 14 antimicrobial classes detected in their resistome, but also that some niches harbored more resistance genes than others due to the sample size (Fig. 2).

A further analysis of the resistome revealed six shared clusters (R1–R6) based on Jaccard similarity analysis (Fig. 3). In each of the clusters, there were strains from healthcare and community settings that shared similar resistome content (Fig. 3). In the resistome cluster R1 for instance, a strain from an index patient had a closely related resistome with other strains originating from other samples from the healthcare environment as well as from the community settings including household animals, household environments and family members. This is an indication that strains in health care setting were not very different from strains in community setting with regard to resistome, as also indicated from the results in Table 1.

Figure 3
figure 3

Resistome clustering based on the Jaccard similarity index between all AMR genes detected in each sample. Samples are named with the initials of the sample sources to display diversity of niches within the resistome clusters. The blue color indicates pairwise map of the same sample. HCW for healthcare worker and caretaker at hospital, HCE healthcare environment, HHA household animal, HHC household contact, HHE household environment.

Diversity of ESBL strains between healthcare and community settings

To be able to perform detailed mapping of possible transmission routes of ESBL E. coli from the hospital to the community, we selected strains from clusters where an index patient (admission to the hospital) was linked to ESBL isolation from health care settings and/or household members after returning home from hospital. We sequenced strains in such clusters and found that the 67 strains sequenced belonged to 42 different sequence types. Thus, the strains belonged to almost as many MLST as the number of strains sequenced in each niche, a sign of highly heterogeneous ESBL E. coli populations across samples from the 6 months of sampling (Table S2).

The high heterogeneity in the ESBL-E. coli isolates led us to the analysis of cgMLST phylogeny based on allelic distances. The results of the cgMLST analysis corroborated a wide genetic diversity between the strains isolated from each niche (Fig. 5). Based on this analysis we were able to identify up to five instances of sharing events where isolates from healthcare and community settings within strain-clusters from the same hospitals were found in the same branch and closely related, suggesting somewhat related clades of strains being transmitted between healthcare and community settings (Fig. 4).

Figure 4
figure 4

MST tree built with allelic distances generated by cgMLST from analysed strains. The red circles indicate instances where related isolates from healthcare and settings are found. Within each leave the name of the hospital is reported.

On the other hand, core-genome SNP-based phylogeny of the strains clustered them into several unrelated lineages. With this higher resolution analysis, we identified one instance of strain transmission between healthcare and the community, and this was between the isolate DKB841 isolated from a healthcare caretaker in Kitovu hospital showing only two SNPs difference from the strain DKB948 isolated a month later at the index-patients household from another family member (clade C4-M15, Fig. 5).

Figure 5
figure 5

Core-genome SNP-based phylogeny of the ESBL E. coli strains and selected markers.

A few other closely related isolates (with less than 50 SNPs apart) were identified with the SNPs analyses, but they were all isolated either within healthcare setting only or the community setting only (Fig. 5, Table S3). For instance, strains DKB0435, DKB062 and DKB1190, all from community settings (including household family member, household animal and household environment), isolated between August and September 2020 were 36–47 SNPs apart. Moreover, isolates DKB1719 vs DKB1720 as well as isolate DKB1762 vs DKB1765 isolated the same day in Kitovu hospital settings (including from environmental samples and humans) were strictly clonal with zero SNP, showing a short-term nosocomial transmission of ESBL strains within hospitals. A similar situation was observed at community level where strains DKB0960 vs DKB0961 isolated the same day showed two SNPs difference within the household of a returning patient. Similarly, strains DKB0733 and DKB0734 isolated from household contacts of a discharged patient discharged from Masaka hospital were only one SNP apart. In addition, it was observed that strains DKB0702 from a household family member was closely related (14 SNPs difference) to the strain DKB0725 isolated from a domestic animal isolated the same day related to an index patient from Kitovu hospital. So short-term strain sharing within each setting was found to be more common than across settings.

Overall, the SNP-analysis (Fig. 6) and the SNP distance data in Table S3 indicate that short-term transmissions of ESBL strains within community and within healthcare settings are common. However, besides the only event between the strains DKB841 and DKB0948 (2 SNPs, clade C4-M15 of Fig. 5) and the five loosely related sharing events reported by the cgMLST analysis (Fig. 4), no other cases of strain sharing across the healthcare and community settings were observed. This suggest that either the long-term continuous transmission between the two settings is mediated by another transmission mechanism, for example mobile genetic elements, or transmission is generally not followed by persistence in the new host. Based on this we investigated the role of mobile elements in the spread of ESBL genes in our study area.

Figure 6
figure 6

Overall comparison of plasmids categories hosting the bla-CTX-M-15 gene as reconstructed from hybrid assemblies in healthcare and community strains. The observed clustering represents the distinction between the main plasmid categories described in the study. Same genes have the same coloration and the black lines between plasmids indicate the degree of gene-by-gene similarity between and within plasmid clusters.

The location of ESBL genes in strains and the possible role of plasmids in disseminating ESBL genes between health care and community setting

The SNP analysis pointed to the occurrence of genetic relatedness in subclades harboring the bla-CTX-M-15 gene (Fig. 5). Isolates from different niches (in healthcare and community settings) in these clades, e.g. C1-M15 to C7-M15, harbored the main bla-CTX-M-15 gene on related types of plasmids (Fig. 5). In many cases within these clades, the bla-CTX-M-15 gene was located within a consistent genetic context on the same plasmid in samples from both healthcare and community settings (e.g., from C4-M15 and C5-M15, Fig. S1). This suggested that dissemination of ESBL genes was happening via plasmids that were circulating across strains from healthcare and community settings within these phylogenetic clades. The complete reconstruction of the plasmids from representative isolates from each clade was done using hybrid assembly of long and short-read sequences. This identified at least four categories of conjugative plasmids, based on similarities in their genetic arrangements, that were likely responsible for the long-term transmission of ESBL across settings (Fig. 6, Table S4).

As shown in Fig. 6, one of the categories of plasmids was composed of multi-replicon plasmids (IncFIA, IncFIA, IncFIB, IncFIC) of at least 100kbp nucleotides that hosted the main ESBL gene, bla-CTX-M-15, in a similar genetic arrangement (Fig. S1). Strains with this type of plasmid were detected in healthcare and community settings, suggesting a role in spread of ESBL genes between the compartments.

Another category was the IncY plasmids also carrying bla-CTX-M-15 in similar genetic contexts in the study settings. These IncY plasmids are smaller than the previous with sizes around 90 kbp and occurring in strains mainly found in the same phylogenetic clade. Our analysis also identified a group of plasmids with only the IncFIB rep type with specific genetic context also very likely responsible for the dissemination of the bla-CTX-M-15 gene across niches.

The majority of plasmids responsible for the carriage of the ESBL genes across strains from healthcare and community settings were of the IncF types (Fig. 7A, Fig. S1) with the bla-CTX-M-15 gene often located upstream of a transposon insertion sequence IS1380 and varying genetic contexts downstream depending on the rep types (Fig. 7A,B).

Figure 7
figure 7

Graphical visualization of genetic features. Context of representative complete ESBL-plasmids reconstructed by hybrid assembly from strains DKB1325 (A) and DKB1765 (B). In panel (C), we show the genetic context around bla-CTX-M-15 in the strains where we observed that the gene was located on the chromosome.

Importantly, the hybrid assembly also allowed us to determine that in some cases like in strain DKB0960 and DKB0961, bla-CTX-M-15 was located on the chromosome within genetic contexts closely consistent to the one found on the group of IncFIB plasmids (Fig. 7C). This indicates the possibility of a long-term dissemination of the IncF plasmids in the study area that lead to this settling of the ESBL gene in these isolates in the chromosome within the same context, probably through the ISEc9 and IS1380 transposases upstream of bla-CTX-M-15 in these strains.

Discussion

We conducted a longitudinal study for 8 months following patients in the healthcare and household settings in two tertiary hospitals in Uganda. We investigated the hypothesis that multidrug resistant bacteria (with ESBL-producing E. coli as proxy) and/or their resistance determinants could spread within and between healthcare and community settings through discharged patients that were still colonized. After analyzing 1556 samples from different sources in the healthcare and community settings and performing different phenotypic and genotypic typing analyses, our overall findings show that the transmission of AMR between healthcare and community settings is manifold.

We confirmed a high ESBL carriage (81.9%) similar to but also different from similar studies in Uganda who reported varying prevalence from as high as 89%, 62% to below 30%15,16,17. These differences may relate to differences in sampling frameworks, varying levels of infection control in healthcare settings and detection methods but nevertheless relate to a big challenge of ESBL as a serious health threat. We report that in both healthcare and community settings, bacterial isolates displayed similar degree of resistance to several critically important drugs, indicating either sharing of strains or similar degree of selection on the bacteria in the two settings. This further suggests a common epidemiological background for strains circulating in the healthcare and community settings with the hospital being the likely source of selection of AMR based on previous evidence that one in eight patients acquired resistant bacteria upon admission to hospitals8. This finding is further corroborated by the genotypic resistance data where we found similarity in the resistome (resistance gene richness) of strains originating from the two compartments. It is however worth emphasizing that the sharing of resistome here is based on overall resistance gene content regardless of the overall dissimilarity between strains, and as such is not a proof of strain sharing. Nevertheless, these data overall support the hypothesis that strains from healthcare and the community settings were not very different with regard to resistome, and that they might indeed be subjected to a similar selection pressure for maintenance.

Diversity-wise, the study also showed that the strains sequenced belonged to almost as many MLS-Types as the number of strains sequenced from each source, a very common situation for the highly heterogeneous E. coli populations also reported in previous studies3,13,18. In this wide diversity, we identified up to five instances of short-term sharing events across healthcare and community using cgMLST, suggesting related clades of strains being transmitted between healthcare and community settings. This phylogenetic method which is based on cgMLST allelic distances, although “shallow”, is often used when analyzing highly heterogeneous genomes like E coli strains and have been useful in previous studies to attribute likely related strains12. Using the higher resolution phylogenetic analysis based on SNPs, we identified one instance of strain transmission between healthcare and the community, where an isolate from the caretaker of an index patient in the hospital was found closely related to an isolate obtained a month apart from a family member at the household. Altogether, these diversity analyses show that dissemination of resistant strains does occur at low level between the hospital and the community but in a short-term basis.

The study also detected several instances of short-term nosocomial transmission of ESBL strains within the hospital premises. Likewise, we report cases of very closely related strains occurring within the community indicating that short-term strain sharing within community and within healthcare settings is common. Close genetic relatedness among isolates in hospital acquired infections and within household settings has previously been documented19, where the authors observed that in the Swiss context, the household outweighed nosocomial transmission of ESBL, likely because of the effects of hospital infection control measures. Our study instead demonstrated similar transmission was reported in hospitals as at household levels unlike the Swiss study that is likely explained by the possible differences in the infection prevention and control differences between Swiss and Ugandan healthcare facilities. This indicates that in the Ugandan case more efforts, such as improved hygiene, may be necessary to minimize nosocomial transmission of AMR within hospitals and transmission at household levels.

The low level of long-term strains transmission between healthcare and community settings in our study suggested that prolonged transmission between the two settings is mediated by another transmission mechanism, which we found in this case was mobile genetic elements, mainly plasmids. Such mechanism of transfer is common with ESBL producers both in clinical settings2,19 as well as in communities3,13,20. The study pointed that the main ESBL gene circulating in the sequenced samples, the bla-CTX-M-15 gene, was located within a consistent genetic context on the same plasmid in samples from both healthcare and community settings for at least 6 months. It therefore suggested that it was related plasmids that were responsible for ESBL long-term transmission from healthcare and community settings. Many of the plasmids were of the conjugative IncF types common in many ESBL transmission cases3,6,13. We also found indication that the long-term transfer of the bla-CTX-M-15 gene via these plasmids may lead to settling of the ESBL gene into the chromosome of some of the isolates, probably through the transposon insertion sequences located upstream of the gene on the IncFIB plasmids. Such situations are indications of an epidemiological context, where plasmids have played the long-term mediators of horizontal transfer between the healthcare and the community severally. The role of whole strains in AMR transmission from healthcare settings to communities may require longer follow up of hospitalized patients after they return home. We did not collect follow up samples of these patients at several intervals after discharge from hospital. Despite this limitation, we demonstrated through cgMLT and resistome analysis that some patients had closely related strains and resistome from the household members, animals, and environments and the hospitals settings that support potential role of whole strains in AMR transmission between the different niches.

Overall, the data suggest a long-term well-established and continuous colonization of humans, animals, and the environment in the study area by these multidrug resistant bacteria sustained by horizontal gene transfer across niches played by different plasmids. As previously shown by other studies2,13 the transmission of resistance by ESBL-producing E. coli between different reservoirs might not be easy to assess at strains level due to the high heterogeneity of E. coli but can be more surely observed with plasmids and the genomic context of marker genes.

Methods

Sampling and sample processing

The study was conducted between July 2020 and March 2021 at and around two tertiary healthcare facilities: Masaka Regional Referral Hospital, a government public hospital, and Kitovu Hospital, a not-for Profit Faith based healthcare facility, both tertiary healthcare facilities in Uganda. Rectal swabs were collected from “index patients” admitted to the two hospitals. The inclusion criteria were that they were on antibiotic treatment and kept livestock in their homes. Rectal swabs were also taken from caretakers, cleaners, healthcare workers, and other caretakers of other patients on the same ward. These samples were grouped as “healthcare workers”. Samples from the hospital environment, including swabs of surfaces, sinks, and wastewater were taken as “healthcare environment”. When discharged, patients were followed-up to their homes, where samples were collected from their household contacts, livestock, and household environments (see details in Table S5). The sampling framework is summarized in Fig. S2. In total, we collected and screened 1556 samples from the different sources.

Antimicrobial susceptibility testing and short-read sequencing

All samples were processed for isolation and identification of ESBL producing E. coli using methods as described in our previous studies3. E. coli were presumptively identified based on the standard IMViC (indole test, methyl red test, Voges-Proskauer test, and citrate utilization) tests and later confirmed upon sequence analysis. The presumptive colonies were obtained on McConkey agar containing 2 ug/ml cefotaxime, purified and confirmed as ESBL by the double-disk synergy method. Moreover, 268 of the confirmed ESBL producing isolates were subjected to antimicrobial susceptibility testing against 18 antimicrobials by disc diffusion assay (Table S5) as previously described3.

DNA was purified from 67 of the ESBL E. coli strains and subjected to short-read whole genome sequencing (WGS) on Illumia MiSeq. The selection of these 67 isolates (Table S6) was based on the criteria that the isolate belonged to sampling clusters, where ESBL isolates from index patients were present together with ESBL isolates from (either/or) their caretakers, associated family members, animals, healthcare workers, as well as hospital and community environments. DNA extraction, WGS and sequence reads processing was performed following previously described protocols13,21.

Data processing for bioinformatics analysis

All bioinformatics analysis was performed using internally customized scripts run on the terminals of the Danish National Life science supercomputer, Computerome2 on nodes each equipped with dual 20-core Intel Xeon Gold 6230 CPUs and 192 GB RAM.

Long read sequencing, hybrid assembly and plasmids reconstruction

After preliminary analyses of the short-read sequence data, we selected representatives of different phylogenetic clades identified in our dataset and carried out long-read whole-genome sequencing of these isolates (n = 15) using the Oxford Nanopore technology to obtain completely assembled genomes for comprehensive analyses. Genomic DNA was re-extracted from the selected isolates using the A&A Genomic Mini AX Bacteria+ kit (A&A Biotechnology, Gdańsk, Poland) to obtain less fragmented DNA. Long-read sequencing libraries were then generated using the Rapid Barcoding Sequencing Kit (SQK-RBK004) and sequenced on a MinION Flow Cell (R9.4.1) with MinKNOW version 22.08.9 (Oxford Nanopore Technologies, Inc., Oxford, United Kingdom). The super-accuracy (r9.4.1_450bps_sup) configuration was employed to basecall the Fast5 files generated using the nanopore-supported software Guppy v. 6.3.8. The basecalled FASTQ files were then concatenated into a single file per basecalled sequence run and demultiplexed into individual FASTQ files.

Before, hybrid assembly, we first filtered out poor-quality reads using fastp v0.20.122 (for short reads) and Filtlong v0.2.123 (for long reads) at a sequence similarity threshold of 95%, retaining only the reads with a minimum length of 1 kbp and excluding the worst 5% of the reads. Long-read sequences assembly was first performed using Tricycler pipeline24, and the generated circularized assemblies were then polished using Medaka v1.7.2. To generate high-quality complete assemblies, the generated long-read assemblies were then polished with the short reads using Polypolish25. For comparison of hybrid assemblies, Unicycler v0.4.9b23 was then used on the filtered short and long read sequences. From the hybrid assemblies, the mob-recon command from the mob-suite v3.1.026 software was used to separate complete chromosomal sequences from mobile elements including typing for plasmids and associated insertion sequences.

Diversity analyses

The diversity of the strains was determined using multilocus typing MLST on the assembled genomes. For phylogeny, two approaches were used: (1) a gene by gene cgMLST phylogenetic analysis was performed in enterobase (https://enterobase.warwick.ac.uk/) based on allelic distances; and (2) single nucleotide polymorphisms-based phylogenetic analysis (SNPs) were also performed to assess the genetic relatedness of isolates across niches. For the SNPs analysis, we followed previously described procedures21 where variants were called with Snippy v4.6.0 (https://github.com/tseemann/snippy) under the following parameters: mapping quality of 60, a minimum base quality of 13, a minimum read coverage of 4, and a 75% concordance at a locus. An alignment of core genome variants was produced with snippy-core for phylogeny inference. Putative recombinogenic regions were detected and masked with Gubbins version v2.4.127. A maximum likelihood (ML) phylogenetic tree was built, with RAxML version/8.2.12, under the GTR model with 1000 bootstraps28. The final tree was rooted on the reference E. coli K12 substr MG1655 genome and visualized with iTOL v529. The pairwise SNP distance was generated with snp-dists v 0.8.2 (https://github.com/tseemann/snp-dists) and the data is provided in Table S2. Given the study period lasting from August to February and knowing the general heterogeneity of the E. coli genomes, strains with less than 50 SNP difference were considered genetically related.

Resistome and mobilome analyses

A systematic resistome analysis was performed using combined database from Resfinder v4.130 and AMRfinderplus v3.10.2431. Phenotypic and genotypic resistome data was subjected to a pooled analysis, where comparisons of relative frequencies for resistance and the abundance of resistance gene by antimicrobial classes was performed according to place of isolation.

Analysis of the mobile genetic elements hosting the main ESBL gene was performed using annotated genomes with comparison and visualization of the genetic context around the main ESBL gene using clinker32 and BLAST atlas.

All genomes and assembly fragments were annotated using Bakta v1.5.133. With the complete assemblies, the genetic contexts of ESBL resistance genes were observed in Artemis, and genomic resistance islands were identified using the IslandViewer 4 tool34. Insertion sequence (IS) elements were identified using the BLAST tool on the ISfinder database (https://www-is.biotoul.fr/blast.php). Using GView Server (https://server.gview.ca/), we mapped the reconstructed plasmids against their closest hits from NCBI to determine representativeness and some genetic context visualizations were performed with CLC Main Genomics workbench (Qiagen).

Plotting and statistical analysis

Data were plotted in R v4.2.1 using ggplot2. For resistome analysis similarity analysis was performed using the Jaccard similarity index (JI) (between 0 and 1, where 1 is identical), and the resulting JIs were used to generate a hierarchical clustering heatmap in Rv 4.2.1.

Ethics approval

Ethical approval for the study was obtained from Makerere University School of Public Health Higher Degrees Research and Ethics Committee (HDREC #760) and Uganda National Council for Science and Technology (UNCST) (#HS649ES). The study also obtained administrative approval from Uganda Ministry of Health (#ADM.185/130/01), while written informed consent was obtained from each eligible study participant with assurance for confidentiality of information. All methods in this study were performed in accordance with the relevant guidelines and regulations.