Emergence of carbapenem, beta-lactamase inhibitor and cefoxitin resistant lineages from a background of ESBL-producing Klebsiella pneumoniae and K. quasipneumoniae highlights different evolutionary mechanisms

Klebsiella pneumoniae is recognised as a major threat to public health, with increasing emergence of multidrug-resistant lineages including strains resistant to all available antibiotics. We present an in-depth analysis of 178 extended-spectrum beta-lactamase (ESBL)-producing Klebsiella strains, with a high background diversity and two dominant lineages, as well as several equally resistant lineages with less prevalence. Neither the overall resistance profile nor the virulence factors explain the prevalence of some lineages; we observe several putative hypervirulence factors across the population, including a reduced virulence plasmid, but this does not correlate with expansion of one or few highly virulent and resistant lineages. Phenotypic analysis of the profiles of resistance traits shows that the vast majority of the phenotypic resistance profiles can be explained by detailed genetic analyses. The main discrepancies are observed for beta-lactams combined with beta-lactamase inhibitors, where most, but not all, resistant strains carry a carbapenemase or ampC. Complete genomes for six selected strains, including three of the 21 carbapenem-resistant ones, are reported, which give detailed insights into the early evolution of the bla-NDM-1 enzyme, a carbapenemase that was first reported in 2009 and is now globally distributed. Whole-genome based high-resolution analyses of the dominant lineages suggests a very dynamic picture of gene transfer and selection, with phenotypic changes due to plasmid acquisition and chromosomal changes, and emphasize the need to monitor the bacteria at high resolution to understand the rise of high-risk clones, which cannot be explained by obvious differences in resistance profiles or virulence factors. Importance Carbapenem-resistant and extended-spectrum beta-lactamase (ESBL) carrying Enterobacteriaceae were recently highlighted as critical priority fo the development of new treatments by the WHO. Klebsiella pneumoniae is a member of the Enterobacteriaceae and has seen a dramatic rise in clinical relevance due to its uncanny ability to accumulate multidrug-resistance plasmids. We present a detailed analysis of a set of ESBL-resistant K. pneumoniae clinical isolates, and our high-resolution whole-genome sequence analyses highlight that acquisition of drug resistances is not a one-way street in K. pneumoniae, but a highly dynamic process of gain and loss, and that the most successful lineages in the clinic are not necessarily the most resistant or most virulent ones. Analysis of the virulence potential also shows that these strains harbour some, but not all, hallmarks of hypervirulent strains, emphasizing that it is not a clear distinction between hypervirulent and other strains, but equally in flux.


Importance 45
Carbapenem-resistant and extended-spectrum beta-lactamase (ESBL) 46 carrying Enterobacteriaceae were recently highlighted as critical priority fo the 47 development of new treatments by the WHO. Klebsiella pneumoniae is a 48 member of the Enterobacteriaceae and has seen a dramatic rise in clinical 49 relevance due to its uncanny ability to accumulate multidrug-resistance 50 plasmids. We present a detailed analysis of a set of ESBL-resistant K. 51 pneumoniae clinical isolates, and our high-resolution whole-genome 52 sequence analyses highlight that acquisition of drug resistances is not a one-53 way street in K. pneumoniae, but a highly dynamic process of gain and loss, 54 and that the most successful lineages in the clinic are not necessarily the 55 most resistant or most virulent ones. Analysis of the virulence potential also 56 shows that these strains harbour some, but not all, hallmarks of hypervirulent 57 strains, emphasizing that it is not a clear distinction between hypervirulent and 58 other strains, but equally in flux. 59

Introduction 86
The past four decades have seen a continuous escalation of bacterial 87 pathogens acquiring resistance mechanisms against antimicrobials, and 88 especially antimicrobial resistance determinants associated with mobile 89 elements have spread with exponentially increasing speed across the globe 90 (1, 2). A particularly successful pathogen in this group is Klebsiella 91 pneumoniae, which was formerly known as a major cause of infections in 92 neonates, especially in developing countries (3-6) and community-acquired 93 and nosocomial infections in immunocompromised patients (7-9). The 94 acquisition of extended-spectrum beta-lactamases (ESBLs) rapidly increased 95 in Klebsiella spp. from the 1990s, particularly in hospital isolates (10, 11). In 96 2009 there was first description of NDM-1, a metallo-betalactamase which 97 hydrolyzes carbapenems and can escape beta-lactamase inhibitors, which 98 were until then the main alternative treatments for third-generation 99 cephalosporin-resistant bacteria (12). NDM-1 appears to have originated as a 100 chromosomal gene fusion in Acinetobacter (13), but rapidly disseminated 101 around the globe as well as across several species (e.g. E. coli, Provatella; 102 there is less diversity of LPS O-antigens with two main types in our collection; 161 however these are not the usually reported main types (i.e. not O1, O2 and 162 O3; 20, 22) but include less common types. LPS serotype 05 was the second 163 most common, present in 68 of our isolates across 3 sequence types (Fig. 1). 164

165
With the collection of strains there is an almost ubiquitous presence of 166 heavy metal resistance (silver, sil; copper, pco) as well as the type 3 fimbriae 167 (i.e. mrk; Fig. S1). The genes encoding the siderophores aerolysin and 168 yersiniabactin are distributed only partially throughout the collection ( Fig. S1; 169 23), and no salmochelin or colibactin encoding genes were identified. Our 170 collection encodes for six different types of yersiniabactin in eight sequence 171 types (23), including the less common combination of ybt8 with ICEKp3 172 (ST423), and one sequence type (ST48) with three different variants (Fig. S1), 173 highlighting the wide distribution of the different siderophore types even within 174 a confined setting already in these historical isolates. 175 176 We identified a high number of genes predicted as pseudogenes by 177 ariba (Fig. S2A), including the virulence gene rmpA2, the capsule upregulator 178 (Fig. S2B), as well as the fluoroquinolone resistance gene qnrB (Fig. S2C). 179 RmpA is inactivated through a frameshift in either the poly-G and poly-A 180 regions (Fig. S2B), as described previously (24). RmpA and aerobactin are 181 indicative of the presence of the Klebsiella virulence plasmid (25-28), and we 182 therefore compared several virulence plasmids found with our data, and we 183 performed comparisons against the plasmid pSGH10, which is present in a 184 representative model strain of the hypervirulent clade CG23 (28). The plasmid 185 is not evenly spread across the sequence types from Lahore and lacks 186 salmochelin; both when compared mapping ( Fig. 2A) or assemblies (Fig. 2B). 187 The presence or absence of these putative key virulence factors (15) could 188 not be correlated with clinical outcome (details Table 2 Closer investigation of the potential loss of fluoroquinolone resistance 212 via the predicted qnrB pseudogenes (Fig. S2A) showed that it might reflect a 213 functional gene with an alternative start codon (Fig. S2C), which however is 214 recognised as truncated due to its slightly shorter sequence at the N-terminus. 215 The full-length sequence found in the CARD and ARGANNOT database 216 (accession ABC86904.1) contains 12 amino acids at the N-terminus. A 217 frameshift mutation present in the qnrB coding sequences of our strain set 218  identical with the reference sequences. We also note that the sequence in 224 Genbank has been updated (ABC86904.2), which renders it the same length 225 as the more recently described qnrB2 (35; APU91821.1, 5 amino acids 226 difference), and we therefore assume the putatively shortened QnrB to be 227 functional and to confer quinolone resistance in the strains reported here. Klebsiella spp. harbour two intrinsic enzymes related to low-level beta-233 lactam resistance; AmpH, an AmpC-related enzyme functioning as penicillin-234 binding protein; and a chromosomally integrated beta-lactamase of the SHV, 235 OKP or LEN family for K. pneumoniae, quasipneumoniae and variicola, 236 respectively (36). In addition to these chromosomal enzymes, we detected 237 CTX-M-15, three OXA and three TEM variants which however result in the 238 identical amino acid sequence, VEB-5; the AmpC enzymes CMY-6 and CMY-239 18, and NDM-1 carbapenemases as well as additional copies of SHV (Fig. 3). however the only clear genetic difference between HE205 and HE206 is an 253 iron uptake system (fec) that is absent from the two former strains (Fig. S4). 254 255 It is recognised that sensitivity against piperacillin/tazobactam (TZA) 256 often coincides with lower MIC values against amoxicillin/clavulanate ( Fig. S4; 257 Table S2), even though, based on a phenotypic sensitive/resistant evaluation, 258 these strains are assessed as resistant. More detailed analysis of the 259 encoded SHV/OKP bla genes revealed that several K. quasipneumoniae 260 strains (HE031, HE029, HE025, HE034, HE023; ST334) gained a bla-SHV 261 gene in addition to the intrinsic bla-OKP, and the contigs encoding the bla-262 SHV copies are similar to plasmid sequences based on Blast data from public 263 websites, and a plasmid location is also supported by the unique plasmid 264 replicon profile of these strains (Fig. S1). These putatively plasmid-borne 265 blaSHV genes encode for a sequence with the mutations G238S and E240K, 266 which correlate with TZA resistance, or TZA-intermediate sensitivity ( Fig. 3; 267 37-39). We also observed several sequences in our dataset encoding the 268 L35Q mutation; however this only provides a subtle increase in TZA MICs (37, 269 40, 41), and no mutation was found at S130 (42). The MIC data highlights that the same gene (NDM-1) might cause 273 different resistance phenotypes in different genetic backgrounds (Table 1) unstable nature of NDM-1 has been observed previously in several 286 independent reports, where loss can occur already after two generations of 287 re-growth, and even under meropenem selection (43-45). The loss of the 288 entire plasmid carrying NDM-1 in part of our samples, or at least a larger 289 transposable cassette, is further supported by the presence of several 290 resistance genes predicted at low abundance and absent in the PacBio 291 assemblies, potentially indicating the unstable nature of the plasmid carrying 292 the resistance (Fig. 4). 293 In addition, replicon analysis and mapping of the resistance and 294 virulence genes back to the plasmids across all strains where the genomes 295 were resolved with PacBio indicates the involvement of several different 296 plasmids, showing highly dynamic profiles between strains even within closely 297 related isolates (Fig. 4). Almost fixed in the population across all sequence 298 types are as expected the incompatibility groups F and H; with only few 299 isolates showing rarer types such as N and R (Fig. S1, Fig. 4).

Comparison ST15 isolates from Lahore with outbreak in Nepal 307
To gain further insights into the diversity within dominant lineages in 308  (Table S3). This included assessing the 310 presence or absence of genes via a pan-genome analysis of a subset of 311 isolates, as well as mapping of the short read data for the same isolate highlighting the likely close relationship of this virulent lineage that seems to 319 be spreading throughout this area of the world. 320 The phylogeny indicated additional structure in the isolates from 321 Lahore, which was confirmed by analysing the mapping results in more detail 322 The Klebsiella isolates represent routinely collected samples over a 360 protracted period. We observed sporadic single-isolate lineages; smaller, 361 related clusters of 5-10 isolates per lineage; in addition to two larger clusters 362 of strains. One of the latter two represents members of K. quasipneumoniae 363 subsp. similipneumoniae, a subspecies that was previously thought to be less 364 virulent ( Fig. 1; 15). The major K. quasipneumoniae sequence type (334)  Ag, that is generally far less variable than the capsule. A recent large-scale 384 analysis of several Klebsiella datasets spanning a global collection identified 385 serotypes O1, O2 and O3 in 80% of hospital infections (20); contrary to this 386 previous finding, we observed a high number of non-O1/O2 types in our 387 collection of Klebsiella; these are mainly, but not exclusively, contributed by 388 the large number of K. quasipneumoniae (Fig. 1D, Table S1). This diversity, 389 and the lack of apparent correlation to disease severity (20), indicate that  We report a strain set that is a hybrid type; either encoding the virulence 417 plasmid but with rmpA2 inactivated, or the yersiniabactin, but no strains with bacteria, but it seems clear that acquisition of an NDM-1 plasmid alone is 456 neither a sign that the plasmid will now spread stably with the population, nor 457 that this lineage has an undisputable advantage over all carbapenemase-458 negative lineages. 459 The genomics generally predicted the phenotype with respect to drug 461 resistance. This is important as technology is being developed to identify 462 resistance genes at the point-of-care. Subtle differences in the sensitivity that 463 are overlooked by evaluating strains only below or above a specified drug 464 concentration, and might indicate an additional mechanism to deactivate beta-465 lactamase inhibitors, especially since we could not detect any genomic The samples as indicated in the respective experiments were assembled and 518 annotated as described above, and the GFF3 files generated by PROKKA 519 v1.11 (83) were used as input for the pan-genome pipeline Roary (v3.7.0; 520 85) with a BLASTp percentage identity of 90% and a core definition of 99%. 521 This gives a core gene alignment of 3486 genes for all strains from this study 522 (Figure 1, 2). The core gene alignment was generated with mafft (86); SNPs 523 were first extracted using snp-sites v2.3.2 (87), and then a maximum 524 likelihood tree with RAxML (88) was calculated. For the ST15 comparison, the 525 pan-genome was calculated as described above, and gene presence/absence 526 analysed further using R and the resulting presence/absence matrix and 527 phandango (89). 528 529

Phylogenetic analyses 530
Trees were calculated using RAxML (v8.2.8; 88) with the time-reversible GTR 531 model and 100 bootstrap repeats. Tree demonstrations were prepared in itol 532 (90) as well as in house python and R scripts. For whole-genome mapping 533 trees, recombinant regions were removed using Gubbins (v1.4.9; 91) and a 534 maximum likelihood tree calculated with RAxML to obtain bootstrap support 535 values, as described above, and sites with more than 5% N were not 536 considered in the tree calculation.  The assignment to O3 subgroups was updated based on recent nomenclature 558 (O3l/s; now a/b/c) using the online tool captive (http://kaptive.holtlab.net/, 22). 559 560