Characterization of a SPM-1 metallo-beta-lactamase-producing Pseudomonas aeruginosa by comparative genomics and phenotypic analysis

Pseudomonas aeruginosa is one of the most common pathogens related to healthcare-associated infections. The Brazilian isolate, named CCBH4851, is a multidrug-resistant clone belonging to the sequence type 277. The antimicrobial resistance mechanisms of the CCBH4851 strain are associated with the presence of the bla\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_\text {SPM-1}$$\end{document}SPM-1 gene, encoding a metallo-beta-lactamase, in combination with other exogenously acquired genes. Whole-genome sequencing studies focusing on emerging pathogens are essential to identify key features of their physiology that may lead to the identification of new targets for therapy. Using both Illumina and PacBio sequencing data, we obtained a single contig representing the CCBH4851 genome with annotated features that were consistent with data reported for the species. However, comparative analysis with other Pseudomonas aeruginosa strains revealed genomic differences regarding virulence factors and regulatory proteins. In addition, we performed phenotypic assays that revealed CCBH4851 is impaired in bacterial motilities and biofilm formation. On the other hand, CCBH4851 genome contained acquired genomic islands that carry transcriptional factors, virulence and antimicrobial resistance-related genes. Presence of single nucleotide polymorphisms in the core genome, mainly those located in resistance-associated genes, suggests that these mutations may also influence the multidrug-resistant behavior of CCBH4851. Overall, characterization of Pseudomonas aeruginosa CCBH4851 complete genome revealed the presence of features that strongly relates to the virulence and antibiotic resistance profile of this important infectious agent.

. Currently, the main carbapenem resistance mechanism associated with Brazilian isolates is the production of an MBL denominated São Paulo metallo-beta-lactamase (SPM-1). In 2002, the SPM-1-encoding gene (bla SPM-1 ) was first identified in a clinical isolate recovered from a blood culture of a patient admitted in a hospital located in the state of São Paulo. Over the past years, several isolates carrying bla SPM-1 were recovered from multiple P. aeruginosa hospital infection outbreaks widespread in the Brazilian territory. Multilocus sequence typing included most of these isolates within the sequence type 277 (ST-277) which contains P. aeruginosa strains from different countries. However, bla SPM-1 presence was only detected in isolates originating from Brazil and it is unknown why SPM-1 is restricted to a specific region whereas other MBLs tend to spread worldwide. Nevertheless, the risk of SPM-1 worldwide dissemination should not be neglected 5 . In 2008, an isolate was recovered from the catheter tip of a patient admitted in a hospital located in the state of Goiás. The strain, named CCBH4851, was subjected to antimicrobial susceptibility assays. Among the agents tested, the strain was resistant to aztreonam, amikacin, gentamicin, ceftazidime, cefepime, ciprofloxacin, imipenem, meropenem, and piperacillin-tazobactam, being susceptible only to polymyxin B 6 . Multidrug resistance scenarios like those presented by CCBH4851 highlight the urgent need for the discover of new therapeutic options. Thereby, recent whole-genome sequencing studies have focused on emerging MDR P. aeruginosa strains in order to identify key aspects of their physiology that may lead to the identification of new targets for drug development. This study was designed to thoroughly characterize the complete closed genome of P. aeruginosa CCBH4851 through the detection of genomic features, genome comparison with other P. aeruginosa strains, and phenotypic analysis. We believe this study will serve as a useful reference for future research using CCBH4851 as a model organism.

Results
Genome features of P. aeruginosa CCBH4851. A complete genome sequence was obtained by the combined assembly of Illumina short and PacBio long reads. The genome consisted of a single circular chromosome with 6,834,257 bp and a G+C content of 66.07%. The number of coding sequences (CDSs) was 6,236, with an average length of 973 bp, which represents 88.78% of the genome. Families of Clusters of orthologous groups (COG) were attributed to 5,247 CDSs comprising 20 functional categories distributed along the chromosome (Fig. 1). A summary of the genomic features found in the CCBH4851 complete genome is listed in Table 1.
core, accessory and unique genomes. The gene repertoire of CCBH4851 was compared to other strains divided in three groups (Table 4): (A) composed by the commonly used reference strains in research studies, (B) composed by other ST-277 strains also carrying the bla SPM-1 gene, and (C) composed by other MDR strains belonging to different STs. A search for orthologs among these genomes identified genes that were shared by all genomes (core), genes specific to each genome (unique) or genes shared by two or more (but not all) genomes (accessory). In group A, 5,068 orthologous genes were identified as the core genome, 251 as the accessory genome and 697 genes were unique of CCBH4851 (Supplementary Table S1 online). Apart from genes included in the "Function unknown" COG class, the majority of unique genes were distributed among "Replication, recombination, and repair" and "Transcription" COG classes ( Fig. 2A). In group B, the core genome comprised 5,645 genes. The accessory and unique genomes were comprised by 293 and 195 genes, respectively, also having the "Replication, recombination, and repair" and "Transcription" COG classes as the major functional categories with the exception of the "Function unknown" family ( Fig. 2B). Group C, including MDR strains of different STs, presented 3,680 genes as the core genome, 1,745 as the accessory genome and 508 as unique genes of CCBH4851. The most evident functional categories remained the same (Fig. 2C). The comparative analysis www.nature.com/scientificreports/ also revealed that the CCBH4851 genome has 205 genes lacking or presenting partial homology when compared to the PAO1 genome (Supplementary Table S2 online). The predominant functional COG categories of the non-orthologous genes were "Replication, recombination, and repair", "Transcription", "Intracellular trafficking, secretion, and vesicular transport", "Inorganic ion transport and metabolism", "Energy production and conversion", "Secondary metabolites biosynthesis, transport, and catabolism", "Defense mechanisms", and "Coenzyme transport and metabolism". It is noteworthy that oprD (porin), mexZ, liuR, PA0207, PA0306, PA0306a, PA2100, PA2220, PA2547, PA3067, PA3094, and PA3508 (transcriptional regulators), opdE (membrane protein), pilA (type IV fimbrial protein), algP (alginate regulatory protein), and others were among these genes.  www.nature.com/scientificreports/ of the genes located in the PAGIs had no function assigned. The remainder were mainly classified in the following COG categories: "Replication, recombination, and repair", "Transcription", "Intracellular trafficking, secretion, and vesicular transport", "Energy production and conversion", "Secondary metabolites biosynthesis, transport, and catabolism", "Inorganic ion transport and metabolism" and "Defense mechanisms". A total of 13 insertion sequences' (IS) families were found in the CCBH4851 genome with 46 predicted interspersed open reading frames (ORFs), including complete and partial sequences (Supplementary Table S4 online). A search for clustered regularly interspaced short palindromic repeats (CRISPR) revealed the presence of two CRISPR loci in CCBH4851 genome: one had only 1 spacer and direct repeats of 25 bp, and the other had 39 spacers and direct repeats of 32 bp. The first CRISPR was located at position 1,878,269 to 1,878,370 in the genome, between two hypothetical protein-encoding genes. The DNA sequence referring to this first CRISPR array was a variant of the intergenic region between PA3230 and PA3231 genes in PAO1 (homologous to AL347_00365 and AL347_00370). The second CRISPR was located at 5,576,114 to 5,578,733, partially overlapping the hypothetical protein-encoding gene AL347_11857. In addition, the second CRISPR array was located adjacent to an intact CRISPR-associated gene cluster, cas2 (AL347_11860), cas1 (AL347_11865), cas4 (AL347_11870), cas7 (AL347_11875), cas8c (AL347_11880), cas5 (AL347_11885), cas3 (AL347_11890). These cas genes were related to the type I-C CRISPR-Cas system and were located inside PAGI-34, as previously described in other ST-277 clones 7 .

Regulatory proteins.
Comparative genome analysis revealed strong evidences of horizontal gene transfer events in the CCBH4851 genome. Despite the predominance of genes encoding hypothetical proteins in these acquired regions, several CDSs were classified into the "Transcription" COG functional class. Regulatory protein analysis was performed to predict the presence of two-component systems, transcription factors, and other DNA-binding proteins in the CCBH4851 genome. Table 2 summarizes a comparison between CCBH4851 and PAO1 regulatory proteins.
In accordance with the comparative analysis, the CCBH4851 genome lacks the histidine-kinase PA2583, the response regulator PA0034, as well as the transcriptional regulators PA3508, PA3067, and LiuR (PA2016). However, the CCBH4851 genome possessed 28 additional regulatory proteins, mostly transcriptional regulators with Xre-and LysR-type domains, distributed along PAGIs (Supplementary Table S5 online).
Antimicrobial resistance factors. Protein sequences from the CCBH4851 genome were used for searches against The Comprehensive Antibiotic Resistance Database (CARD). As previously described 6 , the CCBH4851 genome possessed additional genes conferring multidrug resistance: two copies of bla SPM-1 (AL347_32235, AL347_32285), two copies of sul1 (AL347_30655, AL347_30700), rmtD (AL347_30670), blaOXA-56 (AL347_30635), aac(6')-I (AL347_30630), aadA7 (AL347_30640), cmx (AL347_30710), and three copies of bcr (AL347_32216, AL347_32266, AL347_32316). As expected, all these genes were located in PAGIs. Apart from mexZ, genes listed as belonging to the PAO1 resistome were also present in the CCBH4851 genome (Supplementary Table S6 Table S7 online). Synonymous variants were left aside, as they have no presumable impact on cellular processes. Apart from intergenic regions, the remaining affected genes were classified into COG families to assess whether variant types were common to a few functional categories or were randomly distributed among all of them. Mutations occurred mainly in genes belonging to "Inorganic ion transport, and metabolism", "Amino acid transport and metabolism", "Cell wall, membrane, and envelope biogenesis", "Transcription", and "Energy production and conversion" COG categories (Fig. 3). In addition, Table 3 summarizes a list of virulence and antimicrobial resistance-associated genes with their respective amino acid substitutions. The hypothetical structural effect of new substitutions was predicted and listed in details in Supplementary Table S8 online. www.nature.com/scientificreports/ phylogenomic analysis of P. aeruginosa CCBH4851. The relatedness of CCBH4851 genome was assessed by phylogenetic inference amongst the isolates of group A, B, and C using the whole genome to perform a SNP calling and multiple sequence alignment of the resulting core genome. P. aeruginosa PAO1 was used as reference genome and had 92.9% of its sequence covered by all isolates. Interesting, isolates clustered according to the sequence types in the same phyletic clade (Fig. 4). Also, phylogenomic analysis showed that CCBH4851 was more nearly related to isolates of ST-244, sharing about 2-fold less SNP variants with them than with ST-235 isolates. P. aeruginosa PA7, as expect, was excluded from the resulting tree since it is known to be a taxonomic outlier 10 .
Swarming, swimming and twitching motilities. Swarming, swimming and twitching assays showed that CCBH4851 is impaired in all three types of motility. Swimming and swarming zones of CCBH4851 were about 5 millimeters (diameter) smaller when compared to PAO1 (Figs. 5 and 6, respectively). Both differences were statistically significant (swimming: P value = 0.0048, two-tailed, unpaired t test; swarming: P value < 0.001, two-tailed, unpaired t test). Comparisons of twitching motility showed no differences concerning the diameter of migration zones (Fig. 7A). However, after removal of unattached cells and staining with crystal violet, we observed diminished attachment of CCBH4851 cells to the polystyrene surface when compared to PAO1 (Fig. 7B). Observation of the stained area showed that CCBH4851 cells produced an expanding circular adherent zone, indicating that only cells on the outer side of the twitching area were attached, whereas PAO1 cells seemed to adhere to the polystyrene surface throughout the entire area, with only a few bacteria released from the center of the colony. Absorbance measurements of solubilized crystal violet confirmed the significant difference seen between the two strains. In order to assess if the diminished attachment observed would affect the ability of CCBH4851 to form biofilms, 96-well plate biofilm formation assays were performed. Under the conditions tested, CCBH4851 formed about fourfold less biofilm than PAO1 (Fig. 7C, P value < 0.0001, two-tailed, unpaired t test).
Antimicrobial susceptibility. Antimicrobial susceptibility testing was performed using one drug of each of four major antimicrobial classes. Tests were performed using the broth microdilution method, which allowed us to determine a minimal inhibitory concentration (MIC) for each drug. CCBH4851 was resistant to all antimicrobial agents tested except for colistin. Colistin susceptibility tests presented more than one skipped well, with no pattern. According to the European Committee on Antimicrobial Susceptibility Testing (EUCAST) guidelines, tests with more than one skipped wells should not be reported 11 . However, the majority of replicates showed no growth above the resistance breakpoint for colistin. The MICs for gentamicin, imipenem and ciprofloxacin were >1024 µg/mL, 128 µg/mL and 16 µg/mL, respectively. In addition, an E-test strip containing ceftolozane/tazobactam (a cephalosporin in combination with a beta-lactamase inhibitor) was also employed and CCBH4851 showed no inhibition zone ( Supplementary Fig. S2 online).

Discussion
Pseudomonas aeruginosa CCBH4851 is a clinical isolate originating from Brazil, which belongs to the endemic group ST-277. Previous studies demonstrated that ST-277 strains have a highly conserved DNA sequence and share several virulence and antimicrobial resistance-related features. However, to date, the presence of the bla SPM-1 gene conferring resistance to carbapenems is a singular trait of Brazilian clones 5,7 . The purpose of this study was to reassemble and to re-annotate the CCBH4851 genome in order to perform a thorough characterization and a genomic comparison with other P. aeruginosa strains. The reassembly resulted in a single contig representing the complete closed genome of CCBH4851. Together with the provided annotation, data such as chromosome size, G+C content, number of CDSs, structural RNAs, and tRNAs were consistent with general features reported for P. aeruginosa strains 12 . In addition, the CCBH4851 genome alignment with PAO1, PA14, and ST-277 strains revealed a strong synteny ( Supplementary Fig. S1 online) indicating an accurate assembly and annotation. The analyses performed in this work showed that the core genome of CCBH4851, PAO1, and PA14 strains (group A, Table 4) covered more than 80% of the CCBH4851 genome. A similar percentage was found when the core genome was defined comparing CCBH4851 with other isolates belonging to the ST-277 also carrying bla SPM-1 (group B, Table 4). However, when comparing MDR strains belonging to different STs (group C, Table 4), the core genome was smaller and an increase in the number of genes comprised in the accessory genome was www.nature.com/scientificreports/ observed. This result could be attributed to different factors: (i) some strains used in this analysis had over one hundred contigs and this lack of continuity could affect genome annotation; (ii) the strains used in this analysis are from different countries and had distinct resistance profiles, which could suggest a variation of mutation patterns and acquired regions, depicting interstrain differences. The larger accessory genome among these strains suggests a variability in pathogenicity and environmental flexibility. The comparative genome analysis revealed total or partial absence of homology between several PAO1 genes and the CCBH4851 genome. Among them, there is oprD, which suffered a deletion of 2 nucleotides in positions 380-381, causing a frameshift resulting in the gain of a premature stop codon. OprD is an outer-membrane porin important for the diffusion of carbapenems, particularly imipenem. Disruption of OprD is a common Table 3. SNPs found in virulence and antimicrobial resistance-associated genes of P. aeruginosa CCBH4851 using P. aeruginosa PAO1 as reference. www.nature.com/scientificreports/ resistance mechanism, mainly when combined with overexpression of AmpC and efflux pump systems 13 . As result, CCBH4851 showed a MIC of 128 µg/mL for imipenem. Comparison with PAO1 also revealed the mutation of mexZ in CCBH4851 genome, suggesting the overexpression of at least one efflux pump system, MexXY, whose transcription is repressed by MexZ. A deletion of 17 nucleotides from position 439 in mexZ gene sequence caused a frameshift resulting in the loss of the stop codon.  Aminoglycoside resistance is known to be produced not only by the overexpression of MexXY but the presence of aminoglycoside-modifying enzymes (AMEs) and rRNA methylases 15 . In addition to the possible efflux mechanism, CCBH4851 acquired   www.nature.com/scientificreports/ two AME-encoding genes, aac(6')-I and aadA7, and a 16S rRNA methylase-encoding gene, rmtD, resulting in a MIC over 1024 µg/mL for gentamicin. Other differences revealed by the comparative analysis were the mutations of algP and pilA genes, both related to biofilm development. AlgP is a histone-like protein which activates the transcription of algD, responsible for the alginate precursor synthesis. Alginate is an important molecule for the structural stability of the biofilm matrix. The amino acid sequence of AlgP contains several repeated KPAA motifs which are mutation targets at high frequency in clinical isolates. According to this, the algP gene sequence of CCBH4851 had an insertion of 12 nucleotides, adding one KPAA motif to the AlgP protein sequence. The repeated 12-bp sequences of algP are considered a hot spot for DNA rearrangements and could provide a reversible switching mechanism between nonmucoid and mucoid phenotype, thus turning on and off virulence factors important for a successful infection 16,17 . On the other hand, the pilA sequence is completely absent from the CCBH4851 genome. PilA is a pilin protein involved in type IV pilus (TFP) biogenesis, a major surface adhesin. Previous work demonstrated that the absence of pilA in carbapenem-resistant clinical isolates is not uncommon 18 .
P. aeruginosa is capable of switching between three types of motility depending on the environment: swimming, swarming and twitching. Swimming is mediated by flagella in aqueous environments. Swarming is mediated by both flagella and TFP in semisolid environments. Twitching is TFP-dependent and allows the cells to move across surfaces by extension, attachment and retraction of the pilus filament. Also, surface attachment mediated by TFP is required for biofilm initiation [19][20][21] . As expected, our in vitro analyses showed that CCBH4851 is impaired in swarming, twitching, and biofilm development. P. aeruginosa CCBH4851 was also defective in its ability to swim. Although CCBH4851 presented no lack of known genes directly involved in flagellar biosynthesis, several SNPs were located in those genes (Supplementary Table S7 online), including flhF, fliA, fleN, motB, and others. During acute infection, motility is required to the colonization of environments, attachment to surfaces and initiation of biofilm formation, mechanisms strongly related to virulence. Motile bacteria often display a more virulent phenotype. Loss of motility is important to the switching for a sessile lifestyle and the establishment of chronic infection. Also, it is noteworthy that virulence and multidrug resistance are often inversely correlated, i.e. MDR strains are less or non-virulent whereas non-MDR strains are hypervirulent 22,23 . In that sense, the phenotype displayed by CCBH4851 suggest a less virulent behavior in favor of a survival strategy under selective environmental pressures.
The acquisition of exogenous material by horizontal gene transfer is a common adaptive mechanism among P. aeruginosa strains, often related to the presence of genomic islands. Genomic islands are clusters of genes often encoding virulence factors, antimicrobial resistance proteins, toxins, secretion system proteins, transcriptional regulators, and other proteins 24 . The CCBH4851 genome presented a high number of islands, all sharing homology with previously described PAGIs, except for PAGI-41. Although the majority of proteins encoded by these PAGIs are hypothetical proteins, some of them could be clustered into functional categories due to the presence of conserved domains. The outstanding categories comprised genes involved in mechanisms of replication, recombination and repair, and transcription. Indeed, the newly annotated PAGI-41 carries the gene AL347_09405 which encodes an XRE-type HTH domain-containing protein, suggesting a function as a transcriptional repressor. Downstream, the gene AL347_09410 encodes a protein containing a RelE/ParE toxin superfamily domain. In response to environmental stress, such as antimicrobial exposure, this type of toxin can inhibit replication or translation mechanisms, leading the cell to a dormant state. Cells in dormant state, also called persister cells, become multidrug-tolerant. The repressor binds to the toxin forming an inactive complex acting as an antitoxin, and is able to binding its own promoter region preventing toxin-antitoxin expression. Antitoxins could also regulate transcription of several virulence factors, including biofilm formation 25,26 . CCBH4851 possess three additional toxin-antitoxin system in other PAGIs. Moreover, all acquired genes conferring resistance to a broad range of antimicrobial agents were located in PAGIs, including the two copies of the carbapenem resistance gene bla SPM-1 . Our in vitro analyses showed that CCBH4851 is not only resistant to several antimicrobial agents, but also presented MIC values over the resistance breakpoints.
In addition to the presence of PAGIs, distinct IS families were also detected in the CCBH4851 genome. The importance of ISs is not restricted to their role in horizontal gene transfer, but ISs movement along the chromosome can also affect antibiotic resistance by the activation of gene expression 27 . Another player in shaping the bacterial genome is the CRISPR-Cas system. The type I-C found in the CCBH4851 genome was previously described in other ST-277 bacteria as well as in members of ST-235 7, 28 . A recent study suggests the protective effect of CRISPR-Cas systems is more evident at the population level than at an evolutionary scale 29 . This could explain the intraclonal genome conservation already observed among Brazilian isolates belonging to ST-277 in previous work 7 and corroborated here by our phylogenomic analysis.
During the infection course, P. aeruginosa strains tend to adapt to the selective pressures of the environment, often through mutations in intergenic and/or coding sequence regions. A classical example of adaptive mutation is the MexT protein, responsible for activation of the mexEF-OprN operon. MexEF-OprN is quiescent in wildtype cells and its expression occurs following mutations in MexT or MexT-related genes. Indeed, the mexT gene sequence of CCBH4851 had an 8-bp deletion known to render MexT active, causing the induction of mexEF-oprN transcription and a decrease in OprD levels, which characterizes the so-called nfxC-type mutants 30 . However, later work showed that this 8-bp deletion alone was not sufficient to activate the transcription of mexE, and the generation of nfxC-type mutants seems to be multifactorial. Indeed, inactivation of mexS, a gene upstream of mexT, seems to be one of these additional factors 31 . The mexS gene of CCBH4851 suffered a deletion of 1 nucleotide in position 22 causing a frameshift which resulted in the loss of the stop codon. The gene product of the mexS pseudogene showed an alignment of only 7 aa with the MexS wild-type sequence of PAO1. This could suggest the absence of MexS in the CCBH4851 proteome, leading to the overexpression of mexE and other phenotypes related to nfxC-type mutants. www.nature.com/scientificreports/ in GyrA and S87L in ParC are well-known to play an important role in fluoroquinolone resistance as well as the overexpression of efflux pump systems 32 . Once again, the antimicrobial susceptibility testing revealed a MIC of 16 µg/mL for ciprofloxacin, corroborating the observed genotype. Quantification of efflux pump systems expression is required to confirm altered transcription levels, but amino acid substitutions in NalC (repressor of mexA gene), ArmZ (MexZ anti-repressor), PA2018 and PA2019 proteins suggest additional mechanisms leading to the overexpression of mexA and mexX. In addition to these factors, amino acid substitutions R79Q and T105A found in AmpC (Table 3) characterizes this protein as a variant type named PDC-5. Clinical isolates carrying this variant presented AmpC overexpression, increased beta-lactamase activity and reduced susceptibility to ceftazidime, cefepime, cefpirome, aztreonam, imipenem, and meropenem 14,[32][33][34] . Although SPM-1 is an enzyme capable of hydrolyzing a broad range of beta-lactams, it is known that MBLs has little or no effect against monobactams 2, 3 . However, CCBH4851 was resistant to aztreonam 6 , a monobactam widely available for clinical use. This phenotype can be explained by the presence of PDC-5, depicting a synergistic effect between AmpC and SPM-1 on beta-lactam resistance mechanisms of CCBH4851.
Not all SNPs found in this work were previously described; however, different mutations in the same genes listed in Table 3 were observed in other clinical isolates, suggesting a recurrence in the mutations' location caused by selective pressure. The variants found in CCBH4851 could be intraclonal, but could still change gene function, thus contributing to the enhanced resistance of ST-277 clones. In fact, the in silico analysis suggests some of these SNPs could affect protein function due to the introduction of amino acids with different properties, such as size, charge, hydrophobicity, often located in protein domains. The prediction indicates noteworthy mutations occurring in genes such as ampD (negative regulator of ampC expression), mucD (repressor of algU alternative sigma factor transcription), oprJ (member of mexCD efflux system), and others, which could affect protein folding and/or function (Supplementary Table S8 online).
The features found in the P. aeruginosa CCBH4851 complete genome are somehow related to its pathoadaptive behavior, since part of them are common among clinical isolates and some are unique of ST-277 clones. The number of unique genes, the number of genes contained in the acquired genomic islands, and the larger size of the CCBH4851 chromosome are consistent with this observation. However, most of the genome is shared with more susceptible strains, suggesting the high number of mutations found in conserved genes are contributing to the success of this MDR strain. Further validation of uncharacterized polymorphisms revealed by this study should help increase the understanding of MDR strains phenotype. The results provided in the present study will stimulate and contribute to future works that lead to the discovery of new antimicrobial targets and to the development of effective therapeutic choices against P. aeruginosa infections.

Methods
Bacterial strains. The focus of the present study was the bacterial strain Pseudomonas aeruginosa CCBH4851. The isolate is available in the Culture Collection of Hospital-Acquired Bacteria (CCBH), located at Fundação Oswaldo Cruz (WDCM947; CGEN022/2010). Other strains used to perform comparative genome analysis are listed in Table 4.
Genome sequencing, reassembly and re-annotation. Whole-genome sequencing of CCBH4851 was previously performed using the Illumina MiSeq platform 6 . Genome assembly resulted in a draft genome comprising 150 contigs, which were annotated as described in previous work. Later, an additional whole-genome sequencing was performed using the PacBio platform. Here, de novo assembly was carried out using the MaS-uRCA assembler, which allowed the combination of Illumina short reads with PacBio long reads resulting in an improvement of the original assembly 35 . A single contig was obtained and was annotated using a customized pipeline. Briefly, annotation from the closely related strain P. aeruginosa PAO1 was transferred to the CCBH4851 genome using the Rapid Annotation Transfer Tool software 36 . Then, CDSs were predicted by GeneMarkS 37 . Prediction of rRNAs and tRNAs were performed by RNAmmer and tRNAscan-SE software, respectively 38,39 . The predicted CDSs were functionally annotated based on homology searches against public databases, such as RefSeq 40 , COG 41 , KEGG 42 , UniProt 43 and InterPro 44 . Finally, the annotations from previous steps were compared, combined and manually curated. Additionally, PRIAM software was used to assign an enzyme commission number to CDSs 45 . All genome visualization was made using the Artemis software 46 . The circular plot of CCBH4851 genome was produced by Circos software 47 . comparative genome analysis. Whole-genome comparison was performed to identify similarities and differences between CCBH4851 and each group of strains listed in Table 4. The genomes were analyzed using the bidirectional best-hit (BBH) clustering method based on homology searches using the BLAST algorithm to identify pairs of corresponding genes that are each other's best hit when different genomes are compared 48 . Allagainst-all BLAST alignments were performed using the following parameters: ≥ 90% coverage, ≥ 90% similarity and E value cut-off of 1e-10. A Python algorithm was applied over BLAST results to identify the BBHs 49 . Core, accessory, and unique genomes were analyzed using a MySQL database created with the data generated in previous steps. The CDSs were classified into Clusters of Orthologous Groups (COG) functional categories using the eggNOG-Mapper web application 50 . The identification of genomic islands was carried out by IslandViewer 51 . Insertion sequences were identified using the ISSaga web application 52 . Presence of a CRISPR-Cas system was assessed by the CRISPRCasFinder web application 53 . In order to detect genes involved with antimicrobial resistance mechanisms, protein sequences of the CCBH4851 genome were used to perform BLAST searches against CARD using the Resistance Gene Identifier web application 54  www.nature.com/scientificreports/ value cut-off of 1e-10. Regulatory proteins were predicted using the Predicted Prokaryotic Regulatory Proteins web application 56 . Detection of SNPs was performed using the Snippy software based on the alignment of unassembled reads with the P. aeruginosa PAO1 genome sequence using default parameters 9 . The hypothetical effect of non-synonymous variants were analyzed using the HOPE web application 57 .
phylogenomic analysis. Genome sequences of CCBH4851 and strains listed in Table 4 were used to perform whole-genome phylogeny analysis. The analysis was performed by CSI Phylogeny web application using default parameters 58 . A core genome phylogeny tree was inferred and edited with FigTree (available at http:// tree.bio.ed.ac.uk/softw are/figtr ee/).
Antimicrobial susceptibility testing. Antimicrobial susceptibility testing was performed according to the EUCAST guidelines using the broth microdilution method 11 . Four major antimicrobial classes were assessed: aminoglycosides (gentamicin), quinolones (ciprofloxacin), beta-lactams (imipenem) and lipopeptides (polymyxin E or colistin). The MIC was determined as the lowest concentration of antimicrobial agent that prevented visible growth of the microorganism. Classification as susceptible, intermediate or resistant was based on EUCAST criteria 59 . Briefly, resistance breakpoints for ciprofloxacin, imipenem and colistin are >0.5 µg/mL, >4 µg/mL and >2 µg/mL, respectively. Gentamicin has an "IE" status, meaning there is no sufficient evidence that the organism is a good target for therapy with the agent, MIC may be reported but without an S, I or R assignment. For the broth microdilution method, a total of six biological replicates with two technical replicates each were performed divided in three independent experiments. With exception of colistin susceptibility testing, that showed growth in skipped wells with no pattern, the same visual reading was observed in all experiments.
Motility assays. Motility assays were performed as described previously with slight modifications 60 . Luria-Bertani (LB) agar was used as growth medium. Swimming plates (0.3% agar) were inoculated with a sterile toothpick and incubated for 16 h at 37 • C. Swarming plates (0.6% agar) were inoculated with a sterile toothpick and the plates were incubated for 24 h at 37 • C. For twitching, cells were inoculated with a toothpick through a LB layer (1% agar) to the bottom of the petri dish. Plates were incubated for 24 h at 37 • C. After incubation, the agar was discarded, the plates were washed to remove unattached cells, and stained with crystal violet (1% [wt/vol] solution). All motilities were measured as the circular zone formed by bacterial migration away from the inoculation point. Additionally, the stain in the twitching plates was solubilized in ethanol (95% [vol/vol] solution) and the absorbance was measured at 570 nm using an ethanol solution as blank. With exception of the stain quantification in the twitching plates, at least three biological replicates with three technical replicates each were performed to three independent experiments. For stain quantification in the twitching plates, a total of five biological replicates with two technical replicates each were performed divided in two independent experiments.
Biofilm formation assay. The ability of P. aeruginosa to form biofilms was assessed as described previously 61 . Briefly, cells were inoculated in 96-well plates containing LB or Mueller-Hinton medium. Plates were incubated for 20-24 h at 37 • C. After incubation, the medium was discarded, the plates were washed with water to remove unattached cells and stained with crystal violet (1% [wt/vol] solution). Then, the crystal violet was solubilized in ethanol (95% [vol/vol] solution), and the absorbance was measured at 570 nm using an ethanol solution as blank. P. aeruginosa PA01 was used as positive control. A total of seven biological replicates with at least three technical replicates each were performed divided in two independent experiments. Statistical analysis. Statistics was performed in GraphPad Prism software. Student's t test was used to test whether the results of phenotypic experiments were significantly different (P value < 0.05). Data were expressed as mean ± standard error of the mean. In the figures, * equals P value < 0.05, * * equals P value < 0.01, * * * equals P value < 0.001 and * * * * equals P value < 0.0001.

Data availability
The genome sequence and annotation data generated in this study have been submitted to the GenBank database (https ://www.ncbi.nlm.nih.gov/genba nk/) under accession number CP021380.