Contact-dependent growth inhibition systems in Acinetobacter

In bacterial contact-dependent growth inhibition (CDI) systems, CdiA proteins are exported to the outer membrane by cognate CdiB proteins. CdiA binds to receptors on susceptible bacteria and subsequently delivers its C-terminal toxin domain (CdiA-CT) into neighbouring target cells. Whereas self bacteria produce CdiI antitoxins, non-self bacteria lack antitoxins and are therefore inhibited in their growth by CdiA. In silico surveys of pathogenic Acinetobacter genomes have enabled us to identify >40 different CDI systems, which we sorted into two distinct groups. Type-II CdiAs are giant proteins (3711 to 5733 residues) with long arrays of 20-mer repeats. Type-I CdiAs are smaller (1900–2400 residues), lack repeats and feature central heterogeneity (HET) regions, that vary in size and sequence and can be exchanged between CdiA proteins. HET regions in most type-I proteins confer the ability to adopt a coiled-coil conformation. CdiA-CT and pretoxin modules differ significantly between type-I and type-II CdiAs. Moreover, type-II genes only have remnants of genes in their 3′ end regions that have been displaced by the insertion of novel cdi sequences. Type-I and type-II CDI systems are equally abundant in A. baumannii, whereas A. pittii and A. nosocomialis predominantly feature type-I and type-II systems, respectively.

also promote social interactions between isogenic CDI + cells, facilitating biofilm formation. The genes involved in the process are likely activated by contact-dependent signalling pathways 14 . Biofilm growth plays a crucial role in the persistence of pathogenic strains in infected hosts. Indeed, specific CdiA proteins are key determinants of bacterial virulence in some species 15,16 . CDI systems are accessory genome components acquired by lateral gene transfer events and are conserved in a relatively small number of strains within a species. CDI systems that mediate growth inhibition of non-immune sister cells have recently been identified in a few strains of Acinetobacter baumannii 17,18 . Although A. baumannii is the most clinically important Acinetobacter species 19 the related species 3 and 13TU now recognized as A. pittii and A. nosocomialis, respectively 20 , have also been frequently associated with nosocomial infections 21 . These three species, as well as the environmental species A. calcoaceticus, are closely related at the genomic level and are all referred to as components of the A. calcoaceticus-A. baumannii (ACB) complex. Furthermore, the group was recently revisited to include the pathogenic A. seifertii 22 and A. dijkshoorniae 23 species.
In Acinetobacter, several virulence factors act at the bacterial surface level 24 . Recently, we described two A. baumannii surface proteins that stimulate biofilm formation and adhesion to epithelial cells 25 . By wiping out non-self cells and by simultaneously stimulating the aggregation of self cells, CdiA proteins may contribute to making Acinetobacter a successful pathogen. Two CdiA-like proteins of 2000 (CdiA2784) and 3711 (CdiA940) aminoacids, found in the non-pathogenic Acinetobacter baylyi ADP1 strain, were both shown to inhibit the growth of ADP1 cells lacking the corresponding CdiI immunity proteins in a contact-dependent manner 26 .
In this study, systematic in silico analyses revealed that pathogenic Acinetobacter also feature CdiA proteins that significantly differ in size and structural organization. The distribution of the corresponding CDI systems differs among the species of the ACB complex.

Results
Acinetobacter cdi genes are located at different chromosomal sites. Acinetobacter proteins annotated as haemagglutinins using the KEGG (Kyoto Encyclopedia of Genes and Genomes) database were used as queries to search for CdiA-encoding genes in ACB complex genomes deposited in GenBank. Most of the bacterial sequenced genomes are incomplete, and many are unannotated. Moreover, giant proteins, such as CdiA, are often overlooked, with the corresponding genes annotated as pseudogenes 25 . To circumvent both problems, CdiAs were searched for using tBLASTn. All the CDI systems identified are listed in Supplementary File 1. For each Acinetobacter strain, the sequence type (ST), which was determined with the Acinetobacter Pasteur Multi Locus Sequence Typing (MLST) system 27 , is also provided. In the adopted annotation scheme, CdiA proteins are all marked by a prefix to identify the species (bau, pit, nos, cal and bay denote A. baumannii, A. pittii, A. nosocomialis, A. calcoaceticus and A. baylyi proteins, respectively). Thoroughly analysed CDI + strains are listed in Table 1.
As in other gram-negative bacteria, the Acinetobacter cdi operons included (in the 5′−3′ order) three genes, cdiB, cdiA, and cdiI, which encode the transporter CdiB, CdiA, and the CdiI immunity protein that antagonizes the CdiA toxin, respectively (Fig. 1). CdiA proteins vary extensively in length and can be roughly sorted into large (~2000 amino acids) and giant (>4000 amino acids) proteins that feature long repeat arrays, herein referred to as type-I and type-II CdiA, respectively. CDI genes identified in species of the ACB complex are located on genomic islands inserted at 5 chromosomal sites. In particular, sites A, B, Cu, and Cd host type-I genes, whereas site D hosts only type-II genes (Fig. 1). In the islands inserted at sites A and B, the cdi operons are flanked by genes of unknown function. Intriguingly, in the islands located at site Cu, the cdi operons are instead flanked by genes adjacent to the border of site D (Supplementary File 2). This observation tracks both the earlier insertion of an ancestor type-I cdi cluster at site D and the capture of target sequences in the excision process. Terminal repeats, corresponding to target site duplications (TSDs) mark the ends of several type-II CDI islands (Supplementary File 2). Type-I CDI islands do not feature TSDs.
Type-I cdi genes. Type-I cdi genes are plausibly derived from two ancestor gene clusters. This hypothesis is supported by two observations: i) the cdiB-cdiA gene distance is 24 base pairs (bp) in the A and C genes and 46 bp in the B genes; ii) the A and C transporter genes are much closer to each other than they are to B homologs (95% vs 65% similarity, respectively; Supplementary File 3). Alignments of CdiA proteins denoted a more articulated branching of type-I CDI systems (Fig. 2). Type-I CdiA proteins were marked by a prefix which identifies the species and by a letter to denote the chromosomal site of insertion of the corresponding gene cluster. All CdiA proteins display the same backbone, which is characterized by four main features: i) a 24 residues ESPR (extended signal peptide region, PF13018) motif at the NH2 terminus, which is recognized by the Sec-translocation machinery, and is cleaved during export through the inner membrane; ii) a ~140 residue region, that is recognized in the NCBI Conserved Domain Database (CDD) as a Haemagg_act (haemagglutination activity domain PF05860) domain, corresponding to the TPS domain involved in CdiA-CdiB interactions; iii) a domain of unknown function (DUF637, PF04830) present in a subset of CdiA proteins from other bacterial species; and iv) a pretoxin PT-VENN module (PFO4829) demarcating the variable CdiA-CT region (Fig. 2). In pair-wise comparisons, the homology between CdiA proteins ranges from 50 to 95% similarity. Type-I CdiA proteins are aligned in Supplementary File 4.
Type-I CdiA proteins may adopt a coiled-coil conformation. Aside from the CT region, the primary source of variation among type-I CdiA proteins occurs in the central heterogeneity (HET) region (Fig. 2a). HET regions, the length of which ranges from 28 to 176 residues, are conserved in a few proteins but vary extensively in all others (Supplementary File 5). HET and CT regions vary independently. For instance, although similar proteins, such as cal-C11 and pit-C7 (92% identity) or bau-C2 and pit-C6 (83% identity), have the same CT region, Scientific REPORts | (2019) 9:154 | DOI:10.1038/s41598-018-36427-8 they have different HET regions (Fig. 2). In contrast, the closely related pit-A5 and pit-C5 (87% identity) feature the same HET region but have different CT domains. Secondary structure predictions obtained using PAIRCOIL2 28 and MARCOIL 29 revealed that HET regions may adopt a coiled-coil conformation (Fig. 2b). Coiled coils, which are structural protein motifs in which two or more alpha-helices are coiled together, typically contain a repeated heptameric pattern of hydrophobic and charged amino acids 30 . Most CdiA proteins may adopt a coiled-coil conformation in the HET region, as evidenced by the height of peaks in the MARCOIL profiles, which are indicative of coiled-coil formation (Supplementary File 6). Only bau-B2, pit-B6, bau-B3 and pit-C6 do not form coiled-coil conformations.
Swapping of HET modules. We identified A. pittii variants of pit-A5 carrying the HET region of A. pittii (either pit-A4 or pit-C7) or A. nosocomialis (nos-C8) CdiA proteins, as well as a single pit-A4 variant carrying the HET region of pit-C7 (Fig. 2c). Variants featuring HET regions of an unknown source (n1-HET and n2-HET) were also identified (Fig. 2c). Of these variants, only pit-A5/n2-HET adopts a coiled-coil conformation  (see Supplementary File 6). In most CdiA protein variants, exchanges between "donor" and "recipient" genes were limited to the HET region. Few chimeric proteins featuring the COOH region of pit-B5, and both the NH2 and the HET regions of either bau-B3 or pit-B7 were also identified in A. pittii and in A. baumannii isolates (Fig. 2c). DNA alignments revealed that switching from either bau-B3 or pit-B7 to pit-B5 sequences occurred in the same region (see Supplementary File 5).
Multiple type-I CDI systems coexist in A. pittii. Most A. baumannii isolates host a single cluster of type-I cdi genes, but isolates assigned to the ST52 genotype, such as the reference ATCC 19606 strain, carry both B and C type-I genes.
In contrast, more than half of the A. pittii CDI + strains (69/126) carry multiple cdi gene clusters (see Supplementary File 7). Some combinations are observed more often than others. For instance, pit-B5 genes are associated with pit-C6 genes in 39 strains, and 11 of these strains possess an additional cdi gene, A5/HET-n2, while other partners are present in 6 additional strains. All bau-B3/ and pit-B7/pit-B5 chimeric genes coexist with other cdi genes, mostly with pit-A4 and pit-C5.
Our data do not reflect an over-representation of peculiar groups of strains. Indeed, the A. pittii strains featuring multiple cdi genes belong to 20 different STs and were isolated from different geographical areas (see Supplementary File 7).
Type-II CDI systems. Type-II cdiB and cdiA genes in the ACB complex strains are separated by 63 bp, suggesting that they are derived from a single ancestor. In A. baumannii and A. nosocomialis, type-II cdi genes map to site D (see Fig. 1), adjacent to a tRNA-trp gene. In contrast, type-II cdi genes are inserted adjacent to a type-3 fimbrial gene cluster in A. pittii. Type-II genes present in the soil-living A. baylyi ADP1 strain and in the A. baumannii SDF strain (isolated from a louse) differ from those present in the ACB complex. In A. baylyi ADP1, the cdiB and cdiA genes are separated by 85 bp. In the SDF strain, the two genes are separated by cdiC, a gene involved in the maturation of CdiA proteins that has identified in different gamma-proteobacteria 31 . For the sake of simplicity, all type-II CdiA proteins are referred to as D proteins and numbered according to the CT profile.
The analysed type-II CdiA proteins are shown in Fig. 3, and protein alignments are provided in Supplementary File 8. The length of these giant proteins, ranging from 3723 to 5733 residues, is correlated with the size of large repetitive (R) regions consisting of 20-mer repeats similar to those described in B. pertussis haemagglutinins 32 . Within repeats, branched-chain amino acids, glutamine and glycine residues are periodically reiterated in a few specific sequence combinations (Supplementary File 9). The repeat pattern is altered by indels and mutations and is discontinuous, as repeat regions occur in clusters. A detailed analysis of the R regions is out of the scope of this report. In type-II CdiAs, the R region is located between the TPS domain at the NH2 terminus and a conserved 270-residue region that is partly related to the FhaB domain (COG3210) present in many haemagglutinins ( Fig. 3 and Supplementary File 9). Downstream of this domain, CdiA proteins diverge in sequence, accounting for the class 1-3 subdivision. Type-II CdiB transporters do not significantly vary and show a robust 96% similarity.
Toxin and pre-toxin modules in type-I and type-II proteins. CT regions differ between type-I and type-II proteins. For instance, although three proteins of the two groups (i.e., bau-C3 and bau-D10, nos-C9 and bau-D16, and pit-A5 and bau-D19) have the same toxic activity, they are embedded within different sequence contexts. Consequently, the corresponding CT modules, as the cognate co-expressed CdiI immunity proteins, differ in all pairs. Moreover, PT-VENN modules also differ in type-I and type-II proteins, and PT-VENN modules are replaced by novel pretoxin modules called PT-LPEN in type-II proteins of class-3 (Supplementary File 9).
In most type-II CT regions, the C-toxic domains are flanked by long (150-200 residues) N-upstream modules, which are shared by multiple CT regions (Fig. 3). Upstream modules of E. coli CdiA-CTs regulate toxin transport in the cytoplasm of targeted cells 33 . Remarkably, upstream components are missing in the CT regions of type-I CdiAs, and only bau-B2 and pit-A4 exhibit partial homology downstream of the PT-VENN module (Supplementary File 9). In class-3 type-II CdiA, 20-40 residue modules called SWR (switch regions) connect different upstream sequences to the same toxin module in proteins derived from either different species (bau-D15 and bay-D15; bau-D16 and bau-leaf, a type-II CdiA identified in the Acinetobacter strain Leaf130 that was isolated from Arabidopsis thaliana, see ref. 34 .) or from the same species (bau-D21J and bau-D21L). SWR vary in sequence and are composed of 1 or 2 modules. Sequences homologous to N-upstream toxin and SWR modules present in bau-D19 and bau-D2 were identified in type-II CdiA proteins from Acinetobacter junii and Acinetobacter seifertii (Fig. 4). Complete CT sequences of class-3 CdiA are shown in Supplementary File 9.
Orphan cdi genes. The 3′ regions of most type-II cdi operons host orphan cdi sequences that encode CdiA-CT regions and/or immunity cdiI proteins (Fig. 5). These segments are remnants of cdi genes displaced by the insertion of novel cdi sequences. We hypothesize that the replacement of the CT region in Acinetobacter CDI systems takes places in a way similar to that described for the T6SS gene clusters in Vibrio cholerae 35 , where incoming DNA forms a heteroduplex with homologous cdiA sequences by promoting integration via illegitimate recombination of novel non-homologous CT sequences at the 3′ end 36 . Orphan sequences are retained, plausibly because of their potential usefulness, a hypothesis supported by the predominance of antitoxin cdiI genes. Accumulation of cdi gene remnants in the 5′−3′ direction, resulting from a step-wise remodelling of the cdi locus, is evident when comparing the large orphan region of bau-D19 and those from the other class 3 genes (Fig. 5). It appears that the cdi locus was occupied early on by bau-D17 sequences, and was subsequently remodelled by bau-D15, bau-D16 and eventually bau-D19 sequences (Fig. 5).
Some orphan genes may be derived from recombination/insertion events, as indicated by an inspection of the 3′ end region of the bau-D7 gene (Fig. 5). Instead, others may be derived from tandem duplications events, such as the cdi orphans in the nos-D4 and nos-D8 clusters, which are copies of the nos-D4 CT/I region. Similarly, the bau-D12 cdiI gene and its orphan copies, located downstream of bau-D13 and bau-D20 genes, are flanked at the 3′ end by sequences encoding a CdiA tract conserved in all class 2 proteins (Fig. 5).  Among A. baumannii isolates, 16 and 23 different genotypes feature only type-I or type-II CDI systems, respectively, whereas 28 genotypes feature both systems (Fig. 6). Except for a few type-II cdi clusters restricted to single STs, all CDI systems were observed in multiple genotypes. The distribution of CDI systems significantly differs in A. nosocomialis and A. pittii, which host predominantly type-II and type-I systems, respectively (Fig. 6).

Distribution of CDI systems in
Cladograms in Fig. 6 indicate that CDI systems are not restricted to subsets of A. baumannii strains. This result consistent with the absence of phylogenetic structuring in this species 27 . Unrooted neighbour-joining phylogenetic analyses of the entire population, unfeasible in the A. baumannii species which includes >1000 ST, revealed the absence of phylogenetic structuring in both A. nosocomialis and A. pittii, suggesting a random distribution of cdi genes in both species (Supplementary File 10).
CDI systems were also identified in other Acinetobacter species (Supplementary File 11). The Acinetobacter genus consists of >50 distinct species (http://www.bacterio.net/Acinetobacter.html). A phylogenetic tree, based on the alignment of core-genome proteins 37 , separates Acinetobacter species into two large groups: one group includes A. soli, A. junii, A. baylyi and the species of the ACB complex, while the other includes A. lwoffii, A. johnsonii, A. gerneri, and the closely related species, A. guillouiae and A. bereziniae 38 . The identification of type-I and type-II CdiA CDI proteins in half of the limited number of sequenced A. guillouiae and A. bereziniae strains, respectively (Supplementary File 11), suggests that CDI systems are spread throughout the entire genus.

Discussion
Since the first description of CDI systems in E. coli 4 , they have been increasingly recognized as relevant accessory genome components in proteobacteria. In this study, we provide the results of a genome-wide survey of the CDI systems present in pathogenic Acinetobacter genomes. Although Acinetobacter spp. have been primarily isolated from soil, their occurrence in clinical settings has been intensively investigated as a major source of nosocomial infections. Unsurprisingly, knowledge of the organization of Acinetobacter genomes is mostly derived from analyses of clinical isolates. All Acinetobacter isolates of medical interest belong to the A. calcoaceticus-A. baumannii ACB complex. The relative abundances of sequenced A. baumanni, A. pittii and A. nosocomialis strains (2450, 157, and 72 to date, respectively) largely mirrors the frequency of infections caused by each species. Despite the predominance of A. baumannii, the number of A. pittii and A. nosocomialis sequenced genomes is sufficiently high to warrant meaningful comparisons. Monitoring of the distribution of CDI systems among species of the ACB complex is largely biased by both the over-representation of strains belonging to epidemic lineages and by mistakes in the species identification. Consequently, in this study, we adopted an ST-driven identification system for a coherent classification of CDI-positive clones.
Estimates based on ST profiles revealed CDI systems in approximately half of the genotypes identified as A. nosocomialis (17/43 genotypes) and A. pittii (42/97 genotypes) according to the Acinetobacter Pasteur MLST system. In contrast, far fewer CDI systems were identified in the A. calcoaceticus (2/12 genotypes) and A. baumannii (66/943 genotypes) strains. We speculate that the paucity of cdi-positive genotypes may be correlated to the limited number of sequenced A. calcoaceticus strains and to a reduced dissemination/maintenance of CDI systems in A. baumannii.
Acinetobacter features two distinct types of CdiA proteins that significantly differ in size and organization and exhibit limited homology (40% identity) in the ESPR and TPS domains at the NH2 terminus. Type-II CdiA proteins range in size from 3700 to 6000 residues and like most CdiA proteins characterized in other bacterial species [5][6][7][8] , these proteins feature long arrays of 20-mer repeats. The repeat region is not essential for contact-dependent growth inhibition, as supported by the analysis of type-I and type-II CdiAs in A. baylyi 26 . Body heterogeneities allowed us to sort type-II CdiA proteins into three classes. Like in many CdiA proteins from other species, in most class-1 proteins and in all class-2 proteins, CdiA-CT regions are flanked upstream by PT-VENN modules. However, in class-3 proteins, PT-VENN modules are replaced by PT-LPEN, i.e., novel pretoxins that are unrelated to PT-VENN and to alternative pretoxin modules identified in Burkholderia 39 , Neisseria 40 , and Pseudomonas 8 . Most CdiA-CT regions are bipartite and feature upstream N-modules, that are common to multiple CdiA proteins and are plausibly involved in toxin trafficking inside targeted cells 33 . CdiA-CT regions, which are located downstream of PT-LPEN pretoxins, have a more complex structure, where the profile of each region results from the combinatorial assembly of multiple modules (Fig. 4). Similar mosaic structures may not be uniquely present in Acinetobacter, and the construction and assay of site-directed mutants will eventually elucidate the role of the various CT modules in class 3 CdiA proteins.
As in other proteobacteria, Acinetobacter cdi genes are flanked by the remnants of cdi genes that were destroyed by the insertion of novel cdiA/cdiI sequences. Indeed, we observed that repeated insertions sequentially dislodged orphan modules in a 5′−3′ manner (Fig. 5). Effector (E) and immunity (I) genes were similarly displaced by the insertions of novel E/I pairs in T6SS gene clusters in V. cholerae 35 . Although effector genes were lost, the I module genes were retained, allowing for protection against bacteria that were still producing the old effector 35 . Orphan gene fragments retain sequences that may enable them to fuse with the upstream cdiA gene by homologous recombination. Orphan CT/I pairs have been reported to be able to change the toxicity profile of CdiA proteins 5,41 . CT/I reprogramming is undoubtedly harmful because of the loss of immunity against neighbouring unrearranged cells producing the old toxin. However, a few mutants equipped with the new toxin may survive and turn into predators. In contrast, recombination events driven by body or SWR orphan modules might be lethal, leading to the formation of I-strains that are rapidly eliminated.
Type-I CdiA proteins differ from type-II proteins in many respects. Repeats present in type-II proteins fold into right-handed parallel alpha-helices and may form structures protruding 40-140 nanometres from the surface of CDI + bacteria 6 . Type-I proteins lack repeat sequences of any length and composition, suggesting that type-I and type-II proteins may be differently exposed on the cell surface and that their interaction with targeted cells may also differ. Type-I CdiA-CT regions lack the long upstream components present in many type II proteins. Type-I CDI systems also differ from type-II CDI systems in that they lack orphan cdi sequences. There is no obvious explanation for such discrepancy. It may be that type-II cdi genes may have recombination hotspots that make them prone to recombination events, eventually leading to the formation of orphan sequences. Alternatively, type-I CDI clusters lack orphan sequences because they have not yet experienced cycles of de novo insertions, being evolutionarily younger than type-II CDI clusters. The two CDI systems seem to have evolved independently of each other upon speciation of taxa within the ACB complex. The relative abundances of type-I and type-II systems is comparable in A. baumannii. In contrast, type-I systems prevail in A. pittii, while type-II systems prevail in A. nosocomialis (Fig. 6). The different type-I CdiA proteins are largely similar, although a sequence alignment highlighted a non-homologous central region in these proteins. The HET region varies in size and sequence content and may differ in otherwise similar proteins. Secondary structure prediction showed that most HET regions can adopt a coiled-coil conformation, while analogous regions were not identified in type-II CdiA proteins. Coiled-coil domains are structural motifs used to facilitate protein oligomerization, separate functional domains, and modulate interactions with partner proteins 42 . Surprisingly, derivatives of the A. pittii CdiA proteins pit-A5 and pit-A4 carry HET regions that were "stolen" from other A. pittii or A. nosocomialis CdiA proteins, or even from unknown proteins. The changes occurring in all variants are due to site-specific recombination events that selectively replace HET regions. In some pit-A5 chimaeras, both the NH2 and HET regions are derived from other CdiA proteins. The need for all these changes, as well as the function of HET regions, is unknown. We hypothesize that the HET region may modulate protein-protein interactions by influencing cell surface presentation of type-I CdiA proteins and that the variety of HET regions may be associated with a similar variety of interacting protein partners.
We searched for haemagglutinin-like proteins, which are approximately 2000 amino acids in length and feature DUF367 and PT-VENN modules, in other proteobacteria. Intriguingly, proteins matching this search criterion were identified in Neisseria meningitidis FAM18 (GenBank AM421808, gene 444), Serratia plymuthica AS9 (GenBank CP002773, gene 3742), and Moraxella catarrhalis 25239 (GenBank CP007669, gene 760). Equally noteworthy is that all these proteins featured regions able to adopt a coiled-coil conformation. Future work is warranted to assess the occurrence of type-I CDI systems in other bacterial species, to further characterize HET modules, to identify their partners, and to ascertain the role of HET regions in the activity of CdiA proteins.
The results of this work further our knowledge of the intricate and fascinating world of CDI systems and paves the way for functional studies aimed at understanding the role of CdiA proteins in Acinetobacter.

Methods
Acinetobacter FHA-like proteins identified in the KEGG database were used as queries for homology searches in GenBank. TBLASTn searches were carried out against both complete and draft genomes classified as Acinetobacter (taxid:469), A. baumannii (taxid:470) A. nosocomialis (taxid:106654), and A. pittii (taxid:48296). In unannotated contigs, the proteins of interest were identified with the ORFfinder (https://www.ncbi.nlm.nih.gov/ orffinder/) or with the EMBOSS Sixpack (https://www.ebi.ac.uk/Tools/st/emboss_sixpack/). Protein alignments generated using MultAlin 43 , and protein domains were searched for in the NCBI Conserved Domain Database 44 .
Coiled-coils structures in type-I CdiAs were searched for with the programs Paircol2 and MARCOIL 28,29 . The sequence type of CDI + strains was determined by querying either the genome or the pool of contig sequences of the strain of interest in FASTA format against the A. baumannii MLST database 27 .
The organization of repeat sequences in type-II CdiA was investigated with RADAR (Rapid Automatic Detection and Alignment of Repeats; https://www.ebi.ac.uk/Tools/pfa/radar/). The enrichment in particular amino acids strings was detected with COMPSEQ (http://www.hpa-bioinfotools.org.uk/pise/compseq.html). A. baumannii and A. nosocomialis ST dendrograms were generated by ClustalW alignments of concatenated allele sequences of the cpn60, fusA, gltA, pyrG, recA, rplB, and rpoB gene segments of the STs of interest extracted from the Acinetobacter baumannii MLST Pasteur database.

Data Availability
All data generated or analysed during this study are included in this published article and the Supplementary Information files.