Introduction

Acinetobacter baumannii is an emergent nosocomial pathogen of increasing interest due to its widespread resistance to antimicrobials1. A. baumannii is truly a global pathogen, with isolates collected from hospitals around the world2,3, including injured soldiers from Iraq4 and Afghanistan5. The concern is the emergence of multidrug-resistant (MDR)6 and extremely drug-resistant (XDR)7 isolates that are resistant to most currently used therapeutics. Genes that confer resistance in A. baumannii have been documented, including class D beta-lactamases8, such as blaOXA-51-like, which appears to be highly conserved across A. baumannii9. The insertion element ISAba1 is required for carbapenem resistance in blaOXA-51-like positive isolates10.

The genome of A. baumannii is highly plastic11, with much of the evolution characterized by recombination12 and horizontal gene transfer13. The core genome phylogeny of A. baumannii demonstrates highly divergent genomes, with the emergence of a few highly successful clonal lineages12,14. While the evolution of these lineages is anticipated to be clonal, no in depth evolutionary studies have been performed to look at the fine scale evolution, recombination and gene composition of these clades.

Infections caused by A. baumannii are increasing worldwide, possibly due to the rapid expansion of a selected number of genetically distinct lineages12,14. Three of these lineages, known as international clones I to III, represent globally distributed and ubiquitous clades15. Other successful lineages, which spread in single institutions and/or worldwide, have been identified in the population structure of A. baumannii using different genotyping methods, including sequence type ST2514. A. baumannii strains assigned to ST25 were responsible for epidemics in different European countries16,17,18,19,20 and the United Arab Emirates21 and were isolated as endemic or sporadic isolates in South America22 and Asia18, respectively. ST25 genomes are of increasing interest due to increasing antimicrobial resistance14 found within novel genomic resistance elements23.

The aim of the current study was to analyze the genomic epidemiology of 19 A. baumannii strains belonging to the ST25 lineage according to Pasteur’s MLST scheme15. Understanding the composition and evolution of one successful global lineage may help in understanding the genetic basis for the emergence and proliferation of global clones of A. baumannii.

Methods

Isolates

The collection of ST25 isolates analyzed in this study includes 19 strains: three sporadic strains from Leiden’s collection isolated during 1985, 2000 and 2002; 13 strains representative of epidemics or endemic circulation in different countries; three additional sporadic isolates selected because of their antimicrobial susceptibility profile and mechanisms of antimicrobial resistance15,16,17,18,19,20,21,22 (Table 1). Seven of these isolates were chosen for sequencing.

Table 1 Metadata associated with ST25 isolates analyzed in this study.

Pulsed-field gel electrophoresis (PFGE) typing and dendrogram analysis

ApaI DNA macrorestriction and PFGE of A. baumannii isolates were performed as previously reported24. PFGE profiles were compared using the GelCompar II v. 4.6 software package (Applied Maths, Sint-Martens-Latem, Belgium). Clustering was based on the un-weighted pair-group method with arithmetic averages (UPGMA). The Dice correlation coefficient was used to analyze the similarities of the banding patterns with a tolerance of 1%. Interpretation of chromosomal DNA restriction patterns was based on the criteria of Tenover et al.25 and also on a similarity of >85% at dendrogram analysis, to indicate strain relatedness.

MLST typing

Multi-locus sequence typing (MLST) analysis was performed using the Institut Pasteur’s MLST scheme as previously described15. Allele sequence and MLST profile definitions were assigned using the sequence and profile definitions available at http://pubmlst.org/abaumannii/. The MLST results were confirmed from the whole genome sequence analysis using a publically available script: https://github.com/Victorian-Bioinformatics-Consortium/mlst.

DNA extraction, sequencing, assembly

DNA was extracted with the GenElute DNA extraction kit (Sigma-Aldrich, Milan, Italy). Sequence libraries were generated from extracted DNA as reported previously9. Genomes were sequenced to high depth on the IlluminaMiSeq platform. Resulting reads were adapter trimmed with Trimmomatic26, error corrected with Hammer27 and assembled with SPAdes v3.128. The read coverage across each contig was evaluated and contigs of an anomalous coverage, due to read crossover in multiplexed runs, were manually removed. The assembly stats for each genome are shown in Supplementary Table S1. All assemblies and raw reads were deposited in public databases (accession numbers in Supplementary Table S1). Annotation was performed with the NCBI PGAP pipeline.

Antimicrobial susceptibility testing

Antimicrobial susceptibility testing was performed using the Vitek 2 system (bioMérieux, Marcy l’ Étoile, France). Imipenem, meropenem and colistin minimum inhibitory concentrations (MICs) were determined by agar dilution and Etest (bioMérieux) and interpreted using the EUCAST29 and CLSI 201230 interpretative criteria.

in silico antimicrobial susceptibility profiling

To identify previously characterized genes associated with antimicrobial resistance in our dataset, raw reads were mapped to the ResFinder database31 with the SRST2 pipeline32; raw reads were used to determine the percentage of the reference gene covered, but also could identify variants compared to the reference database. SRST2 produces a table of all positive hits identified in each genome.

Biofilm formation

Biofilm formation was determined as previously described55. Three independent experiments, each one performed in triplicate, were conducted for each strain. Biofilms were grown in the presence and absence of 0.5 mg/L imipenem.

Cell adhesion assays

Adherence of A. baumannii strains to A549 cells (human type 2 pneumocytes) was determined as described previously55, with minor modifications. In brief, ~105 A549 cells were infected with ~107 bacterial CFU and incubated for 60 min at 37 °C in 5% CO2 (v/v) atmosphere. Non-adherent bacterial cells were removed by washing with PBS. Infected cells were lysed by the addition of 1 ml distilled water and serial 10-fold dilutions were plated on LB agar to determine the number of CFU of adherent bacteria. To determine adherent and invading bacteria, A549 cells were infected with A. baumannii strains as described above. The monolayers were then treated with 1 ml of fresh culture medium containing 5 mg/L of colistin sulfate (Sigma-Aldrich, Milan, Italy) for 30 min, the shortest time point that resulted in the killing of all extracellular bacteria added to the monolayers. Afterwards, the cells were washed with PBS, harvested with trypsin and lysed with sterile distilled water. Dilutions from harvested samples were inoculated on LB agar plates and bacterial colony counts were estimated after overnight incubation at 37 °C. Each experiment was performed in triplicate.

Statistical analysis

Data were analyzed using a Statistical Package for the Social Sciences Version 13.0 (SPSS Inc., Chicago, IL, USA). Differences between mean values were tested for significance by performing either unpaired, two-tailed Student’s t-tests or one-way ANOVA analysis followed by Tukey’s multiple-comparison test, when appropriate. A P value < 0.05 was considered to be statistically significant. Correlations were evaluated by regression analysis using the Pearson’s correlation coefficient (r).

Single nucleotide polymorphism (SNP) identification and phylogenetics

For ST25 comparisons, all SNPs were identified by mapping raw reads against A. baumannii AB307-0294 (NC_011595)33 with BWA-MEM34 and calling SNPs with the UnifiedGenotyper method in GATK35. For external genome assemblies, whole genome alignments were generated with nucmer36 and variants were identified by direct mapping of each query to the reference. These methods were wrapped by the Northern Arizona SNP Pipeline (NASP) (http://tgennorth.github.io/NASP/)37. A phylogeny was inferred from the resulting concatenated SNP alignment with a maximum likelihood algorithm in RaxML v838. The Retention Index (RI) value39, which demonstrates how consistent the nucleotide character states are with the phylogeny, was calculated with Phangorn40.

For the global A. baumannii phylogeny, a set of 572 reference genomes (Supplementary Table S2) were downloaded from Patric41. Genome assemblies were aligned against AB307-0294 with NASP. A maximum likelihood phylogeny was inferred on this alignment with RaxML. Genomes were pruned from the phylogeny to only reflect the major sequence types. Clades were collapsed in ARB42.

LS-BSR analysis

To look for differential gene conservation, the Large-Scale Blast Score Ratio (LS-BSR) pipeline43 was employed. In this method, all coding regions (CDSs) predicted by Prodigal44 are clustered with USEARCH45 at an ID of 90%. Each resulting centroid, which is the most representative sequence of each cluster, is then aligned against itself with BLAT46 to obtain the reference bit score. Each centroid is then separately aligned against each genome assembly with BLAT to obtain the query bit score. Dividing the query bit by the reference bit score returns the BLAST Score Ratio (BSR)47. Unique genomic regions were identified by comparing all CDSs between groups and considering a region to be unique if it had a BSR value > 0.8 in target genomes and a BSR value < 0.4 in all non-target genomes.

Recombination analysis

The FASTA output of NASP was converted to NEXUS using Readseq48. The Retention Index for each base was then calculated with Paup v4a14049; the specific Paup commands are publicly available (https://gist.github.com/jasonsahl/a66afa55371d7d916a0e). The SNP density (SD), or number of parsimony-informative (PI) SNPs across a genomic interval, was calculated across 1-Kb, non-overlapping windows, compared to the reference genome of A. baumannii AB307-0294; PI SNPs are those that that contain at least two types of nucleotides and occur in a minimum of two genomes. The number of homoplasious SNPs, based on a per-base RI value < 0.5, was also calculated across the same window. The Homoplasy Density (HD) value was calculated by dividing the number of homoplasious SNPs (those SNPs that are inconsistent with the tree topology) by the total number of PI SNPs; a script to perform these functions has been published previously50. Visualization was performed by Circos51. Core genome regions, or those regions conserved across all genomes tested, were identified from the NASP output, where a call was made in all genomes.

Plasmid analysis

CDSs predicted by Prodigal for 40 plasmids identified in A. baumannii (Supplementary Table S3) were mapped across ST25 genomes with LS-BSR and BLAT. Following manual curation of screened CDSs, the conservation of genes in associated plasmids was visualized with the interactive tree of life52. Only a subset of CDSs was selected to demonstrate the variability in plasmid content across isolates.

Gene screen

The distribution of several genes associated with virulence or antimicrobial resistance was determined across all ST25 genomes. This included AbaR1, which has previously been shown to be missing from A. baumannii 419053. This region (Coordinates 3702770-3602770 in A. baumannii AYE) was parsed out of A. baumannii AYE and coding regions were predicted with Prodigal. Other resistance islands, including AbaR4 (JN107991) and the G7 plasmid that contains AbaR3 (KF669606) were also screened. All CDSs were then compared against all ST25 genomes with LS-BSR and BLAT. An additional set of genes previously associated with virulence in A. baumannii (Supplementary Table S4) was also screened against ST25 genomes with LS-BSR.

Results

Isolates analyzed

In addition to ST25 genomes deposited in public databases, we selected a set of isolates to expand the diversity of this global lineage. The following seven isolates in our collection were selected for WGS analysis in addition to WGS data of strain 4190 already available in GenBank54: strains RUH1486 and NM3 are susceptible and MDR epidemic isolates, respectively; carbapenem-resistant strains 4390 and 741019 carry different carbapenemase genes (blaOXA-58 versus blaOXA-23) but both belong to the major PFGE type E (Fig. 1), isolated during epidemics in different countries (Table 1); strain 161/07 contains a distinct carbapenemase (NDM-1); strain LUH6220 has a MDR phenotype but is susceptible to carbapenems; strain LUH7841 is susceptible to most antimicrobials (Supplementary Table S5).

Figure 1
figure 1

Genotypic analysis of PFGE profiles of A. baumannii strains included in the study.

Percentage of similarity at dendogram analysis and position and tolerance values of the DICE correlation coefficient used in clustering are shown. Sizes in kilobases (kb) of lambda DNA molecular mass markers are indicated above the PFGE profiles. Strain number, year/country of isolation, PFGE types and subtypes and Multi-locus Sequence types are shown on the right of each profile.

PFGE Analysis

PFGE analysis identified ten unrelated PFGE types (A-J), with six PFGE subtypes within these types (Fig. 1) (C1, E1-E4, I1). Interestingly, PFGE type A included the susceptible first isolate of our collection (RUH1486) and the NM3 MDR strain isolated during an epidemic in United Arab Emirates; six strains from Greece, Argentina, Sweden or Abu Dhabi Emirates were assigned to PFGE type E or PFGE subtypes E1-E4, while two XDR isolates from Argentina and Sweden showed identical PFGE type H (Table 1 and Fig. 1).

MLST

MLST with the Pasteur system15 was performed on 19 ST25 genomes, although only 7 were subjected to whole genome sequencing (WGS). Of all ST25 isolates analyzed, including a set of reference genomes from GenBank, eighteen were assigned to ST25, while one to ST402 (LUH7841), which is a single-locus variant of ST25. In silico MLST confirmed sequence type assignments for all genomes where WGS data exists (Supplementary Table S1).

Antimicrobial susceptibility testing

The antimicrobial susceptibility profiles of 19 ST25 A. baumannii strains included in the study are shown in Supplementary Table S5. Two strains were classified as susceptible, six and eleven as MDR and XDR, respectively, according to Magiorakos et al., 201229. Five out of six MDR strains and all 11 XDR strains showed resistance to carbapenems and contained class D or class B carbapenemases.

In silico antimicrobial resistance profiling

As a complement to the laboratory-determined antimicrobial susceptibility profiles, in silico profiles were generated for each sequenced genome, using the ResFinder database31 in conjunction with the SRST2 pipeline32. The results demonstrate that resistance mechanisms were identified in the genomes tested for only a few classes of antimicrobials (Table 2), demonstrating limitations in predicting the resistance phenotype from the genotype. Carbapenemase genes were found in the genomes of 161/07 (blaNDM-1, blaOXA-64), 741019 and NM3 (blaOXA-64, blaOXA-23) and 4390 and LUH6220 (blaOXA-64). The blaOXA-64 gene (AY750907), which is also known as blaOXA-51-like, is conserved in all ST25 genomes tested (Table 2). However, the ISAba1 insertion sequence is missing in carbapenem-susceptible isolates that are blaOXA-64 positive, while present in resistant isolates, which confirms published results that this sequence is required for carbapenem resistance10.

Table 2 in silico analysis of genes associated with antimicrobial resistance.

Biofilm formation and pneumocyte adherence

Our previous results demonstrated that the ability to form biofilm and adherence to cultured pneumocytes was significantly higher for A. baumannnii strains assigned to ST25 and ST2 compared to other STs55. The biofilm growth on abiotic surfaces and adherence/invasion to cultured A549 pneumocytes were assessed for the 19 strains included in this study. As demonstrated in Fig. 2 panel A, the 19 strains assigned to ST25 and ST402 and strain ACICU assigned to ST2 generally demonstrated significantly (p < 0.05) greater biofilm growth than strain AYE assigned to ST1, although variability in biofilm growth was observed among ST25 strains. Exposure to sub-inhibitory concentrations of imipenem significantly (p < 0.05) stimulated biofilm growth in strains AYE and ACICU assigned to ST1 and ST2, respectively, but not in the strains assigned to ST25 or ST402 (Fig. 2, panel B). We next investigated the ability of A. baumannii strains to adhere to A549 human alveolar epithelial cells. All A. baumannii strains assigned to ST25 and ST2 (ACICU) strain showed a significantly higher adherence to A549 human bronchial cells compared with ST1 strain AYE (Supplementary Fig. S1) (p < 0.01). On the other hand, ST25 A. baumannii strains were not able to invade A549 cells human alveolar cells. Also, a similar number of bacteria adhered to A549 cells when the monolayers were incubated with A. baumannii strains for 60 min at 4 °C, i.e. under conditions that do not allow for tissue invasion.

Figure 2
figure 2

Biofilm variation between ST25 genomes, a ST1 genome (AYE) and a ST2 genome (ACICU).

Error bars represent the standard deviation between biological replicates. Differences in biofilm production were calculated with a two-tailed t-test. Isolates were grown in the absence (A) or presence (B) of imipenem (0.5 mg/L).

Sequencing and comparative genomics

WGS of seven strains was performed and compared to whole genome sequences of 12 A. baumannii strains assigned to ST25 available in GenBank (Supplementary Table S1) and to 572 non-ST25 A. baumannii reference genomes (Supplementary Table S2). The core genome phylogeny based on 1.15Mb of conserved sequence demonstrated the position of the ST25 lineage (Fig. 3) in relation to other globally-relevant lineages. The retention index (RI) of the concatenated SNP alignment was 0.85, demonstrating significant homoplasy likely due to recombination and introducing uncertainty in the phylogenetic placement, especially with regards to deeply branching nodes and long branches56. A phylogeny of just the ST25 genomes (Fig. 4) also demonstrated homoplasy (RI = 0.84), which demonstrates that although ST25 is a lineage with closely related genomes, the evolution of this group has also been partially driven by recombination; the core genome size of ST25 genomes in relation to AB307-0294 was 3Mb. As anticipated, the core genome phylogeny demonstrated much different relationships than were obtained by the PFGE cluster dendrogram (Figs 1 and 4). For example, genomes RUH1486 and NM3 both share the same PFGE type (Fig. 1), but are significantly different based on the core genome phylogeny (Fig. 4).

Figure 3
figure 3

A core genome single nucleotide polymorphism (SNP) phylogeny of 597 A. baumannii genomes.

The phylogeny was inferred with RAxML38 from a concatenation of ~104,000 SNPs compared to the reference genome of AB307-0294. Sequence types were identified from genome assemblies. Genomes without close relatives in established sequence types were manually pruned from the tree and groups were collapsed with ARB42. The phylogeny was rooted by first including an outgroup from A. nosocomialis, then re-running the analysis with only A. baumannii and rooting on the most basal genome from the original analysis.

Figure 4
figure 4

A core genome single nucleotide polymorphism (SNP) phylogeny of ST25 genomes.

The phylogeny was inferred with RAxML from a concatenation of ~24,000 SNPs compared to the reference genome of AB307-029433 with 100 bootstrap replicates. Each genome was annotated with its antimicrobial susceptibility information, where available. The tree was rooted according to the most basal genome isolated from the global phylogeny (Fig. 3).

Recombination in the ST25 lineage

To demonstrate both the extent and location of recombination in the ST25 lineage in relation to the genome of AB307-0294, a homoplasy density analysis was performed50. Considering all of the A. baumannii genomes (n = 597), the homoplasy appears to be distributed equally across the reference chromosome, with no isolated regions of recombination (Supplementary Fig. S2). When only considering the ST25 genomes, clear regions have likely been recombined between isolates (Fig. 5, panel A). However, fragments in the core genome, using AB307-0294 as the reference, still generally give a strong phylogenetic signal (Fig. 5, panel B). The annotation of selected regions associated with recombination is shown in Supplementary Table S6. A comprehensive list of HD values across all regions in the reference chromosome is also available (https://gist.github.com/5e1cab0b85c73de7c6d6.git).

Figure 5
figure 5

Homoplasy density (HD) ratio analysis of ST25 genomes.

(Panel A): Parsimony informative (PI) single nucleotide polymorphisms (SNPs) were identified across 1-Kb, non-overlapping windows (SNP density, or SD), compared to the reference genome of AB307-0294. Homoplasious SNPs were identified by a Retention index value < 0.05. The HD was calculated by dividing the number of homoplasious SNPs by the total number of PI SNPs. The SD and HD values were visualized with Circos51. Core genome regions were identified where there was a call in all query genomes compared to the reference genome. (Panel B): The distribution of 1-kb, non-overlapping regions, based on the HD values.

Plasmid composition

The horizontal gene transfer of plasmids was analyzed in ST25 genomes, using the composition of 40 previously characterized A. baumannii plasmids (Supplementary Table S3). Coding regions were predicted for all the plasmids using Prodigal and they were compared against 19 ST25 genomes using LS-BSR. The results demonstrate that the plasmid content is highly variable across sequenced genomes (Fig. 6), although this method fails to discriminate between genes present on plasmids or the chromosome. The susceptible genome RUH1486 appeared to not contain any annotated plasmid.

Figure 6
figure 6

The ST25 phylogeny associated with a heatmap of coding regions predicted from plasmids identified in A. baumannii genomes.

The heatmap was generated from LS-BSR43 output and was visualized with the interactive tree of life52. GenBank accession numbers are listed for each queried coding region.

Unique genomic regions

The complete genetic content for 597 A. baumannii genomes was compared using LS-BSR. By using default values in LS-BSR, a single coding region was found to be present in ST25 genomes and absent from all others; this region corresponds to a large (~14600 nucleotides) hemagglutinin repeat protein (WP_002016208.1). While portions of this gene are conserved in other A. baumannii genomes, the complete gene structure is unique to ST25 genomes and could potentially serve as a diagnostic tool for the surveillance of this global lineage.

Distribution of virulence associated genes

Virulence associated genes (Supplementary Table S4) were screened against ST25 genomes with LS-BSR and BLAT. The results demonstrate variability in composition across four genes. Perhaps the most striking difference is in ompF, which is highly conserved in only three genomes, each on the same branch of the phylogeny (Supplementary Fig. S3).

Distribution of resistance islands

Coding regions from three previously characterized resistance islands were screened against ST25 genomes with LS-BSR. The results demonstrate that AbaR1 and AbaR4 were sparsely distributed across ST25 genomes (Supplementary Fig. S4). However, the AbaR3 resistance island was highly conserved across two lineages in the ST25 phylogeny, likely demonstrating independent acquisition.

Discussion

A. baumannii is a globally distributed nosocomial pathogen associated with clinical infections that are difficult to treat due to widespread antimicrobial resistance. WGS has begun to demonstrate the phylogenetic diversity of this pathogen, which seems largely driven by homologous recombination12. And although phylogenetic diversity has been documented, many genomes sequenced to date fall into clearly defined clonal lineages, such as ST1 and ST2, which have been identified worldwide14. Genetic diversity within each of these sequence types has largely focused on the diversity of antimicrobial resistance islands57,58,59,60 or individual loci61. The focus of this study was to perform a comprehensive genomics analysis of the ST25 lineage, which was isolated in different countries and was responsible for epidemics worldwide16,17,18,19,20,21,22, to better understand the genotypic and phenotypic properties behind the worldwide distribution and evolution of a successful lineage of A. baumannii.

Phenotypic diversity was observed within ST25, including differences in biofilm formation, antimicrobial susceptibility and pneumocyte adherence. In terms of biofilm formation, ST25 isolates produce a significantly higher amount of biofilm than a single ST1 representative in the absence of antimicrobials (Fig. 2, panel A). Although the relationship between biofilm formation and virulence has not been solidified in A. baumannii62, biofilms have been associated with resistance to antimicrobials63, pathogenensis64 as well as resistance to host factors65. The presence of imipenem did not affect biofilm production in ST25 isolates, suggesting that these mechanisms are constitutively expressed.

Antimicrobial susceptibility testing demonstrated varied susceptibility profiles within ST25 isolates (Supplementary Table S5). In silico profiles using WGS data against the ResFinder database could predict laboratory resistance for only a subset of antimicrobials (Table 2). This demonstrates that although antimicrobial resistance databases are useful for predicting resistance for some classes of antimicrobials, additional experimentation is required to fully understand the genetic basis for antimicrobial resistance in A. baumannii. In terms of adherence, ST25 genomes were demonstrated to adhere better than a representative from the ST1 clonal lineage. Although this may be due to the presence of a unique haemagglutinin identified in ST25 and absent from all other A. baumannii genomes, additional experimentation is required.

The genetic diversity of ST25 was demonstrated in multiple ways, including the visualization of a core genome single nucleotide polymorphism (SNP) phylogeny (Figs 3 and 4); a similar topology was observed compared to a recent CRISPR-subtyping analysis66. A homoplasy density approach demonstrated that much of the SNP density that defines the phylogenetic structure of ST25 is due to homoplasy (Fig. 5, panel A), most likely resulting from homologous recombination. The extent of homoplasy observed in the global phylogeny suggests that A. baumannii does not evolve in a tree like manner and different methods may better represent the evolution in this highly recombinant pathogen.

In addition to SNP analyses, comparative genomics demonstrated a much different gene content between ST25 genomes, primarily between mobile genetic elements. In the case of plasmids, the gene content was significantly different (Fig. 6), which was anticipated due to the movement of mobile genetic elements and has been demonstrated previously in Acinetobacter67. However, a screen of genes associated with virulence also demonstrated differences between closely related genomes (Supplementary Fig. S3). The variable distribution of genes across the ST25 dataset, including those associated with antimicrobial resistance islands (Supplementary Fig. S4), may help explain the variable phenotypes. Overall, these results demonstrate the problem with assuming that isolates have similar gene content or phenotypes based solely on MLST or PFGE type analyses. In terms of either assigning isolates to outbreaks or understanding the evolution of clonal lineages, WGS offers the resolution to untangle the relationships between seemingly related isolates.

Additional Information

How to cite this article: Sahl, J. W. et al. Phylogenetic and genomic diversity in isolates from the globally distributed Acinetobacter baumannii ST25 lineage. Sci. Rep. 5, 15188; doi: 10.1038/srep15188 (2015).