Complete genome sequence of hypervirulent and outbreak-associated Acinetobacter baumannii strain LAC-4: epidemiology, resistance genetic determinants and potential virulence factors

Acinetobacter baumannii is an important human pathogen due to its multi-drug resistance. In this study, the genome of an ST10 outbreak A. baumannii isolate LAC-4 was completely sequenced to better understand its epidemiology, antibiotic resistance genetic determinants and potential virulence factors. Compared with 20 other complete genomes of A. baumannii, LAC-4 genome harbors at least 12 copies of five distinct insertion sequences. It contains 12 and 14 copies of two novel IS elements, ISAba25 and ISAba26, respectively. Additionally, three novel composite transposons were identified: Tn6250, Tn6251 and Tn6252, two of which contain resistance genes. The antibiotic resistance genetic determinants on the LAC-4 genome correlate well with observed antimicrobial susceptibility patterns. Moreover, twelve genomic islands (GI) were identified in LAC-4 genome. Among them, the 33.4-kb GI12 contains a large number of genes which constitute the K (capsule) locus. LAC-4 harbors several unique putative virulence factor loci. Furthermore, LAC-4 and all 19 other outbreak isolates were found to harbor a heme oxygenase gene (hemO)-containing gene cluster. The sequencing of the first complete genome of an ST10 A. baumannii clinical strain should accelerate our understanding of the epidemiology, mechanisms of resistance and virulence of A. baumannii.

R ecognized as one of the most problematic bacterial pathogens due to emergence of multidrug-resistant (MDR) strains 1 , Acinetobacter baumannii has been responsible for a significant proportion of world-wide healthcare-acquired infections, including ventilator-associated pneumonia, surgical site and urinary tract infections and septicemia [2][3][4][5] . Additionally, MDR clinical isolates of this species have been found to infect military personnel wounded in combat zones 6,7 .
Hospital outbreaks of A. baumannii infection are often associated with multidrug resistance in the causative strains 8,9 . Besides their intrinsic resistance to certain antibiotics due to the presence of native b-lactamase genes, poor permeability and efflux systems, A. baumannii strains have acquired a large array of antibiotic resistance mutations and genes, located either on the chromosome or plasmids 10,11 . Furthermore, clinical strains of A. baumannii resistant to carbapenems, the last resort antibiotics to treat infections caused by A. baumannii, have been found to possess acquired bla genes encoding several groups of carbapenem-hydrolyzing b-lactamases, such as bla OXA-23 12,13 , bla OXA-24/33/40 14,15 , and bla OXA-58 16 . These bla genes appear to have been transferred via mobile genetic elements such as insertion sequences, transposons or plasmids 12,13,15,16 .
Despite the clinical significance of A. baumannii infections, the molecular basis on the virulence and acquisition of multidrug resistance by A. baumannii remains largely unknown. To better understand the genome plasticity, natural history, epidemiology and acquisition of resistance and pathogenicity islands/genes of A. baumannii, the complete genomes of 21 A. baumannii strains, including LAC-4, became available as of December 31, 2014 [17][18][19][20][21][22][23][24][25][26][27][28][29][30] . In addition, the genomes of hundreds of A. baumannii strains have been sequenced to scaffold or contig levels (http://www.ncbi.nlm.nih. gov/genome/genomes/403). Most of these sequenced A. baumannii genomes are divided into 31 groups on basis of their sequence similarity. These efforts and other incomplete Whole Genome Shotgun (WGS) projects involving A. baumannii strains and strains of other Acinetobacter species 31,32 will likely offer additional insights on epidemiology, phylogenetics, evolution of pathogenic strains and gene flows among Acinetobacter species including A. baumannii.
Several years ago, we described antibiotic resistance patterns and clonal relationships of 20 MDR clinical isolates of A. baumannii obtained from nosocomial outbreaks in Los Angeles County hospitals (LAC-1 to LAC-20) 5 . Our pulsed-field gel electrophoresis (PFGE) fingerprinting analysis indicated that these isolates appeared to have originated from eight epidemiologically distinct lineages 5 . More significantly, we identified the LAC-4 strain as hypervirulent in a mouse model of intranasal infection 33 in comparison to other clinical isolates and laboratory strains of A. baumannii, including the eight representative LAC isolates and the widely studied clinical strain AYE 22,34 . The LAC-4 strain reliably reproduces the most relevant features of human pulmonary A. baumannii infection, including significant extrapulmonary dissemination and bacteremia 33 . Subsequent studies showed that LAC-4 exhibits high serum resistance, expresses a highly efficient heme utilization system 35 and contains some unique structure and composition in its surface polysaccharide 36 , which may contribute to its hypervirulence. However, the precise mechanism of the hypervirulence of LAC-4 remains to be determined. Here we describe the complete genome sequence of the LAC-4 strain with special emphasis on the comparative genomics analyses to identify genomic regions that may contribute to the acquisition of antibiotic resistance and establishment of superior colonization and invasion by this hypervirulent strain.

Results and discussion
Phylogenetic lineages based on trilocus multiplex PCR and MLST. To understand the epidemiology and phylogenetics of 20 clinical isolates of A. baumannii (including LAC-4) obtained from four apparent nosocomial outbreaks, we first attempted to determine the clonal relationships among these isolates of A. baumannii by Trilocus multiplex PCR (TLM-PCR) analyses. Our results indicate that we can only type four isolates (LAC-11, LAC-12, LAC-13 and LAC- 14) belonging to Global Clone (GC) II (Table 1). Since the TLM-PCR method failed to resolve phylogenetic relationships of most of these A. baumannii isolates, multilocus sequence typing (MLST) scheme based on Pasteur Institute approach was subsequently employed. Our results showed that previously non-typable isolates belong to unusual ST types (such as ST10, ST241 and ST417) (Table 1). Previously, PFGE profiling divided these 20 outbreak isolates into eight distinct clonal groups: LAC-1 to LAC-3; LAC-4; LAC-5 and LAC-8; LAC-6; LAC-7, LAC-9, and LAC-10; LAC-11 to LAC-14; LAC-15; and LAC-16 to LAC-20 5 . In accordance with the PFGE grouping 5 , MLST typed LAC-5 and LAC-8 to a rare ST241 (Table 1). Since these two strains were isolated during two separate outbreaks in a single hospital, it appears that the ST241 lineage persisted for at least four years (1997)(1998)(1999)(2000)(2001) in the same facility. More interestingly, we found that LAC-4, which was much more virulent than other LAC isolates in mice 33 , belongs to ST10 (Table 1). Most importantly, two series of outbreaks were caused by ST10 strains in two separate hospitals (LAC-1 to LAC-4 in Hospital A; LAC-16 to LAC-20 in Hospital C) in Los Angeles County, California, during the late 1990s ( Table 1), suggesting that ST10 strains were quite dominant in causing nosocomial outbreaks in Los Angeles County at the time, with LAC-4 being their representative. While LAC-1 to LAC-4 were all typed to ST10, the PFGE profile of LAC-4 diverged from those of LAC-1 to LAC-3 sufficiently to be grouped as a separate clone 5 , indicative of possible divergent evolution of LAC-4 from its original clone.
There have been few reports describing clinical isolates of A. baumannii belonging to ST10. Among 1237 A. baumannii strains with assigned STs in Pasteur Institute's MLST database as of Oct 23 2014 (the most recent update), only three A. baumannii ST10 strains were listed. Recently, an ST10 strain of A. baumannii was isolated from a  40,41 or ST15 40 . To our knowledge, in this report we describe the first nosocomial outbreaks caused by unusual ST10 and ST417 A. baumannii strains (Table 1).
LAC-4 genome sequences and general features. Our MLST analysis showed that LAC-4 and eight other outbreak LAC strains belong to the unusual ST10 (Table 1). Previous studies have shown that LAC-4 exhibits several distinctive attributes (iron utilization and unusual repeating unit composition of surface polysaccharides) that may contribute to its hypervirulence in mice 33,36 . Since none of the A. baumannii strains whose genomes were completely sequenced belongs to ST10, we decided to completely sequence the genome of LAC-4 using a combination of Roche 454 FLX1, Illumina Hiseq2000 and Illumina Miseq platforms, with gaps/uncertainties filled or clarified by genomic PCR and sequencing of PCR products. Our analysis showed that the LAC-4 genome consists of a circular chromosome of 3,954,354 base pairs and two circular plasmids, one with 8,006 base pairs while the other with 6,076 base pairs. The general features of the LAC-4 genome are summarized in Table 2.
The LAC-4 genome contains multiple copies of five types of IS elements: ISAba1, ISAba13, ISAba125, and novel ISAba25 and ISAba26 whose names were recently assigned (see Table 2 and Supplementary Table S1 online). There are 19 copies of ISAba1 in the LAC-4 genome. Three pairs of ISAba1 constitute three novel transposons (Tn6250, Tn6251 and Tn6252) because identical target site duplication (TSD) sequences are found next to the external (outward facing) inverted repeats (see Fig. 1A; Supplementary Table S1 online), while one ISAba1 element is found to be linked to an ampC gene (Fig. 1B). In addition, we identified in the LAC-4 genome 12 copies of a novel IS element (ISAba25) whose closest hit from the IS Finder database is ISAba16 (see Supplementary Table S1 online). This IS element has 2,491 base pairs in length with three ORFs, while ISAba16 has 2,552 bp. Importantly, the largest ORF of ISAba25 has a 63-bp deletion compared to ISAba16, resulting in a predicted polypeptide of 527 amino acids, 21 amino acids shorter than its counter-part in ISAba16 (which has 548 amino acids). Two copies of ISAba25 are associated with genes of a RND type of efflux pump system (AdeIJK) (Fig. 1B), probably contributing to LAC-4's resistance to a number of antimicrobial agents. Furthermore, 14 copies of another novel IS element (ISAba26) are located on the LAC-4 chromosome (see Supplementary Table S1 online). This IS is 1,318 bp in length and contains one gene encoding a transposase with 402 amino acids. While transposase protein sequence BLAST analysis indicates ISAba26 belongs to the mutator family of transposases, IS Finder database BLASTp search identifies ISEc39 as its closest homologue, sharing 72% transposase amino acid identities. ISAba26 produces TSD of 8 bases in length (see Supplementary Table S1 online) and its inverted repeats (IRL and IRR) are 26 bp in lengths. Interestingly, two copies of ISAba26 flank a large number of continuous loci (15 genes) which are predicted to be involved in copper resistance (Fig. 1B). Moreover, 22 copies of ISAba13 are scattered around the chromosome of LAC-4 (see Supplementary Table S1 online). Among these elements, 14 copies were found to create TSD sequences of 9 bases in length as expected, while one element has an 8-base TSD (see Supplementary Table S1 online). Finally, we found 14 ISAba125 elements in LAC-4 genome, each with a TSD of 3 bases (see Supplementary Table S1 online).
The LAC-4 genome also includes two plasmids ( Table 2). Plasmid pABLAC1 contains nine predicted ORFs (ABLAC_p100010-ABLAC_ p100090). Among these loci, ABLAC_p100020 (rep) encodes a replicase (Aci3) belonging to group 3 of replicases defined 42 . Fifty-six base pairs upstream of the rep gene, four copies of iterons were found with a sequence of 59-TAAAACGAGGTTTACCTTGCAT-39, which is identical to those observed in replicons Ab203-Aci3 and Ab736-Aci7 42 . Iterons have been shown to be involved in controlling plasmid replication via interacting with replicase proteins 43 . Additionally, an AT-rich sequence (59-AAAAATAT-39) identical to that found in the pRAY plasmid 44 was located 37 bp downstream of the fourth iteron and 11 bp upstream of rep start codon. The AT-rich element and the iterons presumably serve as the oriV site of the plasmid. Furthermore, ABLAC-p100050 and ABLAC_p100060 were found to encode homologues of RelE toxin of toxin/antitoxin system and Cro/C1 transcriptional regulator, respectively. However, no homologue of antitoxin partner gene was found on the plasmid. Many of predicted proteins of plasmid pABLAC1 loci (ABLAC_p100020 through ABLAC_p10002060) share identical amino acid sequences with those encoded by genes harbored by the plasmid (pD1279779) in A. baumannii strain D1279779 23 . On the other hand, plasmid pABLAC2 shares nearly 100% with a series of plasmids related to pRAY 44,45 , in particular pRAY*. Of particular note, two pABLAC2 loci (ABLAC_ p200050 and ABLAC_p200060) encode homologues of MobA and MobC, respectively, which are known to be involved in plasmid mobilization; it is tempting to speculate that these mobilization genes may contribute to the transfer of the aminoglycoside resistance gene (ABLAC_p200010, ant(20)-Ia or aadB) carried on this plasmid. Moreover, two copies of AT-rich sequence 59-AAAAATAT-39 were found within the coding region of a predicted ORF (ABLAC_ p200080). However, no potential replication gene (rep) or iterons were found on pABLAC2. Similar to pRAY series of plasmids 44,45 , no plasmid partitioning or restriction/modification systems were identified in both plasmids of LAC-4.   baumannii genome groups are considered, LAC-4 was found to belong to Genome Group 3, represented by the genome of BJAB7015, as verified by a phylogenetic tree composed of 45 completely or partially sequenced A. baumannii genomes (see Supplementary Fig. S1 online). A total of 12 genomic islands were identified ( Fig. 2B and Table 3). The GI1 (Table 3) consists of a novel transposon (Tn6250) composed of two ISAba1 elements at the ends, and an IS1006 internally next to a partial ISVsa3 element (Fig. 1A). Additionally, this genomic island harbors resistance genes for streptomycin (strA and strB) and for sulphonamides (sul2), thus this GI is a resistance island ( Table 3). The 34 kb GI2 contains two novel IS elements (ISAba26) near the ends (Table 3 and Fig. 1B). Notably, these two IS elements sandwich a long 15-gene cluster (ABLAC_05330 to ABLAC_05470) coding for various proteins involved in copper resistance (Fig. 1B), representing the second resistance island found in LAC-4. Most of the genes of this gene cluster were also present in the genome of A. baumannii ATCC 17978, and to a lesser extent, in those of A. baumannii strains AB0057 and AYE (Fig. 3A). Outside one of the ISAba26 elements (ABLAC_05500), another novel IS element (ISAba25) and a phage integrase gene (ABLAC_05570) are found (Table 3 and Fig. 3A), implicating the roles of mobile elements in the transfer of this GI. Furthermore, GI3 is bounded by two copies of a novel IS element (ISAba25; transposase ORFs: ABLAC_07480-07500 and ABLAC_0758-07600) and it harbors a partial ISAba1 element and genes for the AdeIJK, which are components of an RND-type efflux pump system involved in resistance phenotypes of multiple classes of antibiotics (Fig. 1B).
Several GIs (GI5, GI6, GI7, GI8, GI9 and GI10) contain many genes encoding phage-derived proteins of unknown functions; the significance of these GIs remains to be elucidated (Table 3). Among these GIs, nearly all of the phage-derived genes of the GI8 have no homologues in any of the 20 other completely sequenced A. baumannii genomes (except several transposase genes and Zn-dependent protease gene) (data not shown). Moreover, GI11 contains a number of genes encoding putative proteins/enzymes with functions of detoxi-fication of a variety of xenobiotic compounds (Fig. 3B); for example, glutathione S-transferase (ABLAC_32340, gstB), S-formylglutathione hydrolase (ABLAC_32430, frmB) and S-(hydroxymethyl)glutathione dehydrogenase (ABLAC_32440, frmA) are involved in resistance to and detoxification of formaldehyde and chlorine 46 . Interestingly, other than the frmA gene, most of the GI11 genes are unique to LAC-4 among the 21 completely sequenced A. baumannii genomes (data not shown). Finally, the GI12 contains the K (capsule) locus which encodes enzymes responsible for biosynthesis of the surface polysaccharide of LAC-4 (Fig. 4). Gene product homology analyses indicate that the K locus gene cluster harbors genes encoding two series of enzymes (FnlA, FnlB, FnlC and Leg1-6) which apparently catalyze biosynthesis of a-L-frucosamine and a-8-epi-legionaminic acid, respectively. These two sugars are the two precursor sugars (other than a-D-glucosamine, which is supplied by cellular metabolism) for the biosynthesis of three-sugar repeating unit of LAC-4 surface polysaccharide recently determined to consist of a-L-frucosamine, a-D-glucosamine and a-8-epi-legionaminic acid 36 . Interestingly, six ORFs (orfK1-6, ABLAC_36970-ABLAC_37020) in the middle of the gene cluster encode protein products (two of which are oxidoreductases) sharing no homology with any proteins in the 20 completely sequenced A. baumannii genomes (Fig. 4). Among the 20 A. baumannii strains whose genomes were completely sequenced, several have acquired resistance islands of various sizes 18,25,34,47 . For example, strain AYE harbors an 86-kb resistance island in which 45 resistance genes are located 34 . Similarly, MDR-TJ contains a 42-kb resistance island with four drug resistance genes 47 , while MDR-ZJ06 has a 38kb resistance island (AbaR22) containing 40 genes, five of which are involved in antimicrobial resistance 25 .
Antibiotic resistance of LAC-4 and resistance genetic determinants. The recent emergence and rapid dissemination of multidrug and pandrug resistant A. baumannii has caused significant burdens in clinical management of infections world wide. While we previously baumannii strains whose genomes were completely sequenced, the MDR LAC-4's level of resistance is moderate (e.g., susceptible to imipenem and meropenem, intermediate to amikacin), which reflects the relatively early isolation of this strain (in 1997) from a hospital in Los Angeles. Nevertheless, the availability of complete genome sequences of A. baumannii strains with varied degrees of antimicrobial resistance (including the moderately resistant LAC-4) is highly useful for the scientific research community, especially from the standpoints of emergence and dissemination of antimicrobial resistance genes.
To better understand the repertoire of MDR genetic determinants and organization, in-depth analysis of the genome sequences of LAC-4 was performed. Consistent with the fact that LAC-4 was not among the most resistant strains analyzed in our 2008 report 5 , the LAC-4 genome harbors a moderate number of genetic determinants, some of which are linked to mobile genetic elements (IS or Tn), with a potential to encode resistance functions observed in this bacterial strain (Table 4). For example, genes potentially encoding for all four classes of b-lactamases (Classes A, B, C and D) have been found ( Table 4). Two of such loci (ampC and bla OXA-236 ) are closely associated with ISAba1 elements (Fig. 1B and 1A, respectively) which may provide exogenous promoter functions, consistent with LAC-4 being resistant to piperacillin, older versions of the penicillins, and several cephalosporin antibiotics (see Supplementary Table S2 online). ISAba1 has also been found in the genomes of other A. baumannii strains associated with antibiotic resistance genes (including ampC), most likely driving robust expression of the resistance phenotypes 17,47,48 . While a bla OXA-51-like gene (ABLAC_23600) is found in the LAC-4 genome (Table 4) as expected, the absence of bla OXA-23-like , bla OXA-40-like and bla OXA-58-like genes (data not shown) in the genome explains its susceptible phenotypes to imipenem and meropenem (see Supplementary Table S2 online). In contrast, in MDR strains AB0057, MDR-TJ, MDR-ZJ06, BJAB07104, BJAB0868 and BJAB0715, the presence of bla OXA-23 could account for their carbapenem resistance 17,18,25,47 . In addition, presence of aminoglycoside 6-phosphotransferase [aph(6)-Id, i.e., strB] and/or aminoglycoside 30-phosphotransferase [aph(30)-Ib, i.e., strA] were known to contribute resistance to streptomycin 49 . The existence of several aminoglycoside modification enzyme genes [ABLAC_01120 (strA), ABLAC_ 01130 (strB), and ABLAC_p200010 (ant(20)-Ia)] in the LAC-4 genome (Table 4) and an association of an ISAba1 with two of these loci (strA and strB) in the context of a novel transposon Tn6250 (Fig. 1A) correlate well with the strain's resistance to streptomycin, gentamicin, kanamycin and tobramycin (see Supplementary Table S2 onlinie). Finally, genes encoding a series of proteins for major facilitator superfamily (MFS), resistance-nodulation-division (RND) family, and multidrug and toxic compound extrusion (MATE) family of efflux pumps were also located in the LAC-4 genome (Table 4). Notably, there are at least five complete sets of RND type of efflux pump systems in LAC-4 (Table 4). Besides the AdeIJK genes linked with several IS elements described above in GI3 (Table 3 and Fig. 1B), four other sets of RND type efflux pump systems include AdeABC (encoded by ABLAC_26430-26450), AdeFGH (encoded  (Table 4). The last two sets have been shown to participate in transporting toxic metals such as cobalt, zinc and cadmium 50,51 . These efflux pump systems and other membrane-associated transporters (Table 4) confer resistance to diverse classes of antibiotics and other toxic compounds/metals in bacteria.
Many A. baumannii isolates have been reported to possess a series of efflux pump gene clusters in their genomes, which encode several types of chemical extrusion apparatuses rendering cells resistant to a large variety of antibiotics and toxic compounds 52,53 . For instance, there are three sets of RND efflux pump systems in A. baumannii strains (AdeABC, AdeIJK and AdeFGH) [54][55][56] . While LAC-4 genome harbors three gene clusters each of which separately encodes one of the above RND type efflux pumps, no apparent two-component system gene homologues adeRS was found upstream of adeABC gene. Such two-component systems (AdeRS) were found to be critical in regulating expression of AdeABC efflux pumps in several strains of A. baumannii 57,58 . Interestingly, Sun and colleagues observed that the insertion of an ISaba1 into the adeS gene resulted in over-expression of adeABC operon hence the non-susceptibible phenotypes of clinical strains to tigecycline 59 . Additionally, AdeIJK appears to be regulated by AdeN, a TetR type of regulator in some strains 60 . On the contrary, the LAC-4's adeIJK genes exist in an unusual genomic context. These genes are not linked with any recognizable regulatory gene(s). Upstream of adeIJK genes, there exist a locus of unknown function (ABLAC_07540) and an apparent nonfunctional copy of ISAba1 element (Fig. 1B). All these genes are flanked by two copies of novel IS element ISAba25 (Fig. 1B). To our knowledge, this is the first report of a gene cluster encoding an RND efflux pump system being flanked by two ISAba25 elements, which may modify the expression of the efflux pump genes, thus conferring resistance to a number of antimicrobial agents.
Potential virulence factors. Because our recent studies have shown that the LAC-4 strain is much more virulent than other A. baumannii strains in the mouse model of intranasal infection 33 , the LAC-4 genome was searched for genes coding for potential virulence factors against entries from VFDB. A total of 615 gene hits were generated which could facilitate the identification, evaluation and validation of virulence factors in this strain and the species (see Supplementary  Table S3 online). Potential virulence factor loci that are unique for LAC-4 include ABLAC_03200 (which may encode a fimbrial protein, PilE), ABLAC_37000 and ABLAC_37010 [two genes of K locus gene cluster ( Fig. 4)] (see Supplementary Table S3 online). In addition, several other potential virulence factor loci are found in only 1 to 4 genomes of the 20 completely sequenced A. baumannii genomes: ABLAC_05370 (encoding a transcriptional activator protein CusR/ CopR), ABLAC_05450 (encoding an Fe 21 transport system protein FeoB) and K locus gene cluster genes ABLAC_36940-39960 [encoding UDP-N-acetylglucosamine 2-epimerase (FnlC), reductase (FnlB) and 4,6-dehydratase,3-and 5-epimerizase (FnlA)](see Supplementary  Table S3 online and Fig. 4). Additional genes of the K locus gene cluster (ABLAC_36850, and ABLAC_36870-ABLAC_36910) were also identified as potential virulence factor genes (see Supplementary Table S3 online). Recently, Russo and colleagues identified, within the K locus gene cluster of A. baumannii strain AB307-0294, two loci ptk and epsA (encode a putative protein tyrosine kinase and a putative polysaccharide export outer membrane protein, respectively) that are required for capsule-positive phenotype, for survival in human serum and survival in a rat soft tissue infection model 61 ; gene knock-out mutants of either gene were completely cleared from animals in the in vivo experiments. The encoded proteins of these two genes shared 90% and 95% identity over the entire protein lengths with the products of loci ABLAC_37120 and ABLAC_37100, respectively, suggesting importance of K locus gene cluster genes to the virulence of A.  baumannii. More interestingly, as described above (Fig. 4), LAC-4's K locus gene cluster also contains a series of genes (leg1-leg6, ABLAC_ 37030-37080) apparently encoding enzymes necessary for biosynthesis of legionaminic acid, the precursor for an uncommon sugar (a-8-epi-legionaminic acid) found in the repeating unit of the surface polysaccharide of this strain 36 and others 62,63 . It has been proposed that since legionaminic acid structurally resembles sialic acid on mammalian host cell's surface glycoproteins, bacterial pathogens may utilize this unusual sugar to mimic host cell surface, thus escaping from host immune surveillance to facilitate their colonization and invasion [64][65][66] . Since the legionaminic acid biosynthesis genes are not common among the 20 completely sequenced A. baumannii genomes (well conserved in only three of the 20 genomes), it is tempting to hypothesize that these genes may contribute to LAC-4's hypervirulence in mice. Furthermore, there are six ORFs (orfK1-6, ABLAC_ 36970-ABLAC_37020) in LAC-4's K locus that are absent in the other 20 completely sequenced genomes. It would be exciting to determine if any of these ORFs is involved in pathogenesis of this strain. The ability of bacterial pathogens to obtain scarce iron from their mammalian hosts is critical for survival and infectivity 67,68 . Consequently, proteins that participate in uptake and utilization of iron or heme have been recognized as crucial virulence factors [69][70][71][72] . Not surprisingly, additional loci that were identified as putative virulence factor genes include genes from heme utilization cluster 1 (ABLAC_ 24390 and ABLAC_24450) and the hemO cluster (ABLAC_16780, ABLAC_16790 and ABLAC_16800) (see Supplementary Table S3 online). Also identified as potential virulence factor genes are many linked genes (ABLAC_10700, ABLAC_10710, ABLAC_10730, ABLAC_10750-ABLAC_10810, ABLAC_10840-ABLAC_10880) which are major constituents of a large 20-gene cluster (ABLAC_ 10690-ABLAC_10880) for the biosynthesis and transport of the acinetobactin siderophore in LAC-4 strain (see Supplementary Table S3 online); this gene cluster shares identical gene organization as described for the same gene clusters in several other A. baumannii genomes 73 . Similarly, a second siderophore biosynthesis and transport gene cluster (ABLAC_24780-ABLAC_24890) was identified from LAC-4 genome based on the presence of six of the loci within the gene cluster (ABLAC_24780-ABLAC_24820, ABLAC_24850 and ABLAC_24890) being on the list of potential virulence genes in LAC-4 (see Supplementary Table S3 online). The proteins encoded by the genes of this cluster and gene organization are identical to the cluster 5 described previously 73 for a putative hydroxamate siderophore.
Our previous results 35 and those of others 73 suggested that a heme utilization gene cluster including a gene encoding for heme oxygenase (thus called hemO cluster) was present only in some A. baumannii strains and may account for their enhanced virulence. Bioinformatics analysis confirmed the presence of a hemO gene cluster in the LAC-4 genome (ABLAC_16780 to ABLAC_16850). Furthermore, comparative genomics analyses using BLASTp search 1 hit colocation analysis showed that 11 of the other 20 complete genomes of A. baumannii strains also contain the hemO cluster (see Supplementary Fig. S2 online). Another heme utilization gene cluster (without hemO gene; to be called heme utilization cluster 1) was reported to be present in all strains of A. baumannii tested 73 . As predicted, LAC-4 also harbors heme utilization cluster 1 (ABLAC_ 24350-ABLAC_24460), which is present in all the other 20 A. baumannii genomes (see Supplementary Fig. S3 online). The presence of hemO cluster in LAC-4 and 19 other outbreak isolates 5 was further investigated by PCR. All 20 outbreak isolates (including LAC-4) were found to contain the hemO cluster (Table 1; 33 . Taken together, our data suggest that the presence of hemO cluster per se may not entirely account for the hypervirulence of LA-4. Indeed, the LAC-4 genome analysis identified four gene clusters for iron/heme utilization (the hemO cluster, the heme utilization cluster 1 and two gene clusters for siderophore biosynthesis and transport). It is highly likely that these gene clusters provide redundant function to ensure efficient acquisition of iron for cellular use. Further genome-wide molecular genetics studies (such as transposon mutagenesis) will be needed to decipher the relative contribution of genes or gene clusters of the iron/heme utilization and other processes in the virulence of this strain.
In summary, here we reported the complete genome sequence of a hypervirulent, multidrug resistant clinical outbreak isolate (LAC-4) of A. baumannii with an extremely rare MLST sequence type ST10 (MLST Pasteur Scheme). Among the 20 strains of A. baumannii whose complete genomes were available before this report, there are three strains of ST1, nine strains of ST2, two of ST79, one each for ST17, ST23, ST267, ST437, ST638 and ST639 (see Supplementary  Table S4 online). Thus, the LAC-4 genomics study reported here represents the first complete genome sequence of an ST10 A. baumannii strain, making an important addition to the growing list of complete A. baumannii genomes for the scientific research community world-wide. Additionally, molecular tests and comparative genomics analyses offer insight in the mechanisms of resistance and virulence in this important bacterium.

Methods
Bacterial strains. The 20 outbreak strains (LAC-1 to LAC-20) were collected from four apparent clinical nosocomial outbreaks from 3 hospitals in Los Angeles County between 1996 and 2004 and obtained from Los Angeles County Public Health Laboratory 5 . The antimicrobial susceptibility and genetic profiles of these nonduplicate isolates were described in details previously 5 . Specifically, strains LAC-1 through LAC-5 were isolated from an outbreak in Hospital A that lasted for several years (1996)(1997)(1998)(1999), while LAC-6 to LAC-10 were from the 2001 outbreak in the same hospital 5  Clonal relationships and sequence typing. Tri-locus multiplex PCR as described by Turton and colleagues 74 was used to determine clonal relationships of these A. baumannii clinical isolates [Global Clones (GC) I, II or III, also known as International Clones I, II or III]. Multi-locus sequence typing (MLST) was performed based on methods of Diancourt et al 75 . PCR amplification was performed in separate reactions of a final volume of 50 ml. After amplification, aliquots of the PCR reactions were subject to agarose gel electrophoresis analysis. If successful, PCR products were purified using a QIAquick PCR purification kit (Qiagen, Valencia, CA) and then sequenced on an ABI 3730 automated fluorescent sequencer. Determination of the sequence type was carried out using details on the MLST scheme at www.pasteur.fr/ mlst. To obtain MSLT ST assignments for A. baumannii strains whose genomes are complete, genome FASTA sequences were retrieved from NCBI and uploaded in the ''batch sequence query'' mode of MLST (Pasteur) database on the Acinetobacter baumannii MLST Databases website (http://pubmlst.org/abaumannii/), which will generate allelic profiles and STs.
Genomic DNA sequencing, assembly and gap closing. The LAC-4 strain was grown overnight at 37uC to stationary phase in LB medium and total DNA was isolated from harvested cells. The genome sequence of LAC-4 was first determined using Roche 454 FLX1 and assembled with the GS De Novo Assembler, resulting in sequences with 43.5-fold coverage. Then 105 large contigs (.500 kb) were obtained by a combination of re-sequencing by Illumina Hiseq2000 (paired-end sequencing for 400-bp library, with 255.6-fold coverage) and Illumina Miseq (mate-pair sequencing for 3000-bp library, with 408-fold coverage). Finally, gaps between these contigs were closed by genomic PCR and sequencing of PCR products using conventional Sanger method (Applied Biosystems 3730 Genetic Analyzer). An integrity check that performs a BLASTp analysis on neighboring pairs of proteins identified 37 pairs of proteins, which may represent single genes that either have gained mutations or have split into two open reading frames (ORFs) due to sequencing errors. Additional PCR and DNA sequencing analysis have either corrected sequencing mistakes (likely as a result of 454 sequencing errors around homopolymer nucleotides) or combined gene pairs into single pseudogenes.
Genome annotation and comparative genomics analysis. The replication origin (oriC) in LAC-4 genome was predicted by OriFinder 76 . The assembled genome sequences were annotated by using programs Glimmer 3.02 for identification of protein-coding sequences (CDSs) 77 with manual validation of predicted CDSs on the www.nature.com/scientificreports SCIENTIFIC REPORTS | 5 : 8643 | DOI: 10.1038/srep08643 basis of annotations of the BJAB7015 genome 17 and the CBMAR database 78 , tRNAscan-SE for tRNA genes 79 , and RNAmmer for rRNA genes 80 . CDS functions were predicted using BLASTp searches 81 of the NCBI non-redundant database followed by manual curation using the annotations of A. baumannii ATCC 17978 (NCBI accession no. NC_009083.1) 19 and BJAB7015 (NC_021733.1) 17 as references.
The virulence factor genes were predicted with the BLASTp searches against the virulence factor database (VFDB) 82 . The mobile genetic elements in the LAC-4 genome sequences were detected by the following online tools and/or open-access database and manual examinations: MobilomeFINDER for the tRNA/tmRNA generelated genomic islands (GI) 83 , IslandViewer for the island-like regions 84 and IS Finder for insertion sequence (IS) elements 85 . New IS names were provided by the curators of IS database 85 , while transposon (Tn) numbers were assigned by the Tn Number Registry 86 . The gene clusters carried by genomic islands (GIs) or involved in the heme utilization in the sequenced A. baumannii genomes were aligned by using the standalone package MultiGeneBlast 87 , an approach also known as ''BLASTp searches 1 hit collocation''.
The genome sequence comparisons of LAC-4 with the other 20 completely sequenced A. baumannii strains were performed with the rapid multiple genome alignment tool mGenomeSubtractor 88 . All the annotated LAC-4 protein-coding genes (served as query) were examined by mGenomeSubtractor-facilitated BLASTn searches, using an H-value cut-off . 0.42 for conserved genes, against the other A. baumannii genomes (served as subject). The H-value (0 ,5 H-value ,5 1.0) reflects the degree of similarity in terms of the length of match and the degree of identities at a nucleotide level between the matching gene in the subject genome and the query gene examined 88 .
GenBank accession number. The genome sequences of the LAC-4 chromosome and two plasmids pABLAC1 and pABLAC2 have been submitted to the GenBank under accession numbers CP007712, CP007713 and CP007714, respectively.
Antimicrobial susceptibility testing. Antimicrobial susceptibility of the clinical isolates or strains were determined using the broth microdilution protocols of Clinical Laboratory Standards Institute 89 according to methods described previously 5 . Escherichia coli (ATCC #25922) and Pseudomonas aeruginosa (ATCC #27853) were used as quality control strains in the testing.  Table S5 online. The PCR amplification of the hemO gene cluster was performed for the clinical isolates using genomic DNA as a template. Individual PCR reactions involved a 50 ml reaction mixture containing 1 3 5 PRIME MasterMix, 200 nM of forward and reverse primers, and 1 ml of genomic DNA template. Amplification was carried out with 5 min at 95uC; 36 cycles of 30 s at 95uC, 30 s at 55uC, and 1 min at 72uC; and 10 min at 72uC. The resulting PCR products were visualized and imaged through agarose gel electrophoresis analysis.