Comparative genomic analysis of a naturally competent Elizabethkingia anophelis isolated from an eye infection

Elizabethkingia anophelis has now emerged as an opportunistic human pathogen. However, its mechanisms of transmission remain unexplained. Comparative genomic (CG) analysis of E. anopheles endophthalmitis strain surprisingly found from an eye infection patient with twenty-five other E. anophelis genomes revealed its potential to participate in horizontal gene transfer. CG analysis revealed that the study isolate has an open pan genome and has undergone extensive gene rearrangements. We demonstrate that the strain is naturally competent, hitherto not reported in any members of Elizabethkingia. Presence of competence related genes, mobile genetic elements, Type IV, VI secretory systems and a unique virulence factor arylsulfatase suggests a different lineage of the strain. Deciphering the genome of E. anophelis having a reservoir of antibiotic resistance genes and virulence factors associated with diverse human infections may open up avenues to deal with the myriad of its human infections and devise strategies to combat the pathogen.


Results
Characteristics of the patient. The patient was diagnosed with post-operative endophthalmitis (anterior chamber hypopyon, dense vitreous haze with yellow reflex). There was no view of the retina and his vision was only perception of light. Subsequently, vitrectomy was performed to debulk the infection and pus was removed from the infected eye. Empirical treatment for endophthalmitis was immediately initiated.
Identification and antibiotic susceptibility profile. The isolate was initially identified as Elizabethkingia meningoseptica by Vitek-2. Subsequently, with the availability of additional Elizabethkingia genomes and 16 s rDNA sequence analysis, the genome of the study isolate was re-identified as Elizabethkingia anophelis. MIC analyses of 13 antibiotics revealed that the pathogen was resistant to penicillins, cephalosporins, monobactam, carbapenems, aminoglycosides and trimethoprim/sulfamethoxazole; and sensitive to levofloxacin and minocycline. The organism exhibited intermediate resistance to tigecycline and ciprofloxacin (Table 1).
Comparative analysis of core and pan genome. 26 genomes of E. anophelis including the genome of the new E. anophelis endophthalmitis study strain were retrieved from NCBI database for CG analysis. The average size of the genomes is 4.03 Mb and average G + C% is 35.61%. The strain with the maximum size is NUHP1 (4.36 Mb) and the smallest genome is As1 (3.59 Mb) ( Table 2). The predicted protein sequences of all the 26 E. anophelis genomes were used as input to conduct the core-pan genome analysis. CG analysis revealed that 1404 (40.79%) genes were shared between all the 26 strains, which may be considered to be "core genome". The accessory genome varied from 656 to 2240 genes (avg. 59.2%) across the strains (Fig. 1). Most notably, exclusion of the study isolate's genome from the analysis led to an inclusion of an additional 844 genes in the core genome of the remaining 25 genomes. The study strain was found to have maximum number of exclusively absent (846) and unique (156) genes (Table 3), revealing that it has an open pan genome. Annotation of the pan-genome of all the 26 genomes to understand the enrichment of the COG (Fig. 2) and KEGG pathways (Fig. 3) has been mapped.
Core genome based phylogenetic analysis. Phylogenetic tree was constructed to infer the relationship among these genomes using the concatenated sequences of 1,404 core proteins identified from the pan/core genome analysis. The tree separated the 26 E. anophelis genomes into two distinct branches. In the lower branch, genomes of 502, B2D, Endophthalmitis were found to be closely related. Genomes of E. anophelis from Singapore have clustered together (NUH6, NUH11, NUH1, NUHP1, NUHP2, NUHP3) excepting NUH4. Further, the genomes of the four wisconsin outbreak isolates CSID300521207, CSID3015183678, CSID3015183681 and CSID3015183684 have been found to cluster together 13 . The other branch was found to have two sub clusters. 12012-2PRCM, As1, Ag1, R26, 0422, PW2806 and PW2809 genomes grouped into one, while EM361-97, Po0527107, LDVH-AR107, V0378064, FMS007 and NUH4 genomes were part of the second sub cluster (Fig. 4). The genome of the isolated strain included in the lower branch, is phylogenetically closer to the strains B2D and 502. However, unlike the test isolate being associated with endophthalmitis, the strains B2D and 502 were isolated from dental plaque and traumatic wound respectively.
Resistance to antibiotics and toxic compounds. Screening of all the study genomes revealed the presence of blaGOB, blab and blaCME (excepting strain AS1) beta-lactamase genes in most of the strains. Further, vancomycin (VanW) resistance was predicted among all the 26 strains. A gene encoding bile salt hydrolase -Choloylglycine hydrolase (EC 3.5.1.24) was found in all the analysed genomes excepting As1. This enzyme has been previously reported to protect Brucella abortus in the host gut from the toxic and antimicrobial activity of the bile salts 14 . Genes associated with CzcCBA, a membrane bound protein complex aiding in heavy metal resistance 15 have been identified in all the 26 genomes. Genes coding for proteins conferring resistance to several heavy metals (copper, Zinc, cadmium, cobalt) have been discovered among all the study isolates. In addition, genes coding for the proteins leading to arsenic resistance have been predicted in all the genomes. Further, some of the strains (NUH1, NUH11, NUH4, NUH6, NUHP1, NUHP2 and NUHP3) were found to possess arsenic resistance operon repressor suggesting the inducibility of the system's resistance. Several genes encoding for the multidrug efflux pumps including RND (CmeB -26 strains, CmeC -14 strains), multi antimicrobial extrusion protein (Na(+)/ drug antiporter -26 strains) belonging to the MATE family of MDR efflux pumps and acriflavine resistance protein (RND efflux pump transporter -26 strains) 16 were found in the study genomes (Supplementary Table 2).
Putative virulence and anti-virulence genes. Many genes that may be associated with invasion and intracellular resistance in humans have been identified. We have identified homologs of the gene encoding for an agmatine deiminase in all the 26 isolates. Agmatine deiminase has been reported to aid growth at low pH and biofilm formation, confer acid tolerance in addition to being a potential adherence factor in the colonization of vagina 17 . Putative hemolysin and a hemolysin secretion protein have been predicted among all the 26 genomes. Hemolysin has been implicated as a virulence factor among several gram-negative and gram-positive pathogens 18 . It has also been reported that hemolysin could be a potential ocular virulence factor in Bacillus cereus and Staphylococcus aureus leading to endophthalmitis 19,20 and keratitis 21 respectively. However, among the 26 genomes included in this study, arylsulfatase has been identified only in the current study isolate. Arylsulfatase has been implicated previously in E. coli infection of the brain microvascular endothelial cells (BMEC) of the host. Presence of arylsulfatase may contribute to the ability of the pathogen to cross the blood-brain barrier leading to meningitis 17 Table 3). These genes are considered to have potential anti-virulence function as their activation was reported to inhibit invasion and intracellular spread among Shigella species 23 .   Table 4). A total of 107 Genomic Islands (GIs) have been identified in 25 of the study genomes excepting B2D. The smallest of the predicted GI was 8.2 kb in the strain EM361-97 and the largest GI was 31.8 kb in NUH1 and NUHP3 strains. NUH1, NUH4 and NUHP2 were found to harbor 8 GIs. Maximum number of coding DNAs (66 numbers) were found in the GI (region III) of the strain 422. Several virulence factors, antibiotic resistance genes, pathogenicity islands, insertion sequences, prophage related genes and genes of the secretion systems have been identified in these GIs (Supplementary Table 5). Further, remnants of several Integrative and Conjugative Elements (ICEs) have been found in many of the study genomes.

Analysis of Mobile Genetic Elements (MGEs
Type VI and Type IV Secretion systems. T6SS plays an important role in bacterial pathogenesis by allowing the transport of virulence factors, targeting the host cells as well as helping in competing with other bacteria in their niche 24 . They are widely distributed in the genomes of the phylum Bacterioidetes of which Flavobacteriaceae is a family. More specifically, T6SS iii has been reported to be prevalent among the members of the Flavobacteriaceae 25 . Consistent with earlier reports 25,26 , several of the study genomes were found to possess T6SS iii . Genes annotated to TssN, TssO and TssP proteins which are unique to only T6SS iii along with those coding for the other core components such as TssB, TssC and extracellular components VgrG and HcP have been identified in twenty-four genomes. Only strains NUH6 and As1 did not appear to harbor any genes coding for the T6SS components (Supplementary Table 6).
T4SS are established components of bacterial conjugation and virulence. T4SS genes are also acquired as part of the Integrative and Conjugative Elements (ICEs). Our analysis revealed that 23 of the genomes possessed the genes associated with T4SS but are absent in strains As1, B2D and R26 (Supplementary Table 7).
Defence and repair systems. Bacteria employ a host of mechanisms to protect themselves from the invading genetic elements. These include Restriction Modification systems (RMs) which are considered to be innate immune systems and Clustered Regularly Interspaced Short Palindromic repeat sequences (CRISPRs), considered to be adaptive immune systems 27 . Our analysis revealed the presence of Type I and Type II RM systems. Unlike all other genomes analyzed in the study, E. anophelis endophthalmitis was found to possess maximum number of RM system associated genes. The genome was found to possess 12 RM genes, while the strains 502 and B2D did not possess any genes coding for RM systems (Supplementary Table 8). Analysis for CRISPRs indicated that only four of the study genomes (FMS007, LDVH-AR107, Po0527107 and V0378064) possess confirmed CRISPRs (Supplementary Table 9). Analysis for the presence of anti-restriction systems 28 led to the detection of an anti-restriction gene ArdA among 19 of the study genomes. ArdA protein was reported to support the MGEs in evading the Type I RM systems and augment the spread of resistance determinants 29 (Supplementary Table 10).
CG analysis indicated that most of the DNA repair pathways are represented in all the study genomes. Most of the repair pathways appear to be intact. In the majority of the genomes excepting those of the wisconsin strains, there was no disruption in the mutY (Adenine DNA glycosylase) gene. In the genomes of the four Wisconsin strains, the 1,029 bp mutY gene was found to be disrupted by the insertion of a 62,212 bp ICEEa1 (Supplementary Fig. 1) 13 . A total of 32 protein coding genes (excluding hypothetical genes) involved in transposition, excision of the conjugative transposon, heavy metal resistance and tetracycline resistance have been found inside the ICE.

E. anophelis endophthalmitis is naturally competent. Although a number of Elizabethkingia strains
have been identified and characterized, natural transformation has not yet been reported in any of them. Given the presence of a considerable number of GIs and other gene clusters possibly acquired through HGT in the strain, we investigated the capability of the organism to carry out HGT. In the absence of any well characterized bacteriophages for these group of bacteria, we resorted to study natural transformation.
Natural transformation was observed after exposing plasmid DNA to exponentially growing cells. PCR analysis for the presence of BDNF gene in the plasmids isolated from the transformants confirmed that E. anophelis is naturally competent (Fig. 5). However, natural competence was observed only when OD 600 reached 0.84. Genome analysis revealed three genes -(a) DNA internalization-related competence protein ComEC/Rec2, (b) Competence protein F homolog and (c) Competence/damage-inducible protein CinA involved in DNA internalization and transformation. These genes are present in majority of the analyzed genomes indicating that natural transformation may be occurring in other Elizabethkingia (Supplementary Table 11).  Table 3. Core, accessory, unique and exclusively absent genes in the 26 E. anophelis genomes after pan-core genome analysis using BPGA pipeline. Core genes -number (No.) of genes that are shared by all the study genomes, Accessory genes -Genes that are not shared by all the genomes, Unique genes -genes that are found exclusively in a particular genome, exclusively absent genes -genes that are exclusively absent in a particular genome but are otherwise found in the other genomes.

Discussion
In this manuscript, we have described the uncommon features of a new strain of E. anopheles isolated from a post-operative endophthalmitis patient. This is the first ever isolate of the species from a patient suffering from this infection. Initial Vitek-2 analysis from the vitreous fluid of the patient led to the identification of the organism as Elizabethkingia meningoseptica 2 . Availability of several Elizabethkingia genomes due to the advent of whole genome sequencing and 16srDNA analysis lead to the unambiguous identification of the pathogen as Elizabethkingia anophelis, a closely related species to E. meningoseptica. Due to the unusual nature of the pathogen, an investigation of the patient's eye drops, sinks and cubicle in the ward where the patient stayed was conducted for the presence of the pathogen. However, culture results did not indicate the presence of E. anophelis ruling out the possibility of nosocomial acquisition. This raised the possibility that alternate modes have been employed in the transmission of the bacterium leading to postoperative endophthalmitis. Our findings suggest that E. anophelis is a slow growing bacterium compared to E. coli and is capable of natural transformation during a narrow window of exponential phase of growth. CG analysis with 25 other sequenced strains of E. anophelis showed several differences. Pan and core genome analysis revealed that the strain E. anophelis endophthalmitis has undergone massive gene rearrangements indicated by the high number of unique and exclusively absent genes predicted in its genome. Exclusion of E. anophelis endophthalmitis genome from the pan-core genome analysis has led to an increase in the core genome size by 37% indicating the strain's divergence. The presence of a large number of MGEs and the horizontally acquired genes would have contributed to the genome rearrangement in the organism, diverging it from other analyzed genomes. These data along with others described below suggest the possibility of the recent emergence of the strain.
Survival in diverse environments such as mosquito mid gut and human tissues necessitates the bacterium to adapt to the respective niche environments. In such a scenario, possession of MGEs would help adjust and thrive. Analysis for MGEs revealed the presence of GIs, remnants of phage genomes and ICEs amongst several of the study genomes, indicating that the organisms with incomplete prophages and ICEs have further undergone gene gain/loss which may be of an evolutionary requirement for the pathogen to be successful in diverse ecological niches. Interplay between the bacterial defence systems (RMs, CRISPRs) and anti-RM proteins are known to  Table 12). Taken together, these observations highlight the potential of the organism to be involved in more robust HGT. The presence of T6SS iii and T4SS secretome further confirms that HGT is a norm and not an exception in E. anophelis. Given this scenario, it is not surprising to find several genes associated with antibiotic resistance, efflux pumps, virulence factors and competence. The finding of an exclusive virulence gene, arylsulfatase in the organism along with other genes with potential for pathogenesis and disease in humans, may provide further insights into the adaptation mechanisms of the pathogen to thrive under diverse ecological niches.
Notably, the virulence factor arylsulfatase with potential to cause meningitis was not found in any of the other analysed genomes including those from the central african republic outbreak (Po0527107 and V0378064) associated with neonatal meningitis 8 . This indicates that the current study strain may also have the potential to cause meningitis and there may be other meningital virulence factors that are yet to be identified and characterized.
From the features described, it is evident that E. anophelis endophthalmitis is a novel human pathogen with distinct characteristics. It is tempting to speculate how the organism could have been transmitted possibly by the mosquito vector to cause eye infection hitherto not associated with E. anophelis. Different mosquito species are highly prevalent in India, Africa and elsewhere. Thus, it is not unrealistic to suggest the transmission of the pathogen to human hosts by the mosquito vector in whose gut E. anophelis was first discovered. We speculate that the mosquito may have bitten near the eye after the patient has undergone surgery, thus transmitting the bacterium into the host's ocular system. However, there could be alternate explanations which could account for the transmission. Also, further investigations are needed to confirm the transmission routes including zoonotic transmission of this enigmatic pathogen.
Given the enhanced antimicrobial resistance observed in E. anophelis, and its capacity to thrive in many ecological habitats it is very critical to implement best healthcare practices when in contact with the pathogen and initiate appropriate surveillance measures before the pathogen gets involved in the next outbreak.    incubated for 90 minutes at 30 °C. Untransformed E. anophelis endophthalmitis was found to exhibit resistance to ampicillin (250 µg/mL). Hence, transformation mixtures were plated on LB-amp agar plates at a higher concentration (600 µg/mL) in three dilutions, i.e., 1:2, 1:20 and 1:100. PCR was performed for the presence of BDNF gene in the plasmids isolated from the transformaned colonies using specific primers (forward Primer: 5′-GGATCCATG ACCATCCTTTTCCTTACTATGG-3′; reverse Primer: 5′-AAGCTTCTATCTTCCCCTTTTAATGGTCAGT-3′) in a final volume of 20 μl containing 10 μl of PCR Master Mix (Takara) which includes dNTPs, MgCl2, Taq DNA polymerase and PCR buffer), 0.5 μM of forward and reverse primers, template DNA (2 μl) and nuclease free Water (4 μl). The PCR conditions employed for the amplification of BDNF gene were 94 °C -2 minutes followed by 30 cycles of 94 °C -15 seconds, 55 °C -30 seconds, 72 °C -47 seconds and a final elongation at 72 °C -7 minutes. PCR amplicons were analysed on 1.2% agarose gel. E.coli DH5α and untransformed E. anophelis endophthalmitis served as negative controls.

Case presentation and strain characterization.
Growth kinetics. To determine the growth curve of E. anophelis endophthalmitis, 1% inoculum from overnight culture was added to 100 ml of fresh LB medium and incubated at 37 °C. Absorbance at 600 nm was measured at every 30 minutes. Growth kinetics revealed that the generation time of the isolate is 78 minutes at 37 °C ( Supplementary Fig. 2).
Genome characterization. Draft genome sequencing and assembly of E. anophelis endophthalmitis genome was recently reported 12 . The contigs from the draft assembly were subjected to gene prediction using PRODIGAL tool 31 with default parameters as recommended. Predicted protein sequences were annotated using BLASTp against UNIPROT bacterial proteins database with evalue < = 0.001, > = 70% as query coverage and %Identity > = 30. E. anophelis NUHP1 (gi Number: 675102482) genome was used to construct the genome map of E. anophelis endophthalmitis. This led to successful organization of 111 contigs out of 167 contigs. A total of 3,729 ORFs encoding 2302 proteins were predicted from the assembled genome. All available (twenty-five other) E. anophelis genome sequences (as on February 02, 2017) were obtained from NCBI Genomes database for CG analysis and RAST (version 2.0) annotation has been repeated to obtain unambiguous results (Supplementary Table 1).
Bacterial Pan Genome Analysis (BPGA) 32 was used for comprehensive pan/core genome analysis, functional annotation of the core, accessory and unique genes to Cluster of Orthologous groups (COG) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways using default parameters. Phage genomes were identified by PHASTER 33 . Antibiotic resistance genes were predicted by RAST (version 2.0) 34 , Resfinder (version 3.0) 35 and VRprofile (version 2.0) 36 . Virulence factors, genomic islands, Insertion sequences and T6SS Secretory systems were analyzed by VRprofile (version 2.0). T4SS genes were identified by SecReT4 (version 1.0) 37 . ICEberg (version 1.0) 38 was used to screen for ICEs. Restriction Modification (RM) and anti-restriction systems were identified by RAST (version 2.0). CRISPRfinder 39 was used to predict potential CRISPR gene clusters. Unless otherwise mentioned, all the above mentioned analyses were performed using default parameters.