Genomic features, phylogenetic relationships, and comparative genomics of Elizabethkingia anophelis strain EM361-97 isolated in Taiwan

Elizabethkingia anophelis has become an emerging infection in humans. Recent research has shown that previous reports of E. meningoseptica infections might in fact be caused by E. anophelis. We aimed to investigate the genomic features, phylogenetic relationships, and comparative genomics of this emerging pathogen. Elizabethkingia anophelis strain EM361-97 was isolated from the blood of a cancer patient in Taiwan. The total length of the draft genome was 4,084,052 bp. The whole-genome analysis identified the presence of a number of antibiotic resistance genes, which corresponded with the antibiotic susceptibility phenotype of this strain. Based on the average nucleotide identity, the phylogenetic analysis revealed that E. anophelis EM361-97 was a sister group to E. anophelis FMS-007, which was isolated from a patient with T-cell non-Hodgkin’s lymphoma in China. Knowledge of the genomic characteristics and comparative genomics of E. anophelis will provide researchers and clinicians with important information to understand this emerging microorganism.

Isolate of E. anophelis. Elizabethkingia anophelis strain EM361-97 was isolated from the blood of a 46-year-old male patient with advanced nasopharyngeal carcinoma and lung cancer. During admission, the patient suffered from pneumonia, respiratory failure, and profound shock. He initially received empirical antibiotics with levofloxacin. Unfortunately, the patient died several days after this infection. One blood culture from the patient yielded a gram-negative bacillus that was initially identified as E. meningoseptica using API/ID32 GN (bioMérieux S.A., Marcy l'Etoile, France) by the clinical microbiology laboratory. This isolate was named strain EM361-97 and was stored at −80 °C as a glycerol stock for further experiments. We re-identified this isolate as E. anophelis using 16S ribosomal RNA (rRNA) gene sequencing as previously published 15 . The minimum inhibitory concentration (MIC) of this isolate was examined using the broth microdilution method. The susceptibilities were determined according to the interpretive standards for "other non-Enterobacteriaceae" as suggested by the Clinical and Laboratory Standards Institute (CLSI) guidelines 16 . Whole-genome sequencing and genome annotation of E. anophelis EM361-97. The deoxyribonucleic acid (DNA) of this isolate was prepared using a Wizard Genomic DNA Purification Kit according to the manufacturer's instructions (Promega, WI, USA). The genome was sequenced using an Illumina HiSeq. 2000 Sequencing Platform (Illumina, CA, USA). The short reads were assembled and optimized according to paired-end and overlap relationship via mapping reads to contig using SOAP de novo v. 2.04 17 . The assembled genome was then submitted to the NCBI Prokaryotic Genome Annotation Pipeline 18 and the Rapid Annotations based on Subsystem Technology (RAST) Prokaryotic Genome Annotation Server (http://rast.nmpdr.org/) for gene function annotations 19,20 . The graphical map of the circular genome was generated using the CGView Server (http://stothard.afns.ualberta.ca/cgview_server/) 21 . The virulence factors of strain EM361-97 were analysed using the Virulence Factor Database (VFDB, http://www.mgc.ac.cn/VFs/) 22,23 . Antibiotic resistance genes were searched using the Antibiotic Resistance Genes Database BLAST Server (https://ardb.cbcb.umd.edu/) 24 , RAST Server 19,20 , and UniProtKB/Swiss-Prot database via OrthoVenn (http://probes.pw.usda.gov/OrthoVenn/) 25 . Comparative genomic analysis. For comparison, the genome sequences of 34 available, nonduplicated, different genome sequences of E. anophelis in GenBank were downloaded from the National Center for Biotechnology Information (NCBI) genome sequence repository (https://www.ncbi.nlm.nih.gov/genome/). The genome-wide comparison and annotation of clusters of orthologous groups (COGs) were generated using the web server OrthoVenn 25 . The average nucleotide identity (ANI) values between two genome sequences were calculated using the original ANI function of OrthoANI 26 . The heat maps were generated using CIMminer (https:// discover.nci.nih.gov/cimminer/). The in silico DNA-DNA hybridization (DDH)-analogous values between different strains were calculated using the Genome-to-Genome Distance Calculator (GGDC) 2 (http://ggdc.dsmz. de/distcalc2.php) 27 . A 70% similarity of in silico DDH value represents the cut-off value for species boundaries. The phylogenetic tree was constructed using CIMminer (https://discover.nci.nih.gov/cimminer/) based on ANI values. Data Availability. The names of organisms, strains, biosample numbers, bioproject numbers, assembly numbers, isolated origins, and release dates of bacteria used in this study are shown in Supplementary Table S1.

Results and Discussion
General genome description of E. anophelis EM361-97. The statistics of assembly and annotation are shown in Table 1. The total length of the draft genome was 4,084,052 bp, with a mean GC content of 35.7%. This genome contained 3,774 genes that made up 87.9% of genome. The genomic features of E. anophelis EM361-97 are shown in Fig. 1. The number of tandem repeat sequence was 108. The assembly contained 18 scaffolds, 27 contigs, 3,743 coding sequences (CDSs), 53 minisatellite DNAs, 26 microsatellite DNAs, 51 transfer RNAs (tRNAs), and 15 rRNAs (Fig. 1A).
The genomic features of microorganism could be investigated according to the subsystem, a cluster of genes that function with a specific biological process or structural complex 19,20 . The genome of E. anophelis strain EM361-97 analysed by the RAST Server revealed 356 subsystems that could be classified into 27 categories (Fig. 1B). Among these, the "amino acid and derivatives" subsystem accounted for the largest number of 319 CDSs, followed by carbohydrate metabolism (268 CDSs), protein metabolism (220 CDSs), and RNA metabolism (121 CDSs). Regarding the 88 CDSs in the "virulence, disease, and defense" subsystem, 12 were related to invasion and intracellular resistance, and 76 were associated with resistance to antibiotics and toxic compounds. The high number of antibiotic resistance-associated CDSs suggests that E. anophelis EM361-97 might be resistant to multiple antibiotics.
Orthologous genes. Orthologous genes are clusters of genes in different species that have evolved by vertical descent from a single ancestral gene. A genome-wide comparison of orthologous clusters in different isolates provides insight into the gene structure, gene function, and molecular evolution of genomes 25 . The COGs analysis of strain EM361-97 was compared with the other four genomes isolated from the USA (strains CSID_3015183681 and 3375), Africa (strain V0378064 [E18064]), and Singapore (strain NUHP1) (Fig. 2). The analysis shows that E. anophelis EM361-97 contained 3,611 proteins, 3,324 COGs, and 234 singletons. Among the 3,324 COGs in strain EM361-97, 2,988 COGs were shared by all five strains, and 11 COGs were only present in the strain EM361-97 genome. The unique COGs existing in EM361-97 involved genes functioning with transferase activity, cofactor binding, oxidoreductase activity, nucleotide binding, fatty acid elongation, and 3-oxoacyl-[acyl-carrier-protein] reductase (NADPH) activity. The representative meanings of these singular genes in E. anophelis EM361-97 are not clear. Further investigations to understand the features of these unique genes in E. anophelis EM361-97 are warranted.    Fig. 3A. The genome of E. anophelis EM361-97 was apparently closer to that of E. anophelis R26 T than the other Elizabethkingia species.
The evolutionary relatedness among these strains was measured by in silico DDH-analogous values (Fig. 3B) 28 . However, the strain of E. endophytica was re-identified as an additional strain of E. anophelis based on in silico DDH of whole-genome sequencing (77% DDH value with regard to E. anophelis strain R26 T ) 29 . Recently, Nicholson et al. 30 proposed three novel Elizabethkingia species, Elizabethkingia bruuniana sp. nov., Elizabethkingia ursingii sp. nov., and Elizabethkingia occulta sp. nov. Our study showed that strain EM361-97 anophelis based on ANI is shown in Fig. 4. The phylogenetic analysis revealed that E. anophelis EM361-97 was a sister group to E. anophelis FMS-007, which was isolated from a patient with T-cell non-Hodgkin's lymphoma in China. The sister group of E. anophelis strains EM361-97 and FMS-007 was a clade sister of strains Po0527107 (E27017) and V0378064 (E18064) isolated from two neonates with meningitis in the Central African Republic 7 . The seven strains isolated from Singapore were divided into two clusters (NUHP1, NUHP2, NUHP3, NUH1, NUH4; and NUH6, NUH11). The 13 strains isolated from the USA clustered in four groups, and the four strains that caused the outbreak of E. anophelis infection in Wisconsin (strains CSID_3015183678, CSID_3015183681, CSID_3015183684, CSID_3000521207) were in the same clade.
Virulence factors. Elizabethkingia anophelis infections in humans have shown a mortality rate of 24% to 60% 5,6,11 , and this high mortality rate may be in part correlated with the virulence of this pathogen and also the preexisting conditions of the patients (e.g., old age, neonates, and immunosuppression). In this study, homologs of 25 virulence factors were identified in E. anophelis EM361-97 using VFDB 22,23 (Supplementary Table S2). These virulence factors included products of the capsule, lipopolysaccharide, endopeptidase, lipid biosynthesis and metabolism, magnesium transport protein, macrophage infectivity, heat shock protein, catalase, peroxidase, superoxide dismutase, two-component regulatory system, and others.
According to the VFDB classification scheme, virulence factors are divided into offensive, defensive, nonspecific, and virulence-associated regulatory genes 22 . In our study, 13 of 25 pathogen-associated virulence factors homologs were identified to play offensive functional roles, eight were associated with defensive functions, three were nonspecific virulence factors, and one was related to regulation of virulence-associated genes. In strains Po0527107 (E27017) and V0378064 (E18064), Breurec et al. 7 identified several offensive virulence factors that were found in strain EM361-97, including clpC, kdtB, pilR, sodB, galE, bplC, katA, clpP, fleQ, and htpB. These virulence factors were also detected in the Wisconsin strains 12 .
Pathogenic genomes were identified to have more offensive virulence factors, such as toxin and type III/IV secretion systems, than non-pathogenic genomes. In contrast, defensive, nonspecific, and regulatory virulence factors, such as iron uptake, motility, and antiphagocytosis, were found more frequently in non-pathogenic genomes than in pathogenic genomes 31 . Ho Sui et al. 32 carried out a large-scale study to analyse the virulence factors of multiple bacteria and found over-presentation of offensive virulence factors, such as type III/IV secretion systems or toxins, within genomic islands of invasive pathogens. The manifestation of many offensive virulence factors in E. anophelis suggests this microorganism may severely damage the host. However, this hypothesis lacks validity. More experiments are warranted to test the hypothesis of offensive virulence factors in E. anophelis.
Antimicrobial resistance and associated genes of E. anophelis EM361-97. The MIC and susceptibility of E. anophelis EM361-97 are shown in Table 2. This isolate was only susceptible to piperacillin-tazobactam and minocycline. The MIC of tigecycline was 2 mg/L. However, there are no interpretive criteria of the susceptibility for E. anophelis to tigecycline in the CLSI 16 and European Committee on Antimicrobial Susceptibility Testing 33 .
Little information is known about the antimicrobial susceptibility of E. anophelis. Han et al. 34 reported the susceptibilities of 51 E. anophelis isolates from South Korea. The susceptibility rates to piperacillin-tazobactam, piperacillin, levofloxacin, ciprofloxacin, gentamicin, and trimethoprim-sulfamethoxazole were 92%, 82%, 29%, 22%, 22%, and 22%, respectively. All the isolates were resistant to ceftazidime and imipenem. However, the MICs of minocycline and tigecycline were not examined in that study 34 . Perrin et al. 12 used the disk diffusion method to examine antimicrobial susceptibilities of 29 E. anophelis isolates in the Wisconsin outbreak. Most of these isolates were resistant to ceftazidime, imipenem, amikacin, tobramycin, gentamicin, but susceptible to cefepime, piperacillin, piperacillin-tazobactam, ciprofloxacin, and levofloxacin. Minocycline was also not tested in the study of Perrin et al. 12 . The antibiogram of isolates in the Wisconsin outbreak was different from that of isolates in Singapore by macrolides and isepamycin 10,35 .
Gene functions annotated using the RAST/SEED Server recognised 76 genes of E. anophelis EM361-97 that were related to antibiotic resistance, including 12 for β-lactamase resistance, one for vancomycin resistance (vanW), four for fluoroquinolone resistance (parC, parE, gyrA, gyrB), nine for the membrane component of the tripartite multidrug resistance system, and 16 for multidrug resistance efflux pumps (six CmeB, one TolC, two MATE family efflux pumps, five OML, and two AcrB) ( Table 2). The protein function annotations based on UniProtKB/Swiss-Prot demonstrated a number of proteins that played the role of antibiotic resistance, including multidrug resistance proteins (MdtA, MdtB, MdtC, MdtD, MdtE, MdtK, MdtL), probable multidrug resistance protein EmrK, multidrug export protein EmrA, macrolide export protein MacA, macrolide export ATP-binding/ permease protein MacB, multidrug resistance outer membrane protein MdtQ, outer membrane efflux protein BepC, carbapenem antibiotics biosynthesis protein CarD, β-lactamase (BRO-1, 2), multidrug efflux pump subunit AcA, lincomycin resistance protein, DNA gyrase subunit A and subunit B, erythromycin resistance ATP-binding protein MsrA, and vancomycin B-type resistance protein VanW (Table 2). A replacement of serine by isoleucine at position 83 of DNA gyrase subunit A (Ser83Ile; AGC → ATC) was identified in E. anophelis strain EM361-97. Perrin et al. 12 also found the same mutation of DNA gyrase subunit A in the Wisconsin outbreak strain CSID_3000521792. These findings are in agreement with the resistance of these two strains to fluoroquinolones.   Table 2. The minimum inhibitory concentration, susceptibility, and genes associated with antibiotic resistance in E. anophelis EM361-97. MIC, minimum inhibitory concentration. † Associated with multidrug resistance: membrane component of tripartite multidrug resistance system, multidrug resistance efflux pumps (CmeB, TolC, MATE family efflux pump, OML, AcrA, AcrB), outer membrane efflux protein BepC, multidrug resistance proteins (MdtA, MdtB, MdtC, MdtD, MdtE, MdtK, MdtL), multidrug resistance protein EmrK, multidrug export protein EmrA, multidrug resistance outer membrane protein MdtQ, ABC transporter, MFS transporter, transcription-repair coupling factor, acriflavin resistance protein, and isoleucine-tRNA ligase. * Susceptibility was determined according to the interpretive standards for other non-Enterobacteriaceae of CLSI.