Genome insight and description of antibiotic producing Massilia antibiotica sp. nov., isolated from oil-contaminated soil

An ivory-coloured, motile, Gram-stain-negative bacterium, designated TW-1T was isolated from oil-contaminated experimental soil in Kyonggi University. The phylogenetic analysis based on 16S rRNA gene sequence revealed, strain TW-1T formed a lineage within the family Oxalobacteraceae and clustered as members of the genus Massilia. The closest members were M. pinisoli T33T (98.8% sequence similarity), M. putida 6NM-7T (98.6%), M. arvi THG-RS2OT (98.5%), M. phosphatilytica 12-OD1T (98.3%) and M. niastensis 5516S-1T (98.2%). The sole respiratory quinone is ubiquinone-8. The major cellular fatty acids are hexadeconic acid, cis-9, methylenehexadeconic acid, summed feature 3 and summed feature 8. The major polar lipids are phosphatidylethanolamine, diphosphatidylglycerol and phosphatidylglycerol. The DNA G + C content of the type strain is 66.3%. The average nucleotide identity (ANI) and in silico DNA–DNA hybridization (dDDH) relatedness values between strain TW-1T and closest members were below the threshold value for species demarcation. The genome size is 7,051,197 bp along with 46 contigs and 5,977 protein-coding genes. The genome showed 5 putative biosynthetic gene clusters (BGCs) that are responsible for different secondary metabolites. Cluster 2 showed thiopeptide BGC with no known cluster blast, indicating TW-1T might produce novel antimicrobial agent. The antimicrobial assessment also showed that strain TW-1T possessed inhibitory activity against Gram-negative pathogens (Escherichia coli and Pseudomonas aeruginosa). This is the first report of the species in the genus Massilia which produces antimicrobial compounds. Based on the polyphasic study, strain TW-1T represents novel species in the genus Massilia, for which the name Massilia antibiotica sp. nov. is proposed. The type strain is TW-1T (= KACC 21627T = NBRC 114363T).

Antimicrobial-resistance (AMR) is the massive public health threat in the world 1 . Continuously elevated number of multidrug-resistant (MDR) strains harden the efficient treatment of infections caused by bacteria 2 . The infections caused by MDR strains are tremendously tough to treat and might need last resort of antibiotics 1,3 . On the other hand, bacteria have been developed AMR to all antibiotics discovered to date 1 and no novel antibiotics have been reported since long period. A review by Jim O'Nill estimated 700,000 deaths annually due to MDR infections caused by bacteria 4 . Furthermore, a study by Naylor et al. guesstimated healthcare system costs more than $90 million per year globally 5 . These consequences exhibited that AMR is not only accountable for public health hurdle but also pondered as economic burden. Hence, search for formidable and new antibiotics for the treatment of MDR infections caused by bacteria is extremely required.
We are continuously reconnoitring the previously uncultivated bacteria with the hope that they might produce a new bioactive molecule that may have pharmaceutical applications and might bioremediate the recalcitrant hydrocarbons. During the study of searching oil-degrading bacteria, we have surprisingly isolated a novel candidate of the genus Massilia producing antimicrobial agent that hinders the growth of Pseudomonas aeruginosa and Escherichia coli. The bioactive molecules (antimicrobial agent) from Gram-negative bacteria are scarcely reported. On the contrary, almost all the antibiotics have been reported from Gram-positive bacteria such as Streptomyces 6 . In this context, the report of this novel strain which possesses antimicrobial activity seems valuable.
In this study, a novel member of the genus Massilia isolated from oil-contaminated experimental soil having promising antimicrobial activity against E. coli and P. aeruginosa (Gram-negative pathogens) has been described with its phylogenetic and taxonomic position. In addition, whole-genome analysis of strain TW-1 T has been explored providing deeper insights into metabolic products.

Materials and methods
Isolation and preservation. Strain TW-1 T was isolated unexpectedly during the bioremediation experiment from oil-contaminated natural soil. The oil-contaminated soil was collected form industrial site located near Jeonju City, Republic of Korea. Isolation, maintenance and preservation of strain TW-1 T was carried as mentioned in previous study 23 . Phylogenetic analysis. Genomic DNA of strain TW-1 T was isolated by using InstaGene Matrix kit (Life Science Research; Bio-Rad) following manufacturer's instruction. The 16S rRNA gene was amplified by using PCR (Bio-Rad) with forward and reverse primers 27F and 1492, respectively 24 . Applied Biosystems 3770XL DNA analyzer was used with a BigDye Terminator cycle sequencing Kit v.3.1 (Applied Biosystems, USA) for gene sequencing. After sequencing, nearly complete sequence of 16S rRNA genes was assembled using Seq-Man software (DNASTAR Inc., USA). Phylogenetically closest neighbours were identified using the EzBioCloud server 25 and ncbi megablast. All the 16S rRNA gene sequences of phylogenetically closest neighbours were retrieved from the ncbi GenBank database. All the retrieved sequences along with TW-1 T were aligned using in silico by silva alignment (https:// www. arb-silva. de/ align er/). Neighbor-joining (NJ), maximum-likelihood (ML) and maximum-parsimony (MP) phylogenetic trees were reconstructed using mega (v7.0.26) software 26 . Genome analyses. For genome sequencing, extraction of genomic DNA was carried out by using DNeasy Blood and Tissue kits (Qiagen). Whole-genome shotgun sequencing of strain TW-1 T was accomplished at Macrogen (Republic of Korea) using the Illumina HiSeq 2500 platform using a 150-bp × 2 paired-end kit. The wholegenome sequences were assembled by SPAdes (v3.2) 27 . The authenticity and legitimacy of the assembled genome were checked by comparing 16S rRNA gene sequence of strain TW-1 T using ncbi Basic Local Alignment Search Tool (blastn) 28 . Potential contamination of genome assembly was examined in silico by ContEst16S algorithm using EzBioCloud server (https:// www. ezbio cloud. net/ tools/ conte st16s) 29 . Then, the whole-genome sequence of strain TW-1 T was annotated using the ncbi PGAP (Prokaryotic Genome Annotation Pipeline; https:// www. ncbi. nlm. nih. gov/ genome/ annot ation_ prok) 30 and RAST (Rapid Annotation using Subsystem Technology; https:// rast. nmpdr. org) server 31 . All the genome sequences of reference strains were retrieved from ncbi database. The DNA G + C content of strain TW-1 T and other references used in this study were calculated based on respective whole-genome sequences. Genome-based relatedness between TW-1 T and phylogenetically closest neighbours were determined based on ANI (Average Nucleotide Identity) in silico by OrthoANIu (https:// www. ezbio cloud. net/ tools/ ani) algorithm 32 . The phylogenomic tree was reconstructed in silico using concatenated alignment of 92 core genes with UBCGs software 33 . Digital DNA-DNA hybridization (dDDH) was calculated in silico by the Genome-to-Genome Distance Calculator (GGDC 2.1) using the blast method 34 . The conventional DNA-DNA hybridization (DDH) was measured fluorometrically using photobiotin-labelled DNA probes and microdilution plates as recommended by Ezaki et al. 35 . Graphical circular map was constructed by using CGView (http:// cgview. ca) server 36 . Transfer RNAs (tRNAs) and ribosomal RNAs (rRNAs) were analysed using tRNAscan-SE (http:// lowel ab. ucsc. edu/ tRNAs can-SE) 37 and rnammer (http:// www. cbs. dtu. dk/ servi ces/ RNAmm er) 38 servers. The CRISPR genes and Cas clusters were determined in silico using the CRISPRCasFinder (https:// crisp rcas. i2bc. paris-saclay. fr) server. The anti-SMASH server was used to identify the biosynthetic gene clusters (BGCs) for various secondary metabolites 39 . The Clusters of Orthologous Group (COG) functional categories were allocated by digging against the KEGG (Kyoto Encyclopedia of Genes and Genomes) database 40 . Physiological analyses. The cell morphology of strain TW-1 T , grown on R2A agar plate for 4-5 days at 28 °C were observed by using TEM (transmission electron microscopy; Talos L120C; FEI). The colony morphology of strain of TW-1 T was seen using a Zoom Stereo Microscope (SZ61; Olympus, Japan). Gram staining was performed as described previously 41  www.nature.com/scientificreports/ tures (0-45 °C) on R2A agar plates was monitored for 10 days. Growth was observed on various media including brain heart infusion agar (BHI; Oxoid), Luria-Bertani agar (LBA; Oxoid), marine agar 2216 (Becton), nutrient agar (NA; Oxoid), R2A agar, sorbitol MacConkey agar (MA; Oxoid), tryptone soya agar (TSA; Oxoid), and veal infusion agar (VIA; Becton). DNase activity of strain TW-1 T was examined by using DNase agar (Oxoid). Tolerance of salt was checked in R2A broth supplemented with NaCl [Duksan Chemicals, Republic of Korea; 0-5% (w/v) at 0.5% interval]. The pH range was observed at 28 °C in R2A broth (pH 4-12 in increments of 0.5 pH units). Testing of pH after sterilization showed only minor changes. To analyse the optimum temperature, pH and NaCl, the growth curve was determined by measuring growth absorbance at 600 nm using a spectrophotometer (Biochrome Libra S4). Hydrolysis of Tweens 80, 60 and 40 were analysed as described by Smibert & Krieg 42 . The anaerobic growth of strain TW-1 T was observed for 10 days on R2A agar at 28 °C with BD GasPak™ EZ Gas Generating Pouch System (BD). Hydrolysis of casein CM-cellulose, starch and tyrosine were assessed as mentioned in previous study 43 . Production of H 2 S and indole was checked in SIM (sulfide indole motility medium; Oxoid). Malachite green was used for spore staining. Other physiological tests were examined by using API 20NE and API ID 32GN kits (bioMérieux). The enzyme activities of strain TW-1 T and other references were examined by using an API ZYM kit (bioMérieux) following the manufacturer's instructions. All the biochemical tests including API were performed in duplicate.
For the determination of fatty acids, cells of reference strains and TW-1 T were harvested from identical culture condition (at 28 °C for 4 days). Fatty acid methyl esters (FAME) of harvested cells were extracted using MIDI protocol technical note #101 (http:// midi-inc. com/ pdf/ MIS_ Techn ote_ 101. pdf). Extracted FAMEs were analysed using a HP 6890 Series GC System (Gas chromatograph; Hewlett Packard; Agilent Technologies) and the FAME compositions (percentage of totals) were identified with TSBA6 database of the Microbial Identification System 44 . The polar lipids and isoprenoid quinones were extracted from freeze-dried cells following the protocol of Minnikin et al. 45 . Isoprenoid quinone was analysed by using the HPLC (Agilent 1200 series) with following conditions. Solvent system, acetonitrile: iso-propanol (65: 35); flow rate, 1.2 mL/min; detection, 270 nm; run time, 20 min; and injection volume, 20 µL. Appropriate reagents for the spot detection were used as given by Komagata and Suzuki 46 . Antimicrobial activities of strain TW-1 T . Antimicrobial activities of strain TW-1 T against E. coli KACC 10,185 and P. aeruginosa KEMB 121-234 were examined by disc-diffusion and spotting method, respectively. Screening were done against these Gram-negative pathogens by spotting the colonies of strain TW-1 T on R2A agar plates and incubated at 28 °C for 48 h. Crude product of culture extract was prepared by the culture supernatant of strain TW-1 T to evaluate disc-diffusion test. Strain TW-1 T was cultured in 300 mL of R2A broth at 28 °C (180 rpm for 5 days) into a 500 mL Erlenmeyer flask. Culture supernatant of strain TW-1 T was extracted by equal volume of ethyl acetate (2 ×) with pH 2.0 and 10, respectively 47 . Organic layer collected from extraction was completely evaporated by Rotary evaporator, (Eyela) and remained residue of crude product was dissolved in 500 µL of methanol. Then, 15 µL of crude product was diffused to a paper-disc (6 mm, Whatman) and antimicrobial activities against P. aeruginosa and E. coli KEMB were checked by measuring the inhibition zones. Trimethoprim/sulfamethaxazole (15 µg) and only methanol (15 µL) were used for positive and negative controls, respectively.
Ethics approval. This study does not describe any experimental work related to human. Genomic analysis. The ContEst16S analysis showed that the genome belonged to strain TW-1 T and the genome has not been contaminated. Whole-genome shotgun sequence has been deposited at DDBJ/ENA/Gen-Bank under the accession JAAQOM000000000. The genome size and N50 value of strain TW-1 T are 7,051,197 bp and 309,394 bp, respectively. The genome has 46 contigs and coverage of 89.0 × (Table S1). The graphical genomic map revealed the presence of 12 rRNAs (Fig. 2). The DNA G + C content of strain TW-1 T is 66.3% and within the range of Massilia species 8 . The ANI threshold for species delineation is recommended at 95-96% 48 and ANIu between strain TW-1 T and phylogenetically closest neighbours are ≤ 87.8% (Table 1). The dDDH values of ≤ 34.4% is much lower than the species threshold of 70% recommended for species demarcation 34 (Table 1). Moreover, DDH relatedness between strains TW-1 T , M. pinisoli KACC 18748 T and M. arvi KACC 21416 T were 40.2 ± 2.6 and 24.1 ± 2.1%, respectively. These data clearly show that strain TW-1 T represents a novel member within the genus Massilia 34 . Furthermore, the phylogenomic tree constructed using UBCGs (concatenated alignment of 92 core genes) also proved that strain TW-1 T was a novel member of the genus Massilia (Fig. 3).

Result and discussions
The RAST analysis revealed the presence of 339 subsystems and 4 secondary metabolisms consisting auxin biosynthesis (four, Fig. S3). The COG functional classification of proteins showed the highest and lowest number www.nature.com/scientificreports/ of genes were of unknown functions (1347) and extracellular structures (1) (Fig. 4). The genome of strain TW-1 T consists 5 putative BGCs (terpene, siderophore, bacteriocin, acyl_amino_acid and thiopeptide) were revealed by anti-SMASH analysis ( Table 2). The core and additional biosynthetic genes were predicted along with gene flaking similarities for secondary metabolite biosynthetic gene clusters (smBGCs) ( Table 2). Thiopeptide BGC showed only 33% of gene similarity with Massilia putida (NZ_CP019038; 4,695,767-4,724,913) with no known clusters. Additionally, each identified gene cluster from anti-SMASH analysis were compared against the ncbi database using protein-protein blast (BlastP) (Fig. S4). The core biosynthetic genes in the thiopeptide BGC showed identified nuclear transport factor 2 family protein and OsmC domain/YcaO domain-containing proteins (Fig. S4). As no known cluster has been predicted from the thiopeptide cluster, strain TW-1 T might produce unique natural products (Fig. S4). In addition, the genome contained three antibiotic biosynthesis monooxygenase (ABM; WP_166855736, WP_166862851 and WP_166864744) that are possibly responsible for biosynthesis for antibiotics. Although strain TW-1 T was isolated at 28 °C, it could grow well at 4 °C. When we performed the genome mining of strain TW-1 T we found the genes (CspA, CspC) related to cold shock proteins and cold-shock domain containing proteins (WP_036166698, WP_166858078, WP_166858204, WP_056448893 and WP_03616538). These proteins help the organism to adapt in cold temperatures. The genome contained arsenic resistance gene arsH (WP_166857685) and chromate resistance gene (WP_166864147) showing the strain could tolerate arsenic and chromate. The genome contained various protease genes such as rhomboid family intramembrane serine protease (WP_166860501), site-2 protease (S2P) family protein (WP_166861383), ATP-dependent Clp protease ATP-binding subunit ClpX (WP_166861695), ATP-dependent protease ATPase subunit HslU (WP_166858142), ATP-dependent protease subunit HslV (WP_166857747), trypsin-like serine protease (WP_166857912, WP_166858974), protease HtpX (WP_166861379, WP_166859769), DJ-1/PfpI/YhbO family deglycase/protease (WP_166859796, WP_166864477, WP_166859797), ATP-dependent Clp protease adapter ClpS (WP_027864722), FtsH protease activity modulator HflK (WP_166865455), and protease modulator HflC (WP_166865458). Presence of these various proteases encoding genes indicate the industrial and medical significance of strain TW-1 T . Bacterial proteases are widely used in the industrial sectors for various enzymatic activities and currently these enzymes are also regarded as valuable resources for antimicrobial drug targets 49,50 . The indole test is negative and strain TW-1 T is non-spore-forming. Strain TW-1 T hydrolysed DNA, CM-cellulose, casein, aesculin, starch, Tweens 40 and 60. Hydrolysis of chitin is negative but weakly hydrolyse Tween 80, gelatin and tyrosine. Red diffusible pigmentation was also observed while hydrolysing tyrosine. Strain TW-1 T grew well but the references were unable to grow at 4 °C. Other differential physiological characteristics are given on Table 3 with phylogenetically closest species of the genus Massilia.
Antimicrobial activities. Strain TW-1 T showed antimicrobial activities against Gram-negative pathogens.
The zone of inhibitions for P. aeruginosa KACC 10,185 and E. coli KEMB 121-234 were 17 and 18 mm, respectively (Fig. S7). This is unique result that we have isolated the bacterial strain having potent antimicrobial effects against Gram-negative pathogens from oil-contaminated soil and we report this is the first study of Massilia species producing antimicrobial compound. The determination of MIC value in addition to identification and characterization of bioactive compound are under investigation. However, based on anti-SMASH analysis, there is a high chance to get a novel antimicrobial compound from strain TW-1 T as it showed thiopeptide smBGC with no known cluster blast (Fig. S4).