Introduction

The gut microbiota is linked to human health and disease by maintaining intestinal homeostasis and regulating metabolism1,2,3. Since the first large-scale 16s rRNA sequencing in 2005 to identify intestinal bacteria, information on the diversity of gut ecology has increased exponentially4,5. However, the culture of bacteria had been very biased toward partial phylogenetic groups, and the quantitative gap between the number of novel cultured species and the results obtained by the culture-independent method had widened6,7. Microbial culturomics, introduced in 2012, is a research concept that combines high-throughput cultivating microorganisms with various culture conditions, a matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF MS), and 16s rRNA gene sequencing8. The culturomics approach has enabled high-throughput microbial identification and cultivation, resulting in many previously unexplored bacterial lineages reported9,10,11.

Nevertheless, a significant fraction of the bacteria detected in the 16s rRNA sequencing has yet to be cultured. The number is expected to be higher when considering the bacteria not detected in metagenomics for reasons such as low abundances and sequencing bias12,13. Since 2021, we have conducted a culturomics study focusing on the intestine of healthy individuals and inflammatory bowel disease (IBD) patients to contribute to discovering novel gut bacteria. As part of the results, we isolated a novel bacterial stain ICN-92133T (= KCTC 25622T = JCM 36070T) from a stool sample in a patient with Crohn's disease. Using the taxonogenomics, a new research concept that proposes the novelty of new species based on whole genome sequence along with polyphasic properties14, we present distinct properties of strain ICN-92133T from its phylogenetic neighbors, thereby proposing strain ICN-92133T as a type species Selenobaculum gbiensis gen. nov., sp. nov. of the new genus within the family Selenomonadaceae.

Materials and methods

Sample collection under ethical approval

Before the sample collection, this study was approved by the Institutional Review Board of Soon Chun Hyang University Bucheon Hospital in South Korea under the number SCHBC 2021-01-028-002. The protocol was carried out in accordance with relevant guidelines and regulations approved by our ethics committee. The donor was a 26-year-old male diagnosed with Crohn’s disease who gave his informed and signed consent. The fecal samples were kept below 4 °C under anaerobic conditions immediately after defecation until processed in the laboratory within 24 h.

Strain isolation and identification by MALDI-TOF MS

The sample was homogenized, resuspended to 0.25 g/L with 0.85% sodium chloride saline, and immobilized on polysaccharide gel beads for long-term pre-incubation15. The stool gel beads were inoculated in an anaerobic culture bottle (BioMérieux, Marcy l’Etoile, France) supplemented with 0.2 μm filtered rumen fluid16 and defibrinated sheep blood and then incubated at 37 °C for 30 days. The cultured broth was sampled regularly and plated onto modified Gifu Anaerobic Medium (mGAM; Nissui Pharmaceutical, Tokyo, Japan). Bacterial colonies were isolated and initially identified using MALDI-TOF MS on a Biotyper Sirius system (Bruker Daltonics, Bremen, Germany)17. The process was conducted in an anaerobic chamber consisting of 5% CO2, 10% H2, and 85% N2.

Whole genome sequencing and analysis

Genomic DNA of strain ICN-92133T was extracted using Quick-DNA HMW MagBead kit (Zymo Research Corporation, Irvine, CA, USA) with lysozyme chloride (Sigma-Aldrich, St. Louis, MO, USA) and quantified by QuantiFluor dsDNA System method (Promega, Madison, WI, USA) using Victor Nivo Multimode Microplate Reader (PerkinElmer, Waltham, MA, USA). Whole genome sequencing was performed using a PacBio Sequel (Pacific Biosciences, Menlo Park, CA, USA) and a NovaSeq 6000 system (Illumina, San Diego, CA, USA) by Macrogen Corporation (Seoul, South Korea). PacBio long-read sequences from approximately 12 kb SMRTbell templates were assembled de novo to construct genome sequences. They were then corrected using Illumina sequencing data from paired-end (2 × 150 bp) sequencing to obtain high quality. In this study, Microbial Assembly Application was applied for assembly18, and the genome was annotated using the PATRIC RAST tool kit (RASTtk)19 in the BV-BRC server20. The circular genome was constructed in a CGView tool21. The virulence factor database (VFDB)22 and the Comprehensive Antibiotic Resistance Database (CARD)23 were used to predict potential virulence factors and antimicrobial resistance genes.

Phylogenetic analysis based on 16s rRNA gene sequences

The NCBI BLASTn search24 and Eztaxon database25 were used to classify the strain based on 16s rRNA gene sequence similarities with phylogenetically closest strains. The threshold of 94.5% 16s rRNA sequence identity, previously established26,27, was applied to determine a new genus classification. A phylogenetic tree was constructed using the Maximum-likelihood method28 based on the Tamura-Nei model29 after alignment of the 16s rRNA sequences using MUSCLE30 in the MEGA11 program31.

Morphological, phenotypical, and biochemical assays

Gram-stained morphology was observed using a phase contrast microscope (Eclipse Ci-L, Nikon, Tokyo, Japan). The outer and inner structures of strain ICN-92133T were observed using the scanning electron microscope (SEM) and energy-filtered transmission electron microscope (EFTEM) acquired on a Regulus 8100 (Hitachi, Tokyo, Japan) and LIBRA 120 (Carl Zeiss, Germany), respectively. The main characteristics, such as oxidase, catalase, nitrate reduction, and sporulation activities, of strain ICN-92133T and phylogenetically closest type strains were compared. The optimal growth conditions were determined by culturing at different temperatures (20, 26, 30, 37, and 45 °C), pH (5, 5.5, 6, 6.5, 7, 7.5, 8.5), and osmotic conditions (0, 1, 3, 5, 10, 15 and 20 g of NaCl/L). According to the manufacturer's instructions, enzymatic activities and substrate utilization properties were investigated using API ZYM, API 20A, and API 50CH kits (BioMérieux) at 30 °C.

Cellular fatty acids methyl ester analysis

Cellular fatty acids methyl esters (FAMEs) were extracted from colonies grown on mGAM agar and analyzed by Agilent 8890 gas chromatography system (Agilent Technologies, Inc., Palo Alto, CA, USA) at Korean Collection for Type Cultures (KCTC; Jeonbuk, South Korea). The spectra of cellular fatty acids were determined using the Sherlock MIDI software version 6.5 based on the ANAER6 method32.

Minimum inhibitory concentration assay

The minimum inhibitory concentrations (MICs) of strain ICN-92133T were determined in mGAM agar at 30 °C using ETEST (BioMérieux) of nine antibiotics: ampicillin, gentamicin, kanamycin, streptomycin, erythromycin, clindamycin, vancomycin, tetracycline, and chloramphenicol.

Genomic comparison

Whole genome-based phylogeny was inferred by cross-genus protein families (PGfams) Codon Tree pipeline33 and Type (Strain) Genome Server (TYGS)34. The genomic features of the comparing strains were obtained from the BV-BRC server20. To calculate in silico genome-to-genome similarity, digital DNA-DNA hybridization (dDDH)35 and Orthologous Average Nucleotide Identity (OrthoANI)36 were applied.

Results

Strain isolation, morphology, and identification of strain ICN-92133T

A colony of strain ICN-92133T was first isolated on mGAM agar plates after 21 days of pre-incubation. The morphology of the colonies was creamy white, convex, and circular forms less than 1 mm in diameter. The spectrum of strain ICN-92133T did not match the bacterial spectra in the MALDI Biotyper database showing scores lower than 1.5 (Fig. 1). The strain was gram-negative rod-shaped with 0.4–0.5 × 1.6–3.2 μm in size. However, some cells had blunt ends sporadically, which were observed in surface observations (Fig. 2a). In addition, structures presumed to be granules were observed inside the cells (Fig. 2b). The 16s rRNA gene sequence of strain ICN-92133T was 1557 bp in length. Strain ICN-92133T exhibited sequence similarity with M. massiliensis strain DSM 102838T (93.91%)37, Propionispira raffinosivorans strain DSM 20765T (93.08%)38, Propionispira arcuata strain KCTC 15499T (92.73%)39, Propionispira arboris strain DSM 2179T (92.51%)40, Anaerosinus glycerini strain DSM 5192T (92.54%)41,42, and Propionispira paucivorans strain DSM 20756T (91.85%)38. Phylogenetic analysis based on 16s rRNA placed strain ICN-91133T into a member of the family Selenomonadaceae (Fig. 3). Since the highest identity value is lower than the threshold of 95% for determining a new genus, strain ICN-92133T showed the possibility of being classified as a new genus of a new species and required further characterization.

Figure 1
figure 1

MALDI-TOF MS spectra of Selenobaculum gbiensis strain ICN-92133T.

Figure 2
figure 2

Morphology of Selenobaculum gbiensis strain ICN-92133T. Images were obtained using (a) scanning electron microscope (SEM) and (b) energy-filtered transmission electron microscope (EFTEM). Scale bars of 1 μm were represented.

Figure 3
figure 3

Phylogeny of Selenobaculum gbiensis strain ICN-92133T and closely related species. The 16s rRNA sequences were aligned using MUSCLE with default parameters, and the phylogenetic tree was calculated using the maximum-likelihood method within MEGA software version 11.0. GenBank sequence accession numbers were described in parentheses. Among the bootstrap values obtained from 500 replications, values higher than 90% were indicated at branch points. Veillonella dispar was used as outgroup.

Comparison of phenotypic and biochemical characteristics

The condition from which strain ICN-92133T was isolated was mGAM medium at 37 °C in an anaerobic atmosphere. Strain ICN-92133T is motile with peritrichous flagella, but oxidase, nitrate reduction, and sporulation activities were not observed under the growth conditions (Table 1). The optimal growth conditions for strain ICN-92133T were 26 to 37 °C, pH 6.5 to 8.5, and less than 0.5% salinity. To compare the main characteristics with strain ICN-92133T, four strains belonging to the genera Massilibacillus, Propionispira, and Anaerosinus were selected according to phylogeny results. Strain ICN-92133T had enzymatic activities regarding pyroglutamyl aminopeptidase, alkaline phosphatase, esterase C4, esterase lipase C8, leucine arylamidase, acid phosphatase, and naphtol-AS-BI-phosphohydrolase. Moreover, the strain utilized glycerol, d-ribose, d-adonitol, d-glucose, d-fructose, d-mannose, and d-mannitol. Since A. glycerini strain DSM 5192T utilizes glycerol as a primary carbon source, some reactions were not observed under the test conditions. In cellular fatty acids analysis, it is revealed that the primary cellular fatty acids of strain ICN-92133T were C17:1,cis-9 (17.6%) and C18:1,cis-9 (19.5%) (Table 2).

Table 1 Biochemical characteristics of strain ICN-92133T and its phylogenomically related type strains.
Table 2 Cellular fatty acids compositions (%).

Antibiotic susceptibility of strain ICN-92133T

Strain ICN-92133T showed MICs (μg/mL) of 0.094 for ampicillin, 4 for gentamicin, 4 for kanamycin, 6 for streptomycin, 6 for erythromycin, 0.032 for clindamycin, 0.19 for tetracycline, 0.5 for chloramphenicol, and more than 256 for vancomycin.

Genomic properties of strain ICN-92133T

The complete genome of strain ICN-92133T was a circular genome with a 2,679,003 bp size and a GC content of 35.5% (Fig. 4a). The genome contained 2526 protein-coding sequences (CDS), 86 tRNA genes, and 18 rRNA genes. The annotated proteins included 1016 hypothetical proteins and 1510 proteins assigned to 237 RAST subsystems (Fig. 4b). The clusters of orthologous group (COG) functional categories in which most of the genes involved were amino acids and derivatives (19.6%), protein metabolism (13.5%), cofactors, vitamins, prosthetic groups, pigments (13.4%), carbohydrates (8.3%) and DNA metabolism (6.4%). In addition, virulence factor candidates and antibiotic-resistance genes were predicted within the genome of strain ICN-92133T to evaluate the potential risk to the host. A total of 38 sequences were matched to virulence genes in the VFDB database regarding adherence (pilA, tcpl, groEL, tufA, and htpB), exotoxin (hlyB), chemotaxis (acfB), immune modulation (cpsB/cdsA, pseC, rfaD, gmd, and manC), motility (flhA, tlpB, gleQ, and pseB), stress survival (clpE and clpC), and Type III secretion system (T3SS; ysaV, escV, and invA) (Table 3). In addition, five genes related to antibiotic efflux pump (adeF and qacJ), antibiotic target alteration (vanT and vanW), and inactivation of antibiotics (fosXCC) were detected in CARD.

Figure 4
figure 4

Complete genome and RAST subsystem category distributions of Selenobaculum gbiensis strain ICN-92133T. (a) A circular chromosome map represents annotated proteins, GC contents, forward GC skew, reverse GC skew, CDS, tRNA, rRNA, and repeat regions from outside to the center. (b) COG categories distributions were analyzed in the RAST server.

Table 3 Prediction of virulence factors and antimicrobial resistance genes in the genome of Selenobaculum gbiensis strain ICN-92133T based on VFDB and CARD databases, respectively.

Comparative genomic characteristics

The whole genome of strain ICN-92133T was compared with the available genomes of seven strains which were phylogenetic neighbors (Table 4). The genome of strain ICN-92133T had the third shortest genome length and the lowest GC content and CDS numbers. In addition, strain ICN-92133T showed a G+C content difference of 2.23 to 17.21% from the type strains excluding Pectinatus brassicae strain DSM 24661T. The subsystems annotated using the RASTtk are described in Table 5. The phylogenetic tree-based whole genomes constructed in PGfams Codon Tree pipeline showed that strain ICN-92133T formed a clade with M. massiliensis strain DSM 102838T and Propionispira type strains (Fig. 5a). Strain ICN-92133T exhibited dDDH values lower than 21.9% and OrthoANI values ranging from 71.9% with M. massiliensis strain DSM 102838T to 64.6% with S. montiformis strain DSM 106892T (Fig. 5b). The dDDH and OrthoANI values were lower than the thresholds (70% and 95%, respectively), indicating that strain ICN-92133T represented a novel species43,44. Taken together, we propose strain ICN-92133T as a type species of a new genus within the family Selenomonadaceae.

Table 4 General genomic features of Selenobaculum gbiensis ICN-92133T and related species.
Table 5 Comparison of subsystems based on PATRIC annotation results.
Figure 5
figure 5

Whole genome-based phylogeny and pairwise comparison. (a) Phylogenetic tree based on Codon Tree pipeline in BV-BRC PGFams. GenBank sequence accession numbers were described in parentheses. Bacillus alcalophilus was used as outgroup. (b) Comparison of ANI and dDDH values between Selenobaculum gbiensis strain ICN-92133T and closely related species. ANI values were obtained from OAT software, and dDDH was calculated in a formula 2 equation (DDH estimates based on identities/high-scoring segment pair length) in GGDC.

Discussion

Since gut microbiome and its association with host health and disease have been focused on, intensive efforts are being made to cultivate the gut microorganisms to elucidate microbe-microbe and microbe-host interactions9,45,46,47. The culturomics approach can contribute to acquiring diverse gut microbes, including new species that have never been cultured8,9,10,11,12. We obtained novel strains from the gut microbiota of healthy individuals and IBD patients during the culturomics study. Among them, the characteristics of strain ICN-92133T are presented here with the taxonogenomic strategy.

The family Selenomonadaceae, which has the characteristics of strictly anaerobic and gram-negative bacilli, was first classified as a new family within the order Selenomonadales in the class Negativicutes in 201548. The family currently contains 43 species in 8 genera, including Anaerovibrio, Centipeda, Megamonas, Mitsuokella, Pectinatus, Propionispira, Schwartzia, and Selenomonas. The phylogenetic analysis based on 16s rRNA gene sequences placed strain ICN-92133T with species belonging to the family Selenomonadaceae, Veillonellaceae, and Sporomusaceae (Fig. 3). Phylogeny based on both amino acids and nucleotide sequences in the Codon Tree pipeline was inferred with currently available genomes within the three families to confirm its taxonomic classification (see Supplementary Fig. S1). Although strain ICN-92133T not only showed the highest sequences similarity of 93.91% in 16s rRNA with M. massiliensis strain Marseille-P2411T (= DSM 102838T), which belongs to family Veillonellaceae, but also constituted a clade in whole genome-based phylogeny, it shared a common ancestor with the other species of the family Selenomonadaceae. Therefore, we classified strain ICN-92133T as a new genus belonging to the family Selenomonadaceae.

The complete genome of strain ICN-92133T was smaller than those of M. massiliensis DSM 102838T and P. arboris DSM 2179T, which could be the basis for the relatively fewer ranges of enzymatic and fermentation activities (Tables 1, 4). In electron microscopic observations of strain ICN-92133T, we found granular cytoplasmic inclusions in strain ICN-92133T (Fig. 2b). Although the contained substances have not been verified in this study, they can be assumed to be a kind of energy storage based on the results of studies that P. arboris DSM 2179T has carbohydrate-containing granules40. In addition, we investigated potential virulence and antibiotic resistance genes within the complete genome of strain ICN-92133T to predict its pathogenicity because the strain was isolated from fecal specimens of a patient with Crohn’s disease (Table 4). Based on the strain ICN-92133T results showing high MIC values for vancomycin in vitro, the resistance might be derived from efflux pumps and reduced binding affinity to vancomycin. Our results also included several housekeeping proteins in the VFDB results, such as EF-Tu and GLoEL, which may play a role in the moonlighting functions in certain species49,50. However, it is necessary to confirm whether the detected genes in strain ICN-92133T are expressed in the host and show pathogenicity through further research.

In conclusion, we propose strain ICN-92133T obtained through culturomics as a type species Selenobaculum gbiensis sp. nov., of a new genus Selenobaculum within the family Selenomonadaceae, according to the taxonogenomic results, including MALDI-TOF MS spectra, morphology, phenotypic properties, phylogenetic analysis, FAME composition, ANI, and dDDH calculation.

Description of Selenobaculum gen. nov.

Selenobaculum (Se.le.no.ba.cu.lum. N.L. fem. pl. n. Selenomonadaceae, a bacterial family name; L. neut. n. baculum, a stick, staff, rod; N.L. neut. n. Selenobaculum, a rod-shaped bacterium that belongs to the family Selenomonadaceae). Cells are gram-negative rods with rounded ends, obligately anaerobic, motile with flagellum, oxidase, and catalase negative. The type species is Selenobaculum gbiensis.

Description of Selenobaculum gbiensis gen. nov., sp. nov.

Selenobaculum gbiensis (g.bi.en’sis. N.L. masc. adj. gbiensis for the acronym of Green-Bio Institute, where the type strain was isolated). Cells grow at 26–37 °C, pH 6.5–8.5, a salinity of less than 0.5%, and the size was 0.4–0.5 × 1.6–3.2 μm. After 2 days of incubation at 37 °C in anaerobic conditions, colonies that were creamy and circular forms less than 1 mm in diameter were observed on mGAM agar. Among the 19 enzymatic activities tested in the API ZYM system, a positive reaction was observed for pyroglutamyl aminopeptidase, alkaline phosphatase, esterase (C4), esterase lipase (C8), leucine arylamidase, acid phosphatase, and naphtol-AS-BI-phosphohydrolase. In API 20A and API 50CH tests, it was observed to be utilized glycerol, d-ribose, d-adonitol, d-glucose, d-fructose, d-mannose, and d-mannitol as carbon sources. In contrast, utilization for other substrates was not observed. The primary cellular fatty acids were C17:1,cis-9 and C18:1,cis-9. The type strain ICN-92133T was first isolated from a patient’s stool with Crohn’s disease, and its genome was found to be 2,679,003 base-pair lengths and 35.5 mol% of GC contents. Strain ICN-92133T has been deposited at Korean Collection for Type Cultures (KCTC) and the Japan Collection of Microorganisms (JCM) under the numbers KCTC 25622T and JCM 36070T, respectively.