Introduction

Methanogenic archaea colonize a variety of anoxic habitats including the gastrointestinal tracts of warm-blooded animals, where they are widely acknowledged to contribute to host digestive function via their key role in coordinating interspecies hydrogen transfer and promoting more efficient growth of heterotrophic fermentative bacteria [1,2,3]. Generally, the principal members of these gut methanogenic communities are the Methanobacteriales (i.e., Methanobrevibacter and Methanosphaera spp.). Methanobrevibacter spp. are often the numerically dominant members of these taxocenes, favoring hydrogen-dependent reduction of carbon dioxide and/or formate to methane. In contrast, much less is known about the methylotrophic archaea, such as Methanomassiliicoccales and Methanosphaera spp. Draft genomes have been produced for several members of the Methanomassiliicoccales, and their role in livestock methane emissions [4, 5] and methylamine metabolism established [6,7,8,9,10].

Despite its discovery in the mid-1980s, Methanosphaera spp. have received relatively scant attention, with virtually all our knowledge derived from a single isolate recovered from human stool (Methanosphaera stadtmanae DSMZ3091T, [11]). Thauer et al. [12] proposed that energy conservation and hydrogen-dependent methanol reduction to methane by Methanosphaera stadtmanae is achieved without cytochromes, and involves electron bifurcation between the MvhADG-HdrABC and Ehb complexes that creates a coupling between ferredoxin and CoM-S-S-CoB reduction; this mode of electron bifurcation has recently been established biochemically by Yan et al. [13]. Hydrogen-fueled methanogenesis is also conserved in Methanosphaera cunculii DSMZ4103T [14, 15]. The physiological consequences of this mechanism of methanogenesis and energy conservation is consistent with recent findings that populations of Methanosphaera spp. are remarkably high in “low hydrogen/methane” producing ruminants [16]. However, their metabolic versatility may be more expansive than previously appreciated as we recently reported the isolation and characterization of a new member of the genus Methanosphaera (strain WGK6) from the foregut microbiome of the Western Grey Kangaroo (Macropus fulgidus) that is capable of using either hydrogen or ethanol as a source of reducing power for methanogenesis and growth [17].

In humans, Methanosphaera spp. also appear to be implicated with the onset of digestive diseases, and furthermore, stimulate various arms of the immune system. Blais Lecours et al. [18] reported greater absolute counts and prevalence of Methanosphaera spp. to occur in patients with inflammatory bowel disease (IBD), compared with healthy non-IBD subjects. The IBD patients were also found to possess high titers of Methanosphaera stadtmanae DSMZ3091T cross reactive IgG antibodies, and the archaeon was shown to be more immunogenic than Methanobrevibacter smithii in a murine model of respiratory disease [19, 20]. In summation, we propose that the genus Methanosphaera remains underexplored and poorly defined relative to their evolutionary origins, ubiquity and abundance in the gastrointestinal tracts of warm-blooded animals, and thereby, so too are their contributions to gut function, health and disease.

Here we describe the isolation and genomic characteristics of two new isolates of Methanosphaera sp., from the bovine rumen (BMS, the first reported isolate of its type) and human stool (PA5). Based on these initial findings, we produced seven Methanosphaera spp. population genomes from human, ovine and bovine metagenomic datasets and undertook the first, pan-genomic analysis of this genus.

Materials and methods

Enrichment and isolation of Methanosphaera sp. BMS and PA5

Rumen fluid was collected using previously described procedures [17] from a fistulated Brahman steer grazing on native grass/legume forage at the Gatton campus of the University of Queensland’s School of Agriculture and Food Sciences. The procedures were approved by the Department of Employment, Economic Development and Innovation (DEEDI; AEC Proposal Reference Number SA 2011/08/365). A subsample of the strained liquid (~2 ml) was mixed with an equal volume of pre-reduced, anoxic, and sterilized solution of 30% (vol/vol) glycerol, and then stored −80 °C. An inoculum was also prepared from the stool samples of a healthy adult Australian subject, recruited as part of a larger nutritional study conducted through Monash University under MU-HREC approval CF14/2904 - 2014001593 and UQ-HREC 2015000317. The raw stool samples were subsampled (~2 g) under anaerobic conditions and mixed with 3 ml of sterile glycerol preservation medium as described above.

Enrichment cultures of digesta and stool samples were initiated with BRN-RF30 medium [21] supplemented with either a combination of methanol and ethanol (each 1% vol/vol), or the same concentration of methanol but with H2 gas added to a final pressure of 150 kPa. The enrichments were placed within a shaking incubator cabinet at 37 °C and agitated at 100 r.p.m. After 2–7 days incubation, 2 mL of the headspace gases were collected and analyzed for methane gas formation; and culture fluids were examined by UV fluorescence-microscopy for autofluorescent cells. The bovine-derived methanol:H2 enrichment cultures that were positive for both criteria were then used to inoculate BRN-RF30 medium supplemented with methanol:H2, and 50 µg/mL penicillin, with subsequent rounds of enrichment using the same basal medium with either 100 µg/mL penicillin, 50 µg/mL tetracycline, or 100 µg/mL novobiocin added. For the human stool-derived enrichment cultures, both ampicillin (500 μg/mL) and streptomycin (80 μg/mL) were added to the initial transfers and vancomycin (50 μg/mL) was also included to further suppress bacterial growth. For each round of enrichment, the inoculum was first diluted to extinction, and the culture inoculated with the greatest dilution that still produced autofluorescent cells and methane in the headspace gas used for the next round. The candidate axenic cultures were then serially diluted and used to inoculate BRN-RF30 agar roll tubes supplemented with methanol and H2 gas. Following growth, individual colonies were aseptically picked and propagated using the same medium. One isolate from each enrichment (strain BMS from bovine and PA5 from human stool) were selected for preservation based on microscopy and headspace gas analysis, and then archaea-targeting primers (86F/1340R, [22]) were used for 16S rRNA gene sequencing to confirm strain purity and their affiliation with the Methanosphaera genus, following methods described by Hoedt et al. [17].

Methanosphaera spp. substrate utilization and growth kinetics

Substrate affinities for  Hand/or ethanol, and maximal growth rates for Methanosphaera sp. BMS, WGK6 and DSMZ3091T were determined using BRN-RF30 medium, with the headspace purged with oxygen-free N2 gas to remove any carryover H2, and the methods outlined by [17, 23]. Briefly, triplicate cultures consisting of 10 mL BRN-RF30 in Balch tubes of each strain were provided with 1% (vol/vol) methanol and either varying concentrations of ethanol (0, 3, 10, 30, 50, and 171 mM with strain WGK6) or H2 (0, 1.3, 1.8, 2.6, 3.5, and 4.6 mM for all three strains) pressurized to 150 kPa. The cultures were placed within a shaking incubator cabinet at 37 °C and agitated at 100 r.p.m. Growth was monitored by longitudinal measurement of optical density at 600 nm (OD600) as described by Hoedt et al. [17]. Strains DSMZ3091T (human) and BMS (bovine) were also cultured using BRN basal medium prepared to contain either human fecal water (FW) or bovine rumen fluid (RF) at different concentrations. The FW extract was prepared by resuspending 10 g of human fecal material in 100 mL of ddH2O, followed by two rounds of centrifugation (20,000×g for 30 min at 4 °C) and filter sterilization of the supernatant (0.45 μm Thermo Scientific Nalgene Rapid-Flow Filter unit). The basal BRN medium was then supplemented with either 1 or 5% (vol/vol) FW, or 5 and 10% RF (vol/vol) prior to autoclaving. Triplicate tubes of the culture media were inoculated with actively growing cultures of the respective strains, the tubes aseptically pressurized with H2:CO2 (80:20 vol/vol) to 150 kPa, and 1% (vol/vol) methanol added. The cultures were placed within a shaking incubator cabinet at 37 °C and agitated at 100 r.p.m. Growth was monitored by longitudinal measurement of optical density at 600 nm (OD600) as described above.

Specific growth rates for each substrate concentration and combination were calculated and plotted as double-reciprocal plots, with substrate affinity and maximal growth rate estimated from the x- and y-intercepts, respectively. GraphPad Prism 7 was used to plot the optical densities over time and determine the specific growth rate using linear-regression of the exponential growth phase for each condition. Unpaired Student’s t-test was used for the statistical analysis, and P-values <0.05 were determined to be statistically significant. Growth rates and statics were calculated as described above.

Methanosphaera spp. genome sequencing

The methods and sequencing platform used to produce the draft genome of strain WGK6 were described by Hoedt et al. [17]. Strains BMS and PA5 were cultured using BRN-RF30 medium supplemented with methanol and hydrogen and harvested by centrifugation (17,000×g). The cell pellets were resuspended in RBB+C lysis buffer [24] and subjected to 15 rounds of freeze-thaw cycles and repeated phenol:chloroform extraction, with a final additional ethanol precipitation included. The genomic DNA was quantified using the Quant-iT dsDNA BR Assay Kit according to the manufacturer’s instructions (Invitrogen) and its integrity determined by agarose gel electrophoresis. Aliquots (20 µg) of BMS genomic DNA were then sheared with the Hydroshear® Shearing Device (Pacific Biosciences, CA, USA) to produce fragments ~20 Kb in length, which were then used for library construction and SMRT-cell sequencing with the PacBio RS2 “continuous long read” platform and the SMRT portal assembly (Pacific Biosciences). The genome assembly was performed using HGAP2 with a seed cutoff length of 10 Kb, and quality assessed via adjacency graph construction and visualization with Contguity [25]. Finally, genome closure was achieved using primers (Table S1) to “walk” across the determined ends of the single BMS contig, and the amplicons sequenced using an Applied Biosystems 3130xl Genetic Analyser. For strain PA5, a high-quality draft genome was prepared from 100 ng genomic DNA by the Australian Centre for Ecogenomics (ACE, www.ecogenomic.org) using their in-house workflows for the Illumina NextSeq platform. The raw data files were quality checked, filtered, and assembled using the CLC genomics workbench 10 (www.qiagenbioinformatics.com).

The quality of the genome assemblies for strains BMS and PA5 were assessed using CheckM [26]. The BMS genome has been deposited at JGI IMG/ER under the accession 2626541600, and DDBJ/EMBL/GenBank whole-genome submission portal under the accession CP014213. The PA5 genome has been deposited under the JGI IMG/ER accession 2754412591 and DDBJ/EMBL/GenBank accession NGJK00000000.

Comparative analyses of the Methanosphaera sp. isolate genomes

We first attempted to improve the assignment of functions to “unknown” genes, by using RAST, Prokka, NCBI, KEGG, and JGI IMG/ER annotation tools [27,28,29,30]. Additionally, the COG gene profiles, mobile genetic elements (MGE), CRISPR-Cas functions and fused protein coding genes for Methanosphaera sp. DSMZ3091T, BMS, PA5, and WGK6 genomes were produced via JGI IMG/ER. The Methanosphaera stadtmanae DSMZ3091T, BMS, WGK6, PA5, DSMZ4103T genomes and the population genomes were uploaded to the software platform EDGAR [31] to determine a core, shared and accessory genome. The Methanosphaera DSMZ3091T, WGK6 and BMS genomes were also compared using barcodeByMers (https://github.com/minillinim/mikezbioscripts) across 2 Kb windows of each genome, and the resulting heat map was generated in RStudio (www.rstudio.com) using the package ggplot2 (http://ggplot2.org/).

Retrieval of Methanosphaera population genomes from metagenomic data

The NCBI Sequence Read Archives (SRA) were queried with the 16S rRNA from Methanosphaera stadtmanae DSMZ3091T to identify datasets most likely to contain sequences associated with Methanosphaera spp. (Table S2). Metagenomic datasets identified as possessing Methanosphaera spp. associated sequences included 10 stool samples from the study by Karlsson et al. [32] of elderly European women with glucose control profiles considered either normal, impaired, or diabetic; and the ovine rumen metagenomic datasets from Shi et al. [16]. Bovine rumen metagenomic datasets (Illumina HiSeq 2000 platform; Macrogen Inc., Korea) produced from 10 Brahman steers consuming Rhodes grass (Chloris gayana) with or without supplementation of Leucaena (Leucaena leucocephala) as part of an unpublished CSIRO-led study were also used. The datasets were quality checked with Trimmomatic v0.32 [33] to remove adapters and trim low-quality bases, and bbmerge v34.49 (http://sourceforge.net/projects/bbmap/) was used to merge overlapping pairs. The quality checked, paired-end and merged reads were assembled using CLC Genomics Workbench 8 (http://www.clcbio.com), and MetaBAT v0.25.4 [34] was used to recover population genomes from the three metagenomic studies. An estimate of population genome completeness and contamination was performed with CheckM v1.0.3, and their estimated size was calculated from the product of bin size (in nucleotides) and CheckM completeness score [26]. Annotations for the 7 population genomes were generated through the Rapid Annotation using Subsystems Technology (RAST) server and Prokka.

The population genomes have been deposited at DDBJ/ENA/GenBank under the following accession numbers MUZW00000000, MVJJ00000000, MVJK00000000, MVAA00000000, MUZZ00000000, MUZY00000000, and MUZX00000000. The versions described in this paper is version XXXX01000000.

Genome-based nucleotide identity and phylogenetic analyses

Mauve [35] was used to examine genome synteny among the 12 isolate and population genomes, with Methanosphaera stadtmanae DSMZ3091T used as the reference genome. The phylogenetic trees of the amino acid sequences for methyl-coenzyme M reductase component A2 (MrtA) and energy conserving hydrogenase (EhaRST) were constructed with MEGA7  [36]. ClustalW [37]) was first used to align the sequences, and the stability of the Jones–Taylor–Thornton modelled Maximum-likelihood tree was evaluated by 1000 bootstrap replications.

The average nucleotide identity (ANI) scores between the orthologs present in a “core genome” of 293 genes from all 12 Methanosphaera genomes were also calculated using EDGAR, and a matrix of ANI scores generated using RStudio. The carbohydrate-active enzyme (CAZYme) profiles were produced using the dbCAN database (http://csbl.bmb.uga.edu/dbCAN/). Whole-genome based phylogenies were produced using either the Genome Trees Database (GTDB) which uses the concatenation of 122 universal marker genes to perform the analysis [38] or the “core genome” of 293 genes common to all 12 Methanosphaera genomes, using EDGAR [31]. The GTDB tree inference was performed with FastTree v2.1.7 [39] and included all genomes in IMG v4.510 [29] with the resulting tree imaged using ARB [40]. Finally, the number of core, non-core and singleton genes for the two genome groups were estimated using only the isolate genomes and those population genomes with CheckM scores >90% (n = 8).

Results

Methanosphaera sp. BMS and PA5 are obligate hydrogen-dependent methylotrophs

Our enrichment protocols from bovine rumen digesta and human stool produced axenic cultures of methanol dependant methanogenic prokaryotes. Amplicons of the 16S rRNA gene of these isolates were only produced by archaeal-targeting primers (86F/1340R) and were found to be 97–99% identical with the 16S rRNA gene sequence of Methanosphaera stadtmanae DSMZ3091T. Based on these results, we concluded that these isolates were members of the genus Methanosphaera, and referred to as Methanosphaera sp. BMS and PA5, respectively. Growth and methane production for BMS was supported by methanol and H2, but not methylated amines, acetate, formate, propanol or a mixture of CO2:H2; nor could strain BMS use ethanol as an alternative to H2 to support methanogenesis. Based on the PA5 genome analysis and its extensive synteny and ANI scores with strain DSMZ3091T, as well as the absence of genes encoding dehydrogenases for CO2 or ethanol utilization (see below) we conclude that PA5 is not capable of autotrophic growth, nor utilization of ethanol as a carbon source or source of reductant, like that reported for Methanosphaera sp. WGK6 [17]. For these reasons, we only compared the growth of BMS with isolates from other hosts, i.e., WGK6 [17] and DSMZ3091T using methanol as carbon source and varying concentrations of H2 or ethanol as the reductant. The specific growth rates of both DSMZ3091T and WGK6 were found to be maximal with the lower concentrations of H2 added, whereas the growth rate of strain BMS was significantly increased at higher H2 concentrations (supplementary Table S3). The [KS] values for H2 were calculated to be 0.09 mM, 0.11 mM, and 1.58 mM for strains WGK6, DSMZ3091T, and BMS, respectively. As expected, only strain WGK6 was capable of growth with methanol:ethanol and interestingly, its growth rate was slower with a high concentration of H2 compared to when ethanol was provided (Fig. 1 and Table S3). Based on these results, we conclude that all three strains from the different animal hosts are capable of growth with relatively low H2 thresholds, but the isolates recovered from monogastric hosts are perhaps better adapted to these conditions, and in the case of strain WGK6, have acquired accessory functions in support of methanol reduction, methanogenesis and growth.

Fig. 1
figure 1

Longitudinal monitoring of the growth of Methanosphaera stadtmanae DSMZ3091T (a), strain BMS (b) and strain WGK6 when provided with either methanol:H2 (c) or methanol:ethanol (d). The colored lines and symbols denote different concentrations of either H2 (0 , 1.3 , 1.8 , 2.6 , 3.5 and 4.6 mM) or in the instance of (d), ethanol (0 , 3 , 10 , 30 , 50 and 171 mM). The growth of the individual cultures is plotted until the time point at which maximal OD600 was measured. Individual values represent the mean (±SD) produced from triplicate cultures

The Methanosphaera sp. BMS genome is remarkable in terms of its size

The Methanosphaera sp. BMS genome was assembled into a single molecule with 60x coverage using the PacBio RS2 platform, and is comprised of 2,874,170 bp, nearly twice the size of the genomes of the Methanosphaera isolates derived from monogastric hosts (Table 1). The MAUVE alignments validated there is a large amount of genomic synteny among the isolates, with the “additional” content within the BMS genome demarcated from the smaller genomes via a contiguous region of ~1 Mbp (Fig. 2). Next, both CheckM and BarcodeByMers were used to assess whether the BMS genome might be comprised of a chimeric assembly, but the contamination estimates (Table 1 and Table S4), together with the uniformity in tetramer nucleotide profiles for the BMS, WGK6 and DSMZ3091T genomes (Figure S1) suggest the differences in genome size are not the result of a chimeric assembly nor other sequencing artefacts.

Table 1 Summary statistics of the Methanosphaera isolate and population genomes, with respect to population genome size, completeness and contamination estimates, and the predicted complete genome size, based on these metrics
Fig. 2
figure 2

The degree of genome synteny evident between the Methanosphaera stadtmanae DSMZ3091T, WGK6, and BMS genomes using Mauve. In brief detail, the alignment algorithm facilitates the depiction of genome synteny (illustrated by the size and the coloring scheme used for the local collinear blocks identified by Mauve), regions of genome rearrangement (illustrated via the lines connecting locally collinear blocks between the two genome models) and any xenologous regions (illustrated by blank spaces). The xenologous regions present in the BMS genome are not ubiquitous, as illustrated by the high degree of synteny between all three genomes

The general genome features as well as the COG based assignments for the DSMZ3091T, PA5, WGK6, and BMS genomes are shown in Tables S5 and S6, respectively. In general terms, the total gene count in most COG categories are similar for all three genomes despite their different sizes, further establishing the relatedness among the strains with respect to core functions. However, only 37% of the additional gene content from the BMS genome could be assigned to functional COG categories such as signal transduction mechanisms (8 genes), secondary metabolite biosynthesis, transport and catabolism (10 genes), or replication, recombination and repair (16 genes). Instead, the additional genome content was predominantly comprised of either CDSs with no assigned COG functions (976 genes for strain BMS c.f. 576, 525 and 534 genes for strains WGK6, DSMZ3091T and PA5, respectively) or COGs with “function unknown” (105 genes for strain BMS c.f. 78, and 78, and 81 genes for strains WGK6, and DSMZ3091T, and PA5, respectively). The BMS genome was also remarkable for its greater number of mobile genetic elements (MGE), with only a single gene annotated as a transposase in the DSMZ3091T genome and none in WGK6, compared to 30 putative transposases in the BMS genome. Our comparisons of the CRISPR-Cas associated proteins is limited only to those isolates available for JGI IMG/ER analysis (Table S7). The PA5 genome contains no Cas annotations, while the BMS genome is predicted to possess only a duplicated copy of Cas1. In contrast, both the WGK6 and DSMZ3091T genomes possess CRISPR-associated proteins Cas1–7. As such, it would appear the extra genome content in strain BMS does not include a more expansive CRISPR-Cas system. Based on these findings, we concluded that the bovine-derived isolate BMS is a member of the Methanosphaera genus, and it is the first isolate of its kind—and distinct from the human and kangaroo isolates—because of its substantially larger genome, which is comprised of largely cryptic functions.

Methanosphaera sp. population genome recovery and analyses expand the representation of the genus and confirms its divergence relative to genome size

We were able to retrieve seven Methanosphaera spp. population genomes from two publicly available metagenomic datasets and an ongoing CSIRO project, and their predicted genome sizes are estimated from the completeness and contamination scores calculated using CheckM. They range in size from 1.6 (DEW79) to 2.7 Mbp (rholeuAM74) and 5/7 are closer in size to the BMS genome described above (Table 1). Despite some of the population genomes falling below the 90% completeness threshold for core genome calculations, the extent of sequence conservation across all 12 Methanosphaera genomes is indeed extensive. First, we subjected the “large” and “small” genomes to Mauve alignments separately, which confirmed there is a high level of synteny between the sequenced isolate genomes and the bioinformatically recovered population genomes assigned to each group (Figure S2 and S3). Importantly, the Mauve alignments also confirmed the existence of syntenic blocks extending across the BMS 1 Mbp “unique” region present in the larger genomes. Next, we specifically examined the hydrogenase gene operon (EhaRST), which is universally present in all 12 genomes, and its flanking regions (Figure S4). Here, the degree of gene conservation and organization is again extensive and remarkable. Furthermore, the carbohydrate-active enzyme (CAZymes) family profiles of the 12 Methanosphaera genomes are all very similar, suggesting their evolutionary conservation and provision of key housekeeping functions (Table S8). As might be expected, there is a relatively scant representation of glycoside hydrolases (GH); with the notable exception of all genomes containing a single GH109, which would encode a presumptive α-N-acetylgalactosaminidase [41, 42]. Conversely, both the carbohydrate esterase (CE) and glycosyltransferases (GT) families were numerically more abundant across all 12 genomes, and in particular the GT2 and GT4 families, most likely involved with surface protein glycosylation [43, 44]. Interestingly, the dbCAN analysis predicts that all the genomes contain a similar number of CAZymes with “auxiliary activities” (i.e., AA3 and AA6), which in other microbes are presumptive 1,4-benzoquinone reductases (E.C. 1.6.5.6) and involved with the conversion of NADPH, H+ and p-benzoquinone to NADP+ and hydroquinone. The presence of a putative benzoquinone interacting enzyme within organisms known to have methanophenazine instead of quinones was unexpected [12]. Both KEGG Koala and NCBI BLAST searches did not identify other genes encoding “ubiquinone and other terpenoid-quinone biosynthesis”. Taken together, while these AA3- and AA6-containing genes appear to be conserved across the current representatives of the Methanosphaera genus, their respective functions remain cryptic. Finally, we produced a “core genome” from all 12 Methanosphaera genomes, which is comprised of 293 genes (Table S9) and used these to construct the ANI matrix among the 12 genomes (Figure S5). The ANI scores among DSMZ3091T, PA5, and DEW79 exceed 95%, which is the generally accepted threshold for species-level grouping [45], and across all 12 genomes, range between 77 and 85%, exceeding the threshold for genus-level grouping [46]. Collectively, these results show that the population genomes are truly derived from Methanosphaera sp. present within the sampled microbial communities and that representatives with large genomes a more numerous than currently appreciated, but appear to be largely restricted to ruminant animals.

Phylogenetic analyses establish the monophyletic origin and a host-specific separation of Methanosphaera spp. with large genomes

We recovered all Methanosphaera-affiliated 16S rRNA gene sequences from NCBI database for phylogenetic comparison and found there was a clear bifurcation of the available sequences between monogastric and ruminant hosts (Fig. 3). In relative terms, the BMS 16S rRNA gene represents a deep branch within the ruminant-derived clones, especially in comparison to the 16S rRNA genes from those isolates with a smaller genome (i.e., Methanosphaera stadtmanae DSMZ3091T, Methanosphaera cunculii DSMZ4103T, and Methanosphaera spp. WGK6, and PA5), which all cluster with other sequences derived from monogastric gut microbiomes. The phylogenetic trees produced from BLASTp alignments of the methyl-coenzyme M reductase component A2 (MrtA) and the energy conserving hydrogenase (EhaR) coding sequences also confirm their monophyletic origin and separation on the basis of genome size (Figures S6 and S7) and recapitulates the phylogenetic tree produced using the GTDB universal marker gene approach (Figure S8). Interestingly though, the phylogenetic tree produced using EDGAR, from the 293 genes shared by all 12 Methanosphaera spp. genomes, results in all the representatives with a small genome being clustered together as a distinct lineage relative to those representatives with large genomes (Fig. 4a). Taken together, these results suggest that the Methanosphaera genus is monophyletic in origin and those representatives with large genomes represent more ancestral lineages that are principally adapted to ruminant hosts.

Fig. 3
figure 3

Maximum-likelihood phylogenetic analysis of Methanosphaera spp. isolate and rrs gene clone sequences recovered from NCBI. There is a clear separation of the available Methanosphaera spp. rrs sequences between monogastric and ruminant hosts. Bootstrap values are shown and the scale bar represents 1% sequence divergence, with Methanobrevibacter ruminantium M1 used as the outgroup

Fig. 4
figure 4

a The genome based phylogenetic tree of the isolate and population genomes recovered for Methanosphaera spp. inferred from the concatenation of 293 genes from the core genome and using EDGAR. Those representatives with “small” genomes (<1.8 Mbp; blue/purple circles) appear to have evolved more recently and were more readily found in monogastric hosts, whereas representatives with “large” genomes (>2.0 Mbp; red circles) appear restricted to ruminant hosts and the deeper branching suggestive of their ancestral age. b Estimates of the core, non-core, and singleton genes present in the isolate and population genomes with >90% completeness, when expressed as a per genome equivalent. The total number of core genes is relatively stable between the “large” (red) and “small” (blue) genome groups, with the key differences between groups represented by the substantially larger counts of predicted non-core and singleton genes

The differences in growth of Methanosphaera isolates BMS and DSMZ3091T using habitat simulating medium suggests host-specificity

To biologically assess the apparent differences in host specificity predicted by the phylogenetic analyses described above, extracts prepared from either bovine rumen fluid (RF) or human faecal water (FW) were added to the basal culture medium to produce a bovine and human habitat-simulating medium, respectively. The results of these growth studies are shown in Fig. 5 and Table S10. Both the growth rate and final yield of the human isolate DSMZ3091T was greater than that observed for the bovine strain BMS and was similar in both RF and FW containing media, irrespective of the amount added. Conversely, the growth kinetics for strain BMS were maximal in RF-containing media and substantially decreased with FW containing medium, in a concentration-dependent manner. These results suggest that the composition of RF and FW have limited effects on the growth kinetics of the human isolate, whereas the bovine isolate (BMS) requires growth factors which are either absent or reduced in FW compared to RF.

Fig. 5
figure 5

Growth of human Methanosphaera stadtmanae DSMZ3091T and bovine Methanosphaera sp. BMS when cultured using BRN basal medium supplemented with either 5 or 10% (vol/vol) rumen fluid (RF5 and RF10, respectively) or 1 and 5% (vol/vol) human faecal water (FW1 and FW5, respectively). Notably, the growth rate and yield of strain DSMZ3091T was only marginally affected by the different RF and FW concentrations; however, both the growth rate and yield of strain BMS was reduced in the presence of FW in comparison to RF-containing cultures. Individual values represent the mean (±SD) produced from triplicate cultures

Discussion

Here we have expanded the genomic representation of the genus Methanosphaera from 3 to 12 members, with the addition of two new axenic isolates of ruminant and human origin, as well as the reconstruction of seven population genomes from both human and ruminant metagenomic datasets. Despite being a well-recognized member of the Methanobacteriales with niche specificity first confirmed in human subjects and subsequently, the gastrointestinal tracts of other warm-blooded animals, the physiological capacity of Methanosphaera spp. has been for the most part overlooked. Methanosphaera spp. have been reported to be enriched in “low methane” emitting ruminants [16], with at least some isolates employing alcohol-fueled methanogenesis [17].

Our isolation of a ruminant-derived isolate of Methanosphaera sp. (BMS) has provided the first evidence for the existence of representatives with different genome sizes: with a large genomotype (~2.9 Mbp) compared to those Methanosphaera isolates from human and macropodid hosts (~1.7 Mbp). The regions “unique” to the BMS genome were used for BLASTn analyses against the NCBI databases, and no significant matches to large blocks of other bacterial or archaeal genomes were recovered. The BarcodeByMers analysis also showed that the tetranucleotide profiles of three fully sequenced genomes from human, macropodid, and ruminant origin are virtually identical. Furthermore, the ANI scoring and the Mauve alignments of the BMS and large population genomes revealed the existence of large syntenic blocks extending across the 1 Mbp “unique” region present in these genomes. Taken together, these results suggest that the BMS genome is not a chimeric assembly and the extra genome content is unlikely to be the result of recent gene transfer events. Based on these findings, we propose that the Methanosphaera genus is monophyletic, but actually comprised of two genomotypes. The phylogenetic trees constructed from the genomic data of our isolates and the Methanosphaera-affiliated population genomes also supports our hypothesis that the Methanosphaera spp. with a large genome represent lineages restricted to ruminant hosts (Figs. 3, 4 and supplementary Figures S6, S7 and S8). In that context, we recovered seven Methanosphaera-affiliated population genomes, and 5/7 of these are large, and all 5 were recovered from ruminant-derived datasets (Table 1). The EDGAR phylogenetic tree for all 12 Methanosphaera genomes, constructed using a concatemer of “core genes” from each representative, produced a clear separation of the Methanosphaera spp. with the smaller genomotypes (i.e., DSMZ3091T, PA5, DEW79, WGK6, DSMZ4103T, and SHI1033) from the large genomotypes (Fig. 4).

In general terms, the total gene count in most defined COG categories are similar between the two genomotypes despite their different sizes, further establishing the relatedness among the strains with respect to core functions. An example of the extent of sequence conservation across the genus is illustrated for the energy conserving hydrogenase gene (EhaRST) and flanking regions (Figure S3). Remarkably, the carbohydrate-active enzyme (CAZy) profiles of the respective genomes, and in particular the total gene count and representation of different glycosyltransferases (GT) families appears to be strongly conserved across the two genomotypes. The GT2 and GT4 families are the most numerous across all the genomes (Table S8) and an unusual NAD-dependent α-N-acetylgalactosaminidase (GH109) is also conserved across the Methanosphaera-affiliated genomes, along with multiple copies of GT28 genes, which are infrequently observed in Archaea, but are proposed to be a part of the minimal 3-gene set for peptidoglycan metabolism in other organisms [47]. Collectively, these findings raise new insights into cell wall biosynthesis and decoration in methanogenic archaea, both of which may play substantive roles in coordinating host-microbe interactions and immunomodulation.

Although, the pan-genome analyses revealed a conserved set of functions coordinating the hydrogen-dependent reduction of methanol to methane and emphasizes the limited nutritional versatility of the Methanosphaera genus, there were also some notable differences observed. The SHI1033 population genome was recovered from the sheep rumen metagenomic data of “low methane” emitting sheep [16] and is most similar (in size and content, Figure S9) to strain WGK6, isolated from a macropodid (kangaroo) host. Interestingly, both the WGK6 and SHI1033 genomes possess the alcohol and aldehyde dehydrogenase genes that are considered key to the hydrogen-independent growth of strain WGK6 [17]. The macropodids are recognized as being “low methane emitters” compared to ruminant livestock [48] but more recently, “low methane” emitting ruminants have been identified. These animals possess a shorter retention of feed within the rumen, and a bacterial “ruminotype” associated with reduced levels of ruminal hydrogen production [49]. As such, the capacity for alcohol-fueled methanogenesis by some Methanosphaera spp. appears to be an adaptation to survival in low hydrogen environments of “low methane emitting” animals.

Our gene- and genome-based phylogenetic analyses support the bifurcation of Methanosphaera spp. on the basis of genome size and host specificity; with strain BMS and the representatives with a large genome size recovered only from ruminant hosts. We therefore conducted growth studies with the human and bovine-derived strains of Methanosphaera using media prepared to simulate the growth habitats of the bovine rumen and human gut, by the provision of either rumen fluid or human fecal water, respectively. Interestingly, the growth kinetics of the human isolate were largely unchanged by the provision of either RF or FW at different concentrations, suggesting their growth is not rate-limited by the nutrient milieu present in both types of extracts. Conversely, while the growth kinetics of bovine-derived strain BMS were essentially the same when cultured with RF added at 5 or 10% (vol/vol), the substitution of RF with FW did result in a marked decrease in both growth rate and yield. Furthermore, the reductions in strain BMS growth kinetics observed when FW was reduced from 5 to 1% (vol/vol) suggests the FW nutrient milieu is rate-limiting to the growth, rather than FW possessing non-nutrient factors antagonistic to BMS growth (e.g., anti-microbial proteins). Taken together, these findings suggest that the nutrient requirements of Methanosphaera spp. with larger genomes are more complex than those with smaller genomes, and is a reflection of their additional gene content. Such findings, while limited in scope, are also consistent with our inability to recover representatives of Methanosphaera spp. with large genomes from monogastric hosts; and the recovery of a representative of Methanosphaera spp. with a small genome from a ruminant host (SHI1033).

The evolutionary origins of the gut microbiota, particularly in humans, remain enigmatic. While progress has been made with establishing some of the major lineages of hominid gut bacteria diverged coordinately (co-speciated) with their host [50], scant information is available for methanogenic archaea, and in particular the genus Methanosphaera. Here, our combined use of cultivation- and metagenomics-methods have provided an increased representation of the Methanosphaera genus that is sufficient to reveal novel insights into their phylogeny, and that the variation in genomic content across the genus likely contributes to host-specificity. The genome-scale phylogenetic and comparative analyses confirmed the monophyletic origin of the genus Methanosphaera, which can be bifurcated into two genomotypes, with the smaller genomotype enriched in animal hosts with a “monogastric” digestive system and the large genomotype restricted to ruminant hosts.