Introduction

Bangladesh is experiencing rapidly increased energy consumption over the past two decades. Being one of the world’s most densely populated and least urbanized countries, around 72% of population of this country live in rural areas where there is no supply of natural gas, the main source of energy1. The access to clean and affordable energy is one of the prerequisites to achieve the sustainable development in rural areas. Bangladesh is currently endowed with 25.7 million cattle, 0.83 million buffaloes, 14.8 million goats, 1.9 million sheep, 118.7 million chicken and 34.1 million ducks, and the manure from these animals could provide a reliable source of clean energy2. The production of biogas from CD through anaerobic digestion process cannot only provide fuel, but is also important for reduction of fertilizer nutrient utilization and rural forest conservation in rural areas3,4. The fecal materials of cattle or CD harbors a rich microbial diversity, containing different species of bacteria (Bacillus spp., Corynebacterium spp. Citrobacter koseri, Enterobacter aerogenes, Escherichia coli, Klebsiella oxytoca, K. pneumoniae, Kluyvera spp., Morgarella morganii, Pasteurella spp., Providencia alcaligenes, Providencia stuartii, Pseudomonas spp., and Lactobacillus spp.), protozoa and yeast (Saccharomyces and Candida), which makes them suitable for microbial degradation of wastes5,6. The slurry from CD was maintained in the ratio of 1:10 or 1:25 is able to degrade the rural, urban and hospital wastes, including oil spillage to five basic elements5,6. Recent research findings indicate that CD can supply nutrients and energy required for microbial growth thereby resulting in the bioremediation of pollutants, and energy production5,7.

Biogas (CH4), produced from an AD is an emerging renewable source of energy. This is a relatively high-value fuel and continuous source of energy supply to solve the environmental and energy challenges to replace natural gas or transportation fuel, and an insurance for future energy in a sustainable and environment-friendly manner8. Anaerobic environments play critical roles in the global carbon cycle through the digestion of organic agricultural waste, manure, municipal waste, digester materials, sewage, green waste or food. Moreover, anaerobic digestion of livestock manure improves organic fertilizer quality compared with undigested manure9, and the load of the pathogenic microorganisms and related antimicrobial resistance is also decreased through the biological process of anaerobic degradation10. The rising energy prices and increasing concern of emission of greenhouse gases are the major concern for the people and agro-industries worldwide to consider the wider application of AD technology. This sustainable technology has been viewed as a way to address environmental concern through the generation of CH4 within engineered bioreactors, and thereby reducing the human dependence on fossil fuels11,12. Furthermore, environment friendly renewable energy produced from locally available raw materials and recycled waste could thus contribute to climate change mitigation8. The renewable energy (CH4) produced from AD which is independent of weather conditions could serve for the production of electricity, heat and fuels13. Anaerobic transformation of organic wastes in the AD is carried out by different bacterial and archaeal species, such as hydrolytic, acid forming, acetogenic, and methanogens which produce CO2 and CH4 as the main products of the digestion process11. Though biogas production is directly influenced by the composition of the AD microbiomes11,14, the genomic potentials of the microbiomes favoring anaerobic metabolism to control the level of energy production is thermodynamically dependent on environmental parameters of the AD15.

Diverse microbial communities are associated with biomass decomposition and CH4 production through the metabolic activities of substrate hydrolysis, acidogenesis, acetogenesis and methanogenesis16. A clear understanding of the structure, composition and diversity of the multifarious microbial community involved in biogas production is crucial for the optimization of their performance and stable operational process of the AD. Moreover, a detailed insight into relevant microbial metabolic pathways involved in CH4 synthesis and syntropy is essential to upsurge the yield of biogas. However till to date, only few studies have focused on the taxonomic and functional characterization of microbiomes originating from both laboratory-scale11 and full-scale11,17 biogas reactors under different prevailing ecological conditions14,18. The conventional culture-based techniques19,20 for characterization of the microbiotas in different niches including controlled anaerobic chambers13,21 has been replaced during the last decade by the rapid advances in high-throughput NGS technology and bioinformatics tools22,23. Despite, the 16S rRNA partial gene sequencing approach remained the most widely used genomic approach to study the microbiomes of the AD24,25, several inherent limitations including the polymerase chain reaction (PCR) bias, inability to detect viruses, lower taxonomic resolution (up to genus level only), and limiting information on gene abundance and functional profiling have made this technique questionable22,23. Conversely, the shotgun WMS approach which can identify the total microbial components of a sample (including viruses, bacteria, archaea, fungi, and protists), is being used prudently to decipher the phylogenetic composition, microbiome structure and diversity including profiling of their functional characteristics and interconnections24,26. The current accelerated pace of genomic approaches identified more than 150 species of microorganisms27 such as Clostridium bornimense, Herbinix hemicellulosilytica, Herbinix luporum, Herbivorax saccincola, Proteiniphilum saccharofermentans, Petrimonas mucosa, Fermentimonas caenicola, and Proteiniborus indolifex or even their genomic features to be associated with increased production of biogas11,14,17,27,28,29. To address the changes in microbiome diversity and composition associated with different level of CH4 production, we present a comprehensive deep metagenomic (WMS) analysis of sixteen (n = 16) samples collected from AD periodically loaded with CD under different pH, CO2, O2, H2S and temperature level. Using a homogeneous mapping and annotation workflow associated with a de-replication strategy, our analyses identified ~ 2800 distinct bacterial and archaeal species along with their co-presence networking, antimicrobial resistance and metabolic functional profiling. This study therefore provides an opportunity to investigate the microbiome composition and diversity, and their genomic potentials under controlled anaerobic condition through shotgun metagenomic sequencing, and also to relate their functional activities to generate renewable energy under changing environmental condition and processing parameters.

Results

Physicochemical properties of substrate and digesta

We observed distinct differences in the physicochemical properties of the digester feedstock throughout the experiment which, thus, could have affected the digester performance. The physicochemical properties of the digester feedstock before and after the anaerobic digestion of CD are shown in Fig. 1. The fermentation process of the AD was run for 45 days (Day 0 to Day 44) with a periodic loading of raw CD. The mean loading dose of the raw CD was 90.56 kg/load (highest input volume = 375 kg and lowest input volume = 35 kg) (Table S1). Periodic increments of the raw CD resulted in increased biogas production (Fig. 1A, Table S1). This trend was observed until Day 35, after which biogas production began to decline gradually, although the CD loading was continued. The log phase of methane (CH4) production started around the second day of the experiment, and reached its maximum percentage (74.1%) at Day 35 of the digestion process. It should be pointed out that after Day 35, the CH4 percentage started to decrease, reaching 59.2% on Day 44 (Table 1). Average concentration (%) CO2 was observed 39.52 (minimum = 27.7, maximum = 56) throughout 44 days of the digestion process. Concentration of H2S was maximum (938 ppm) at Day 3, and later on the concentration fluctuated based on feeding (Fig. 1A, Table S1). The overall environmental temperature, AD temperature, AD pressure and humidity were 34.75 °C (maximum = 38.8 °C, minimum = 32.0 °C), 34.46 °C (maximum = 51.0 °C, minimum = 0.0 °C), 22.52 mb (maximum = 56.41 mb, minimum = 0.0 mb), and 55.5% (maximum = 94.0%, minimum = 42.0%) (Fig. 1B, Table S2). On Day 35 of the digestion, when maximum methanogenesis was observed, the concentration of organic carbon (OC) and total nitrogen (TN) in the fermentation pulp were 15.48% and 1.22%, respectively, whereas the concentration of OC and TN in the slurry (CD + seed sludge; Day-0) were 34.39% and 1.96%, respectively (Table 2). The overall C/N ratio of the feedstock also gradually decreased with the advent of anaerobic digestion process, and found lowest (12.7:1) at Day 35. Similarly, the amount of non-metallic element (phosphorus and sulfur) and heavy metals (chromium, lead and nickel) content significantly decreased at the Day 35 of the digestion process (Table 2). However, the amount of zinc and copper did not vary significantly throughout the digestion period (Table 2). As shown in Fig. 2, number of operational taxonomic units (OTUs) were significantly positively correlated with AD bacterial α-diversity (Observed, Chao1 and Shannon indices; r were 1.00, 1.00, and 0.57, respectively, p < 0.05). Similarly, we found significant positive correlation between CH4 concentration (%) and pH (r = 0.92, p < 0.001), CH4 concentration (%) and O2 level (%) (r = 0.54, p < 0.05), and CH4 concentration (%) and environmental temperature (°C) (r = 0.57, p < 0.05) (Fig. 2). We also found positive correlation between the AD CO2 (%) and H2S (%), environmental temperature and digester temperature, environmental temperature and digester pressure, digester temperature and digester pressure (r = 0.64, p < 0.01; r = 0.71, p < 0.01; r = 0.72, p < 0.01; r = 0.89, p < 0.01, respectively). Interestingly, we also found a negative correlation between CO2 and humidity, O2 and humidity, environmental temperature and humidity, and digester temperature and humidity (Fig. 2).

Figure 1
figure 1

Dynamic changes in the physicochemical parameters of the anaerobic digester (AD) over the study period. The R package, ggplot2 (https://cran.r-project.org/web/packages/ggplot2/index.html) was used to visualize the line graphs.

Table 1 Metagenome samples and their groupings in the AD according to experiment time and methane (CH4) concentration up to Day 35 of digestion (highest methane production level).
Table 2 Physicochemical properties of raw cow dung, slurry and active sludges.
Figure 2
figure 2

Pairwise Pearson’s correlation between each of the two physicochemical and microbial alpha diversity characteristics. Black and blue indicate positive and negative correlation, respectively. The figures demonstrate the scale of correlation. pH, CH4 (%), CO2 (%), O2 (%), Others (%), H2S (ppm), Env_Temp (°C) (Environmental_Temperature), Dig_Temp (°C) (Digester_Temperature), Dig_pre (mb) (Digester_pressure), Humidity (%) are different physicochemical parameters. Num_OTUs are the numbers of microbial OUTs. Observed, Chao1, Shannon, and Simpson are the α-diversity indices. *Significant level (■ p < 0.1; *p < 0.05; **p < 0.01; ***p < 0.001). The R package, chart. Correlation function of PerformanceAnalytics (https://cran.r-project.org/web/packages/PerformanceAnalytics/index.html) was used to analyze and visualize the plot.

Microbiome composition and diversity in the AD

The samples (n = 16) were categorized into four groups (Group-I, Group-II, Group-III and Group-IV) based on the level of energy production (CH4%) (Table 1). Group-I (n = 2) belonged to initial time of energy production, Group-II (n = 5) and Group-III (n = 5) had 21–34% and 47–58% of CH4 concentration, respectively. The samples having the highest 71–74% of CH4 was categorized into Group-IV (n = 4). The WMS of 16 sample libraries resulted in 380.04 million reads passing 343.26 million reads quality filters, which corresponded to 90.32% total reads (individual reads per sample are shown in Data S1). The major microbial domain in all samples was Bacteria with an abundance of 81.80%, followed by Archaea (15.43%), and Viruses (2.77%) (Data S1). From these WMS reads, 380–1384 OTUs (average = 850 ± 22) were identified representing the distribution of microbial taxa cross four groups of the AD. Remarkably, the Group-IV metagenome always had higher number of OTUs compared to other groups (Table 1). The alpha diversity (i.e., within-sample diversity) of the AD microbiomes was computed using the Shannon and Simpson estimated indices (i.e., a diversity index accounting both evenness and richness) at the strain level. In this study, both Shannon and Simpson indices estimated diversity significantly varied across the four sample groups (p = 0.03541, Kruskal–Wallis test). The pair-wise comparison of the within sample diversity revealed that the microbiomes of the Group-II significantly differed with those of Group-III and Group-IV (p = 0.048, Wilcoxon rank sum test for each) compared to Group-I (p = 0.91, Wilcoxon rank sum test) (Fig. 3A, B). The rarefaction analysis of the observed species showed a plateau after, on average, 21.45 million reads (Fig. S2, Data S1)-indicating that the coverage depth for most samples was sufficient to capture the entire microbial diversity. We also observed significant differences in the microbial community structure among the four metagenome groups (i.e., beta diversity analysis). Principal coordinate analysis (PCoA) at the strain level (Fig. 3C), showed a distinct separation of samples by the experimental groups. Besides, we found significant (p = 0.032, Kruskal Wallis test) differences in the abundance of ARGs and metabolic functional genes/pathways (DataS2) which could strongly modulate the level of energy production through microbiome dysbiosis in the AD.

Figure 3
figure 3

Microbiome diversity, composition and distribution in four metagenomic groups of the anaerobic digestate samples. (A-B) Box plots showing significant differences in observed species richness in AD associated microbiome. Alpha diversity, as measured by PathoScope (PS) analysis using the Shannon and Simpson diversity indices, revealed distinct microbiome diversity across four metagenome samples (p = 0.03541, Kruskal–Wallis test). (C) The experimental groups were clearly separated by principal coordinate analysis (PCoA), which was measured using non-metric multidimensional scaling (NMDS) ordination plots. The different shapes represent the assigned populations in four metagenomes. As the day progresses, the group color becomes lighter. Values in parentheses represent the fraction of the total variance explained by each PCoA axis. (D) Venn diagram showing unique and shared bacterial strains, (E) Venn diagram showing unique and shared archaeal strains. (F) Stacked bar plots showing the relative abundance and distribution of the total microbial community (phyla level), The archaeal phylum, Euryacaeota showed an increasing trend from initial stage to the final stage (Group-IV) of digestion. (G) Dominant microbial community (≥ 1% reads at least one sample) at species level were visualized in heatmap. Forty-three dominant species were found in this study. Red, blue, and white colors represent different mapped reads to specific taxa from higher to lower respectively. FunRich (http://funrich.org/) tool was used to draw Venn diagram. Data were processed through Phyloseq (https://www.bioconductor.org/packages/release/bioc/html/phyloseq.html) R package, visualized by using ggplot2 (https://cran.r-project.org/web/packages/ggplot2/index.html). Heatmap was generated by using Pheatmap (https://cran.r-project.org/web/packages/pheatmap/pheatmap.pdf) R package.

In this study, on an average 0.43% WMS reads (assigned for r RNA genes) mapped to 28, 110 and 552 bacterial phyla, orders and genera respectively, and relative abundance of the microbiome differed significantly (p = 0.034, Kruskal–Wallis test) across the metagenome groups (Data S1). We observed significant shifts/dysbiosis in the microbiome composition at strain level. The PS analysis detected 2,513 bacterial strains across the four metagenomes, of which 768, 1421, 1819 and 1774 strains were found in Group-I, Group-II, Group-III and Group-IV metagenomes, respectively. Only, 18.34% bacterial strains were found to be shared across the four energy producing metagenomes (Fig. 3D, Data S1). The archaeal fraction of the AD microbiomes was represented by 5, 17, 61 and 343 archaeal phyla, orders, genera and strains, respectively, and the relative abundance of these microbial taxa also varied significantly among the four metagenome groups (Fig. 3E). Remarkably, 95.90% (329/343) of the detected archaeal strains shared across these metagenomes (Fig. 3E, Data S1). In addition, 472, 536, 535 and 536 strains of bacterial viruses (bacteriophages) were identified in Group-I, Group-II, Group-III and Group = IV metagenomes, respectively (Data S1).

Microbial community dynamically changed over time in the AD

Significant changes in the abundances of core microbial groups were observed under anaerobic condition of the AD. At phylum level, the AD metagenome was dominated by Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Spirochaetes and Fibrobacteres comprising > 93.0% of the total bacterial abundances. Among these phyla, Firmicutes was the most abundant phylum with a relative abundance of 41.94%, 37.99%, 40.40% and 38.96% in Group-1, Group-II, Group-III and Group-IV, respectively. The relative abundance of Bacteroidetes (from 37.87% in Group-I to 22.40% in Group-IV) and Actinobacteria (from 3.94% in Group-I to 3.30% in Group-IV) gradually decreased with the advance of AD digestion time. Conversely, relative abundance of Proteobacteria (from 8.08% in Group-I to 18.92% in Group-IV) and Spirochaetes (from 1.28% in Group-I to 3.70% in Group-IV) gradually increased with the increase of anaerobic digestion time in AD. The rest of phyla also differed significantly across these four groups keeping comparatively higher relative abundances during highest CH4 producing stage (Group-IV) of the AD (Fig. 3F). The relative abundances of archaeal phylum Euryarchaeota, steadily increased with the increasing demand of energy (lowest relative abundance in Group-I and highest relative abundance in Group-IV) (Fig. 3F). Similarly, Clostridiales and Bacteroidales were identified as the top abundant order in Group-1, Group-II, Group-III and Group-IV with a relative abundance of 32.37%, 27.81%, 29.22% and 27.87%, and 32.49%, 27.42%, 27.53% and 14.94%, respectively (Data S1).

The structure and relative abundances of the bacteria at the genus level also showed significant differences (p = 0.031, Kruskal–Wallis test) across the study groups. In Group-I, Group-II and Group-III metagenomes, Bacteroides was the most abundant bacteria with a relative abundance of 18.10%, 14.90% and 15.16%, respectively, but remained lower (8.31%) in Group-IV samples. Clostridium was found as the second most predominant bacterial genus, and the relative abundance of this bacterium was 11.92%, 11.13%, 11.73% and 12.15% in Group-1, Group-II, Group-III and Group-IV, respectively. The relative abundance of Ruminococcus, Eubacterium, Parabacteroides, Fibrobacter, Paludibacter, Porphyromonas and Bifidobacterium gradually decreased with the increase of energy (CH4) production rate, and remained lowest in Group-IV. Conversely, Candidatus, Bacillus, Treponema and Geobacter showed an increasing trend in their relative abundances gradually with the advance of digestion time and remained lowest in relative abundances in Group-IV. The rest of the bacterial genera had lower relative abundances in four metagenomes of the AD (Fig. S3, Data S1). In our present study, Methanosarcina was the most abundant archaeal genus, and the relative abundance of this genus remained two-fold higher in Group-III (35.84%) and Group-IV (36.53%) compared to Group-I (17.52%) and Group-II (18.32%). Notably, the relative abundance of Methanoculleus was found higher in Group-II (11.59%) and Group-IV (13.80%) and lowest in Group-I (3.46%). Likewise, Methanobrevibacter was predominantly abundant at the initial phage of digestion (highest in Group-I; 19.35%) and remained lowest in abundance in the top CH4 producing metagenome (Group-IV; 5.01%). Besides these genera, Methanothermobacter (5.30%), Methanosaeta (5.16%), Methanococcus (4.74%), Thermococcus (2.96%), Methanocaldococcus (2.53%), Pyrococcus (2.35%), Methanosphaera (2.32%) Methanococcoides (2.10%) and Archaeoglobus (2.01%) were the predominantly abundant archaeal genera in Group-I samples and their relative abundances gradually decreased with the increase of energy production (Fig. S4, Data S1). On the other hand, Methanoregula (6.43%), Methanosphaerula (2.99%), Methanoplanus (2.37%) and Methanohalophilus (1.39%) were the most abundant archaeal genera in Group-IV metagenome. The rest of the genera remained much lower (< 1.0%) in relative abundances but varied significantly across the four metagenomes (Fig. S4, Data S1).

The strain-level composition, diversity and relative abundances of the microbiomes across four metagenomes revealed significant variations (p = 0.011, Kruskal–Wallis test) (Data S1). In this study, 2,513 bacterial and 343 archaeal strains were detected, of which 18.35% (461/2513) bacterial and 95.92% archaeal strains shared across the study metagenomes (Fig. 3, Data S1). Most of the bacterial strains detected were represented by the phylum Firmicutes followed by Bacteroidetes, Gammaproteobacteria and Betaproteobacteria (Fig. S5, Data S1). Of the detected strains, methanogenic archaeal strains were more prevalent (higher relative abundances) compared to bacterial strains, and this stain-level microbiome profiling was more evident in highest energy producing metagenome group (Group-IV). The most prevalent energy producing archaeal strains in Group-IV were Methanosarcina vacuolata Z-761 (17.31%), Methanosarcina sp. Kolksee (16.63%), Methanoculleus marisnigri JR1 (5.0%), Methanothrix soehngenii GP6 (4.61%), Methanobacterium formicicum DSM 1535 (3.60%), Methanoculleus sp. MAB1 (2.07%) and Methanoculleus bourgensis DSM 3045 (2.07%), and rest of strains had lower (< 2.0%) relative abundances (GData S1). Moreover, the relative abundances of these strains gradually increased with the increase of energy production (lowest relative abundance in Group-I and highest relative abundance in Group-IV) (Data S1). Conversely, the relative abundances of most of the bacterial strains identified gradually decreased with the advance of digestion time (increase of energy production), and mostly remained higher in relative abundances in Group-I (Data S1). Of the top abundant bacterial strains, Bifidobacterium pseudolongum subsp. globosum DSM 20,092 (12.0%), Phocaeicola dorei DSM 17,855 (6.61%), Fibrobacter succinogenes subsp. succinogenes S85 (4.57%), Faecalibacterium prausnitzii M21/2 (2.89%), Clostridiales bacterium CCNA10 (2.78%), and Flintibacter sp. KGMB00164 (2.07%) were found in Group-I, and their abundances gradually decreased with the increase of level of CH4 production. In addition, Dysosmobacter welbionis J115 (5.48%) remained more prevalent in Group-II (Data S1). The rest of the bacterial strains were less abundant (< 2.0%) across the four metagenomes (Data S1). The viral fraction of the microbiomes mostly dominated by different strains of bacteriophages such as Gordonia phage Secretariat (16.12%), Streptomyces phage Bing (5.33%) and Arthrobacter phage Gordon (5.05%) in Group-I, Megavirus chiliensis (1.81%), Acanthamoeba polyphaga moumouvirus (1.60%) and Orpheovirus IHUMI-LCC2 (1.50%) in Group-II, Stenotrophomonas phage Mendera (4.88%), Choristoneura fumiferana granulovirus (3.0%) and Gordonia phage Secretariat (2.55%) in Group-III and Stenotrophomonas phage Mendera (2.58%), Choristoneura fumiferana granulovirus (2.37%) and Bacillus phage Mater (1.47%) in Group-IV (Data S1).

Dominant microbial consortia in the AD

In the present study, we detected 43 dominant species including 31 bacterial and 12 archaeal species belonged of 9 phyla (8 bacterial and 1 archaeal) (Fig. 3G, Data S1). In the digestate of AD, Firmicutes, Bacteroidetes, Fibrobacteres, Spirochaetes, Actinobacteria, Candidatus Cloacimonetes, Proteobacteria, and Thermotogae were the dominant bacterial phyla while Euryarchaeota was the only archaeal phylum dominating in the AD. Methanogenic archaeal species were dominant in Group-III and Group-VI (M. vacuolata, Methanosarcina sp. Kolksee, M. marisnigri, M. soehngenii, M. formicicum and Natrialbaceae archaeon XQ-INN 246) (Fig. 4A). Most of the dominant bacterial species had negative correlation with digester pH, CH4, humidity and pressure. Of the detected dominant species, most of the dominating methanogenic archaeal species such as M. vacuolata, Methanosarcina sp. Kolksee, M. marisnigri, M. soehngenii, M. formicicum, Methanoculleus sp. MAB1, M. bourgensis, M. labreanum, M. barkeri, M. formicica and M. congolense showed strongest positive correlation with pH, CH4 (%), digester pressure (%) and temperature (°C) (Spearman correlation; r > 0.6, p < 0.01) (Fig. 4B). On the other hand, bacterial species including Roseburia hominis, Flintibacter sp., P. dorei, Alistipes shahii, B. thetaiotaomicron, C. butyricum, Megasphaera elsdenii, R. intestinalis, C. bacterium CCNA10, B. pseudolongum, B. fragilis, L bacterium GAM79, F. prausnitzii and I. butyriciproducens, and archaea species N. archaeon XQ-INN 246 had significant negative correlation with digester pH, CH4 concentration (%) and humidity (%) (Spearman correlation; r > 0.6, p < 0.01) (Fig. 4B).

Figure 4
figure 4

Microbial abundance and the correlation between different physicochemical parameters and microbial relative abundance. (A) The species-level taxonomic abundance of microbiome. Stacked bar plots showing the relative abundance and distribution of the dominant abundant species (43 taxa), with ranks ordered from bottom to top by their increasing proportion among the four metagenomics groups. Each stacked bar plot represents the abundance of each strain in each sample of the corresponding category. The relative abundances of archaeal species (red color legends) steadily improved as energy demand increased (lowest relative abundance in Group-I and highest relative abundance in Group-IV). In contrast, the relative abundances of most of the known bacterial species gradually decreased with the passage of time (increased energy production), and mostly remained higher in Group-I and lower in Group-IV. (B) Spearman’s correlation analysis between different physicochemical parameters [pH, CH4 (%), CO2 (%), O2 (%), Others (%), H2S (ppm), Env_Temp (°C) (Environmental_Temperature), Dig_Temp (°C) (Digester_Temperature), Dig_pre (mb) (Digester_pressure), Humidity (%)] and dominant microbial relative abundance at species level. The numbers display the Spearman’s correlation coefficient (r). Blue and red colors indicate positive and negative correlation, respectively. The color density, circle size, and numbers reflect the scale of correlation. *Significant level (*p < 0.05; **p < 0.01; ***p < 0.001). The R packages, Hmisc (https://cran.r-project.org/web/packages/Hmisc/index.html) and corrplot (https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html) were used respectively to analyze and visualize the data.

Identification of potential indicator species and their co-occurrence

To identify microbial taxa (bacteria and archaea) that could discriminate across the four metagenome groups of the AD in terms of energy production (% CH4), the indicator species analysis (ISA) was performed both in individual group and combination basis, as shown in (Fig. 5). Indicator species were those which were significantly more abundant and present in all samples belonging to one group, and also absent or low abundance in the other group (Fig. 5, Data S1). The core taxa were selected based on their relative frequency (≥ 75% occurrence in each of the four groups) (Data S1). Although, 26, 3 and 19 indicator species were found in Group-I, Group-II and Group-IV, respectively, and no indicator species were identified in Group-III (Fig. 5A, Data S1). Higher indicator values (IVs) suggested better performances in the microbial signature of the assigned taxa. Desulfosporosinus youngiae, Treponema caldarium, Pseudoclostridium thermosuccinogenes, Dehalobacterium formicoaceticum, Methanofollis liminatans, Methanoregula boonei, Syntrophomonas wolfei, Hungateiclostridium clariflavum, Candidatus Cloacimonas acidaminovorans and Methanocorpusculum labreanum were highly specific for energy production in Group-IV (highest CH4 production rate; 74.1%), with IVs of 0.983, 0.978, 0.949, 0.907, 0.887, 0.885, 0.882, 0.851, 0.795 and 0.786, respectively (Fig. 5A; Data S1). Considering the combined group effects of the indicator species associated with energy production, our analysis revealed that Methanosarcina vacuolate, Dehalococcoides mccartyi, Methanosarcina sp. Kolksee and Methanosarcina barkeri in Group-III + Group-IV (top CH4 producing groups) having IVs of 0.88, 0.887, 0.879 and 0.879, respectively were highly specific for energy production (Fig. 5B, Data S1). All of the indicator phylotypes displayed reduced abundance in the initial stage of biogas production (Group-I and Group-II, lower CH4 production rate) compared to their increased relative abundance up to Day 35 of the experiment (in Group-III and Group-IV) (Data S1).

Figure 5
figure 5

Indicator species analysis of AD microbiome within four metagenomics groups. (A) Individual group effects of the indicator species associated with energy production, (B) combined group effects of the indicator species associated with energy production. Indicator values (IndVal) plotted next to the name of taxa range from zero to one, with larger values denoting greater specificity. Indicator values (IndVal) are shown next to the taxonomic information for the indicator taxa as indicated by Indicator. Size of symbol ((circle size) is proportional to the mean relative abundance in that group of AD and p-values (circle color) were plotted. Circles colored in grey indicate insignificant taxa. Higher indicator values (IVs) suggested better performances in the microbial signature of the assigned taxa. The R package, Indicspecies (https://cran.r-project.org/web/packages/indicspecies/index.html) was used to analyze the data, and ggplot2 (https://cran.r-project.org/web/packages/ggplot2/index.html) was used to draw these bubble groups.

We then visualized networks within each metagenome group of the AD for both positive and negative co-occurrence relationships (Fig. 6, Data S1). The correlation networks analysis was performed based on the significantly altered species (n = 106) in different groups as revealed by indispecies analysis. This network analysis explored significant association (p = 0.021, Kruskal–Wallis test) in the co-occurrence patterns of the energy producing microbial taxa (species and/or strains) based on their relative abundances in four metagenome groups. In the correlation network of four metagenomes; Group-I to Group-IV), Firmicutes and Bacteroidetes exhibited strongest relation. The resultant network consists of 106 nodes (17 in Group-I, 58 in Group-II, 5 in Group-III and 26 in Group-IV) which were clearly separated into four modules/clusters (Fig. 6). Taxa in the same group may co-occur under the same AD conditions (temperature, O2 and H2S percentage, pressure and humidity). Across different metagenome groups of AD, Firmicutes, Bacteroidetes, Actinobacteria and Proteobacteria were the top abundant phyla in Group-I and Group-II with a cutoff of 1.0 while Bacteroidetes and Chlorhexi in Group-III, and Euryarcheota and Firmicutes in Group-IV were designated as the top abundant phyla with a cutoff of 1.0 (Fig. 6). However, when moving down to the species-level in microbiome co-occurrence in the AD, keystone taxa were much more consistent between networks with different correlation cutoffs. These results reveal that applying the same conditions in the AD for energy production, network elements must happen under careful consideration of the parameters used to delineate co-occurrence relationships. The positive correlations between Group-I and Group-II were observed among the microbiomes of the AD while Group-III and Group-IV showed negative correlations in terms of energy production with the microbial taxa of other two groups (Fig. 6). These findings therefore suggest that different strains of Euryarcheota and Firmicutes phyla were negatively correlated but associated with highest level of energy production (highest % of CH4; Group-IV).

Figure 6
figure 6

Microbiome co-occurrence in the AD within the four metagenomic groups. Correlation networks of significantly altered species based on the indicator species analysis. Spearman’s absolute correlation coefficients ≥ 0.5 and p-values < 0.05 were retained. The node size is proportional to the mean abundance of the species. Nodes are colored by taxonomy with labelled genera names. The positive correlation is represented by the green line, while the negative correlation is represented by the red line. The nodes were grouped according to their higher abundance in each group (It means that a species can be found in all or at least one group. But they were kept in such a group where it was highly abundant). The microbiomes of the AD showed positive associations between Groups I and II, while Groups III and IV showed negative correlations in terms of energy production with the microbial taxa of the other two groups. The R packages, Hmisc (https://cran.r-project.org/web/packages/Hmisc/index.html) package was used to analyze correlation and Gephi (https://gephi.org/) software was utilized to visualize the network plot.

Genomic functional potentials of the anaerobic microbiomes

In this study, there was a broad variation in the diversity and composition of the antimicrobial resistance genes (ARGs) (Fig. 7, Data S2). The results of the present study revealed significant correlation (p = 0.0411, Kruskal–Wallis test) between the relative abundances of the detected ARGs and the relative abundance of the associated bacteria found in four metagenomes (Data S2). ResFinder identified 45 ARGs belonged to eight antibiotic classes distributed in 2,513 bacterial strains (Data S2). The Group-III microbiomes harbored the highest number of ARGs (42), followed by Group-II (38), Group-IV (29) and Group-I (22) microbes (Fig. 7, Data S2). The tetracyclines (doxycycline and tetracycline) resistant gene, tetQ had the highest relative abundance (23.81%) in Group-I associated bacteria followed by Group-II (22.85%), Group-III (16.49%) and Group-IV (6.73%)–microbes. Macrolides (erythromycin and streptogramin B) resistant genes such as mefA (16.80%), mefB (15.32%) and msrD (11.10%) had higher relative abundances in highest CH4 producing metagenome compared to other metagenome groups (Fig. 7A). The broad-spectrum beta-lactams resistant gene, cfxA2-6 was found as the common ARG among the microbiomes of four metagenomes, displaying the highest relative abundance (35.58%) in inoculum (Group-I) microbiota followed by Group-II (23.09%), Group-III (8.02%) and Group-IV (0.14%) microbiomes. The rest of the ARGs also varied in their expression levels across the four metagenomes, being more prevalent in the Group-III microbiomes (Fig. 7A, Data S2).

Figure 7
figure 7

Antibiotics resistance genes (ARGs) detected in anaerobic digestion driving microbiome and their Spearman’s correlation with physicochemical parameters and dominant microbial community. (A) The heatmap illustrates the distribution of 45 ARGs belonged to 8 antibiotic classes and 16 types of mechanisms found across the four metagenomes. The red, blue, white color determine the number of mapped reads and the number also illustrate the same things. (B) Spearman’s correlation between physicochemical parameters and ARGs represents the plot. (C) The correlation analysis between physicochemical parameters and ARGs classes. (D) The correlation analysis between physicochemical parameters and ARGs mechanisms. The numbers display the Spearman’s correlation coefficient (r). Blue and red indicate positive and negative correlation, respectively. The color density, circle size, and numbers reflect the scale of correlation. *Significant level (*p < 0.05; **p < 0.01; ***p < 0.001).

Most of the dominant archaeal species had stronger positive correlation with sul4, tet(W), tet(44), tet(C), Inu(B), tet(32), catQ, tetM) and Inu(D) (Spearman correlation; r > 0.5, p < 0.01), and were negatively correlated with tet(Q), tet(40), Inu(C), cfxA, cfxA2, cfxA3, cfxA4, cfxA5 and cfxA6 (Spearman correlation; r > 0.6, p < 0.01) (Fig. 7B, Data S2, Fig. S6). In contrast, the dominating bacterial species of the AD had significant positive correlation with cfxA, cfxA2, cfxA3, cfxA4, cfxA5, cfxA6, tet(Q), blaACI-1, tet(40) and Inu(C) (Spearman correlation; r > 0.6, p < 0.01) (Fig. 7B, Data S2, Fig. S6). Analyzing the correlation between physicochemical parameters and ARGs classes, we found that the digester pH, CH4 concentration, pressure and temperature were positively correlated with macrolide-lincosamide-streptogramin (MLS), tetracyclines and phenicol while most of the physicochemical parameters were negatively correlated with sulfonamides class of antibiotics (Spearman correlation; r > 0.6, p < 0.01) (Fig. 7C, Data S2, Fig. S7A). Likewise, the digester pH, CH4 concentration, pressure and temperature showed strongest positive correlation with MLS resistance MFS and ABC efflux pumps associated mechanisms (Spearman correlation; r > 0.5, p < 0.01) followed by chloramphenicol acetyltransferases and tetracycline resistance ribosomal protection proteins related functions (Fig. 7D, Data S2). Conversely, Class A betalactamases and sulfonamide resistance dihydropteroate synthases were the negatively correlated mechanisms with digester physicochemical parameters like pH and CH4 (Fig. 7D, Data S2). The mechanisms of the antimicrobial resistance also varied between the dominant bacterial and archaeal species (Fig. S7B). For instance, most of the bacterial species showed highest positive correlation Class A betalactamases (Spearman correlation; r > 0.5, p < 0.01) while the enriched archaeal species showed negative correlation with this functional mechanism of antibiotics. The methanogenic archaeal species predominantly identified however had significant positive correlation with tetracycline resistance ribosomal protection proteins related functions, 23S rRNA methyltransferases and MLS resistance MFS efflux pumps (Fig. S7B, Data S2). In addition to these ARGs, the highest CH4 producing microbiomes were enriched with the higher relative abundance of genes coding for cobalt-zinc-cadmium resistance (18.85%), resistance to chromium compounds (12.17%), arsenic (6.29%), zinc (4.96%) and cadmium (3.26%) resistance compared to the microbes of other three metagenomes (Fig. S8, Data S2). By comparing the possible mechanisms of the detected ARGs, we found that antibiotic efflux pumps associated resistance had the highest level of expression in the anaerobic microbiomes of the AD followed by antibiotic inactivation, enzymatic inactivation and modification, antibiotic target protection/alteration, and folate pathway antagonist-attributed resistance mechanisms (Fig. S8, Data S2). Notably, the physicochemical properties of the AD including pH, CH4 concentration (%), pressure, temperature and environmental temperature were found to be positively correlated with metal such as arsenic, copper, cadmium, mercury, chromium compounds and zinc resistance pathways (Spearman correlation; r > 0.5, p < 0.01). (Fig. 8A, Data S2). Remarkably, most of the methane producing dominant archaeal species showed significant positive correlation with these metal resistance pathways (Spearman correlation; r > 0.7, p < 0.01). (Fig. 8B, Data S2) while the predominant bacterial species to be associated with methane production showed stronger negative correlation with most of the metal resistance pathways (Spearman correlation; r > 0.7, p < 0.01). (Fig. 8B, Data S2).

Figure 8
figure 8

Pairwise Spearman’s correlation with metal resistances (from MG-RAST Seed subsystem) with phytochemical parameters and dominant microbial community. (A) Correlation with phytochemical parameters and (B) correlation with dominant microbial community. The numbers display the Spearman’s correlation coefficient (r). Blue and red indicate positive and negative correlation, respectively. The color density, circle size, and numbers reflect the scale of correlation. *Significant level (*p < 0.05; **p < 0.01; ***p < 0.001).

Functional metabolic profiling of the gene families of the same KEGG pathway for AD microbiomes revealed significant differences (p = 0.012, Kruskal–Wallis test) in their relative abundances, and positive correlation with level of energy production (Fig. 9, Data S2). Among the detected KO modules, genes coding for CHO metabolism and genetic information and processing were top abundant, however did not vary significantly across the metagenome groups. Remarkably, the relative abundance of genes coding for energy metabolism, xenobiotics biodegradation and metabolism, butanoate metabolism, citrate synthase (gltA), succinyl-CoA synthetase subunits (sucC/D), pyruvate carboxylase subunits (pycA) and nitrogen metabolism gradually increased with the increasing rate of CH4 production, and had several-fold over expression among the microbiomes of Group-IV. Conversely, fumarate hydratase (fumA/B), malate dehydrogenase (mdh) and bacterial secretion system associated genes were predominantly overexpressed in Group-I related microbiomes which gradually decreased with advance of digestion process, and remained more than two-fold lower expressed in the peak level of CH4 production (lowest in Group-IV) (Fig. 9A, Data S2). We also detected 41 statistically different (p = 0.033, Kruskal–Wallis test) SEED functions in the AD microbiomes. Overall, the top CH4 producing microbiomes (Group-III and Group-IV) had higher relative abundances of these SEED functions compared lower CH4 producing microbiomes (Group-I and Group-II), except for regulation of virulence (highest in Group-I microbes; 17.08%), gluconeogenesis (highest in Group-I microbes; 16.27%) and transposable elements (highest in Group-I microbes; 17.28%) (Fig. 9B, Data S2). The Group-IV-microbiomes (highest CH4 producing) were enriched in genes coding for tetrapyrroles (17.42%), one carbon (10.29%) and biotin (4.55%) metabolism, oxidative (18.76%) and osmotic (9.94%) stress, proteolytic pathway (7.74%), MT1-MMP pericellular network (6.45%), acetyl-CoA production (5.33%) and motility and chemotaxis (3.13%) compared to the microbes of the other metagenomes. The Group-I microbiomes however had a higher abundance of SEED functions involved in protection from ROS (16.28%), heat shock (18.31%) and NAD and NADP (19.03%) (Fig. 9B, Data S2). Analyzing correlation between AD physicochemical parameters and microbial genomic functions, we found that SEED subsystems such as cofactors, vitamins, prosthetic groups, pigments, membrane transport, motility and chemotaxis, respiration, stress response, dehydrogenase_complexes, dehydrogenase_ kinases, glycolysis and gluconeogenesis including archaeal enzymes, glyoxylate bypass, pentose phosphate pathway, pyruvate:ferredoxin oxidoreductase, pyruvate metabolism I and II had significant positive correlation with digester pH and CH4 concentration (%) (Spearman correlation; r > 0.6, p < 0.01). (Fig. 10A, Data S2). Simultaneously, the Kos like ascorbate and aldarate metabolism, citrate synthase (gltA), energy, glyoxylate, dicarboxylate, methane and nitrogen metabolism, xenobiotics biodegradation and metabolism, succinyl-CoA synthetase subunits (sucC/D) and oxidative phosphorylation were significantly positively correlated with digester pH, CH4 concentration (%), pressure and temperature (Spearman correlation; r > 0.6, p < 0.01). (Fig. 10B, Data S2).

Figure 9
figure 9

Functional genomic potentials of the anaerobic digestion associated microbial community through KEGG and SEED Pathways analysis. (A) Heatmap depicting the distribution of the 40 genes associated with the identified metabolic functional potentials detected by KEGG Pathways analysis within the four metagenomic groups of the AD microbiome. (B) Heatmap showing the distribution of the 41 functional gene composition and metabolic potential detected by SEED Pathways analysis within the four metagenomic groups of the AD microbiome. The color code indicates the presence and completeness of each gene, expressed as a value (Z score) between −2 (low abundance), and 2 (high abundance). The red color indicates the highest abundance whilst light green cells account for lower abundance of the respective genes in each metagenome. FunRich (http://funrich.org/) tool was used to these heatmaps.

Figure 10
figure 10

The correlation between different physicochemical parameters and genomic functional potentials of the AD microbiomes. The numbers display the Spearman’s correlation coefficient (r). Blue and red indicate positive and negative correlation, respectively. The color density, circle size, and numbers reflect the scale of correlation. *Significant level (*p < 0.05; **p < 0.01; ***p < 0.001). (A) Correlation analysis of different physicochemical parameters and SEED subsystems. (B) Correlation analysis of different physicochemical parameters and KEGG pathways. The R packages, Hmisc (https://cran.r-project.org/web/packages/Hmisc/index.html) and corrplot (https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html) were used respectively to analyze and visualize the data.

Discussion

This study is the first ever approach to reveal the dynamic shifts in microbiome composition and abundances in different levels of biogas production under the anaerobic digestion system using the state-of-the-art WMS technology along with analysis of the physicochemical parameters in Bangladesh. Anaerobic digestion of organic wastes is favored by the metabolic activities of different types of microorganisms including bacteria and archaea30. The physicochemical parameters assessment of the AD before and after digestion revealed that biogas production was in increasing trend up to Day 35 with periodic loading of slurry (Day 1, 2, 3, 8, 10, 14, 16, 24 and 35) at 1:1 ratio of raw CD and active sludge (Table S1). After 30 days of incubation, we found that cow wastes produced highest amount of biogas (on Day 35, CH4; 74.1%), and corresponds to an average pH of 7.01. Earlier studies showed that optimum pH range in an AD is 6.8 to 7.2 supporting our present findings31. The CH4 production level thereafter decrease gradually reaching 59.2% on Day 44 of processing (Fig. 1). Moreover, with the gradual increase of AD temperature and pH value (~ 7; neutral), the CO2 and CH4 concentration positively increased up to Day 35 which further correlated with the higher abundance of methanogenic bacteria. Biogas production chiefly depends on the content and chemical nature of biodegradable matter. Correlation analysis between the relative abundances of specific microbial taxa and digesters’ physicochemical parameters provides a qualitative analysis method to explore the possible roles of and the interactions among these microbes. Although, most of the dominant bacterial species had negative correlation with digester pH, CH4 concentration, humidity, temperature and pressure, the methanogenic archaea showed strongest positive correlation with these parameters. The biochemical parameter of cow waste (slurry) reflects the presence of high content of readily biodegradable organic matter in the first phases (up to Day 35) of anaerobic digestion30. The CO2 concentration (%) found to be varied throughout the digestion process keeping an average value of 39.52%. Our analysis revealed that OC and TN content was higher at the time of loading of slurry in the AD compared to that of highest biogas production stage (at Day 35). The amount of carbon available of the substrate determines the maximum amount of CH4 and CO2 that can be formed by anaerobic digestion9. Conversely, C/N ratio remained lowest in this peak stage of CH4 production. The OC is essential for bacterial growth, and determining of the C/N ratio is essential for optimal biogas production32. Moreover, the total content of phosphorus, Sulphur and heavy metals (chromium, lead and nickel) also remained lowest at this highest stage of biogas production (Day 35). The lowest chromium, lead and nickel concentration during highest CH4 producing stage might be associated with their higher abundances of heavy metal resistance genes, and small concentrations of these metals found in the process are essential for microbial maintenance33,34. Certain specific metals such as Sulphur, cobalt and nickel serve as cofactors in the enzymes involved in the formation of CH4 during anaerobic processing35. However, the minerals (e.g. zinc and copper) content of the AD did not vary throughout the digestion process revealing their important roles in various metabolic pathways of anaerobic digestion33. Since, CH4 gradually declined after Day 35 of digestion irrespective differences in other physicochemical parameters, we collected samples for microbiome analysis only up to peak level CH4 production (Day 35). The limitation of the study is that a complete correlation between heavy metal concentration and microbial community in terms of CH4 is missing, so we actually do not know the effect of the heavy metals on the single microorganism, but to perform this kind of analysis, the methods should be also changed significantly. Energy production in the AD is carried out by a series of biochemical reactions including hydrolysis, acidogenesis, and acetogenesis performed by fast-growing bacteria through which complex organic macromolecules into volatile fatty acids (VFA), CO2, and H2. However, in many cases, these reactions occur at low pH, while methanogenesis is performed by slower-growing bacteria that thrive best at neutral pH36. Remarkably, the pH, CH4 concentration, pressure and temperature were the main driving factors for variations in microbial community structure in the AD as also reported in previous studies25,30,36.

The within (alpha) and between (beta) sample diversity of the AD microbiomes showed that that microbial dysbiosis in the AD is closely linked to different levels of biogas production. Compared to loading phase of AD (Group-I), increased microbial diversity and species richness was observed in the later phases (Group-II, Group-III and Group-IV) of anaerobic digestion. Beta diversity also revealed a substantial microbial disparity in different levels of biogas production, and segregated the samples accordingly. Despite having higher taxonomic resolutions, the microbiomes of the AD remained inconsistent and fluctuates more in Group-II, Group-III and Group-IV than those of Group-I metagenome. The taxonomic annotations of the four groups of AD showed that they were a reservoir of bacteria, followed by archaea and viruses, which corroborated the findings of other studies37,38. Among the identified domains, bacteria dominated in abundance, comprising 81.80% of the total microbial populations, followed by archaea (15.43%), while viruses (2.77%) comprised the least abundant population. The observed high bacterial abundances suggest their crucial metabolic roles in biomass conversion and other reactions within the reactor systems38. The identified affiliates of archaea were mostly consumers of smaller substrates that were generated by the bacterial taxa. The archaeal species are able to use different methanogenic routes to convert the substrates into methane gas. Nevertheless, the main roles of the identified less abundant bacteriophages were unclear, though the strains could have been active in degrading other microbial cells in the AD systems37,38.

In this study, the most of the dominant methanogenic archaeal species (n = 12) were belonged to phylum Euryarchaeota and class Methanomicrobia and these findings are consistent with several previous studies17,38 on biogas plants where phylum Euryarchaeota (class Methanomicrobia) was reported as the dominant archaea contributing to the energy production.

The methanogenic archaea are obligate anaerobes that can reduce simple substrates to methane (CH4) to produce cellular energy. Most of the dominant archaeal species belonged to Methanosarcinales and Methanobacteriales orders which are syntrophic acetate oxidization digesters39. M. vacuolata can play roles in energy production by utilizing methylamines, and resistance to harsh conditions40. Methanoculleus sp. are acetate-oxidising bacteria, which convert acetate directly to methane and of considerable importance in ammonia-rich engineered biogas processes41. Several members of Firmicutes and Bacteroidetes phyla are well known as fermentative and syntrophic bacteria, and can ferment cellulose and cellobiose to produce acetic acid, ethanol, CO2/H2 and CH442. Moreover, the enriched consortia of Fibrobacteres, Spirochaetes, Actinobacteria, Proteobacteria, and Thermotogae phyla had negative correlations with physicochemical parameters, indicating some members working on acetate and butyrate syntrophic oxidation might be dominant in the AD42. The Candidatus Cloacimonetes phylum is reported to play a role in hydrolysis of cellulose and/or fermentation of hydrolysis products in AD43. In this study, Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Spirochaetes and Fibrobacteres were the most abundant bacterial phyla, and their relative abundance also varied according to the level of energy production in the corresponding sample groups. The bacterial phyla Firmicutes and Bacteroidetes appeared to dominate biogas communities in varying abundances depending on the apparent process conditions corroborating with earlier studies43. The observed community composition at phylum level, with dominance of Firmicutes, Bacteroidetes and Proteobacteria, is also in line with previous findings for biogas reactors34. The first three phases (hydrolysis/cellulolysis, acidogenesis/fermentation and acetogenesis) of the anaerobic digestion are solely performed by fermentative Bacteria. In this study, the bacterial phyla Firmicutes (37.0–42.0%) and Bacteroidetes (22.0–38.0%) appeared to dominate biogas communities in varying abundances depending on the apparent process conditions. The initial phase of the AD digestion process involves hydrolytic reactions that convert large macromolecules into smaller substrates37. In addition, the nature and composition of the substrates, availability of nutrients and ammonium/ammonia contents can affect both the composition and diversity of the methanogenic archaea37. Only certain methanogenic archaea are able to synthesize CH4 from the end products of bacterial fermentation. The performance and efficiency of these processes depend to a large extent on the presence of appropriate and adequate microorganisms along with the physicochemical conditions of the digester and quality of the substrates (organic materials) etc. The degree and rate of degradation (hydrolysis, fermentation, acetogenesis, and methanogenesis) and the biogas yield depend not only on the chemical and physical characteristics of the substrates, but also on the chosen process parameter such as temperature, humidity and retention time, that shape the composition of different microbial groups and communities that active in the process34. During the anaerobic process Clostridiales and Bacteroidales were identified as the top abundant order in all of the four metagenome groups. The large number and proportion of members of the Clostridiales order are indicative of the important role in the proper functioning of the microbial community in an AD fed with complex substrates. Numerous members of the Clostridiales order, and members of the Clostridiaceae family are capable of performing diverse fermentation pathways and known to use lactate and acetate as electron sources for the reduction of iron and cobalt under anaerobic condition. They primarily ferment sugars to organic acids through the reductive acetyl-CoA pathway, which is typical in acetogenic bacteria and in some Archaea4. In this process, CO2 is reduced to CO and then converted to acetyl-CoA, H2 serving as electron donor. It should additionally be noted that a large number of Clostridia actively produce H2, an important substrate for the hydrogenotrophic methanogens4,44. In anaerobic environments, Clostridiales has been reported as the main cellulose degrader45, and play important role in the hydrolysis step34. These findings are in line with many of the previous reports37,46 who reported the potential roles of these bacterial taxa in the efficient and increased production of biogas under anaerobic condition47. In addition, bacteria belonging to the order Bacteroidales have been suggested to be involved in the degradation of lignocellulose materials, such as straw and hay, the chief component of cattle feed47,48. The identified affiliates of Bacteroidales belong to Bacteroidetes, and are majorly known to ferment carbohydrates and proteins, concomitantly releasing H233. Moreover, Bacteroides, the predominating genus in the AD coming from Bacteroidales order were observed to co-exist with methanogenic archaea possibly to increase energy extraction from indigestible plant materials33, and we hypothesize that they are the key drivers of the observed β-diversities since the relative abundance of this genus gradually decreased with the digestion process.

The digestion process of the AD was carried out by the integrated cross-kingdom interactions since both bacteria and archaea were simultaneously detected in this WMS-based study corroborating with several earlier reports22,23,26. The strain-level taxonomic profiling revealed that methanogenic archaeal strains were more prevalent than bacterial strains. The archaeal domain of the AD microbiomes was composed of different strains of methanogenic, hydrogenotrophic and thermophilic genera of Methanoculleus, Methanosarcina, Methanothrix, Methanobacterium and Methanobrevibacter genera. The current findings are corroborated with many of the earlier studies who reported that these genera to be predominantly abundant in the AD of manures30, and associated with biogas production under anaerobic conditions49. These methanogenic genera might reside in the microenvironments appropriate for anaerobic metabolism26 and their presence has been reported in microbial communities producing biogas50. Members of Methanoculleus are hydrogenotrophic methanogens51, while Methanosarcina species or strains are mostly acetoclasic but also able to use H224,52. In addition, Methanosarcina spp. has been reported to have higher growth rates and tolerance to pH changes and could potentially lead to stable methanogenesis in the AD10,53. Methanobrevibacter was predominant in initial phase of digestion (Day 2 and 15) in the bioreactor, and are known to be hydrogenotrophic, by using CO2 and H2 as substrates to generate biomethane24,50. These archaeal genera are suggested to play vital role in hydrogenotrophic methanogenesis, and maintaining methanogenic community diversity21,25.

We found significant association across different methanogenic species with increased level of CH4 production54, with particular emphasis in the highest CH4 producing metagenome (Group-IV) of the AD. Mounting evidence suggested that, under anaerobic condition, the functional composition of the energy producing microbial communities are closely related to the physicochemical and environmental factors such as pH, temperature and pressure of the digester10,30,38,55. The ecological functions of microorganisms living in similar environments remained more similar, but the composition of microbial species (especially, the dominant species) performing the functions differed greatly corroborating with the results of several previous studies10,24,56. The methanogenic archaea are obligate anaerobes that can reduce simple substrates to methane (CH4) to produce cellular energy. Most of the dominant archaeal species belonged to Methanosarcinales and Methanobacteriales orders which are syntrophic acetate oxidization digesters39. M. vacuolata can play roles in energy production by utilizing methylamines, and resistance to harsh conditions40. Methanoculleus sp. are acetate-oxidising bacteria, which convert acetate directly to methane and of considerable importance in ammonia-rich engineered biogas processes41. Several members of Firmicutes and Bacteroidetes phyla are well known as fermentative and syntrophic bacteria, and can ferment cellulose and cellobiose to produce acetic acid, ethanol, CO2/H2 and CH442. Moreover, the enriched consortia of Fibrobacteres, Spirochaetes, Actinobacteria, Proteobacteria, and Thermotogae phyla had negative correlations with physicochemical parameters, indicating some members working on acetate and butyrate syntrophic oxidation might be dominant in the AD42. The Candidatus Cloacimonetes phylum is reported to play a role in hydrolysis of cellulose and/or fermentation of hydrolysis products in AD43.

The differences in ARGs profiles throughout different stages of anaerobic digestion were greatly affected by the taxonomic composition of microbiomes (archaea and bacteria), and the physicochemical parameters of the AD. The observed differences in microbiome composition and AD physicochemical properties are considered to be co-selection factors for antibiotic resistance57. Moreover, we also found that cross-resistance (where a single microbe is related to resistance to both antibiotics and metals) and coregulation (a shared regulatory system for antibiotic and metal resistance) may also lead to the increases in ARGs55. Therefore, to understand the mechanism that allows heavy metals to influence ARGs, the microbial community, heavy metal resistance genes, and their relationships with ARGs should be research forward. Overall, the diversity and abundance of the ARGs determined in this study were high, which clearly indicates that the unmonitored use of antibiotics and metals in dairy farms can subsequently increase the abundance of ARGs in AD systems. Our results also showed that common indicator bacteria such as E. coli, Salmonella and Staphylococcus species were not found in indicator species analysis, and these findings are supported by several previous reports of the absence of common indicator bacteria after 30 days digestion in the experimental AD at different temperatures (25–45 °C)58. Spearmen correlation analysis showed negative correlation between Group-II and Group-IV microbiomes. These findings are in line with the metabolic functional potentials of the AD microbiomes since fumarate hydratase (fumA/B), malate dehydrogenase (mdh) and bacterial secretion system associated genes were predominantly overexpressed in Group-I and Group-III microbiomes, which gradually decreased with advance of digestion process, and remained more than two-fold lower expressed in highest CH4 production stage (Group-IV). The microbial communities present in the early phage of digestion process (Group-I) increased both composition and abundances in the second phase of digestion (Group-II metagenome) in ambient growth conditions of the AD. With the advance of digestion time, the increasing efficiency of anaerobic digestion creates a favorable environment for the methanogens, and thus found in higher composition and abundances in Group-III and Group-IV metagenomes21,25. In addition, the Group-IV-microbiomes showed higher genomic functional activities related to tetrapyrroles, one carbon and biotin metabolism, oxidative and osmotic stress, proteolytic pathways, MT1-MMP pericellular network, acetyl-CoA production, and motility and chemotaxis compared to the microbes of the other groups. The energy and one-carbon metabolism are associated with certain common central pathways which are universally present across in many bacterial species, despite the wide variety of growth substrates and conditions they utilize59. Evidence of environmental stress conditions prevailing within the bioreactors suggested that microorganisms underwent to oxidative stress corroborating our present findings60. Gut microbiota has considerable proteolytic power, converting ingested dietary protein. Identified bacterial species particularly Clostridia, Streptococci, Staphylococci, and Bacillus showed higher proteolytic activity which differed both in quantity and quality of protein degradation in different stage anaerobic digestion, and these findings are in line with many earlier studies61. Though, these findings also support the taxonomic dysbiosis of microbiomes in Group-IV metagenome, however further comprehensive study is needed to elucidate the modulation of microbiome shifts, their functional potentials and genomic expression using a larger dataset. It is therefore imperative to carryout future research on biogas-producing microbiomes which will further help in the development and improvement of anaerobic digestion models and enhance AD efficiency and stability. Such research will benefit greatly from the continued improvement of the omics technologies, particularly metaproteomic and metametabolomics, and standardized methods of bioinformatics analyses.

Conclusions

The level of biogas production increased gradually up to Day 35 (highest CH4 concentration), and declined thereafter under controlled environment of the AD. The physicochemical analysis showed significant positive correlation with the level of energy production and digester pH, O2 level, and environmental temperature. In the AD microbiomes, Firmicutes, Bacteroidetes, Fibrobacteres, Spirochaetes, Actinobacteria, Candidatus Cloacimonetes, Proteobacteria, and Thermotogae were the dominant bacterial phyla while Euryarchaeota was the only predominant archaeal phylum. The indicator species analysis revealed that microbiomes of Group-III and Group-IV, for instance, M. vacuolate, D. mccartyi, Methanosarcina sp. Kolksee and M. barkeri were highly specific species for energy production. The correlation network analysis of the indicator species showed that different strains of Euryarcheota and Firmicutes phyla were negatively correlated but associated with highest level of energy production. The AD microbiomes harbored different ARGs of which genes coding for tetracyclines (doxycycline and tetracycline), macrolides (erythromycin and streptogramin B) and beta-lactams resistance remained over expressed. Additionally, the physicochemical properties of the digester including pH, CH4 concentration (%), pressure, temperature and environmental temperature were found to be positively correlated with these genomic functional potentials and distribution of ARGs and metal resistance pathways. Taken together, a high-level abundance of the dominant and indicator microbial communities, associated genomic functional potentials, and their correlation with the physicochemical parameters of the AD were found to be driving factors for high proportion of the energy production.

Methods

Digester setup and experiment design

The experiment was conducted using an AD plant prepared with provisions to measure temperature, slurry and gas sample collection and substrate charging. The biogas plant consisted of a digester, inlet-chamber, three slurry outlet pipes, gas outlet pipe and thermometer (Fig. S1). The volume of the AD was 3000 L and made of flexible polyvinyl chloride (PVC) fabric with thickness: 1.2 Mm. There was a specialized ball valve which ensures the anaerobic condition within the digester and control the flow of substrates. The temperature of the AD was monitored using a probe. The experiment was conducted for 45 days (Day 0 to Day 44). The AD was loaded 12 times throughout the experiment period with a total of 1,087 kg raw CD (highest input volume = 375 kg and lowest input volume = 35 kg) (Table S1). Initially, the digester was started with the highest loading dose of 375 kg feedstock where the ratio of raw CD and active sludge was 1:1. The raw semi-solid CD was mixed with seed sludge from previous biogas plant (slurry) before charged into the AD. The AD was portable, light in weight, low-priced and retained more heat inside.

Sample collection and physicochemical parameters analysis

The representative samples (n = 16) including CD, slurry and active sludge (AS) were collected and stored for subsequent analysis. The samples were categorized into four groups (Group-I, Group-II, Group-III and Group-IV) based on collection time (Day 0 to Day 35) and level of energy production (CH4%). The samples of Group-I (n = 2) were collected at Day 0 (day of first input) when the CH4 concentration of the digesta was 0.0% with an average pH of 5.44. Likewise, Group-II samples (n = 5) were collected at Day 2 and Day 7 of the digestion process when the CH4 concentration and pH of the digesta were 21 and 5.44%, and 34 and 5.56%, respectively. The sampling of the Group-III (n = 5) was done at Day 10 and Day 27 of the digestion process having the CH4 concentration and pH of the digesta 47.4 and 5.97%, and 58.2 and 6.87%, respectively. The Group-IV included samples (n = 4) collected at Day 34 and Day 35 of digestion when the CH4 concentration and pH of the digesta were 71.4 and 6.99%, and 74.1 and 7.01%, respectively (Table 1). Therefore, highest CH4 production (74.1%) was recorded at Day 35 of the digestion process. Thereafter, the CH4 concentration gradually declined (up to Day 44), and thus, no samples were collected during this stage (Day 36–Day 44) for metagenomic analysis. The TN content was measured by micro-Kjeldahl method62 while phosphorus, potassium, heavy metals (Lead, Zinc, Nickel, Cadmium, Chromium), OC, Sulphur and moisture content were determined by spectrophotometric molybdovanadate63, flame photometric64, atomic absorption spectrophotometric63, wet oxidation, turbidimetric and gravimetric methods from the Department of Soil Science, University of Dhaka. The detection limit of metals was of the order of 0.1 μg L−1 30.

Metagenomic DNA extraction, sequencing and bioinformatics analysis

We extracted the total genomic DNA from 16 samples (Data S1) using an automated DNA extraction platform with DNeasy PowerSoil Kit (QIAGEN, Germany) according to manufacturer’s instructions. Extracted gDNA was quantified and purity checked through NanoDrop (ThermoFisher, USA) with an absorbance ratio of 260/280. Shotgun metagenomic (WMS) libraries were prepared with Nextera XT DNA Library Preparation Kit (Hoque et al., 2019), and paired-end (2 × 150 bp) sequencing was performed on a NovaSeq 6000 sequencer (Illumina Inc., USA) from the Macrogen Inc. (www.macrogen.com) Seoul, Republic of South Korea. The gDNA from sixteen samples generated 380.04 million reads, and the average reads per sample was 23.75 million (maximum = 24.79 million, minimum = 20.75 million) (Data S1). The low-quality reads from the generated FASTQ files were filtered and removed through BBDuk (with options k = 21, mink = 6, ktrim = r, ftm = 5, qtrim = rl, trimq = 20, minlen = 30, overwrite = true) (Stewart et al. 2018). In this study, on an average 21.45 million reads per sample (maximum = 23.75 million, minimum = 18.85 million) passed the quality control step (Data S1).

Microbiome analysis and AMR profiling

The taxonomic assignment of the generated WMS data was performed using both mapping based and open-source assembly-based hybrid methods of PahtoScope 2,0 (PS)65 and MG-RAST 4.0 (MR)66. In PS analysis, the NCBI Reference Sequence Database (NCBI RefSeq Release 201) for bacteria and archaea was prepared by Kraken267. The reads were then aligned against the target (RefSeq) libraries using Minimap268, and filtered to remove the reads aligned with the cattle genome (bosTau8) and human genome (hg38) using BWA69 and samtools70. In PS analysis, we employed the PathoID module to get exact read counts for downstream analysis65. We simultaneously uploaded the raw sequences to the MR server with proper metadata. In MR analysis, the uploaded raw reads were subjected to optional quality filtering with dereplication and host DNA removal, and finally annotated for taxonomic assignment. Less than 100 hits from PS taxonomy were filtered for the downstream analysis. Read normalization in each sample was performed using median sequencing depth through Phyloseq (version 4.0) package in R. The within sample (alpha) diversity of microbial communities was calculated using the Shannon and Simpson diversity indices71 through the “Vegan” package in R. To evaluate alpha diversity in different groups, we performed the non-parametric test Kruskal–Wallis rank-sum test. The diversity across the sample groups (Beta-diversity) was measured with the principal coordinate analysis (PCoA) using Bray–Curtis dissimilarity matrices, and permutational multivariate analysis of variance (PERMANOVA) with 999 permutations to estimate a p-value for differences among groups72. Phyloseq and Vegan packages were employed for these statistical analyses73. Dominant microbial community were also determined where ≥ 1% read in at least one samples were considered. After filtering there were 43 taxa remained in 16 samples. Spearman’s correlation analyses between dominant microbiotas and physicochemical parameters were performed in rcorr function of Hmisc74 and corrplot function of corrplot75 R package.

Indicator species specific to a given sample group (having ≥ 1000 reads assigned to a taxon) were identified based on the normalized abundances of species using the R package indicspecies76, and the significant indicator value (IV) index was calculated by the 999-permutation test. Larger IV indicates greater specificity of taxa and p < 0.05 was considered statistically significant76. Network analysis was used to explore the co-occurrence patterns of the energy producing bacterial and archaeal taxa across the metagenome groups. In addition, the Spearman’s correlation coefficient and significance tests were performed using the R package Hmisc. A correlation network was constructed and visualized with Gephi (ver. 0.9.2). We detected the antibiotics resistance genes (ARGs) among the microbiomes of four metagenomes through the ResFinder 4.0 database (https://bitbucket.org/genomicepidemiology/resfinder/src/master/). The ResFinder database was integrated within AMR++ pipeline (https://megares.meglab.org/amrplusplus/latest/html/v2/)77 to identify the respective antimicrobial resistance genes and/or protein families26.

Functional profiling of the microbiomes

In addition to taxonomic annotations, the WMS reads were also mapped onto the Kyoto Encyclopedia of Genes and Genomes (KEGG) database78, and SEED subsystem identifiers, respectively, on the MR server66 for metabolic functional profiling. The functional mapping was performed with the partially modified set parameters (e-value cutoff: 1 × 10−30, min. % identity cutoff: 60%, and min. alignment length cutoff: 20) of the MR server26.

Statistical analysis

To evaluate differences in the relative percent abundance of taxa in AD (Group-I, Group-II, Group-III and Group-IV) for PS data, we used the non-parametric test Kruskal–Wallis rank sum test. Pairwise Pearson’s correlation was performed to reveal the association between each of the two physicochemical parameters of the AD and microbial alpha diversity characteristics. We normalized the gene counts by dividing the number of gene hits to individual taxa/function by total number of gene hits in each metagenome dataset to remove bias due to differences in sequencing efforts. The non-parametric test Kruskal–Wallis rank sum tests were also performed to identify the differentially abundant SEED or KEGG functions (at different levels), and antimicrobial resistance (ARGs) in four metagenomes. All the statistical tests were carried out using IBM SPSS (SPSS, Version 23.0, IBM Corp., NY USA). To calculate the significance of variability patterns of the microbiomes (generated between sample categories), we performed PERMANOVA (Wilcoxon rank sum test using vegan 2.5.1 package of R 3.4.2) on all four sample types at the same time and compared them pairwise. A significance level of alpha = 0.05 was used for all tests. Spearman’s correlation of dominant microbial community, KEGG pathways and SEED subsystem functional data to physical parameters of the AD were performed in Hmisc74 and corrplot75 R packages as described above.