Permafrost, perennially frozen ground ranging in age from two to over 700,000 years old [1], is a significant carbon reservoir, storing 1330–1580 Pg or nearly half the world’s soil carbon [2, 3]. However, rising global temperature is leading to rapid permafrost thaw and release of greenhouse gases (GHGs), resulting in a positive feedback to warming [2]. Current thaw projections under the IPCC’s (Intergovernmental Panel on Climate Change) representative concentration pathway 8.5 predict that between 30 and 99% of near surface (<3.5 m) permafrost will disappear by 2100 [4]. Within this century, 37–174 Pg of carbon is projected to be released to the atmosphere as carbon dioxide (CO2) and the more potent greenhouse gas methane (CH4), due to the microbial degradation of newly thawed soil [2].

Methane flux in thawing permafrost is controlled by microbial methane producers (methanogens) and consumers (methanotrophs), however the environmental factors controlling the distribution of these microorganisms are poorly understood. The next key step towards improving methane emissions models is the incorporation of microbial community data [5,6,7], which is currently limited by knowledge of the population structure and activity of these communities in the changing natural environment. Aerobic and anaerobic methanotrophs can prevent a large proportion of methane from reaching the atmosphere, with oxidation estimates of 20–60% recorded in permafrost-associated environments [8, 9]. In these systems, aerobic methanotrophs are typically responsible for ameliorating methane release [10], as anaerobic methanotroph abundances are generally below detection levels with very low rates of methane oxidation [11].

Aerobic methanotrophs oxidise methane to methanol using the particulate (pMMO) and/or soluble (sMMO) methane monooxygenase enzymes [12], and are currently restricted to four major lineages, including well-studied members of the Gamma- and Alphaproteobacteria, the intra-aerobic Candidatus Methanomirabilis oxyfera within the NC10 phylum [13], and the verrucomicrobial genera Methylacidiphilum [14, 15] and Methylacidimicrobium [16]. Gammaproteobacterial methanotrophs are organised into 18 characterised genera within the families Methylococcaceae and Methylothermaceae [17,18,19], whereas alphaproteobacterial methanotrophs are less diverse comprising five genera within the Methylocystaceae and Beijerinckiaceae [18, 20]. The environmental distribution of aerobic methanotrophs is typically examined using the pmoA gene, which encodes subunit A of pMMO. However, Methyloferula and Methylocella within the Beijerinckiaceae lack pMMO, so mmoX, encoding subunit A of the sMMO, must be used as a complementary marker gene [21]. Environmental surveys of pmoA and mmoX sequences have revealed a large diversity of potential methanotrophs outside the cultivated lineages [18]. This novel diversity includes the alphaproteobacterial pmoA group, upland soil cluster alpha (USCα), which is found predominantly in aerobic soils [18], and has recently been identified in permafrost-associated systems [22, 23]. Members of this group have a high affinity for methane and are capable of oxidising methane from the atmosphere [24,25,26]. Atmospheric methane oxidisers are responsible for the uptake of ~30 Tg of methane (~5% of the total sink) per year, and are the only known biological sink [27,28,29].

Our understanding of permafrost methanotrophs has focused on intact permafrost where activity is low [10, 22, 30, 31], or artificially thawed incubations [22, 32, 33], with no naturally thawing gradient sites studied to date. Here, the methanotroph communities at Stordalen Mire are examined through metagenomics, metatranscriptomics and paired biogeochemical data, across an environmental thaw gradient, peat depths (surface, mid, deep; spanning the top ~50 cm), and time (September 2010–August 2012). Characterising the presence, genomic potential and activity of methanotrophs across the Stordalen Mire thaw gradient is a significant step towards elucidating the wider role of methanotrophs in the changing Arctic and subarctic environments, and their impact on the global carbon cycle.

Materials and methods

Study site and sample preparation

Sampling of the thaw chronosequence at Stordalen Mire (68° 21′ N, 19° 03′ E, 359 m a.s.l.), porewater and methane flux measurements, and nucleotide extractions and sequencing of DNA were conducted as previously described [34,35,36,37]. Metatranscriptome sequencing was performed for select 2010, 2011 and 2012 samples using 240 ng of RNA in the ScriptSeq Complete (Bacteria) low-input library preparation kit (Epicentre). These samples were run on 1/8th of an Illumina NextSeq lane, with initial shallow runs conducted on 1/11th of a HiSeq (Illumina) and MiSeq (Illumina) lane (see Supplementary Information for details).

Genome assembly and binning

All methanotroph population bins were within the set of 1529 bins recovered in ref. [37], except MB1 (see Supplementary Information for details). Population genome MB1 was derived from a co-assembly of 78 palsa samples. Assembly of these samples was conducted using CLC bio’s de novo assembler (QIAGEN, CA), mapping was conducted using BamM v1.3.8-1.50 (Imelfort & Lamberton et al., http://ecogenomics.github.io/BamM), and differential coverage binning was conducted using MetaBAT [38] v3127e20aa4e7 using a 3000 bp contig cut-off. CheckM [39] v1.0.4 determined completeness and contamination of the population genome bins through the identification and quantification of lineage-specific single copy marker genes following the lineage workflow (lineage_wf).

Genome bins were placed in a reference genome tree containing genomes from NCBI (database 2015-11-27) for taxonomic classification and evolutionary analysis using an in-house pipeline, GTDB v0.0.3 (Chaumeil & Parks, https://github.com/Ecogenomics/GTDBNCBI) (Supplementary Information). ARB [40] v6.0.6 was used to visualise the tree, and ITOL [41] was used to refine, with additional cosmetic amendments made in Inkscape (http://inkscape.org). The reported genome relative abundance is the abundance of each lineage within the total community (Supplementary Information).

Methane monooxygenase phylogenetic analysis and data searching

The gene-centric methanotroph analysis using pmoCAB genes and mmoX genes was conducted using Mingle v0.0.10 for tree creation (Parks et al., https://github.com/Ecogenomics/mingle) and GraftM v0.8.1 for data searches[42] (Supplementary Information). The methanotrophy protein sequences and manually annotated protein trees were used as input for the function ‘graftM create’, producing GraftM packages by creating hidden Markov models (HMMs) from an alignment of the sequences. GraftM ‘graft’ used this annotated tree for classification of gene sequences based on translated query read placement. The protein trees were made for phylogenetic analysis using Mingle and incorporated the identified Stordalen proteins, a selection of sequences greater than 160aa that were translated from sequences compiled in Knief [18], along with the contigs for novel groups from Lau et al. [22], Ricke et al. [26] and He et al. [43] when applicable (Supplementary Tables 1215). Additional protein trees for MxaF/XoxF, RuBisCO (CbbL/RbcL) and NifH were created as above using Mingle, NCBI sequences and the IMG release 4.1 database (Supplementary Tables 1618), respectively. GraftM gene packages for pmoA and mmoX were run on the raw metagenomes and metatranscriptomes and the results were normalised by total library size and HMM length. GraftM ‘graft’ was run on the NCBI nr database (downloaded in February 2016), and 24,636 SRA files (including amplicon and metagenome files, downloaded March 2017) in order to determine the geographic distribution of the Hyphomicrobiaceae pmoA genes. Where possible, GNU parallel [44] was used to speed data processing.

Metatranscriptomic read processing

Metatranscriptomes were processed using part of the TranscriptM (Frouin et al., https://github.com/elfrouin/transcriptM) pipeline, using Trimmomatic [90] and SortMeRNA [45] for in silico data cleaning (Supplementary Information). Dirseq v0.0.2 (Woodcroft et al., https://github.com/wwood/dirseq) (parameter --ignore-directions) determined the average transcript coverage of the methanotroph genes. Metatranscriptome expression per genome for each sample was calculated using ‘BamM parse -m counts’ (Supplementary Information).

Biogeochemical analysis

The pmoA and mmoX gene abundances from the metagenome data were averaged across sampling dates into five or 10 cm depth categories and compared to biogeochemical data from the porewater collected on site at the time of sampling, except for the palsa site, which lacked porewater. Measurements included dissolved methane (mM), and methane isotope data (δ13C-CH4), following a previously described procedure [34, 35]. The percent time that each depth category spends below the water table was calculated from manual water table measurements made three to five times per week during the sampling season. Sulphate concentrations were measured at Florida State University (Tallahassee, FL, USA) by ion chromatography on a Dionex ICS-1100 fitted with a 4-mm IonPac AS22 column, with an eluent of 4.5 mM carbonate/1.4 mM bicarbonate and flow rate of 1.2 mL/min (Supplementary Table 5). Methanotroph abundance to water table depth correlations, statistical differences in sulphur concentrations and methanotroph community composition between sites were conducted using ANOVA and paired t-tests in R [46].

Methanotroph population metabolism reconstruction and average amino acid identity

Metabolic reconstructions were facilitated using PROKKA [47] annotations, and submission through the IMG ER genome annotation pipeline [48]. In house pipelines were used to blast the coding regions of each genome to the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology database [49], and key methanotrophy and other metabolic genes were highlighted for incorporation into the metabolic models. BLASTP [50] was used to validate annotation of the gene sequences detected by GraftM. Hydrogenases were found and annotated using HydDB [51]. Metabolic models were created in Inkscape (http://inkscape.org). Average amino acid identity (AAI) facilitated evolutionary distance comparisons with the closest cultured isolates, or between Stordalen genomes. AAI was determined using the CompareM (Parks et al., https://github.com/dparks1134/CompareM) v0.0.17 ‘aai_wf’ on the genome bins, and genomes from IMG. Average nucleotide identity (ANI) was determined using calculate_ani.py (Pritchard, https://github.com/widdowquinn/scripts).

Results and discussion

Identification and distribution of canonical and novel methanotrophs across the thaw gradient

To investigate methanotroph diversity and the succession of the community as thaw progresses from palsa, to bog and fen, 188 raw metagenomes were searched for pMMO (pmoA gene) and sMMO (mmoX gene) reads using GraftM [42]. GraftM classified the reads into taxonomic groups following placement in a curated pmoA or mmoX gene tree (Fig. 1). The bog had the highest abundance of methanotrophs and was dominated by the well-characterised Methylocystaceae. Several members of the Methylocystaceae are suited to soil and bog environments exhibiting low pH and nitrogen, oligotrophy, or variable methane levels where the ability to oxidise atmospheric methane using a high affinity isozyme of the pMMO (pMMO2) is advantageous [52, 53]. In the minerotrophic and pH neutral fen, members of the Methylococcaceae comprised the largest fraction of the low abundance but taxonomically diverse methanotroph community. Methylocystaceae and Methylococcaceae methanotrophs occasionally possess pMMO, with the canonical operon structure of pmoCAB, in addition to a divergent monooxygenase of undetermined function, pXMO, encoded by a reordered pxmABC operon [54]. The pxmA gene was detected in the palsa, fen and bog, indicating the presence of multiple methanotrophic lineages encoding the divergent pXMO. Beijerinckiaceae possessing mmoX appeared in low abundance in both fen and bog (Fig. 1).

Fig. 1
figure 1

Methanotroph diversity across the Stordalen Mire thaw gradient. a Heatmap of the relative abundance of methanotrophs as a proportion of the total metagenome based on the particulate methane monooxygenase (pMMO; left) and the soluble methane monooxygenase (sMMO; right). Abundances are indicated by the coloured scale (from white, to blue, to red). The trees used by GraftM to classify the reads are shown for pmoA (b) and mmoX (c). The colour of the clades indicates the environment where these clades are most often found (green = bog, blue = fen). Asterisks indicate significantly different abundances based on one-way ANOVA tests and Bonferroni corrected pairwise t-tests (p value < 0.05; see Supplementary Table 6)

Of the uncharacterised methanotroph clades, USCα had greatest abundance in the bog and was the predominant methanotroph in the palsa. Minimal methane is produced in the oxic palsa environment [35], so methane is likely taken up from the atmosphere. Reads from two additional novel groups, ‘novel pmo’ (pmoA) and ‘novel smo’ (mmoX), were found only in the fen (Fig. 1). The ‘novel pmo’ reads clustered alongside a collection of sequences recently recovered from a Sacramento-San Joaquin Delta wetland metagenome [43]. The ‘novel smo’ group has not previously been recognised.

Interestingly, methanotrophs were often present below the water table in the deep (bog: >5 cm below water table, fen: >10 cm below peat surface) and extra deep (>30 cm below peat surface) metagenomes (Fig. 1), suggesting adaptations to cope with low oxygen conditions. The palsa remains mainly oxic to the depths sampled, whereas the bog includes an oxic surface and then likely becomes primarily anoxic below the water table (maximum depth 20 cm) [55]. Although the fen water table is above the peat surface, sedge roots are known to transport oxygen, which likely supports methanotrophs near live roots [56, 57].

Recovery of methanotroph population genomes

To recover the genomes of canonical and novel methanotrophs, assemblies of the 214 Stordalen Mire metagenomes (188 active layer, and 26 palsa core series) were binned into population genomes using differential coverage binning [37]. This led to the recovery of 1529 medium to high-quality population genomes (>70% completeness, and <10% contamination [37]). Of these genomes, 12 were found to encode pMMO and/or sMMO. An additional genome was recovered from a co-assembly of the palsa samples (n = 78). Phylogenetic analysis of the genomes using a genome tree created from the concatenated alignment of 120 single copy marker genes classified all 13 as proteobacterial. Three of these genomes belonged to the Methylococcaceae (Methylobacter-like MB1-2; Methyloglobulus-like MG1), seven to the Methylocystaceae (Methylocystis-like MC), two to the USCα (USC1 and USC2) and surprisingly one to the Hyphomicrobiaceae (HYP1) (Fig. 2). Six of the Methylocystaceae genomes were recovered in the mid or deep core layers of different bog samples. These genomes had ANI of >99% indicating that they represent the same Methylocystis-like population and will be referred to as MC1 accordingly (Supplementary Fig. 1). One Methylocystis-like genome (MC2) was recovered from a palsa sample, and represented a distinct population at 94% ANI to MC1.

Fig. 2
figure 2

Genome tree of the methanotroph population genomes recovered from Stordalen Mire. Solid circles represent bootstrap support of over 70%. Heatmap bars indicate average relative abundance per environment (P = palsa (brown), B = bog (green), F = fen (blue)) as a percentage of the total metagenome library (Supplementary Information). Raw values and standard deviation are presented in Supplementary Table 8. The highest abundances are observed for MC1 (1.85% in a deep bog sample) and USC1 (1.6% in a mid-depth bog sample)

USC1 and USC2 are the first genomic representatives of the USCα lineage [25], as former representation of this group was restricted to partial pmoA sequences and a collection of short (<43 kb) genome fragments [22, 26]. The previous hypothesis that these microorganisms fall within the Beijerinckiaceae was confirmed by placement of USC1 and USC2 in a genome tree [26] (Fig. 2). Average amino acid identity (AAI) of USC1 and USC2 was 72% to their closest isolated taxonomic neighbours, the Methylocapsa spp. [26] (Supplementary Table 1), making it likely these populations belong to a novel species or potentially novel genus within the Beijerinckiaceae [58]. The USCα PmoCAB protein sequences derived from the genomes (USC1 and USC2), and additional sequences from unbinned contigs, clustered with both partial and full length translated USCα pmoA, pmoB and pmoC gene sequences recovered from the active layer of mineral cryosols in the Canadian high Arctic [22] and an acidic forest soil [26] (Fig. 3; Supplementary Fig. 2).

Fig. 3
figure 3

Phylogenetic tree of PmoA proteins recovered from Stordalen Mire. This tree is constructed from isolate-derived protein sequences (colour strip = black), Stordalen sequences (colour strip: palsa = brown; bog = green; fen = blue) and translated environmental sequences compiled by Knief [18] (colour strip = grey), with additional sequences from He et al. [43], Ricke et al. [26] and Lau et al. [22] (colour strip = grey). The asterisks indicate sequences that are within population genomes

An additional potential methanotroph genome (HYP1) was recovered from a fen sample, and was distinct from all canonical methanotroph families. Phylogenetic analysis revealed Rhodomicrobium spp. within the Hyphomicrobiaceae as the closest isolated taxonomic neighbours with an AAI of 67.5% (to both R. vannielii and R. udaipurense), indicating likely taxonomic distinction at the genus level (Fig. 2; Supplementary Table 1) [58]. Interestingly, HYP1 encodes novel methanotrophy operons for both pMMO and sMMO. Contamination from poor binning was discounted as these genes were also found in an additional medium-quality Hyphomicrobiaceae genome (HYP2; 60.53% completeness, 5.54% contamination) recovered from a deep fen sample with 80% AAI to HYP1 (Supplementary Figs. 3 and 4). Protein trees showed that HYP1 PmoA and PmoB sequences, and several additional PmoAB  sequences from partial genomes and unbinned contigs, clustered closely with the wetland ‘novel pmo’ clade that appears basal to the Methylocystaceae and Beijerinckiaceae (Fig. 3; Supplementary Fig. 2a). Full-length PmoC sequences of this ‘novel pmo’ group have not been recovered until now, and were found to be similarly basal to the Methylocystaceae and Beijerinckiaceae PmoC proteins (Supplementary Fig. 2b). The MmoX from HYP1 and additional unbinned MmoX sequences clustered within the ‘novel smo’ group that formed a monophyletic clade with the Beijerinckiaceae (Supplementary Figs. 5 and 6). Recovery of methanotrophy marker genes in HYP1 suggests that the Hyphomicrobiaceae should be added to the recognised alphaproteobacterial methanotrophic families which were previously limited to the Beijerinckiaceae and Methylocystaceae. To investigate other possible habitats of HYP1, publicly available SRA (Sequence Read Archive) datasets (24,636 sequence files) and the NCBI nr database were searched for pmoA sequences falling within the HYP1 clade. Only one sequence from NCBI nr was identified as belonging to HYP1, and was from a Finnish peat soil (NCBI accession: CAC84777). However, HYP1 was detected (at >1 read hits) in 34 SRA metagenomes from a variety of habitats, including Sacramento-San Joaquin Delta wetland, New York City MTA subway, peat, bog and sand metagenomes (Supplementary Table 2), suggesting that HYP1 is a rare but widely distributed group.

Metabolic characterisation of Stordalen Mire methanotrophs

Metabolic reconstruction of the Stordalen Mire methanotroph genomes revealed diverse functional potential. Methane oxidation pathways, carbon assimilation, dissimilation and other potentially habitat-specific metabolisms were examined in the gammaproteobacterial Methylobacter-like (MB) populations MB1, MB2 and Methyloglobulus-like (MG) population MG1 recovered from the fen samples. MB1 and MB2 encode pMMO (pmoCAB), however MB2 also has sMMO (mmoXYZBCD). MG1 has only pxmA and B that appear consecutively at the end of a contig, indicating that pxmC and the pmoCAB operon are likely in the unrecovered portion (21%) of the genome. MG1 also encodes a hydroxylamine reductase (HAO) allowing for the conversion of hydroxylamine to nitrite, which is likely a toxicity response measure for this population [59]. Methane monooxygenases can co-oxidise ammonia to hydroxylamine, an intermediate toxic to cells unless further processed by an enzyme such as HAO [60]. MB1, MB2 and MG1 all possess the ribulose monophosphate (RuMP) pathway for carbon assimilation, the tetrahydromethanopterin (H4MPT) carbon dissimilation pathway for the oxidation of formaldehyde to formate and nitrogen fixation genes (nifHDK). MB1 also has dissimilatory nitrate reduction, which has been identified in the gammaproteobacterial methanotrophs Methylomicrobium, Methylomonas and Methylobacter tundripaludum [61]. This ability appears to be widespread within the Methylococcaceae, and has been linked to survival in low oxygen conditions where nitrate or nitrite can be used instead of oxygen as terminal electron acceptors in order to direct all available oxygen to methane oxidation [61]. Metabolic reconstruction of MB1-2 and MG and the observed distribution of Methylococcaceae in the metagenomes suggest that these microorganisms can function in the microaerobic environment of the deeper permafrost thaw layers (Fig. 1).

The alphaproteobacterial Stordalen methanotrophs had larger genomes than their gammaproteobacterial counterparts, and had an expanded metabolic diversity indicative of a facultative, rather than obligate, methanotrophic lifestyle (Supplementary Table 4). Metabolic analyses were conducted on the most complete representatives of the alphaproteobacterial lineages, USC1, HYP1 and MC1 (Fig. 4). Similar to USC1 and HYP1, MC1 encodes pMMO genes (pmoCAB) but further possesses a pMMO2 (pmoCAB2) variant that is associated with high affinity methane oxidation [52]. MC1 also encodes the pXMO (pxmABC) homologue, as does USC1. This discovery adds Beijerinckiaceae to the list of pXMO-encoding methanotroph families, and explains the high relative abundance of pxmA genes observed in the palsa environment (Fig. 1).

Fig. 4
figure 4

Metabolic reconstruction of the alphaproteobacterial population genomes MC1, HYP1 and USC1. Colours indicate the genome or combination of genomes (Venn diagram) in which the cycle or enzymes are found. Abbreviations: H4F tetrahydrofolate pathway, H4MPT tetrahydromethanopterin pathway, TCA tricarboxylic acid cycle, EMC ethylmalonyl-CoA pathway, EMP Embden–Meyerhof–Parnas pathway (glycolysis), CBB Calvin–Benson–Bassham cycle, PHB polyhydroxybutyrate pathway, LPS lipopolysaccharide, CODH carbon monoxide dehydrogenase, NiFe nickel iron hydrogenase, nitrogenase (NifHDK), pMMO particulate methane monooxygenase, pMMO2 particulate methane monooxygenase isozyme II, sMMO soluble methane monooxygenase, pXMO homologue of particulate methane monooxgyenase, CH3OH methanol, I complex I NADH dehydrogenase, II complex II succinate dehydrogenase, III complex III cytochrome bc1, IV cytochrome c oxidase, IV cbb3 complex IV cytochrome cbb3 oxidase, cyd complex IV cytochrome bd oxidase, FDH formate dehydrogenase, EHR energy converting hydrogenase related (part of formate hydrogenlyase complex), SO42− sulphate, MoO42− molybdate, Zn zinc, PO43− phosphate, CHOH formaldehyde, sulphate adenylyltransferase (SatAB), APS adenosine 5'-phosphosulphate, SO32− sulphite, adenylylsulphate reductase (AprAB), dissimilatory sulphite reductase (DsrAB), H2S hydrogen sulphide, dissimilatory sulphite reductase electron transport complex (DsrK representing the DsrKMJOP complex), SoxYZ/SoxAX/SoxB/SoxCD = sulphur oxidising proteins. See Supplementary Fig. 7 for detailed gene presence/absence and Supplementary Table 3 for the list of additional abbreviations

Genes for the calcium-dependent methanol dehydrogenase (MDH) mxaF-encoded large subunit of the mxaFI-MDH complex, which catalyses the second step of methane oxidation, could not be identified in any of the Stordalen Mire methanotroph genomes. Instead, at least one copy of xoxF, a homologue of mxaF [62], was identified in each genome except MG1. This gene differentiates into five clades (xoxF1-5)[62, 63] and encodes the lanthanide-dependent xoxF–MDH complex known to oxidise methanol in several methanotrophs [62, 64]. The H4MPT carbon dissimilation and tetrahydrofolate (H4F) carbon assimilation pathway, which facilitates entry into the serine cycle [63], are present in USC1 and MC1. Alternatively, HYP1 appears to have a thiol-dependent (glutathione (GSH)-linked) pathway for formaldehyde oxidation to formate using glutathione synthase (gfa), S-(hydroxymethyl)glutathione dehydrogenase (frmA) and S-formylglutathione hydrolase (frmB) genes [63, 65]. This pathway is present as a detoxification and energy generation mechanism in the methylotroph Paracoccus denitrificans [63, 66]. Similar to P. denitrificans and several other alphaproteobacterial methylotrophs [62, 67], in HYP1 these genes are located directly adjacent to a PQQ-dependent alcohol dehydrogenase identified as the xoxF5 type (Supplementary Fig. 8; Supplementary Fig. 9) [62]. This suggests that formaldehyde is produced by the xoxF-MDH of HYP1 and then oxidised to formate using the thiol-dependent pathway.

The ethylmalonyl-CoA (EMC) pathway is used to regenerate glyoxylate from acetyl-CoA for use in the serine cycle [68], and was present in MC1 and HYP1 but not USC1 (Fig. 4). USC1, similar to the Beijerinckiaceae methanotrophs [20], possesses isocitrate lyase (icl), which forms part of the glyoxylate shunt with malate synthase and allows the formation of glyoxylate from isocitrate [68]. The glyoxylate cycle enables the assimilation of acetate, and acetyl-CoA can be produced from acetate using acetate kinase (ack) and phosphotransacetylase (pta) or acetyl-CoA synthetase (acs), which are present in USC1. These genes, in conjunction with demonstrated acetate uptake in isotope studies [69], implicate USCα populations as likely facultative methanotrophs. Additionally, MC1 and HYP1 possess genes involved in transforming acetate to acetyl-CoA (ack, pta, acs), suggesting these populations are also facultative. A facultative methanotrophic lifestyle likely allows USCα and MC1 to survive on acetate, which can diffuse freely across the cell membrane in the acidic conditions of the bog, under methane limited conditions [70].

Although USC1, HYP1 and MC1 contain a near complete EMP pathway, the lack of glucose transporters suggests these populations do not use glucose as a carbon and energy source. Transporters are missing in known methanotrophs and growth of methanotrophs on glucose has not been recorded [20], indicating that this pathway is anabolic in the Stordalen Mire genomes. Poly-β-hydroxybutyrate (PHB) is a storage compound commonly used as a carbon and energy source by methanotrophs under nutrient limiting conditions [71]. Genes for the PHB storage pathway (phaABCZ, bdh, acsA) are present in all three genomes.

HYP1 may have additional and unusual means of energy generation. A complete dissimilatory sulphate reduction pathway was identified, including the canonical markers of dissimilatory sulphite reductase subunits A and B (dsrAB) (Fig. 4). These reductases have high AAI to oxidative dsr genes of the purple non-sulphur photolithotrophic bacteria Rhodomicrobium vannielii (83% WP_013418283) and R. udaipurense (86% KAI93440) [72, 73]. Like Rhodomicrobium spp., HYP1 possesses an incomplete sulphur oxidation system (SOX; soxAXYZD), however sulphur oxidation could still be possible as Rhodomicrobium spp. can use sulphide and thiosulphate [74]. The potential for sulphur oxidation in a methanotroph was recently described in a Methylococcaceae genome from a deep-sea hydrothermal plume [75]. Biogeochemical measurements from the site show that the fen, particularly the fen surface, has significantly higher sulphate concentrations than the bog (Supplementary Table 5; fen surface average = 10.44 µM, average fen = 6.25 µM, bog = 1.83 µM; p value < 10−4), potentially indicating increased sulphur oxidation activity. In the fen, it may be possible for HYP1 to oxidise sulphate that has been reduced during anaerobiosis in the deeper layers [76].

In addition to sulphur, Rhodomicrobium spp. can use hydrogen as an electron donor [74]. HYP1, USC1 and MC1 encode numerous hydrogenases, indicating potential hydrogen use (Fig. 4). Hydrogen uptake and production has been recorded in several verrucomicrobial, alpha- and gammaproteobacterial methanotrophs, and has been linked to nitrogen fixation, which produces the required hydrogen as a by-product [77, 78]. Both HYP1 and USC1 also encode coxLMS for carbon monoxide (CO) oxidation. The ability of these microorganisms to perform CO oxidation could allow CO to be used as both an energy and carbon source [79]. Stable-isotope probing or successful culturing would be required to confirm this metabolism. However, atmospheric CO levels, or CO produced by the photochemical degradation of matter, are sufficient for carboxydotrophs in many soil environments [79]. Several hydrogenotrophic [80], methanotrophic [81] and carbon monoxide oxidising [79] bacteria are known to use these respective pathways to generate energy for carbon fixation using the Calvin–Benson–Bassham (CBB) cycle.

Ribulose bisphosphate carboxylase (RuBisCO) is the key enzyme and genetic marker for the CBB cycle. HYP1 encodes RuBisCO form IV, a RuBisCO-like protein of unknown function [82], and form II (cbbM). Form II is known to operate under high CO2 and low O2 concentrations [82, 83], which would suit fixation in the deeper fen layers (Supplementary Fig. 10). Of the other genomes, MC1 possesses only form IV, whereas USC1, similar to the characterised methanotrophic Beijerinckiaceae [84], encodes the catalytic form I (both cbbL and S; Supplementary Fig. 11). HYP1, MC1 and USC1 are also predicted to be capable of nitrogen fixation, as phylogenetic analysis of the NifH proteins suggests they belong to the functional Type 1D clade (Supplementary Fig. 12). Given the wide distribution of nitrogen fixation genes in the Stordalen Mire methanotrophs, it appears that this metabolism may be a selective advantage for methanotrophs in the nitrogen limited thaw environment.

Metatranscriptomic analysis of in situ activity

Gene presence in metagenomes reveals the metabolic potential of the community but does not indicate which genes are active at the time of sampling. In order to investigate the activity of methanotrophs in the system, expression of key genes for methane oxidation (pmoA and mmoX) were examined in 24 metatranscriptomes spanning different sites and depths. Despite high abundance in the bog metagenomes, low relative transcript expression was observed for the Methylocystaceae (Fig. 5). Low activity of Methylocystaceae has been recorded in other environments [85], and is likely a consequence of harsher environmental conditions in the low pH and ombrotrophic bog that would favour stress-tolerance over productivity [23, 86]. In the fen, the diversity of methanotrophs was not reflected in the metatranscriptomes, which were dominated by Methylococcaceae even in the deeper layers (Fig. 5). This unexpected prevalence of Methylococcaceae activity at depth has been observed in Arctic soils, which further supports hypotheses that these microorganisms are active under microaerophilic conditions [61, 87]. Methylococcaceae pmoA transcripts in the fen also comprised a greater proportion of the total metatranscriptome reads than in the bog or palsa, indicating that methanotrophs make up more of the active community in the fully thawed site. This suggests that Methylococcaceae are oxidising a substantial amount of CH4 and/or responding to increased CH4 availability in the fen, which has the highest CH4 flux of all three sites [35].

Fig. 5
figure 5

Methanotroph abundance and activity in the 24 samples with paired metagenomes and metatranscriptomes from Stordalen Mire. For spatial orientation, distance from the water table and peat surface is shown in a. The metagenome abundances are indicated in b and the transcript expression in c. Methanotroph pmoA and mmoX read abundances are presented as a percentage of total reads normalised by HMM length for both metagenomes and metatranscriptomes

Consistent with previous findings from isolates [68] and similar environments [23, 43, 85], very few pxmA or mmoX transcripts were detected in the Stordalen Mire metatranscriptomes. The ‘novel pmo’ and ‘novel smo’ of the HYP1 group had minimal or no transcript expression (12 ‘novel pmo’ reads detected in one surface fen sample, and 25 ‘novel smo’ reads in one mid fen sample). USCα had very low pmoA expression in the palsa core mid depth sample only (112 reads detected), which fits with low expression observed in samples of an Arctic cryosol environment [22]. Transcript expression of the second step of methanotrophy, methanol oxidation to formaldehyde, was high for xoxF, with virtually no representation in the metatranscriptome reads of mxaF (Supplementary Fig. 13). This strongly suggests that the xoxF form of MDH is the most important complex for methanol oxidation at Stordalen Mire, which supports recent findings that this form is likely dominant in lanthanide non-limiting environments [64]. Further, research into methylotroph and methanotroph co-cultures has revealed that high mxaF expression over xoxF can be linked to cross-feeding and syntrophy between these microorganisms [88]. The dominance of xoxF–MDH expression at Stordalen suggests that lanthanide is non-limiting in this environment, and potentially that methanol is not being secreted for use by syntrophic partners.

Mapping metatranscriptomes to the population genomes enabled transcription analysis of entire metabolic pathways for specific lineages, and revealed the functionality of methanotrophs beyond the expression of marker genes for primary methane oxidation. The highest relative transcript expression was 0.27% for the MC1 population genome in a bog mid-depth sample, and 0.24% for MC2 in a palsa deep (calculated as proportion of total transcript reads in the metatranscriptome; Supplementary Table 7). MC1 and MC2 had high transcript expression for a variety of genes in the bog, and some expression in the fen and one palsa sample, confirming active carbon assimilation (Supplementary Fig. 14). Activity through transcript expression was also observed for MB1 and MB2 in the fen, which had high transcript coverage of pmoCAB genes and downstream methane processing genes (Supplementary Fig. 14). xoxF was consistently expressed in MC1-2 and MB1-2, indicating use of the xoxF–MDH for methanol oxidation in these populations. Surprisingly, MC1-2 and MB1-2 appeared to be actively fixing nitrogen through high expression of nifH, reaffirming the importance of this function in environments with low nitrogen availability [23]. The novel metabolic inferences could not be confirmed for MG1, HYP1 and USC1 due to limited transcript expression across all metatranscriptome samples. It is likely that these populations were inactive, or below detection, at the time of sampling.

Relationship between methanotrophs and biogeochemistry

Previous work at the site revealed that average CH4 flux increases as thaw progresses, from minimal emission in the palsa, to ~1.46 mg CH4 m2/h and ~8.75 mg CH4 m2/h in the bog and fen, respectively [35]. Here, methanotroph metagenome abundances, determined using the pmoA and mmoX genes, were analysed alongside depth resolved biogeochemical data from the sites to evaluate relationships between community and methane oxidation. At the fen and bog sites the concentration and δ13C signature of dissolved CH4 was analysed in porewater samples collected in parallel with the peat samples. The δ13C signature of CH4 is influenced by the combined effect of the methanogenic pathway (acetoclastic versus hydrogenotrophic) that produced the CH4, and the activity of the methanotrophs [35, 89]). The highest abundance of methanotrophs occurred in the region of the peat profiles where there was maximum dissolved CH4, but still periodic oxygen availability either due to varying water tables in the bog [57] or root transport in the fen [56] (bog = 15–20 cm, fen = 10–15 cm; Fig. 6).

Fig. 6
figure 6

Depth profiles of palsa, bog and fen methanotroph community abundances (as pmoA and mmoX gene reads normalised by total library and HMM length) and relationship to water table, dissolved CH4 concentration and δ13C signature (no porewater present at the palsa site). Metagenome and porewater chemistry data from 2011 and 2012 was averaged across all dates by depth category; error bars represent ± 1 standard error. This shows a link between methanotroph abundances and dissolved CH4 concentration across the thaw gradient. At the bog site, δ13C-CH4 patterns track the depth distribution of the methanotroph community, with the heaviest (most oxidised) CH4 occurring just above the maximum water table depth where methanogen populations are highest and the lightest (least oxidised) CH4 occurring in permanently inundated peat where methanotroph abundances are low

Since hydrogenotrophic methanogens were dominant in the bog samples [35], most of the variation in δ13C-CH4 at this site was likely due to variation in methane oxidation (not variation in production pathway), with less negative δ13C-CH4 values indicating the preferential use of lighter 12CH4 by methanotrophs, and consequently greater CH4 oxidation. While methanotroph transcript expression in the bog revealed no depth associated trends (Fig. 5), comparison of methanotroph abundances and δ13C-CH4 suggest that methanotrophy increased with depth across the region of peat that is periodically above the water table and decreased in the permanently inundated peat (Fig. 6). The heaviest (most oxidised) CH4 and the highest methanotroph abundances occurred in peat that is inundated >90% of the time (15–20 cm) (Fig. 6). This was surprising, as CH4 oxidation was expected to be greatest nearer the theoretical optimal oxygen and CH4 conditions of the oxic–anoxic boundary at the average water table depth (6 cm) [85]. Instead, it appears that CH4 concentration, which increases with depth, is the key driver of methanotroph community patterns and that a highly variable water table in the bog allows infrequent oxic events that provide sufficient oxygen to support specialised methanotroph populations.

In the fen, methanotroph abundances correlated with distance from the water table (Supplementary Fig. 15), however low variability in δ13C-CH4 indicated no clear isotopic evidence for activity of methanotrophs even though the highest methanotroph abundances were found between 10 and 15 cm, and Methylococcaceae were active (Figs. 5 and 6). This disconnection of abundances, activity and δ13C-CH4 may be related to the capacity of plant roots to provide substrates for methanogenesis within the peat, a conduit for oxygen into the peat and CH4 out of the peat, which could consequently bypass oxidation and disrupt clear methane oxidation gradients [57].

Conclusion

The methanotrophs found within the Stordalen Mire thaw gradient include canonical and novel populations that encode diverse metabolisms. The low CH4 environment of the palsa is populated by the likely facultative and high affinity atmospheric methane-oxidisers USCα. The Methylocystaceae (MC1 and MC2) in the bog can cope with fluctuating CH4 conditions due to the pMMO isozymes (pMMO and pMMO2) and capability for acetate uptake. In the high CH4 fen, the obligate Methylococcaceae (MB1 and MB2) are dominant and active, while the metabolically diverse HYP1 are in low abundance and appear relatively inactive. It is evident that an evolutionarily complex, diverse and shifting methanotroph community is at the forefront of climate change as permafrost thaws. The low abundance of USC1 and HYP1 precluded their enrichment and visualisation, however it is hoped that the metabolic inferences determined from these genomes will guide future efforts to target and eventually isolate these elusive microorganisms, and experimentally confirm the extent of their ecological impact.

Data availability

Data used in this manuscript are submitted under NCBI BioProject accession number PRJNA386568.