Introduction

Molecular hydrogen (H2) has several physical properties desirable for biological systems, notably its redox potential (E°’=−0.42 V) and diffusion coefficient (4 × 10−9 m2 s−1). Microorganisms are able to harness these properties by consuming and producing H2 using specialised metalloenzymes called hydrogenases (Schwartz et al., 2013). There are three phylogenetically unrelated classes of hydrogenase distinguishable based on the metal content of their H2-binding sites: the [NiFe]-, [FeFe]- and [Fe]-hydrogenases (Volbeda et al., 1995; Peters et al., 1998; Shima et al., 2008). H2 oxidation by such enzymes yields low-potential electrons that are transduced through respiratory chains or used to fix inorganic carbon. In contrast, H2 evolution efficiently dissipates excess reductant as a diffusible gas during microbial fermentation and photobiological processes (Schwartz et al., 2013). Certain hydrogenases are also part of low-potential ion-translocating complexes that use protons as terminal electron acceptors (Buckel and Thauer, 2013). Since the discovery of microbial H2 oxidation in the 1900s (Kaserer, 1906; Stephenson and Stickland, 1931), H2 metabolism has been observed in multiple bacterial, archaeal and eukaryotic phyla. It is increasingly recognised that H2 metabolism is important for a wide range of microorganisms: lithotrophs and phototrophs, respirers and fermenters and aerobes and anaerobes alike (Vignais and Billoud, 2007; Schwartz et al., 2013; Peters et al., 2014). Furthermore, it is widely hypothesised that H2 was the primordial electron donor, suggesting early and sustained evolutionary importance (Lane et al., 2010).

Several recent studies demonstrated that microbial H2 metabolism is more widespread than previously reported. It was recently shown that some aerobic soil actinobacteria and acidobacteria persist by scavenging H2 from the lower atmosphere (Constant et al., 2010; Greening et al., 2014, 2015a, b), overturning long-held beliefs that hydrogen metabolism is restricted to low O2, high H2 environments and highlighting the importance of H2 for survival in addition to growth (Greening and Cook, 2014). Biochemists have simultaneously elucidated mechanisms dependent on reversed electron flow that enable certain hydrogenases to function in the presence of the oxygen (traditionally an inhibitor of their active sites) (Fritsch et al., 2011; Shomura et al., 2011; Horch et al., 2015). In anaerobic systems, ultra-minimalistic hydrogenase-containing respiratory chains have been described that efficiently generate energy within oligotrophic environments (Kim et al., 2010; Lim et al., 2014). In parallel, the discovery of electron bifurcation has expanded our understanding of how energy is conserved in anaerobic processes such as cellulolytic fermentation, acetogenesis and methanogenesis (Schut and Adams, 2009; Kaster et al., 2011; Buckel and Thauer, 2013; Schuchmann and Muller, 2014). Other themes, including H2 sensing within anaerobes (Zheng et al., 2014) and H2 fermentation in aerobes (Berney et al., 2014), are emerging.

Despite this progress, there remains much to be discovered about microbial H2 metabolism on both the microscopic and macroscopic levels. Most studies on microbial H2 metabolism focus on only a few branches of the hydrogenase phylogenetic tree and a small subset of organisms within the universal tree of life. Physiological and biochemical characterisations have focussed on model organisms from within five phyla, Proteobacteria, Firmicutes, Cyanobacteria, Euryarchaeota and Chlorophyta (Schwartz et al., 2013; Lubitz et al., 2014). Furthermore, detailed biochemical information and atomic-resolution structures are available for only a subset of hydrogenases (Volbeda et al., 1995; Peters et al., 1998; Shima et al., 2008; Fritsch et al., 2011; Mills et al., 2013). Although the contribution of H2 metabolism to total ecosystem processes is recognised in some environments (for example, anoxic sediments, animal guts and hydrothermal vents; Vignais and Billoud, 2007; Schwartz et al., 2013), the role of hydrogenases in general soil and aquatic ecosystems remains largely unresolved (Barz et al., 2010; Constant et al., 2011; Beimgraben et al., 2014; Greening et al., 2015b). Consequently, the influence of H2 evolution and consumption on community structuring and global biogeochemical cycling requires further investigation (Schwartz et al., 2013; Greening et al., 2015b).

Hydrogenase gene surveys are vital for understanding microbial H2 metabolism at the global scale. Current knowledge on the evolution and diversity of hydrogenases relies heavily on the progressive surveys conducted by Wu and Vignais (Wu and Mandrand, 1993; Vignais et al., 2001; Vignais and Billoud, 2007); these studies revealed that the primary sequences and subunit architectures of [NiFe]- and [FeFe]-hydrogenases have diversified to enable them to adopt a wide range of physiological roles (whereas the [Fe]-hydrogenase is constrained to a single function). In the eight years following these studies (Vignais and Billoud, 2007), the emergence of sequencing technologies has resulted in the rapid expansion of genome and metagenome sequence data. Genomes are now available for a far greater range of organisms, spanning model laboratory specimens, representatives of dominant environmental phyla, and poorly described ‘Microbial Dark Matter’ (Wu et al., 2009; Rinke et al., 2013). Furthermore, metagenomes enable the metabolic capability of entire communities to be described in silico (Tringe et al., 2005; Morales and Holben, 2011; Wrighton et al., 2012). In this work, we used publicly available genome and metagenome resources to comprehensively analyse the distribution of hydrogenases. Our findings suggest that H2 metabolism is more diverse and widespread on both the taxonomic and community levels than previously reported.

Materials and methods

Hydrogenase sequence retrieval

Amino acid sequences of all non-redundant putative hydrogenase catalytic subunits represented in the National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) (Pruitt et al., 2007) and Joint Genome Institute (JGI) Microbial Dark Matter (MDM) (Rinke et al., 2013) databases were retrieved by Protein BLAST (Altschul et al., 1990) during August 2014. The retrieved sequences were verified as hydrogenase encoding through screening for the presence of conserved cysteine residues required to ligate H2-binding metal centres (L1 and L2 motifs for [NiFe]-hydrogenases (Vignais and Billoud, 2007); P1, P2 and P3 motifs for [FeFe]-hydrogenases (Vignais and Billoud, 2007); and Cys176 in [Fe]-hydrogenases (Shima et al., 2008)). The analysis omitted protein families homologous to [NiFe]-hydrogenases (Ehr/Mbx, NuoD), [FeFe]-hydrogenases (Narf/Nar1p) and [Fe]-hydrogenases (HmdII) that appear to lack the capacity to metabolise H2. Our analysis did not include nitrogenases, alkaline phosphatases, formate dehydrogenases and carbon monoxide dehydrogenases that have been shown to catalyse side-reactions resulting in H2 oxidation or evolution (Schwartz et al., 2013).

Hydrogenase classification and analysis

Protein sequences encoding the catalytic subunits ([NiFe]-hydrogenases, [Fe]-hydrogenases) or catalytic domains ([FeFe]-hydrogenases) of hydrogenases were aligned using the ClustalW (Larkin et al., 2007) and MUSCLE (Edgar, 2004) algorithms. Evolutionary relationships were analysed using neighbour-joining phylogenetic trees (Saitou and Nei, 1987) constructed with MEGA6 (Tamura et al., 2013). All trees were bootstrapped using 500 replicates and were rooted with ancestral sequences where available. The robustness of the analysis was confirmed by varying the number of ingroup sequences tested and the nature of the outgroup sequence used. The Microbial Genomic Context Viewer (MGcV) was used to compare genome regions encoding homologous hydrogenases (Overmars et al., 2013). Domains were predicted by searching the Conserved Domain Database (CDD) (Marchler-Bauer et al., 2011) and using multiple sequence alignments to identify signature conserved residues. WebLogo (Crooks et al., 2004) was used to visualise conserved metal-binding motifs forming the active sites and redox centres. Using a combination of the information derived from these methods, the [NiFe]- and [FeFe]-hydrogenases were further divided into phylogenetically distinct groups and subgroups.

Metagenome analysis

Metagenome sequence libraries derived from 10 ecosystems including soil (farmland, forest, permafrost, bog), gut (termite, human) and water (fresh water, hot spring, coastal upwelling, deep ocean) environments were identified from publicly available databases. For each ecosystem, two libraries were selected. All selected libraries were sequenced with paired-end reads on an Illumina (San Diego, CA, USA) platform and represented read sizes of between 201 and 280 nucleotides. BLAST (Altschul et al., 1990) analyses were performed using a local BLAST database containing the protein sequences of the catalytic subunit ([NiFe]-hydrogenases, [Fe]-hydrogenases) or catalytic domain ([FeFe]-hydrogenases) of all sequenced hydrogenases (curated as described above). Low complexity regions for all reference sequences were masked using the SEG algorithm (Wootton and Federhen, 1996) of BLAST+ (Camacho et al., 2008) and a reference BLAST database was created. All metagenome libraries were randomly subsampled to an equal depth (1 million reads) and read length >201 nucleotides before analyses. A translated BLAST screening of all subsamples was performed using blastx (word size 3 and e-value 10). To minimise false positives, hits within the initial screen were sieved by removing any result with a minimum percentage identity 60% and minimum query coverage 40 amino acids. Identified reads for each class are recorded as relative percentage abundance.

Results

An expanded hydrogenase classification scheme predictive of biological function

The first aim of this work was to identify and classify all putative hydrogenases represented in public databases. Initially, we retrieved non-redundant sequences encoding the catalytic subunits of all hydrogenases in the NCBI database and verified that 3286 of them contained the sufficient residues required to bind their metal centres (Supplementary Tables S1 and S2). In order to develop the classification scheme (Table 1), we correlated the phylogenetic clustering of the hydrogenases with functional information and predictors. For all hydrogenases, we analysed: (1) primary phylogeny to determine their evolutionary relationships (Figure 1), (2) metal- and cofactor-binding motifs to predict redox centres (Table 2), (3) genetic organisation to identify probable partner proteins (Figure 2) and (4) previous literature reports to probe biochemical characteristics and physiological roles (Table 1). Integrating this information, we were able to classify hydrogenases into multiple groups and subgroups/subtypes likely to have distinct cellular functions.

Table 1 New scheme for the classification of hydrogenases
Figure 1
figure 1

Classification and phylogeny of hydrogenases. These neighbour-joining skeleton trees show the phylogenetic relationships of all 3286 hydrogenases identified in this work. The trees are colour coded by [NiFe]-hydrogenase subgroup and [FeFe]-hydrogenase group. The nodes separating the major clades are encircled and coloured according to their bootstrap values, that is, black circles for well-supported nodes (bootstrap values >0.75) and red circles for unsupported nodes (bootstrap values <0.75). Group A [FeFe]-hydrogenases cannot be reliably subdivided phylogenetically and can only be classified into subtypes based on their genetic organisation. The expanded trees, including taxon names and bootstrap values, are shown in Supplementary Figures S1 to S6.

Table 2 Metal centres of hydrogenases
Figure 2
figure 2

Genetic organisation of hydrogenases. The genes surrounding the catalytic subunit of representatives of each subtype/subclass are shown to-scale. Genes/domains are colour coded as follows: green=catalytic site; blue=small subunit; yellow=electron acceptor or donor; red=redox subunit; light orange=maturation factor; dark orange=ion-translocation module; purple=regulatory module; grey=conserved hypothetical. Redox-active centres are shown in circles, where: orange=heme; red=[4Fe4S] cluster; yellow=[2Fe2S] cluster; green=[3Fe4S] cluster; purple=[4Fe3S] cluster. Genes are named according to nomenclature if previously defined. There are often variations in the genetic organisation within subgroups, for example, cytochrome c subunits replace cytochrome b subunits in most group 1a and 1b [NiFe]-hydrogenases in δ-Proteobacteria. However, the organisations depicted reflect the most common organisation, as inferred using the Microbial Genomic Context Viewer.

All hydrogenases could be classified into eight previously described major lineages (Vignais and Billoud, 2007; Calusinska et al., 2010): groups 1 to 4 [NiFe]-hydrogenases, groups A to C [FeFe]-hydrogenases and [Fe]-hydrogenases. However, pre-existing classification schemes did not sufficiently reflect the variety in the functions of the experimentally studied hydrogenases within most of these groups. For example, existing schemes do not account for the great heterogeneity in primary sequence phylogeny, genetic organisation and physiological roles of the group 1 and group 4 [NiFe]-hydrogenases (Vignais and Billoud, 2007), as well as the group A [FeFe]-hydrogenases (Calusinska et al., 2010). In addition, the group 2 [NiFe]-hydrogenases of recently sequenced Aquificae and methanotrophs form distinct lineages from the presently recognised 2a and 2b subgroups (Vignais and Billoud, 2007). We therefore expanded the [NiFe] enzymes into 22 functionally distinct subgroups (groups 1a to 1h, 2a to 2d, 3a to 3d and 4a to 4f) and the group A [FeFe]-hydrogenases into four subtypes (groups A1 to A4). A description of the hydrogenase subgroups/subtypes defined here is provided in Table 1.

Functional diversity of hydrogenases is reflected in phylogenetic clustering, genetic organisation and metal-binding motifs

The hydrogenase classification scheme was developed primarily on the basis of amino acid sequence phylogeny (Figure 1). Our analysis shows that the [NiFe]-hydrogenases are the most diverse and widespread of the hydrogenases, and can be initially divided into H2-uptake (groups 1 and 2), bidirectional (group 3) and H2-evolving (group 4) clades. On the basis of phylogeny, these enzymes can be further divided into 22 subgroups (Figure 1; Supplementary Figures S1–S4) predicted to have distinct physiological roles described in Table 1. These subgroups are monophyletic, well populated (with more than 15 unique sequences) and statistically supported (with bootstrap values above 0.75), with the exceptions of certain group 4 clades (see Supplementary Figure S4 legend). On the basis of phylogeny, three [FeFe]-hydrogenase groups were defined: a main group represented by fermentative and bifurcating hydrogenases (group A), an ancestral group of unknown function (group B) and a group containing putative sensory hydrogenases (group C) (Figure 1). However, we were unable to subdivide these enzymes further by phylogeny alone, owing to poor bootstrapping of the group A subclades (Supplementary Figure S5) and lack of functional information on the group B and C enzymes.

The genetic organisation of the hydrogenases also serves as a reliable indicator of function between subgroups. As detailed in Figure 2, genes encoding hydrogenase structural components are proximal to those encoding diverse electron-transfer proteins, ion-translocating subunits, regulatory components, maturation factors, hypothetical proteins and partner enzymes. Although many of these associations have been previously described (Vignais and Billoud, 2007; Schwartz et al., 2013), novel findings included the association of surprising regulatory components (for example, diguanylate cyclases/phosphodiesterases) with putative group C [FeFe]-hydrogenases and group 2c [NiFe]-hydrogenases. Genome architecture is well conserved within subgroups of [NiFe]-hydrogenases, but varies extensively between enzymes of different functions (Figure 2). Genome architecture is also able to discriminate the sometimes poorly bootstrapped lineages of the group 4 enzymes (Supplementary Figure S4) and therefore is a valuable hydrogenase classification tool. Variations in the domain organisation and probable quaternary structure of the [FeFe]-hydrogenases were also apparent. On this basis, group A [FeFe]-hydrogenases can be subdivided into four functionally distinct subtypes: stand-alone enzymes (group A1) and those associated with putative glutamate synthases (group A2), NADH dehydrogenases (group A3) and formate dehydrogenases (group A4) (Figure 2).

As detailed in Table 2, the content of the hydrogenase metal centres also differed between subgroups, further confirming phylogenetic placements. Cysteine residues that bind the metal ions of the catalytic centres of the three types of hydrogenase are conserved, suggesting the chemical structures of the active site do not vary. However, the neighbouring residues in the motifs binding the [NiFe]-centre (L1, L2) and [FeFe]-centre (P1, P2, P3) vary between subgroups that may influence the catalytic behaviour of the enzymes. The number, configuration and ligands of the iron–sulphur clusters of the [NiFe]-hydrogenase small subunits differ between subgroups (Table 2). Nonstandard ligands for iron–sulphur clusters, that is, Asp, Glu, Asn and His, were common in the clusters proximal and distal to the active site. The number of iron–sulphur clusters associated with the [FeFe]-hydrogenase catalytic domain also varies (Table 2 and Figure 2).

The determinants of hydrogen metabolism are widely distributed in bacterial and archaeal genomes

By curating the sequences of all hydrogenases in public sequence databases, we were able to comprehensively map the distribution and diversity of hydrogenases across sequenced microorganisms. Genes encoding putative hydrogenases were detected in 1397 species (Supplementary Table S3) across 55 phyla (Supplementary Table S4) (note that not all species within these phyla contain hydrogenases). The [NiFe]-hydrogenases were the most widespread of the enzymes, occurring in 36 bacterial and 6 archaeal phyla. Consistent with our current knowledge, they were widespread in all classes of Proteobacteria, as well as Firmicutes, Cyanobacteria, Aquificae, Euryarchaeota and Crenarchaeota. [NiFe]-hydrogenases were also common in multiple phyla where hydrogenases have yet to be described, notably Bacteroidetes, Chlorobi, Chloroflexi, Planctomycetes and Verrucomicrobia. Putative group A1 [FeFe]-hydrogenases were detected in 12 phyla of anaerobic bacteria, 5 phyla of unicellular eukaryotes and, surprisingly, the single-amplified genome associated to the archaeal candidate phylum Diapherotrites (pMC2A384). Other types of [FeFe]-hydrogenases, including bifurcating, ancestral and sensory varieties, were exclusive to anaerobic bacteria such as Firmicutes, Bacteroidetes, Spirochaetes, Thermotogae and Fusobacteria. Both [NiFe]- and [FeFe]-hydrogenase genes were also identified within most genomes of newly characterised phyla, including the databases of the MDM project (Figure 3). In contrast, genes encoding the functionally restricted [Fe]-hydrogenases were exclusively found within 25 methanogen genomes (Supplementary Figure S6). Phylogenetic distribution correlates with oxygen preference: obligate anaerobes encoded O2-sensitive [NiFe]- and [FeFe]-hydrogenases; obligate aerobes encoded O2-tolerant [NiFe]-hydrogenases (groups 1d, 1h, 2a, 3b, 3d); and genomes of facultative anaerobes contained a diverse range (Figure 3). Many hydrogenases have a mosaic distribution that poorly reflects 16S rRNA gene sequence phylogeny, for example, the abundant group 1d and group 3b [NiFe]-hydrogenases (Supplementary Figures S1 and S3), suggesting strong pressure for lateral acquisition of these enzymes.

Figure 3
figure 3

Distribution of hydrogenases in microorganisms. (a) Distribution by hydrogenase type. (b) Distribution by phyla. The cells are shaded by the number of hydrogenases detected in each phyla (light=few hydrogenases, dark=many hydrogenases, grey=no hydrogenases). Hydrogenases were subdivided into the following seven types based on their determined or predicted functions: [NiFe] aerobic uptake (groups 1d, 1h, 2a) [NiFe] anaerobic uptake (groups 1a, 1b, 1c, 1e, 1f, 1g, 3a), [NiFe] bidirectional (groups 3b, 3c, 3d), [NiFe] evolving (groups 4a, 4b, 4c, 4d, 4e, 4f), [FeFe] evolving (groups A, B), [NiFe] regulatory (groups 2b, 2c) and [FeFe] regulatory (group C).

The determinants of microbial hydrogen metabolism are widely distributed in ecosystems

We subsequently gained insight into the distribution of hydrogenases at the ecosystem level through identifying and analysing hydrogenase sequence reads in 20 publicly available metagenomes (Supplementary Table S5). Sequence reads corresponding to the catalytic subunits/domains of hydrogenases were detected in all metagenome samples analysed, and all 22 [NiFe]-hydrogenase subgroups and 3 [FeFe]-hydrogenase groups (but not the methanogen-specific [Fe]-hydrogenases) were detected in at least two samples each. The normalised abundance of hydrogenase reads ranged approximately 50-fold, from <0.001% of the total sequence reads in lake and coastal waters to >0.04% reads in permafrost soils and hot springs (Supplementary Table S5). The distribution of hydrogenase-encoding genes in the sequence reads vary depending on the aeration state of the samples: oxic agricultural and forest soils were dominated by aerobically adapted uptake and bidirectional [NiFe]-hydrogenase reads (groups 1d, 1h, 3b, 3d); anoxic termite and human guts contained a high abundance of fermentative and putative sensory [FeFe]-hydrogenase reads (groups A, B, C); and the bog soils, permafrost soils and hot springs contained diverse [NiFe]- and [FeFe]-hydrogenase reads (Figure 4 and Supplementary Figure S7). Hydrogenase-encoding genes were far less abundant within the aquatic ecosystems tested compared with soil and enteric systems; nevertheless, numerous sequence reads homologous to the Robiginitalea biformata hydrogenase of the still-uncharacterised group 1g [NiFe]-hydrogenases were detected in deep ocean samples. The quantity and distribution of hydrogenase reads was consistent between paired metagenome samples, including samples from equivalent ecosystems taken at different locations (for example, Atlantic Ocean vs Indian Ocean) (Figure 4). An interesting exception was the termite gut samples, with hydrogenases predicted to be 10 times more abundant in a gut sample of an African termite (Cubitermes sp.) compared with an American termite (Nasutitermes sp.) (Supplementary Table S5).

Figure 4
figure 4

Distribution of hydrogenases in ecosystems. The distribution of different hydrogenase types was analysed in 20 metagenomes. Hydrogenases were subdivided into seven types as described in the legend of Figure 3. Metagenomes were screened using the sequences of the catalytic subunits ([NiFe]-hydrogenases, [Fe]-hydrogenases) or catalytic domains ([FeFe]-hydrogenases) listed in Supplementary Table S1. (a) Percentage of sequence reads for each hydrogenase type identified within 1 million random metagenome reads. (b) Percentage of sequence reads for each hydrogenase type compared with total hydrogenase sequence reads. Supplementary Figure S5 shows the metagenome distribution by [NiFe]-hydrogenase subgroup and [FeFe]-hydrogenase group. Note that no [Fe]-hydrogenases were detected in these metagenomes.

Discussion

Molecular hydrogen is a major electron donor for respiration in both anoxic and oxic ecosystems

Molecular hydrogen occurs ubiquitously in the environment, as a result of production from biological, geothermal and atmospheric sources (Schwartz et al., 2013). Our analysis suggests microorganisms are capable of respiring this fuel source in a wide variety of ecosystems, ranging from the hypoxic H2-enriched environments of animal guts and bog soils to aerated soils and waters containing trace concentrations of H2. Group 1 and 2 [NiFe]-hydrogenases that mediate respiratory H2 uptake were encoded in 19 of the 20 ecosystems surveyed (Figure 4), and were widely distributed in the bacterial and archaeal phyla (Figure 3). As summarised in Table 1, these enzymes have differentiated into multiple subgroups that differ in their redox couplings, oxygen tolerance, affinities and cellular interactions. This enables these enzymes to support life across a wide range of ecological niches (Pandelia et al., 2012; Schwartz et al., 2013).

Integrating our analysis with the wider literature, we suggest oxygen partial pressure (pO2) principally drove the evolution and distribution of respiratory hydrogenases. Phylogenetic analysis reveals the deepest-branching forms of these enzymes (groups 1a, 1b) are O2 sensitive and mediate anaerobic respiration in strictly anaerobes (Supplementary Figures S1 and S2). Such enzymes are abundant in hypoxic soils (for example, bog soils, permafrost soils) (Supplementary Figure S7), and are predominantly found in the genomes of anaerobic Firmicutes and δ-Proteobacteria (Supplementary Figure S1) capable of H2-dependent sulphate reduction, metal reduction and dehalorespiration (Schwartz et al., 2013). The mid-branching heterotetrameric hydrogenases (groups 1c, 1e, 1g) were more taxonomically dispersed (Supplementary Figure S1) and appear to support roles in fumarate and nitrate respiration, anoxygenic photosynthesis and chemolithoautotrophy across a diversity of taxa (Pandelia et al., 2012). In contrast, the more recently branching lineages (groups 1d, 1h and 2a) appear to be oxygen-tolerant enzymes that mediate respiration in aerobes and facultative anaerobes (Table 1). These enzymes were predominant in aerated samples (Figure 4), with 0.004 to 0.009 % of total metagenome sequence reads in agricultural and forest soils corresponding to the group 1h enzyme (Supplementary Table S5). Our analysis suggests that these subgroups independently developed mechanisms to tolerate O2 following the emergence of oxygenic photosynthesis (Figure 1); consistently, the well-reported proximal 6Cys[4Fe3S] cluster (Fritsch et al., 2011) is exclusive to 1d enzymes (Table 2), indicating the 1h and 2a subgroups use alternative mechanisms (possibly using other nonstandard clusters) to prevent or reverse formation of O2-inhibited states.

Oxygen-tolerant uptake hydrogenases are significantly more widespread than the literature currently reports. Aerobic H2 uptake has only been reported in three dominant soil phyla to date: many α-, β- and γ-Proteobacteria (for example, Ralstonia eutropha; Schwartz et al., 2013) can grow chemolithoautotrophically using biologically evolved H2, whereas certain model Actinobacteria (for example, Mycobacterium smegmatis; Greening et al., 2014) and Acidobacteria (that is, Pyrinomonas methylaliphatogenes; Greening et al., 2015a) enhance their persistence by scavenging atmospheric H2. However, Figure 3 reveals that the group 1d, 1h, and 2a hydrogenases mediating such processes are also encoded in some 17 bacterial and archaeal phyla, among them representatives of all 9 of the most dominant phyla in global soils (Janssen, 2006). Most significantly, the group 1h [NiFe]-hydrogenases that mediate tropospheric H2 oxidation are encoded in multiple representatives of undercultured, slow-growing phyla (that is, Acidobacteria, Verrucomicrobia, Chloroflexi and Planctomycetes). These findings are consistent with our recent hypothesis that H2 serves as an energy source for the maintenance of dormant soil bacteria (Greening et al., 2015b). Hydrogenase-encoding genes were also identified in the genomes of multiple seemingly obligate methane oxidisers, ammonia oxidisers and nitrite oxidisers (Supplementary Table S1), suggesting H2 may serve as a fuel source for growth or survival of these bacteria and archaea. In line with this, it was recently demonstrated that Nitrospira moscoviensis of the phylum Nitrospirae is capable of hydrogenotrophic growth using a group 2a [NiFe]-hydrogenase (Koch et al., 2014). Aerobic H2 oxidation may therefore provide hitherto-unrecognised metabolic flexibility in microorganisms controlling the methane and nitrogen cycles.

Determinants of fermentative hydrogen production are universally distributed

Since its discovery in the early twentieth century (Stephenson and Stickland, 1932), it has been widely believed that fermentative H2 evolution occurs exclusively in anaerobic microorganisms. This notion was recently challenged by the discovery that the obligately aerobic soil bacterium Mycobacterium smegmatis evolves H2 using a tightly regulated hydrogenase to maintain redox balance under hypoxia (Berney et al., 2014). The surveys presented in this work demonstrate that homologues of the group 3b [NiFe]-hydrogenase mediating mycobacterial H2 evolution actually have the most extensive distribution (phylum level) of all the subgroups in our database. Once thought to be confined to anaerobic archaea (Ma et al., 1993), these enzymes actually occur in at least 27 bacterial and archaeal phyla, among them multiple representatives of the ‘MDM’ (Rinke et al., 2013) (Figure 3). In addition to the versatile group 3d [NiFe]-hydrogenases (Burgdorf et al., 2005), these oxygen-tolerant enzymes are proposed to serve as redox valves that interconvert electrons between NAD(P)H and H2 depending on the availability of exogenous electron acceptors (Greening and Cook, 2014). These enzymes are also abundant at the metagenome level, constituting dominant groups in aerated soils and hot spring ecosystems (Figure 4). These findings may help to explain why the communities of such ecosystems are relatively stable despite pO2 fluctuations. Unsurprisingly, the classical determinants of H2 fermentation were abundant in anoxic ecosystems (Figure 4). Figure 3 shows that formate hydrogenlyases (group 4a [NiFe]-hydrogenases) are widespread in enteric bacteria that adopt a facultatively fermentative lifestyle. The group A1 [FeFe]-hydrogenases, which mediate ferredoxin-dependent H2 production (Peters et al., 1998), are distributed in numerous obligately fermentative bacteria (for example, clostridia), eukaryotes containing hydrogenosomes (for example, Trichomonas vaginalis) and unicellular algae mediating photobiological H2 production (for example, Chlamydomonas reinhardtii). On the basis of domain conservation and phylogenetic similarity, we predict the still-uncharacterised group B [FeFe]-hydrogenases serve a similar function.

Energy-converting and electron-bifurcating complexes enhance the efficiency and flexibility of anaerobe-type hydrogenases

Although group 4 [NiFe]-hydrogenases are traditionally known for their roles in fermentation, the majority of these enzymes have a respiratory function. They associate into complexes comprising primary dehydrogenases and terminal hydrogenases and conserve the energy liberated during electron transfer as a proton- or sodium-motive force (Buckel and Thauer, 2013). Our analysis shows these enzymes have retained roles in anaerobic microorganisms, especially Firmicutes, Proteobacteria (γ, δ and ɛ classes) and methanogens (Figure 3), and contribute to hydrogenase diversity in metagenomes (Figure 4). They appear to have diverse physiological roles, as reflected by their wide-branching phylogeny (Figure 1) and highly modular genetic organisation (Figure 2). This enables them to liberate electrons from low-potential donors, namely formate (group 4a, 4b, possibly 4f), carbon monoxide (group 4b, 4c) or ferredoxin (group 4d, 4e), whereas protons serve as the terminal electron acceptor. Though minimalistic, the respiratory chains they form are often highly efficient and may provide a primary strategy for energy generation within particularly oligotrophic environments; this was emphasised by the recent discovery of a complex in the deep-sea vent archaeon Thermococcus onnurineus that sustains growth across a narrow energy bracket by transferring electrons from formate to protons (Kim et al., 2010; Lim et al., 2014). Others are highly flexible, as demonstrated by the multifaceted roles of the physiologically reversible Ech hydrogenase (group 4e) in hydrogenotrophic vs aceticlastic methanogenesis (Meuer and Kuettner, 2002). Our phylogenetic analyses suggest that the ancestral forms of the group 4 enzymes—and likely [NiFe]-hydrogenases as a whole—may have been formate-oxidising, H2-evolving, energy-transducing complexes. We discovered a deep-branching lineage of these enzymes in Firmicutes (candidate group 4f [NiFe]-hydrogenases) that align closely with the functionally cryptic Ehr complexes (homologues of group 4 [NiFe]-hydrogenases lacking Ni-binding cysteine residues; elaborated on in Marreiros et al., 2013) and yet possess the critical cysteine residues required for [NiFe]-centre ligation.

Many of the anaerobe-type hydrogenases we identified are predicted to mediate electron bifurcation, a recently discovered third mode of energy conservation. Electron-bifurcating hydrogenases are bidirectional enzymes that energise the endergonic reaction of the reduction of ferredoxin with H2 by simultaneously reducing a relatively electropositive acceptor (for example, heterodisulphide, NAD, NADP) (Buckel and Thauer, 2013). The group 3c [NiFe]-hydrogenase in functional complex with heterodisulphide reductase, for example, simultaneously reduces ferredoxin and heterodisulphide during H2 oxidation (Kaster et al., 2011); these enzymes complete the recently elucidated Wolfe cycle of methanogenesis (Thauer, 2012), and are also distributed in some bacteria (for example, δ-Proteobacteria) (Figure 4). The group A3 [FeFe]-hydrogenases reversibly bifurcate electrons from H2 to ferredoxin and NAD using trimeric or tetrameric complexes; in the reverse reaction, energy conserved during the oxidation of ferredoxin is used to drive the thermodynamically unfavourable production of H2 from NADH (Schut and Adams, 2009; Schuchmann and Müller, 2012). A subtype of the group A4 [FeFe]-hydrogenases can also bifurcate electrons from H2 to NADP and ferredoxin, and act physiologically in hexameric complexes with formate dehydrogenase (Wang et al., 2013). We analysed the genetic organisation of the 705 group A [FeFe]-hydrogenases represented in our database in order to identify putative electron-bifurcating complexes (Figure 2 and Supplementary Table S1). This analysis suggested that whereas putative NADP-dependent bifurcating complexes are rare (7 sequences), putative NAD-dependent bifurcating complexes are very abundant in anaerobic bacteria (391 sequences). The group A3 [FeFe]-hydrogenases are highly flexible, capable of both dissipating excess reductant during fermentation (for example, cellulose fermentation) and generating reduced electrons for carbon fixation (for example, acetogenesis) and respiration (via the sodium-motive ferredoxin-NAD oxidoreductase complex) (Buckel and Thauer, 2013; Schuchmann and Muller, 2014). Supported by PCR amplicon sequencing (Zheng et al., 2013), metagenome analysis (Figure 4) demonstrates that group A [FeFe]-hydrogenases, including probable bifurcating hydrogenases, are abundant in termite guts.

Hydrogen sensing may be more important than previously recognised

Our exploration of hydrogenase sequences across a diversity of environments also uncovered evidence that hydrogen-based signal transduction cascades are more significant than previously anticipated. The only characterised sensory hydrogenases to date are the group 2b [NiFe]-hydrogenases (for example, Ralstonia eutropha, Rhodobacter capsulatus) (Lenz and Friedrich, 1998; Vignais et al., 2005); these enzymes have adapted the [NiFe] active site to sense high partial pressures of H2 and in turn activate two-component regulatory cascades that control expression of respiratory hydrogenases (Greening and Cook, 2014). Our analysis shows these enzymes are restricted to Proteobacteria (α, β and γ classes) (Figure 3 and Supplementary Figure S2) and are present in soil environments characterised by logarithmic variations in pH2 (Figure 4 and Supplementary Figure S7). We identified a sister lineage, the group 2c [NiFe]-hydrogenases, that appear to be co-transcribed with diguanylate cyclases/phosphodiesterases (Figure 2). Through modulation of cyclic di-GMP production, we hypothesise these enzymes regulate global cellular functions during adaptation to H2-rich vs H2-deprived environments; currently, however, H2-dependent signal cascades have only been shown to regulate the expression of other hydrogenases. Our genome and metagenome surveys suggest these enzymes are rare (Figures 3 and 4), and are primarily found in methane-oxidising bacteria and sulphate-reducing bacteria that inhabit aquatic environments (Supplementary Figure S2). The presence of helix-turn-helix protein-encoding genes immediately downstream of group 2 [NiFe]-hydrogenase genes in some Aquificae and Crenarchaeota (Supplementary Figure S2) is also suggestive of a regulatory role and requires further study.

Looking more widely, it is probable that the group C [FeFe]-hydrogenases of anaerobic bacteria have a sensory role. Our conserved domain analysis suggests these enzymes are expressed or fused with putative regulatory components, namely serine/threonine phosphatases, histidine kinases, AAA+-type transcriptional activators, methyl-accepting chemotaxis proteins and again diguanylate cyclases/phosphdiesterases (Figure 2). As with the group 2b and 2c [NiFe]-hydrogenases, the operons encoding these hydrogenases contain predicted PAS domains (Figure 2) that likely transduce the signal of hydrogenase activity to downstream components via a redox-active heme. There is some transcriptional evidence that the putative phosphatase-linked sensory hydrogenases of Thermoanaerobacterium saccharolyticum (Shaw et al., 2009) and Ruminococcus albus (Zheng et al., 2014) regulate the transcription of group A [FeFe]-hydrogenases, but it has yet to be biochemically confirmed that these enzymes have a regulatory role. Other sensory hydrogenases may regulate wider cellular functions (for example, motility) in response to changes of pH2 in anoxic environments. Group C [FeFe]-hydrogenases are abundant in strictly anaerobic bacteria of the phyla Firmicutes, Bacteroidetes, Spirochaetes and Thermotogae. These hydrogenases are highly abundant in termite guts and strongly associated with group A [FeFe]-hydrogenases (Supplementary Figure S7).

Conclusions

The surveys reported here suggest that hydrogenases are highly diverse, ancient and widespread. Our work collectively supports the hypothesis that H2 serves as a widely utilised energy source for microbial growth and survival. Before this study, it was already well established that H2 metabolism played major roles in certain specific microorganisms and ecosystems (Schwartz et al., 2013). However, by comprehensively surveying the distribution of hydrogenases, we have provided evidence that microbial H2 metabolism is significantly more extensive and elaborate than previously anticipated. Integrating analysis of primary phylogeny, genetic organisation and metal-binding motifs, we demonstrate that hydrogenases have evolved into numerous functionally distinct subgroups/subtypes. This diversification has enabled the primordial process of H2 metabolism to sustain roles in most major phyla and ecosystems. We showed that some 51 bacterial and archaeal phyla have the genetic capacity to oxidise or evolve H2—vastly more than the 13 phyla experimentally shown to metabolise H2 (Schwartz et al., 2013; Greening et al., 2015a)—and emphasised through metagenome analysis that microbial H2 metabolism is likely to be highly important in both oxic and anoxic environments. Our analysis has emphasised that the evolution and distribution of hydrogenases is particularly influenced by pO2; however, other factors such as pH2, pH, temperature and metal ion availability are also likely to be profoundly significant (Schwartz et al., 2013; Greening and Cook, 2014). However, experimental studies are required to gain a deeper understanding of the ecological significance of H2 oxidation and evolution.

In light of this work, there are now numerous new avenues to investigate microbial hydrogen metabolism at the microscopic and macroscopic levels: What are the functions of multiple newly defined types of [NiFe]-hydrogenase (groups 1e, 1g, 2c, 2d, 4f) and [FeFe]-hydrogenases (groups A2, B, C)? Why are hydrogenases found in the genomes of microorganisms as diverse as Acidobacteria, Chlorobi, Crenarchaeota and Bacteroidetes? What environmental and physiological signals lead to the regulation of the genetic determinants of hydrogen metabolism? How does microbial H2 metabolism influence anthropogenic ecosystems (for example, wastewater treatment) and how can the reported diversity of hydrogenases be exploited for bioremediation, biofuel production and fuel cell development? How does microbial H2 metabolism influence community structuring and biogeochemical cycling in soil and aquatic environments? Some of these research questions will be addressed as we further investigate the physiological roles of the hydrogenases described here and the influence of H2 metabolism in different ecosystems.