Introduction

The DsrAB-type dissimilatory (bi)sulfite reductase is a key microbial enzyme in both the reductive and the oxidative steps of the biogeochemical sulfur cycle. Utilized by microorganisms that catalyze redox reactions involving sulfur-containing compounds as components of energy metabolism, it catalyzes the reduction of sulfite to sulfide during anaerobic respiration with sulfate, sulfite or organosulfonates as terminal electron acceptor, and functions in reverse during sulfide oxidation. DsrAB enzymes are heterotetramer proteins with an α2β2 structure and possess iron-sulfur clusters and siroheme prosthetic groups (Dahl et al., 1993). The α and β subunits are encoded by the paralogous genes dsrA and dsrB, respectively, which are organized in a single copy operon with dsrA preceding dsrB (Dahl et al., 1993; Karkhoff-Schweizer et al., 1995; Wagner et al., 1998; Larsen et al., 1999; Pereira et al., 2011). Given the presumed antiquity of siroheme and the proposed existence of DsrAB before the separation of the domains Bacteria and Archaea (Wagner et al., 1998; Dhillon et al., 2005; Loy et al., 2009), DsrAB enzymes are considered very ancient and might have had a fundamental role in mediating biological conversions of sulfur compounds by some of the first microorganisms in the anoxic, reduced atmosphere environments of the primordial Earth (Wagner et al., 1998; Canfield and Raiswell, 1999; Huston and Logan, 2004). It is now recognized that the distribution of dsrAB among extant microorganisms was driven by a combination of divergence through speciation, functional diversification and lateral gene transfer (LGT) between unrelated taxa (Loy et al., 2008b).

DsrAB enzymes are best known from sulfate-reducing microorganisms (SRMs) because of their global relevance for biogeochemical cycling of sulfur and carbon (Pester et al., 2012; Bowles et al., 2014). DsrAB catalyzes the last and main energy-conserving step in the dissimilatory sulfate reduction pathway that is conserved in all cultivated SRM, which are distributed in four bacterial (Proteobacteria—class Deltaproteobacteria, Nitrospirae, Firmicutes, Thermodesulfobacteria) and two archaeal phyla (Euryarchaeota, Crenarchaeota). The canonical pathway essentially consists of the enzymes ATP sulfurylase (Sat), adenosine 5′-phosphosulfate reductase (Apr) and dissimilatory (bi)sulfite reductase (Dsr). However, a new, yet unresolved pathway for sulfate reduction was suggested to operate in a syntrophic microbial consortium that mediated the anaerobic oxidation of methane coupled to sulfate reduction and polysulfide disproportionation (Milucka et al., 2012). Surprisingly, sulfate was reduced by an unknown mechanism in the archaeal partner resulting in the formation of disulfide and not by the deltaproteobacterial partner that harbors the canonical DsrAB-based pathway.

DsrAB genes are also present in some microorganisms that are unable to use sulfate as a terminal electron acceptor including sulfite-reducing microorganisms (e.g., Desulfitobacterium, Desulfitibacter and Pyrobaculum) (Simon and Kroneck, 2013), sulfur-disproportionating bacteria (e.g., Desulfocapsa sulfexigens) (Finster, 2008) and in organisms that metabolize organosulfonates to internally produce sulfite for respiration (e.g., the taurine-consuming gut bacterium Bilophila wadsworthia) (Devkota et al., 2012). The physiological role of DsrAB in anaerobic syntrophs of the spore-forming Firmicutes genera Pelotomaculum and Sporotomaculum, which possess and transcribe dsrAB but are incapable of reducing sulfite, sulfate or organosulfonates (Brauman et al., 1998; Imachi et al., 2006), is unknown.

Some but not all sulfur-oxidizing bacteria (SOB) carry a reversely operating DsrAB that is homologous, yet phylogenetically clearly distinct from DsrAB enzymes that function in sulfite reduction (Schedel and Trüper, 1979; Loy et al., 2009). Unlike most SRM, SOB do not share a common sulfur metabolism pathway, but exploit various, partially redundant enzyme systems for the oxidation of a range of reduced sulfur compounds with intermediate oxidation states (Kelly et al., 1997; Kletzin et al., 2004; Friedrich et al., 2005). DsrAB is essential for the oxidation of sulfur globule repositories (Pott and Dahl, 1998; Dahl et al., 2005) and might thus provide these SOB with an advantageous backup sulfur metabolism in environments with varying concentrations of reduced sulfur compounds. Thus far, dsrAB have been detected in free-living and symbiotic sulfur-storing SOB of the phyla Proteobacteria (classes Alpha-, Beta-, Gamma- and Deltaproteobacteria) and Chlorobi (Loy et al., 2009; Swan et al., 2011; Sheik et al., 2014).

With a few significant exceptions that are indicative of LGT of dsrAB among major SRM taxa, DsrAB and 16S rRNA phylogenies are largely congruent (Klein et al., 2001; Zverlov et al., 2005; Loy et al., 2009). Consequently, dsrAB have been frequently exploited as phylogenetic marker genes in amplicon sequencing-based environmental studies (Dhillon et al., 2003; Nakagawa et al., 2004; Leloup et al., 2006; Loy et al., 2009; Moreau et al., 2010; Mori et al., 2010; Pester et al., 2010; Lenk et al., 2011). Application of the dsrAB approach (Wagner et al., 2005) in diverse environments has uncovered an extensive hidden diversity of dsrAB sequences that are not closely related to dsrAB from any recognized organisms. New sequencing techniques have opened up opportunities for large-scale α- and β-diversity studies of dsrAB. However, the currently available dsrAB sequence set is largely uncharacterized, which poses considerable problems in identifying and classifying newly obtained environmental sequences. A comprehensive classification framework for streamlined computational analyses of large dsrAB sequence data sets is thus urgently needed. A first step toward a dsrAB classification system has been made by a meta-analysis of dsrAB diversity that focused on freshwater wetland SRM (Pester et al., 2012). This study highlighted the existence of at least 10 major monophyletic lineages that were only composed of environmental sequences and similar in intralineage diversity to known SRM families. Furthermore, several primers targeting reductive and oxidative dsrAB types have been published and applied for PCR-based environmental monitoring of the diversity and abundance of sulfur-cycling microorganisms (Wagner et al., 1998; Kondo et al., 2004; Geets et al., 2005; Loy et al., 2009; Mori et al., 2010; Lenk et al., 2011; Steger et al., 2011; Lever et al., 2013), but it is unclear how well these primers cover the currently known dsrAB diversity and thus how suitable they are for such purposes.

In the present study, we established a comprehensive, manually aligned and curated database of nucleic acid and inferred amino-acid sequences of dsrAB that are available in public sequence repositories, and provided a robust, taxonomically and phylogenetically informed classification system for the entire environmental dsrAB diversity. This allowed us to classify and systematically quantify the uncharted dimensions of dsrAB diversity and to reveal its distribution across various environments. We further used the database to determine the in silico coverage of all published dsrA- or dsrB-targeted primers to provide guidance for future PCR-based studies.

Materials and methods

Construction of a comprehensive dsrAB/DsrAB reference database

A dsrAB/DsrAB reference database (Zverlov et al., 2005; Loy et al., 2009), implemented with the ARB software package (Ludwig et al., 2004), was updated to contain all publicly available dsrAB sequences (status August 2013). Sequences were retrieved by manually searching the NCBI nucleotide and genome databases using appropriate keywords (e.g., ‘dsrAB’, ‘dsrA’, ‘dsrB’, ‘dissimilatory (bi) sulfite reductase’, ‘dissimilatory sulfite reductase’, ‘dissimilatory sulfite reductase’) and by tblastx analysis (Supplementary Materials and methods) (Camacho et al., 2009). Of more than 13 000 retrieved sequences, we retained 7695 sequences with <1% ambiguous nucleotides. This sequence assemblage represents a core data set of 1292 sequences that fully covered the 1.9 kb region amplified by the most widely used primer variants DSR1F and DSR4R (which corresponds to 77% of the entire 2.5 kb-long dsrAB) and 6403 shorter sequences that covered at least 300 nucleotides in this region. Sequences were assigned to broad environmental categories (Supplementary Materials and methods). Alignments of nucleotide and inferred amino-acid sequences were manually corrected. The curated and annotated dsrAB/DsrAB database (Supplementary File S1) is available as ARB database in the download section at http://www.microbial-ecology.net and additionally provided as FASTA files of classified and environmentally annotated nucleotide (Supplementary File S2) and amino-acid sequences (Supplementary File S3).

Comparative sequence analyses and classification of dsrAB diversity

DsrAB phylogeny was calculated based on core data set sequences and by using alignment filters that exclude sequence regions with insertions and deletions (indel filters). Maximum-likelihood, maximum parsimony and neighbor-joining trees were used to construct consensus trees (Supplementary Materials and methods). Shorter dsrAB sequences (300 to <1590 nucleotides in the region used for treeing) were phylogenetically classified by adding each inferred amino-acid sequence separately to the consensus tree using the EPA algorithm (Berger et al., 2011) in RAxML-HPC 7.5.6 (Stamatakis, 2006).

Environmental DsrAB sequences of the core data set that were not affiliated with recognized taxonomic families were assigned into individual lineages of approximate family-level diversity (Supplementary Materials and methods). Lineages were further summarized to superclusters if two or more known families and/or uncultured DsrAB lineages formed a monophyletic cluster with a bootstrap support of >70% in at least one treeing method.

Indications for LGT were obtained using a phylogenetic approach (Klein et al., 2001; Koonin et al., 2001). DsrAB and 16S rRNA consensus trees were manually compared for topological inconsistencies under the assumption that 16S rRNA genes were not subject to LGT and thus are markers for inferring the phylogeny of the analyzed species.

In silico coverage and specificity of dsrA- and dsrB-targeted primers

To obtain comparable coverage values and to avoid basing coverage estimates of primers on sequences that were obtained with the very same primers, we used a data set comprised of 177 full-length dsrAB sequences (the majority of which derive from genomes; 115 reductive and 62 oxidative bacterial-type dsrAB) for the evaluation of primers that bind at the (r)DSR1F or (r)DSR4R primer target region, and primer pairs that use at least one such primer. To test primers that target the region amplified by the (r)DSR1F/(r)DSR4R primer pair, we additionally used 1110 reductive- and 159 oxidative bacterial-type dsrAB sequences of the core data set that completely cover this region (Supplementary Tables S1–S4). Primer coverage was determined with the ARB Probe Match tool using perfect match and one weighted mismatch (standard base-pairing and positional weight settings in ARB). Target positions of primers are numbered relative to the Desulfovibrio vulgaris Hildenborough DSM 644 dsrAB sequence (NC_002937, 449 888…452 365) for reductive bacterial-type dsrAB sequences and the Allochromatium vinosum DSM 180 dsrAB sequence (NC_013851, 1 439 735…1 442 113) for oxidative bacterial-type dsrAB sequences.

Results and discussion

The DsrAB consensus tree provides a robust phylogenetic framework for environmental studies

For a dsrAB census, we created a comprehensive database of 7695 sequences with 300 nucleotides length and sufficient quality that derived from 530 amplicon sequencing, metagenome or genome studies. For more reliable phylogenetic inferences, we constructed a DsrAB consensus tree using a core data set of 1292 sequences with 1.9 kb length (Figure 1 and Supplementary Figures S1–S3). The DsrAB tree has four main basal branches that delineate three major DsrAB protein families, namely the reductive bacterial type, the oxidative bacterial type and the reductive archaeal type. The fourth branch is so far only represented by the second dsrAB copy of Moorella thermoacetica (Pierce et al., 2008; Loy et al., 2009). Through paralogous rooting analysis, we show that this copy and the reductive archaeal-type DsrAB family represent the deepest branches in the DsrAB tree and add support to the previously proposed early evolution of DsrAB as a reductive enzyme (Wagner et al., 1998) (Supplementary Results and discussion and Supplementary Figure S4).

Figure 1
figure 1

Consensus phylogeny of DsrAB sequences. Trees for reconstruction of the consensus tree (extended majority rule) were calculated using an alignment of 911 representative DsrAB sequences (clustered at 97% amino-acid identity) and an indel filter covering 530 amino-acid positions between the target sites of the most commonly used DSR1F and DSR4R primer variants. Remaining core sequences (n=378) of the clusters were subsequently added to the consensus tree without changing its topology. Scale bar indicates 10% sequence divergence. Bootstrap support (100 resamplings) is shown by split circles (top: maximum parsimony; bottom left: maximum likelihood; bottom right: neighbor joining) at the respective branches, with black, gray and white/absence indicating 90%, 70%–90% and <70% support, respectively. DsrAB-carrying phyla are labeled in different background colors; gray background represents lineages with no closely related cultured representatives. Black arrows indicate the possible locations for the root of the tree according to paralogous rooting analysis. LA-dsrAB, laterally acquired dsrAB. Moorella thermoacetica dsrAB copy 1 clustered with the LA-dsrAB Firmicutes group.

To assess the minimum number of dsrAB-containing species that are currently represented in the 1292 sequence core data set, we initially inferred a species-level sequence identity cutoff from the linear regression in a plot of corresponding pairs of 16S rRNA gene and non-laterally acquired reductive- and oxidative bacterial-type dsrAB identities (Supplementary Figure S5) (Kjeldsen et al., 2007). A dsrAB nucleic acid sequence identity of 92% over the 1.9 kb fragment is equivalent to a 16S rRNA sequence identity of 99%, which is a frequently used threshold for delineating species-level OTUs (Stackebrandt and Ebers, 2006). We recommend using a more conservative threshold of 90% dsrAB sequence identity, because two organisms with <90% dsrAB identity generally have <99% 16S rRNA identity (Supplementary Figure S5) and will likely represent two different species, given that dsrAB is usually present as a single copy per genome. Application of the 90% threshold showed that already the core data set represents a minimum of 779 species-level OTUs, of which 647 are of the reductive and 118 of the oxidative bacterial DsrAB type. For comparison, 240 species of SRM are currently present in the List of Bacterial Names with Standing in Nomenclature (Euzéby, 1997).

The reductive bacterial-type DsrAB family cluster mostly contains bacteria that use sulfate, sulfite or organosulfonates as terminal electron acceptors (Loy et al., 2008b), and also from sulfate/sulfite-reducing archaea that received dsrAB via LGT from ancestral bacterial donors (see section below). Two hundred and ninety-nine environmental sequences of the core data set were not affiliated with members of described taxonomic families and clustered in 13 stable, monophyletic lineages, which were designated ‘uncultured DsrAB lineages 1 to 13’ (note that lineages 1 to 10 were defined previously; Pester et al., 2012) using a combination of dsrAB sequence identity-based and phylogenetic criteria (Figure 1, Supplementary Figure S1 and Supplementary Materials and methods). Each of these 13 lineages could represent a taxonomic family whose members are yet uncultured or not known to possess dsrAB, illustrating the enormous unexplored diversity of dsrAB-harboring microorganisms in the environment. Our phylogenetic analysis even provided indications for further lineages of environmental dsrAB sequences (Figure 1), but these did not meet our conservative criteria to label them as an ‘uncultured family-level DsrAB lineage’. Importantly, only very few sequences (n=4) of the uncultured family-level lineages contain internal stop codons, which are not confirmed and might result from sequencing errors. Furthermore, nonsynonymous/synonymous substitution rate ratios of the branches that lead to the 13 uncultured family-level lineages are clearly below one (ω=0.05–0.37), which highlights strong purifying selection and suggests that these dsrAB variants are being expressed as functionally active proteins (Yang, 1997; Yang et al., 2000). Although a very recent loss of function will not be evident in the DsrAB sequence record, it is nevertheless unlikely that this vast environmental dsrAB diversity is primarily caused by uncontrolled mutation rates owing to the lack of or reduced selective pressure, for example, in viruses (Anantharaman et al., 2014) or microorganisms that received dsrAB via LGT yet do not make use of them.

At a higher phylogenetic level, we could reproduce three previously described ‘superclusters’ (Pester et al., 2012), namely the Deltaproteobacteria supercluster, the Nitrospirae supercluster, which was previously named Thermodesulfovibrio supercluster (Supplementary Results and discussion), and the environmental supercluster 1, which each comprise at least two uncultured DsrAB family-level lineages and/or known SRM families (Figure 1). DsrAB of the euryarchaeal genus Archaeoglobus and related sequences from thermophilic environments form a separate branch in the reductive bacterial-type DsrAB family tree. All remaining sequences, namely those that are not affiliated with the three superclusters and the Archaeoglobus cluster, did not group consistently at a higher phylogenetic level (Steger et al., 2011; Pester et al., 2012), and we have thus not designated them as a supercluster but as the Firmicutes group sensu lato. These high-order groups/superclusters are named after the main phylum/class that they affiliate with but do not necessarily imply a taxonomic affiliation. Similar to the Deltaproteobacteria supercluster, the highly diverse Firmicutes group contains dsrAB from cultivated members of different phyla and many environmental sequences (Supplementary Results and discussion).

Oxidative-type DsrAB sequences from SOB form a monophyletic enzyme family that is phylogenetically distinct from all other DsrAB sequences (Figure 1 and Supplementary Figure S4). The branching pattern of the tree suggests that oxidative DsrAB evolved by an ancient functional adaptation from an ancestral reductive DsrAB before the diversification into extant DsrAB-carrying phyla. Sequences from known SOB of the classes Alpha-, Beta- and Gammaproteobacteria and the phylum Chlorobi form separate clusters in the DsrAB tree that are generally in accordance with the organismal taxonomy (Figure 1, Supplementary Figure S2 and Supplementary Materials and methods). Only Thioalkalivibrio nitratireducens branches outside the Gammaproteobacteria cluster. Its DsrAB sequence is remarkably different (67–71% amino-acid identity) from the DsrAB of three other Thioalkalivibrio species (as opposed to 87–95% DsrAB identity among these three species). Metagenomic (Sheik et al., 2014) and single-cell genome (Swan et al., 2011) analyses have recently identified reverse dsrAB and accessory genes for sulfur oxidation in members of the deltaproteobacterial SAR324 clade. These deltaproteobacterial reverse DsrAB branch with DsrAB of Chlorobi and Magnetococcus marinus, a species that has been provisionally included in the Alphaproteobacteria (Bazylinski et al., 2013). Interestingly, the root of the oxidative-type DsrAB branch is not located between the Proteobacteria and the Chlorobi. Instead, Chlorobi form a monophyletic cluster with M. marinus and the putative sulfur-oxidizing deltaproteobacterium (Figure 1), which provides phylogenetic support for the acquisition of dsrAB by Chlorobi via LGT from a sulfide-oxidizing proteobacterial donor. Such a scenario has been previously postulated based on the absence of dsrAB in the deep-branching Chlorobi member Chloroherpeton thalassium (Frigaard and Bryant, 2008).

Archaeal-type dsrAB sequence diversity is mainly represented by sequenced genomes and metagenomes because PCR primers commonly used for amplification of dsrAB do not bind to archaeal-type dsrAB. So far, three genera within the hyperthermophilic family Thermoproteaceae (order Thermoproteales) of the phylum Crenarchaeota, namely species of Pyrobaculum (n=7), Vulcanisaeta (n=2) and Caldivirga (n=1), are known to harbor this type of dsrAB, and each genus is represented by a distinct monophyletic group in the archaeal DsrAB tree (Figure 1 and Supplementary Figure S3). Members of all three genera of archaeal-type DsrAB-carrying organisms are able to reduce thiosulfate and elemental sulfur (Molitor et al., 1998; Itoh et al., 1999, 2002). So far, sulfate reduction has been shown only for Caldivirga maquilingensis (Itoh et al., 1999); however, Vulcanisaeta species might also be capable of sulfate reduction (Itoh et al., 2002), as genes for the complete canonical sulfate reduction pathway are present in the genomes of Vulcanisaeta distributa (Mavromatis et al., 2010) and V. moutnovskia (Gumerov et al., 2011).

dsrAB are robust phylogenetic marker genes for sulfur compound-dissimilating microorganisms

Phylogenetic signal is blurred in genes that are subject to (i) frequent LGT between unrelated organisms and (ii) duplication and subsequent functional diversification (Koonin et al., 2001; Gogarten et al., 2002). The identification of dsrAB in members of bacterial (Actinobacteria, Caldicerica, oxidative-type dsrAB in Deltaproteobacteria) and archaeal (Aigarchaeota; formerly known as pSL4 or hot water crenarchaeotic group I candidate division (Nunoura et al., 2011); note that it is under debate whether Aigarchaeota members represent their own phylum or belong to the Thaumarchaeota (Brochier-Armanet et al., 2011)) phyla previously not known to harbor these genes necessitates the re-evaluation of dsrAB as phylogenetic markers. Using an established phylogenetic approach (Zverlov et al., 2005; Loy et al., 2009), we directly compared consensus trees of DsrAB and 16S rRNA sequences originating from 254 pure cultures and genome sequences for topological incongruences that are indicative of LGT. Owing to the apparent functional adaptation of DsrAB, this analysis was carried out separately for organisms using the reductive (Figure 2) and the oxidative sulfur energy metabolism (Figure 3). Our analysis confirms that reductive-type DsrAB and 16S rRNA branching patterns are generally similar and reproduces known topological inconsistencies regarding (i) a group of Firmicutes that most likely acquired dsrAB from deltaproteobacterial ancestors of the Desulfatiglans anilini (formerly Desulfobacterium anilini; Suzuki et al., 2014) lineage (Figure 2) (Klein et al., 2001; Zverlov et al., 2005, ii) members of the phylum Thermodesulfobacteria and (iii) members of the euryarchaeotal genus Archaeoglobus that possess bacterial-type dsrAB. Besides these documented cases, we have obtained evidence for further possible dsrAB LGT events (Supplementary Results and discussion and Supplementary Figure S6). The Aigarchaeota member clearly has a reductive-type dsrAB that was received either directly by LGT from a bacterial donor, possibly a member of the phylum Firmicutes, or indirectly from a yet unknown, bacterial dsrAB-containing archaeon (Figures 1 and 2). The presence of bacterial dsrAB in the Aigarchaeota member and members of the genus Archaeoglobus seems to be the result of at least two independent LGT events. The stable monophyletic grouping of the actinobacterium Gordonibacter pamelaeae with the Firmicutes genera Desulfosporosinus and Desulfitobacterium in the DsrAB tree (Figures 1 and 2) suggests LGT from an unknown donor. Although the deep, independent position of the Caldiserica phylum member in both the DsrAB tree and the 16S rRNA tree is inconclusive regarding LGT, complementary analyses also indicate a foreign origin of its dsrAB (Supplementary Results and discussion and Supplementary Figure S6). These results provide a first view into the possible evolutionary paths that led to the presence of a reductive bacterial-type dsrAB in the bacterial phyla Actinobacteria and Caldiserica, and the archaeal candidate phylum Aigarchaeota, but in-depth insights can only be obtained when more dsrAB sequences from members of these phyla are available.

Figure 2
figure 2

Comparison of 16S rRNA and reductive DsrAB trees. The strict consensus trees are based on corresponding sequence pairs of 16S rRNA and reductive DsrAB from 254 pure cultures and genomes. 16S rRNA and DsrAB trees were calculated using a 50% conservation filter for bacteria (1222 nucleotide positions) and an indel filter for reductive-type DsrAB (530 amino-acid positions), respectively. Scale bars indicate 10% sequence divergence. Both trees are collapsed at the family, genus or (in case of Desulfotomaculum) subcluster level. Sequences that branch inconsistently between the trees are marked with an asterisk. Bootstrap support (100 resamplings) is shown by split circles (top: maximum parsimony; bottom left: maximum likelihood; bottom right: neighbor joining) at the respective branches, with black, gray and white/absence indicating 90%, 70%–90% and <70% support, respectively.

Figure 3
figure 3

Comparison of 16S rRNA and oxidative DsrAB trees. The strict consensus trees are based on corresponding sequence pairs of 16S rRNA and oxidative DsrAB from 51 pure cultures and genomes. 16S rRNA and DsrAB trees were calculated using a 50% conservation filter for bacteria (1222 nucleotide positions) and an indel filter for oxidative-type DsrAB (552 amino-acid positions), respectively. Scale bars indicate 10% sequence divergence. Sequences that branch inconsistently between the trees are marked with an asterisk. Bootstrap support (1000 resamplings) is shown by split circles (top: maximum parsimony; bottom left: maximum likelihood; bottom right: neighbor joining) at the respective branches, with black, gray and white/absence indicating 90%, 70%–90% and <70% support, respectively.

Based on larger sequence data sets, we confirm that branching patterns of DsrAB and 16S rRNA trees of SOB are largely congruent (Loy et al., 2009), with one exception (Figure 3). In the DsrAB tree, T. nitratireducens is not related to three other species of the genus Thioalkalivibrio, but branches independently from other Proteobacteria. One possible explanation for the phylogenetic position of DsrAB of T. nitratireducens is xenologous replacement of its orthologous dsrAB with dsrAB from an unknown and unrelated proteobacterial donor.

A robust DsrAB consensus tree and knowing the discrepancies in 16S rRNA and DsrAB-based phylogenies of described taxa are important for a phylogenetically well-informed interpretation of dsrAB diversity in environmental samples. The detection of reverse dsrAB in a metagenome bin (Sheik et al., 2014) and single-cell genomes (Swan et al., 2011) of the deltaproteobacterial SAR324 clade, whose sulfide-oxidizing members are related to deltaproteobacterial SRM in the 16S rRNA tree, illustrates that inferring general physiological traits such as sulfate/sulfite reduction or sulfur oxidation from 16S rRNA phylogeny can be problematic. In contrast, DsrAB phylogeny clearly distinguishes oxidative versus reductive sulfur metabolism. Despite some limitations, dsrAB also remain useful phylogenetic markers because an environmental dsrAB sequence is identified with high certainty as a member of a recognized taxon if it clusters unambiguously within this taxon. In contrast, the taxonomic identity of an organism represented by an environmental dsrAB sequence that branches outside a recognized taxon, such as members of the 13 uncultured family-level DsrAB lineages, is uncertain.

Environmental distribution of dsrAB-carrying organisms

We further examined the environmental distribution of the 1292 core dsrAB sequences and of 6403 partial dsrA or dsrB sequences with a minimum length of 300 nucleotides. Owing to the many non-overlapping sequences, partial dsrA and dsrB sequences could not be jointly clustered into sequence identity-based species-level OTUs. Instead, they were phylogenetically placed into the consensus tree (Figure 1) without changing its topology (Figure 4). The majority of the 6403 shorter sequences is affiliated with described families and uncultured family-level lineages (n=5893; 92%) or unclassified environmental sequences of the core data set (n=409; 6%) (Figure 1). Only few sequences (n=101; 2%) do not branch within sequence clusters defined by the core data set. A large proportion (n=2349; 35%) of the 6686 sequences on the reductive bacterial DsrAB branch are not affiliated with known taxa (i.e., families, genera) that are represented by cultured organisms. For example, uncultured family-level lineage 9 (n=559) contains a similar number of sequences as the family Desulfovibrionaceae (n=531) that harbors many, taxonomically well-described Desulfovibrio species (Loy et al., 2002; Muyzer and Stams, 2008).

Figure 4
figure 4

Environmental distribution of dsrAB diversity. Environmental classification of 7695 dsrAB sequences from 530 amplicon sequencing, metagenome or genome studies. Numbers within parentheses indicate the number of sequences/number of studies per lineage. Unclassified environmental sequences (n=594) are only shown as part of DsrAB types/superclusters.

We additionally grouped the 7695 dsrAB sequences into eight broad environmental categories (i.e., marine, estuarine, freshwater, soil, industrial, thermophilic, alkali-/halophilic and symbiotic) (Supplementary Materials and methods) to gain insights into the environmental distribution patterns of members of major phylogenetic DsrAB lineages. Most sequences derive from marine environments (31%), followed by freshwater (24%), industrial (16%) and soil environments (11%). Members of most major DsrAB lineages are widely distributed among various environments with starkly contrasting biogeochemical properties, which provides limited indications of the possible ecological factors that gave rise to evolution of the many, phylogenetically distinct lineages at the approximate taxonomic rank of families (Figure 4). However, there are notable exceptions that are indicative of environmental preference. Members of the uncultured family-level lineages 2, 3 and 4 are almost exclusively found in marine environments. Not surprisingly, sequences affiliated with the deltaproteobacterial families Desulfohalobiaceae and Desulfonatronumaceae, which include known halophilic and alkaliphilic SRM (Ollivier et al., 1991; Pikuta et al., 2003; Jakobsen et al., 2006; Sorokin et al., 2008), derive predominately from high-salt and/or high-pH environments. Oxidative bacterial-type dsrAB sequences of Chlorobi are most often detected in freshwater habitats. This is, however, possibly a biased representation, as two studies of freshwater environments have provided 94% of all available Chlorobi dsrAB sequences. Microorganisms with archaeal-type dsrAB seem to be restricted to hot environments, but this also needs to be interpreted with caution, because of the low number of available environmental sequences from this DsrAB enzyme family.

Analogous to marker genes for other functional guilds (Mussmann et al., 2011), detection of reductive and oxidative dsrAB or their transcripts in an environmental sample is not to be mistaken with the actual physiological capability for dissimilatory sulfate/sulfite reduction and sulfur oxidation, respectively (Pester et al., 2012). In addition to the presence of dsrAB in syntrophic bacteria, which are apparently incapable of using sulfate, sulfite or organosulfonates, environmental fragments of dsrAB might derive from viruses or virus-like particles that infect SRM (Rapp and Wall, 1987; Walker et al., 2006; Stanton, 2007) or SOB (Anantharaman et al., 2014), and thus possibly serve as vectors for LGT or supplement the sulfur metabolism of their microbial hosts. Although DsrAB has thus far been shown to function exclusively in dissimilation, it is conceivable that some organisms use it for detoxification of sulfite (Johnson and Mukhopadhyay, 2005; Lukat et al., 2008).

A list of in silico-evaluated primers allows selection of best primer combinations for future environmental dsrAB surveys

We evaluated the in silico coverage (i.e., the fraction of sequences in the target group that is matched by the primer) of 103 published, individual dsrAB-targeted primers and primer mixtures and 28 primer pairs against the updated dsrAB sequence database (Supplementary Results and discussion and Supplementary Tables S1–S4).

Although most primers are highly specific for dsrAB sequences, only few primers or primer mixes target 50% (coverage of perfectly matched sequences) and/or 90% (coverage of sequences with up to one weighted mismatch) of reductive- or oxidative bacterial-type dsrAB sequences (Supplementary Tables S1 and S2 and Supplementary Figure S7). The forward primers DSR1Fmix a–h, DSR1728Fmix A–E, DSR67F, dsrB F2a–i and reverse primers DSR4Rmix a–g, DSR698R, dsrB 4RSI1a–f have highest coverage values for reductive bacterial-type dsrAB and are recommended for future use. Reverse dsrAB sequences are best covered by forward primers rDSR1Fmix a–c, rDSRA240F, DSR1728Fmix A–E, DSR874F, dsrB F1a–h and reverse primers rDSR4Rmix a–b, rDSRB808R, PGdsrAR and dsrB 4RSI2a–h. It is noteworthy that DSR1728Fmix A–E and dsrB F1a–h have relatively high coverage for both reductive- and oxidative bacterial-type dsrAB.

Of the 28 previously published primer pair combinations (Supplementary Tables S3 and S4), only nine have a good coverage of >75% when hits with up to one weighted mismatch are considered (Table 1) and are recommended for further use. Primer pairs DSR1Fmix a–h/DSR4Rmix a–g (1.9 kb dsrAB PCR product) and DSR1728Fmix A–E/DSR4Rmix a–g (0.4 kB dsrB PCR product) have highest perfect-match coverage of 47% and 70%, respectively, for reductive bacterial-type dsrAB. Primer pairs rDSR1Fmix a–c/rDSR4Rmix a–b and DSR1728Fmix A–E/rDSR4Rmix a–b, which amplify the homologous regions in SOB, have even higher coverage of 97% and 90%, respectively. Separate coverage values obtained for the five major groups within the reductive bacterial-type DsrAB tree indicate that the DSR1F/DSR4R primer pair mix is biased against sequences of the Firmicutes group sensu lato and the Nitrospirae supercluster. While new primer variants should be designed to improve in silico coverage, already many environmental sequences belonging to these two groups were obtained by using the DSR1F/DSR4R primer variants (Kaneko et al., 2007; Martinez et al., 2007; Leloup et al., 2009; Wu et al., 2009).

Table 1 Recommended primers/primer pairs for the amplification of dsrAB

For an improved coverage of the environmentally occurring dsrAB diversity by amplicon sequencing, it is therefore recommended to apply the aforementioned primer pairs at low stringency (e.g., low annealing temperature) to allow for binding of non-perfectly matching target sequences. This also promotes more uniform amplification by the different primers in a degenerate primer mixture (Higuchi et al., 1993). Amplification of complex environmental DNA extracts with highly degenerate primers (Supplementary Tables S1–S4) at low stringency unfortunately increases the likelihood for unspecific PCR products (Wagner et al., 2005). This is particularly a problem if degenerate primers are applied for denaturing gradient gel electrophoresis, terminal restriction fragment length polymorphism or real-time PCR analyses. Hence, PCR performance/biases must be carefully evaluated for each primer combination individually, and specificity of amplification should additionally be confirmed by sequencing of the environmental PCR product or the extracted denaturing gradient gel electrophoresis bands.

To assist researchers during the evaluation of existing and development of new dsrAB-targeted oligonucleotides, we have incorporated our database into the probeCheck webserver for straightforward in silico testing of primer specificity and coverage (Loy et al., 2008a).