Introduction

Microbial diversity, abundance and composition change along environmental gradients. Understanding the nature of these changes may help us identify a set of environmental conditions characteristic to particular groups of taxa (Feris et al., 2009; Logares et al., 2012). Ultimately, these relationships could be used to develop microbial indicators (Feris et al., 2009) and help predict microbial community responses to environmental conditions (Paerl et al., 2003; Sims et al., 2013). Communities of other organisms including macroinvertebrates and periphyton are widely and routinely used as bioindicators of localized chemical and physical conditions (Barbour et al., 1999; Hering et al., 2003; Solimini et al., 2006). Microbial communities, with their great genetic diversity and now rapid identification process, also hold promise as bioindicators. There is reason to believe that microbial community indexes or indicator taxa could be developed, as a variety of studies demonstrate that diversity and composition shift considerably along gradients of pH (Fierer and Jackson, 2006; Fierer et al., 2007; Lauber et al., 2009; Lear et al., 2009; Rousk et al., 2010; Griffiths et al., 2011), trace metal concentration (Baker and Banfield, 2003; Feris et al., 2003; Giller et al., 2009; Lami et al., 2013), salinity (Lozupone and Knight, 2007; Auguet et al., 2010) and substrate carbon-to-nitrogen ratio (Bates et al., 2011).

Because microbial composition can be affected strongly by pH, salinity and metal concentrations, we speculated that exposure to alkaline mine drainage (AlkMD) would drive important shifts in stream microbial assemblages. AlkMD results from surface coal mining, the dominant form of land cover change in Central Appalachia (Townsend et al., 2009). Effluent is produced during rock weathering of surface coal mines that contain carbonate rock strata in addition to coal layers (Palmer et al., 2010; Bernhardt and Palmer , 2011). The carbonate matrix buffers sulfuric acid produced from weathered pyrite minerals, increasing base cation (Ca2+, Mg2+ and HCO3) and SO42- concentrations in receiving waters (Rose and Cravotta, 1998; Kirby and Cravotta, 2005). AlkMD is thus characterized by increased alkalinity, ionic strength and pH and often has elevated metals that reflect parent geology (Lindberg et al., 2011; USEPA, 2011; Griffith et al., 2012). Recent regional analyses suggest that AlkMD generated from surface mines led to significant chemical and biological degradation of at least 22% of southern West Virginia rivers (Bernhardt et al., 2012).

AlkMD pollution offers an interesting contrast to the large body of literature exploring responses to acid mine drainage, as it increases salinity and trace metals but enhances alkalinity rather than reduces pH. In this study, we compared stream bacterial communities between mined and unmined catchments within the largest surface coal mine complex in Central Appalachia. We asked the following questions. (1) Does AlkMD significantly alter community composition and do these changes manifest themselves as changes in α-diversity? (2) What bacterial taxa are responsible for changes in community composition and can these taxa be used as indicators of AlkMD? (3) What insight regarding functional response can be gained by examining taxa lost and gained because of the influence of AlkMD?

We expected bacteria’s compositional response to AlkMD would mirror that observed for macroorganisms (Pond, 2010; Bernhardt et al., 2012). Specifically, compositional shifts would be due to both decreased α-diversity and decreased evenness across the AlkMD gradient. We hypothesized that many taxa found in the low solute waters of unmined sites would be absent or rare in sites downstream of surface mines and that taxa known to metabolize ions released from mining would increase downstream of mining. We anticipated that taxa with distinct metabolic repertoires could indicate sites exposed to AlkMD.

Materials and methods

Study site

Sampling sites are in Mud River, a Central Appalachian surface coal mining region that lies in West Virginia’s Lower Guyandotte watershed (Figure 1). Mud River has two forks. Upper Mud River passes through the Hobet Mine complex, the largest surface coal mine in Central Appalachia and includes active and reclaimed mines within 40 km2 of permitted mines. Left Fork Mud River is unmined but has similar geology and low-density residential housing. Sampling locations spanned a gradient of AlkMD contamination. Sites affected by mining included 9 along Upper Mud River’s mainstem and 8 within tributaries draining mines (6 active and 2 reclaimed). Unmined sites were one in Upper Mud River upstream of surface mines, one unimpaired tributary and four Left Fork Mud River sites.

Figure 1
figure 1

Sample sites on Mud River and Left Fork Mud River in Boone and Lincoln Counties, West Virginia (WV). Hydrologic unit codes (HUCs) 12-050701020301 and 12-050701020104 are outlined in gray. Gray tributary streams run through mined areas, whereas black tributaries are unmined. The mainstem of Mud River and Left Fork Mud River shown in bold black from headwaters to confluence. Arrows show flow direction. Inset of US mid-Atlantic states shows Appalachian Coalfield Region as gray-shaded area with relative location of study site in WV in red (not to scale).

Water chemistry

Water chemistry and temperature at each site were measured during deployment and collection of biofilm substrates and 1 month later (December 2010 and April and May 2011). We measured in-stream conductivity and pH and analyzed samples for a suite of major and trace elements (Supplementary Table S1). Water sampling followed USGS protocols (USGS). See Lindberg et al. (2011) for details.

Stream biofilms

Biofilms were grown on substrates suspended under water near the shaded streambank at each sampling site. To minimize variability (De Beer and Stoodley, 2006; Sabater et al., 2007) and use environmentally relevant substrates, we used wood veneers cut from the same tree (Acer saccharum) and enclosed veneers in mesh aquaculture bags (Pentair Aquatic Eco-Systems, Apopka, FL, USA). Four sterilized veneers were deployed under water at each site and incubated for 4 months. Veneers were removed in April 2011; two were transported to the lab on dry ice and stored at −80 °C until DNA extraction. Two remaining veneers were used for metal analysis and carbon content.

Biofilm scraped from two veneers was oven-dried at 78 °C, homogenized, digested with trace metal grade HNO3 and heated at 80 °C. Samples were analyzed for metal content with inductively coupled plasma mass spectrometry as detailed in Lindberg et al. (2011). Remaining biofilm was used for ash-free dry mass via combustion at 500 °C. Then, 53% of the difference between pre- and post- combusted dry mass was calculated as carbon content (Wetzel, 1983).

Biofilm community 16S rRNA gene analysis

See detailed methods in Supplementary Information. Briefly, we extracted DNA from homogenized biofilm using PowerBiofilm DNA isolation kit (MO BIO, Carlsbad, CA, USA) and amplified the 27–338 region of 16S ribosomal RNA (rRNA). Sequencing was unidirectional using Roche 454 Lib-L kit (Branford, CT, USA). Replicate PCR samples were pooled, purified and normalized before being sent to the Genome Sequencing and Analysis Core Resource at Duke University (Durham, NC, USA) for pyrosequencing with a Roche 454 Life Sciences Genome Sequencer Flex Titanium instrument.

Bacterial community analyses

QIIME 1.6.0 software pipeline (Caporaso et al., 2010) was used for downstream sequence processing: reverse primer and chimera removal, phylotype binning, operational taxonomic unit (OTU) assignment and 10-base MID (multiplex identifier) sample grouping. USEARCH was used to filter noisy sequences, chimera check and pick OTUs from demultiplexed sequences (Edgar, 2010). OTUs containing <3 sequences were removed. Remaining OTUs were picked at 97% sequence similarity and identified using the RDP (Ribosomal Database Project) classifier retrained with Greengenes. The NAST algorithm was used for alignment and Greengenes (http://greengenes.secondgenome.com) supplied core representative sequences (version October 2012). Sequences were quality filtered and rarefied to 1543 sequences per sample. To select 1543 sequences for analysis, each OTU’s abundance at a site was divided by the total sequence count for that site, and then multiplied by 1543 to retain the relative abundance of that OTU out of 1543 sequences. The resulting decimals were floored and remaining sequences needed for the site to contain 1543 sequences were selected using the distribution of OTUs at each site (Beevers, 2006). One unmined site (MRUl2) had only 825 sequences. Data from this site were used in environmental data correlations, but excluded from diversity calculations. Rarefaction curves were generated for Chao1 richness (Chao, 1984), Margalef’s index, Shannon diversity, Simpson’s index for evenness and evenness.

Multivariate analysis was guided by Anderson and Willis (2003), who advocate following these approaches: (1) an ordination (robust and unconstrained), (2) statistical testing of the hypothesis and (3) identification of taxa driving the observed patterns. We visualized differences in OTU-based community composition with nonmetric multi-dimensional scaling (NMDS) ordinations based on Bray–Curtis (Bray and Curtis, 1957) and generalized UniFrac (GUniFrac) distance matrices (item 1). GUniFrac distances measure community phylogenetic relatedness, but cover a series of distances from weighted to unweighted by adjusting the weight of the branches in the UPGMA tree (Chen et al., 2012). Alpha controls the weight on lineages with common taxa and was set to 0.5 to provide the best overall power (Chen et al., 2012). For analysis, we grouped mined sites in two different ways: a priori (mainstem mined, active valley fill and reclaimed valley fill) and post hoc, (Ward's method cluster analysis of Bray–Curtis distances separated sites into group A and group B) (Supplementary Figure S2). Finally, we partitioned variation in our community distance matrix among these a priori and post hoc groupings with permutational multivariate analysis of variance (item 3) (Anderson, 2001). See detailed methods in Supplementary Information.

Bacteria taxa and environmental analysis

To understand community composition and environmental variable associations, we examined correlations between NMDS ordination scores and the first two component scores derived from a principal components analysis (PCA) of transformed environmental variables. We used correlation to look for trends between relative abundance of all phyla and classes and PCA axes, pH gradient and percent of watershed area mined. Because examining linear relationships between taxonomic groups at high hierarchical levels can obscure taxa-specific patterns at lower levels, we also used generalized linear models (Quasi-Poisson regression; McCullagh and Nelder, 1989) to identify genera with positive (slope>0, P<0.10), negative (slope<0, P<0.10) and no response (P>0.1) across the gradient of area mined. Finally, we characterized taxa driving multivariate patterns using indicator species analysis (Dufrene and Legendre, 1997; De Cáceres and Legendre, 2009; De Cáceres et al., 2010) with PC-ORD software (McCune and Mefford, 2011) (Supplementary Information).

Predicted functional profiles

To predict functional responses to the mining gradient, we used PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States; http://picrust.github.com; Langille et al., 2013) to generate a functional profile using our 16S rRNA data. We followed the suggested methods for OTU picking with Greengenes 13.5 using Galaxy (http://huttenhower.sph.harvard.edu/galaxy/). Predicted gene family abundances were rarefied to 1078 sequences per site, analyzed at KEGG (Kyoto Encyclopedia of Genes and Genomes) Orthology group levels 2 and 3 and used in correlation analysis (Pearson’s) with percent watershed mined and the PCA axes. The mean nearest sequenced taxon index was lower (0.14±0.002) than that reported for soil communities (0.17±0.02) (Langille et al., 2013).

Results

Environmental characterization of streams

Our data set included sites that were unmined (n=5), Mud River tributaries draining active (n=6) and reclaimed (n=2) mines and sites within the mainstem of Mud River both upstream and downstream of mined tributaries (n=9). All mined sites had higher concentrations of typical AlkMD constituents (SO42−, Ca2+, Mg2+, Se and Mn) than unmined sites (Table 1 and Supplementary Figure S1), leading to a distinct chemical composition in a PCA (Figure 2). The majority of environmental variables strongly correlated with component 1 are classically associated with AlkMD, with SO42−, Ca2+ and Mg2+ all highly correlated with this first axis (Supplementary Table S3). Component 1 also strongly correlated with landcover such that the percentage of watershed mined that drained to a sampling location explained 96% of site variance (r=0.96, P <0.001). We also observed unexpected increases in non-purgeable organic carbon (r=0.79, P<0.001) and mean nitrate (r=0.72, P<0.001) along the component 1, or AlkMD, axis (Supplementary Table S3). Component 2 positively correlated with increasing biofilm Cd, Mn, Zn, Ni and Zn concentrations (r>0.6, P<0.01).

Table 1 Average water chemistry values from December 2010 and April and May 2011 that differed significantly between sites without AlkMD (unmined, n=6) and sites draining mines (n=17)
Figure 2
figure 2

PCA of selected environmental variables. Component 1 and component 2 explain 56.8±3.8% and 16.9±2.1% variance, respectively. Sites with mining split into group A (heavily mined; symbols outlined) and group B (moderately mined; symbols not outlined).

Biofilm biomass as DNA per unit surface area ranged from 6 to 274 mg m−2 (mean=57±12.07 mg m−2) and did not differ between mined and unmined sites (log-transformed, Student’s t-test, P=0.25). Biofilm C content ranged from 10 to 127 g C m−2 (mean=42±6.17 g C m−2) and was not different between mined and unmined sites (log-transformed, Student’s t-test, P=0.24).

Sequencing and taxa identification

The sequencing run of 16S rRNA amplicons yielded 145 138 raw reads. Filtering and removing nontarget sites left 23 sites with 102 772 sequences. Maximum reads per site was 7555 with mean 3543 (s.e. 393). Final sequence clustering gave 1846 OTUs. Each sample had a mean of 391 OTUs (s.e. 9). Identification of OTUs at different taxonomic levels yielded 304 species, 298 genera, 203 families, 128 orders, 72 classes and 25 phyla. Raw sequences are available at MG-RAST (accession numbers 4498070.3–4498093.3, http://metagenomics.anl.gov/linkin.cgi?project=1572; Meyer et al., 2008).

Across all sites, Proteobacteria was the dominant phylum (66.7%), followed by Bacteroidetes (20.8%), Acidobacteria (4.7%) and Actinobacteria (2.0%). All other phyla had abundances of <1%. At the phylum level, <0.02% of reads were unclassified. The most common classes were Alphaproteobacteria (39.0%), Betaproteobacteria (19.3%) and Sphingobacteria (12.5%). At the class level, 2% of reads were unclassified. The most abundant genera were Flavobacterium (6.7%) and Novosphingobium (4.9%). Of the OTUs, 1.1% were assigned to known species.

Overall bacterial community structure

We determined compositional differences among the following site categories: unmined, within valley filled tributaries and in Mud River’s mainstem downstream of valley filled tributaries using Bray–Curtis distance and GUniFrac NMDS (Table 2). With the Bray–Curtis distance matrix, mined sites separated into two groups using hierarchical cluster analysis (Ward’s method), leading us to reclassify mined sites into two post hoc groupings: group A and group B (Figure 3a). Community composition of these two groups differed significantly (permutational multivariate analysis of variance, F2, 19=4.61, P<0.001, Table 2). Group A was characterized by higher concentrations of biofilm Ca, Cd, Mn, Ni, Sr, Th and Zn, and water column Ca, Ni, Se, SO42- and TN (Student’s t-test, P0.05; Figure 4). Group A sites occurred in stream reaches draining watersheds with 25–96% of watershed area occupied by mines, whereas sites in group B had 16–51% of their watershed mined (Student’s t-test, P=0.03).

Table 2 Overall and pair-wise comparisons of Mud River bacteria community composition analyzed with perMANOVA using Bray–Curtis and GUniFrac distances for a priori groups and Bray–Curtis distance for post hoc groups
Figure 3
figure 3

NMDS ordination (a) using Bray–Curtis distance matrix and (b) using GUniFrac distance matrix (with α=0.5). Mined sites categorized as group A (symbols outlined) and group B (symbols not outlined) resulting from cluster analysis of Bray–Curtis distance NMDS. Distance matrices based on 16S rRNA pyrosequences. The r2 values are in parentheses. Stress: (a) 0.192 and (b) 0.190. Rarefied to 1543 sequences per site.

Figure 4
figure 4

Water and biofilm chemistry variables that differ significantly between groups A and B shown as proportion of change from average concentration in unmined sites (P0.05).

Based on Bray–Curtis distances, which do not incorporate phylogenetic relatedness in community differences, bacteria community composition differed significantly overall (F3, 18 =1.94, P=0.002) and between sites with and without AlkMD (F3, 18=3.18, P<0.001; Figure 3a and Table 2). There were no significant differences in community composition between streams draining active and reclaimed valley fills (F1, 18=1.04, P=0.42). NMDS axes 1 and 2 had similar degrees of explanatory power (25.9% and 26.4%, respectively). Configuration stress was 0.192.

We also used GUniFrac to compare composition across sites. GUniFrac analyses include the phylogenetic relatedness of taxa. Results were similar to the Bray–Curtis analysis, although separation between sites in ordination space was less distinct (F3, 18=1.55, P=0.02; Figure 3b and Table 2). Contrasts between mined and unmined sites showed significant differences in bacterial community composition (F1, 18=2.36, P=0.002), but again we found no difference between communities in streams below reclaimed and active valley fills (F1, 18=0.72, P=0.75). Axis 1 in the GUniFrac distance NMDS ordination explained the most compositional differences between sites (31.0%), whereas axis 2 explained 21.3% and stress was 0.190.

Bacterial diversity along the mining gradient

We examined correlations between α-diversity and environmental variables. Component 2, a PCA axis capturing variation in biofilm metals, was the single strongest correlate of richness estimated by multiple α-diversity metrics. Chao1 richness, Margalef’s index and evenness negatively correlated with component 2 (Chao1: P=0.002, r=−0.63; Margalef’s index: P=0.006, r=−0.57; evenness: P=0.006, r=−0.57). Across all sites we did not observe significant correlations between percent of watershed mined and α-diversity metrics (Figure 5, all P>0.05). However, Chao1 richness estimator and Margalef’s index of only mined sites were significantly negatively correlated with the percent watershed mined (both P=0.004, r=−0.66).

Figure 5
figure 5

The α-diversity (Chao1 richness and Shannon diversity index (H’) shown here) across a range of watersheds with different percentages of their area that had been mined (Observed, P=0.44; Chao1, P=0.74; and Shannon, P=0.37). The post hoc designations of sites (group A, group B and reference) are indicated in key.

We examined diversity variation among site categories using common biotic indices (Table 3). The α-diversity of bacteria OTUs did not differ between a priori designated site types using any of these indices (Kruskal–Wallis, P>0.05). However, post hoc categories did differ in α-diversity for Chao1 richness, Margalef’s index and evenness (Kruskal–Wallis, P=0.04, P=0.04, P=0.002), which was lower in group A (heavily mined) than group B (moderately mined) (Table 3).

Table 3 The α-diversity using 1543 sequences

Indicator taxa and predicted functions

We performed indicator species analysis of OTUs, orders and families to determine which taxa reliably indicated particular environmental conditions (Table 4). Comparing heavily mined, moderately mined and unmined sites, we found 174 OTU-based taxa strongly associated with one of these three groups. Most OTU indicators (n=156) closely associated with unmined sites, only 1 strongly associated with the moderately affected group B sites and 17 associated with the heavily affected group A. Of the OTUs assigned to a taxa identifier, we found 20 orders (of 128 total), 34 families (of 203 total) and 28 genera (of 255 total) that were indicator taxa for one of the three post hoc groups.

Table 4 Taxa identified at the order or family level as indicators using Indicator Taxa Analysis

Out of all described taxa, percent of watershed mined explained significant linear trends in abundance for 9 of 72 classes, 18 of 128 orders and 12 of 203 families (Supplementary Table S4). Whereas the Acidobacteria-5 and Betaprotebacteria classes, YCC11 and Ellin329 orders, and Bacteriovoracaceae and EB1003 families correlated negatively with percent of watershed mined (all r>−0.6), the Acidimicrobiia class, Acidimicrobiales, SBR1032, and Rhodobacterales orders, and Phyllobacteriaceae, Methylophilaceae and Desulfobacteraceae families increased in relative abundance in streams of more heavily mined watersheds (all r>0.5).

Because cross-gradient patterns of taxa are often conducted at coarse levels of taxonomic resolution, we explored genera responses within the two most abundant classes: Betaproteobacteria, which negatively correlated with percent of watershed mined, and Alphaproteobacteria, which did not correlate with percent of watershed mined. For each genus within the class, we assessed abundance patterns across the gradient of percent mining using Quasi-Poisson regression (McCullagh and Nelder, 1989). The negative correlation between Betaproteobacteria relative abundance and percent of watershed mined did not hold for all genera within the class (Figure 6). Whereas 9 genera did show a negative response, 6 responded positively and 18 had no response. In Alphaproteobacteria, which showed no response to mining at the class level, 13 genera responded positively, 11 responded negatively and 28 showed no significant response (Figure 7). We referenced each responding genus with KEGG Organism modules and Bergey’s Manual (Garrity, 2005) to identify energy metabolisms that might respond to AlkMD constituents (Table 5). The majority of sulfur and nitrogen metabolism pathways were shared by both positive and negative responders. However, the only responsive genus reported to include a denitrifier increased with mining, whereas assimilatory sulfate reduction pathways were identified for genera who only responded negatively to mining.

Figure 6
figure 6

Response to percent area of watershed mined by (a) class Betaproteobacteria; and genera within Betaproteobacteria: (b) Hydrogenophaga, a representative positive responder; (c) Rubrivivax, a representative non-responder; and (d) Polaromonas, a representative negative responder, using Quasi-Poisson regression.

Figure 7
figure 7

Response to percent area of watershed mined by (a) class Alphaproteobacteria; and genera within Alphaproteobacteria: (b) Rhodobacter, a representative positive responder; (c) Sphingobium, a representative non-responde; and (d) Bradyrhizobium, a representative negative responder, using Quasi-Poisson regression.

Table 5 Energy metabolism functions for Alpha- and Beta-Proteobacteria that respond to the mining gradient

Relative gene family abundances generated by the predicted functional profile using PICRUSt grouped into three level-2 functional categories that correlated negatively with percent watershed mined. Gene family relative abundances ranged from 11% to <1%. Percent watershed mined correlated negatively with gene families in ‘Signaling Molecules and Interaction’ (Environmental Information Processing category) (P=0.005, r=−0.59, 4% abundance), ‘Xenobiotics Biodegradation and Metabolism’ (Metabolism category) (P=0.01, r=−0.53, 0.4% abundance) and ‘Transport and Catabolism’ (Cellular Processes category) (P=0.02, r=−0.50, 0.2% abundance).

Discussion

Despite receiving extensive AlkMD contamination from the largest surface coal mine in Appalachia, microbial communities exposed to exceptionally high levels of AlkMD constituents in Mud River are no less diverse than nearby reference communities. Diversity in these streams best correlates with a multivariate factor that incorporates elevated biofilm Cd, Mn, Zn and Ni. Contrary to our predictions, overall bacterial diversity was not strongly correlated with the extent of upstream mining, although within mined sites, mining intensity correlated negatively with taxonomic richness. Despite only modest changes in α-diversity, we detected significant compositional differences between microbial communities of unmined and mining-affected streams. These compositional shifts resulted from changes in relative abundance rather than turnover between species at mined and unmined sites, as only 9% of OTUs differed enough between mined and unmined sites to serve as indicator taxa. There was limited evidence that these compositional shifts were driven by responses to nitrate and sulfate availability for use in energy metabolism. Rather, taxa shifts may be because of stressors that effect cellular processes and signaling.

Although our data suggest that bacterial community composition shifted with AlkMD exposure, we do not observe linear trends in α-diversity along the mining gradient because α-diversity varies widely at unexposed sites. However, within mined sites, richness decreased as more watershed area was mined. When an ecosystem undergoes extreme environmental alteration, such as mountaintop mining, we expect organisms favored by the changes to flourish and sensitive taxa to fail. This subsidy–stress response (Odum et al., 1979) can shift community composition across environmental gradients as well as increase diversity at intermediate exposure levels where sensitive and tolerant (or subsidized) taxa overlap (Niyogi et al., 2007). This is a possible explanation for microbial taxa richness and diversity responses to the AlkMD gradient. It is also consistent with a transplant experiment in the Clark Fork River drainage containing Butte copper mine where Feris et al. (2009) found bacterial taxa richness was greatest at low and moderate levels of metal contamination (As, Cd, Cu, Pb and Zn) and lowest in uncontaminated and highly contaminated sediments.

Many studies reporting microbial community responses to environmental contaminants occur in acid mine drainage systems (Baker and Banfield, 2003). Communities in these very acidic, metallic waters typically are less diverse than bacteria from neutral or alkaline streams (Lear et al., 2009; Kuang et al., 2013) and pH strongly correlates with phylogenetic diversity, richness and UniFrac distance (Kuang et al., 2013). Water column pH during our study period was not statistically distinct between mined and unmined sites and had no correlation with percent watershed mined, although it spanned nearly two orders of magnitude (6.9–8.6). AlkMD characteristically elevates pH (Griffith et al., 2012), and previous work at this field site found a strong positive correlation between percent watershed mined and pH (Lindberg et al., 2011). Unlike that earlier study of Mud River, this study occurred during winter/early spring at which time high flows dilute pH effects of mine drainage. In contrast to prior studies (Fierer and Jackson, 2006; Lauber et al., 2009; Rousk et al., 2010; Griffiths et al., 2011), pH was not an important correlate of composition or diversity metrics during this sampling period. This may be explained by the pHs we examined or the limited pH range in our study (2 orders of magnitude) relative to prior studies (4 orders of magnitude). Yet, our results also contrast a study of diversity in streams of Hubbard Brook Experimental Forest in New Hampshire, USA, in which a similar range in pH (4–6.3) was shown to be the best correlate of microbial taxa richness across streams (Fierer et al., 2007). The lack of a pH diversity correlation in Mud River suggests that other chemical constituents were stronger determinants of bacterial community structure than pH, and this is perhaps unsurprising given the large difference in alkalinity, conductivity and numerous trace elements associated with surface mining.

Community composition differed significantly between unmined sites and sites downstream of surface mines. In mining-affected sites, we detected two distinct post hoc groups associated with different levels of AlkMD exposure. Bacterial communities in moderate AlkMD exposure sites were more diverse than communities with high contaminant exposure. The shift in composition between unmined and mined sites was best explained by elevated AlkMD constituents Ca2+, Li, SO42−, Se and Mg2+ and overall ionic strength. Composition differences between high and moderate AlkMD groups was best explained by greater biofilm Cd, Mn and Zn concentrations in the high-mining-affected sites. Earlier work shows that salinity and trace metals generate significant changes in microbial community composition (Baker and Banfield, 2003; Feris et al., 2003; Lozupone and Knight 2007; Giller et al., 2009; Auguet et al., 2010; Lami et al., 2013). A number of studies have found that Cd and Zn in particular alter microbial community composition (Ganguly and Jana, 2002; Sverdrup et al., 2006; Bouskill et al., 2010; Xie et al., 2011).

As previous studies recognize (Lozupone et al., 2007; Kuczynski et al., 2010), analysis methods influence conclusions about composition differences. Although both distance matrices that we used for creating NMDS ordinations yielded identical differences between site types, they also revealed unique patterns of community composition within site- types. GUniFrac incorporates phylogenetic relatedness into the distance matrix by more heavily weighting closely related taxa (Kuczynski et al., 2010; Chen et al., 2012). Because site-type differences in composition were less strong when using a GUniFrac rather than Bray–Curtis dissimilarity distance matrix, shifts in composition may be due to closely related taxa responding quite differently to AlkMD. This point is bolstered by the genera-level analyses that revealed that genera within the same class had varied responses to the mining gradient.

Ultimately, as our knowledge of microbial life history and physiology grows, we hope to map individual microbial traits onto phylogenies. Such knowledge would improve chemical pollution monitoring and predictions of microbial responses to ecosystem degradation. At present, mapping microbial traits to identities is highly limited, obscuring causes or consequences of specific compositional shifts in bacteria communities. Yet, examining taxa that respond strongly to AlkMD and comparing associated compositional changes with those across other environmental gradients may illuminate important microbial indicator taxa.

Taxonomic data sets are informative, but poorly resolved at genus and species levels (only 1.1% of OTUs were assigned to known species). Thus, it is a challenge to select appropriate taxonomic levels for best understanding microbial responses. At the class level, the strongest AlkMD responders were within Proteobacteria, Acidobacteria and Actinobacteria phyla. Similar to Feris et al. (2003), we found that Betaproteobacteria relative abundance decreased across a contamination gradient. Betaproteobacteria indicator taxa also correlate negatively with river and estuary water organic carbon content (Fortunato et al., 2013), and this may play a role in structuring AlkMD communities as dissolved organic carbon had a strong positive correlation with mining. In contrast, Feris et al. (2009) also found that in hyporheic sediments sourced from alkaline streams (pH 7.9–8.3), Alpha- and Gammaproteobacteria relative abundance increased with a metal contamination index. Although we used surface biofilms, not hyporheic sediments, we observed no such relationship. In our study, the strongest positive correlation with mining occurred in Actinobacteria that responded linearly rather than in a threshold manner.

After identifying taxa responsive to this gradient, the next step is investigating mechanisms and corresponding ecological implications. We anticipated that two ions that substantially increased with mining, sulfate and nitrate, would link to changes in taxa composition as these can be used in energy metabolism. Indeed, several AlkMD-tolerant taxa included Gamma- and Deltaproteobacteria, Nitrospira, Bacilli and Sphingobacteria, several of which perform biogeochemical transformations involving nitrogen cycling. These include nitrite oxidation to nitrate by Nitrospirae (Wakelin et al., 2008), methanol oxidation linked with denitrification by Methylophilaceae (Kalyuhznaya et al., 2009) and nitrate reduction and aromatic compound degradation by Rhodocyclales (Hesselsoe et al., 2009). However, KEGG Organism modules revealed similar nitrogen pathways for species in genera that increased and decreased with mining. Moreover, predicted gene family abundances for nitrogen metabolism were not positively correlated with mining. Sulfur metabolism had a similar outcome: predicted sulfur metabolism gene family abundances were not correlated with mining (and sulfate), yet the sulfate-reducing Desulfobacteraceae family increased in relative abundance in streams of more heavily mined watersheds. Because of the regime shift in many AlkMD-associated chemicals, it is likely that no single mechanism is responsible for the taxa patterns we observe. Rather, the multivariate nature of AlkMD is best represented by the percentage of watershed mined. This chemical regime shift may affect cellular processes and signaling as the predicted functional profile suggests. At the ecosystem level, it is possible that these effects could alter energy requirements, thus influencing carbon use efficiency and carbon cycling if energy is shunted toward cellular processes and away from growth. Nonetheless, it seems that the majority of functional categories are predicted to be redundant between bacterial communities spanning the mining gradient.

In conclusion, our study shows that stream biofilm bacterial composition in the Mud River system significantly differed between sites receiving AlkMD and unexposed sites. Average taxonomic richness in sites receiving moderate levels of AlkMD constituents exceeded that for unexposed or heavily exposed sites, creating a nonlinear relationship between exposure and diversity. At most taxonomic levels, few taxa were statistically dissimilar enough between exposure categories to indicate habitat specialization. The small number of strongly responding taxa and disparity in compositional similarity between GUnifrac and Bray–Curtis ordinations suggest that community shifts occur through families, genera and species rather than further up the hierarchy. Such results contrast macrofaunal responses to AlkMD exposure where entire orders of aquatic insects are lost from heavily AlkMD-affected streams (Pond et al., 2008; Pond, 2010, 2012). Testing microbial community functional responses is the next logical step toward understanding ecologically relevant links between compositional shifts and the strong chemical gradient AlkMD establishes in Central Appalachian streams.