Introduction

Composed of glacially carved valleys separated from the Polar Plateau by the Transantarctic Mountains, the McMurdo Dry Valleys compose the largest inland ice-free area of the Antarctic continent (Bockheim and McLeod, 2008). Widely considered the coldest, driest desert on Earth (Hopkins et al., 2006) and used by NASA as an analog for Mars, the Dry Valleys often receive <10 cm per year water-equivalent precipitation, most of which arrives in the form of snow that sublimes almost instantly (Marchant and Head, 2007; Cary et al., 2010). High salt content in Dry Valley soils further reduces the amount of available water and imposes additional osmotic barriers on Dry Valley biota (Wynn-Williams, 2000). The diverse glacial geomorphology (that is, the history of geological processes that shaped the environment) of the Dry Valleys is manifested in high degrees of physicochemical heterogeneity in the region, including altitude, surface geology, soil types and weathering stages, and soil salt content (Bockheim, 2002; Barrett et al., 2004; Marchant and Head, 2007; Bockheim and McLeod, 2008; Block et al., 2009).

Because of the absence of vascular plants and macrofauna, the Dry Valleys has long been considered highly hostile to life (Scott, 1905; Wynn-Williams and Edwards, 1990; Vishniac, 1993). Furthermore, the limitation of culture-based techniques (Rappé and Giovannoni, 2003; Janssen, 2006), the lack of direct data on bacterial and archaeal diversity (Adams et al., 2006), and the belief that aeolian transport is the prevailing mechanism for microbiota distribution within the region (Moorhead et al., 1999; Hopkins et al., 2006; Nkem et al., 2006) collectively contributed to a prevailing notion that the microbial ecology of Dry Valley mineral soils is dominated by a few cosmopolitan species homogeneously distributed across the Dry Valley landscape (Horowitz et al., 1972; Johnson et al., 1978; Friedmann et al., 1993; Vishniac, 1993). Recent studies utilizing molecular genetic methods have challenged this view and showed that Dry Valley soils can harbor bacterial communities far more active and diverse than previously expected (La Torre de et al., 2003; Smith et al., 2006; Niederberger et al., 2008; Wood et al., 2008a; Babalola et al., 2009; Cary et al., 2010), but the forces driving this apparent microbial diversity remain unknown (Adams et al., 2006; Cary et al., 2010).

It has been shown that abiotic elements are the dominant factors determining biodiversity in extremely simplified food webs (Convey, 1996; Doran et al., 2002; Hogg et al., 2006; Poage et al., 2008). The lack of clear biotic interactions (Barrett et al., 2006a) and diverse abiotic conditions make the Dry Valley soil ecosystem a highly, if not the most, tractable model for resolving the roles of abiotic factors in structuring soil microbiota (Wall and Virginia, 1999; Hogg et al., 2006; Hopkins et al., 2006). Furthermore, the importance of abiotic factors as the dominant driver of biodiversity in the Dry Valleys has long been recognized for microfauna and microflora (Adams et al., 2006; Convey et al., 2009), and although some reports have hinted at heterogeneous microbial communities in disparate parts of the Dry Valleys (Smith et al., 2006; Niederberger et al., 2008; Smith et al., 2010), no systematic survey of Dry Valley habitats across well-defined physicochemical heterogeneities have been undertaken (Adams et al., 2006; Cary et al., 2010).

Here we present findings from a coordinated comparative survey of soil bacterial communities from four topographically and geomorphologically diverse Dry Valleys, coupling molecular genetic techniques with biogeochemical analyses. In addition to describing the microbial communities associated with four distinct habitats and establishing an ecological framework of Dry Valley soils, we aim to uncover potential links between microbial ecology and environmental variables, which are potentially reflective of historical and ongoing geological processes (Virginia and Wall, 1999).

Materials and methods

Sample collection

Mineral soils representative of local soil types were collected from four study sites in the McMurdo Dry Valleys: Miers Valley (78°05.486′S, 163°48.539′E, elevation 171 m), Beacon Valley (77°52.321′S, 160°29.725′E, elevation 1376 m), Upper Wright Valley (77°31.125′S, 160°45.813′E, elevation 947 m) and Battleship Promontory (76°54.694′S, 160°55.676′E, elevation 1028 m). Detailed descriptions of each valley are available in Supplementary Information. Samples were collected in December 2006 (Miers Valley and Beacon Valley) and January 2008 (Battleship Promontory and Upper Wright Valley). The sampling sites were all located on a south-facing, 0–20° slope and consisted of two 50 m transects crossing in the middle, with the intersection being the central sampling point (X) and four sampling points (A–D) at the end point of the transects (A being the southernmost point and the remaining points in an anti-clockwise order). At each point, a 1-m2 sampling area was identified, and from each corner one scoop of soil was collected from the top 2 cm, after soil pavement pebbles were removed. All reasonable precautions were taken to avoid anthropogenic and intersite contamination, and samples were stored in a sterile Whirl-Pak (Nasco International, Fort Atkinson, WI, USA). Samples were stored at −20 °C at the earliest opportunity and transported back to our laboratory in New Zealand for further analysis.

DNA extraction and community fingerprinting analysis

DNA was extracted using the bead-beating method (Coyne et al., 2001), which has been shown to be highly effective for Dry Valley soil samples (Niederberger et al., 2008) (see also Supplementary Information). Molecular genetic analysis was carried out using automated ribosomal intergenic spacer analysis (ARISA), which resolves community diversity through the hyper-variability of the 16S–23S internal transcribed spacer (ITS). PCR for ARISA was carried out as previously described (Wood et al., 2008a) using both bacterial and cyanobacterial universal primers. The bacterial primers were ITSF (5′-HEX-GTCGTAACAAGGTAGCCGTA-3′) and ITSReub (5′-GCCAAGGCAATCCCACC-3′) (Jones et al., 2007), and the cyanobacterial primers were CY-ARISA-F (5′-FAM-GYCAYRCCCGAAGTCRTTAC-3′) and 23S30R (5′-CHTCGCCTCTGTGTGCCWAGGT-3′) (Wood et al., 2008a). PCR components and conditions, and quality control procedure for PCR amplicons are described in Supplementary Information. For ARISA fragment length (AFL) analysis, 2 μl of diluted PCR product was mixed with 0.25 μl of ROX-labeled genotyping internal size standard (ETR900-R, GE Healthcare, Buckinghamshire, UK) and diluted to 10 μl final volume using 20% Tween-20. AFLs were determined by capillary electrophoresis using the MegaBACE 500 DNA Analysis system (GE Healthcare). ARISA was performed twice for each sample.

Soil physicochemical analysis

pH and conductivity of soil samples were measured on a CyberScan PC 510 Bench Meter (Eutech Instruments Pte Ltd, Singapore) using the slurry technique, which consists of mixing 2 g of soil with 5 ml of deionized water and allowing the samples to settle. Soil gravimetric water content was determined by gravimetrically drying 40 g of air-dried soil at 105 °C until the sample reached a constant weight.

For total percentage carbon and nitrogen measurements, air-dried soil samples were ground to fine powders using a ball grinder and precisely weighed out to 0.2 g aliquots. The samples were then analyzed using a LECO Truspec Elemental Determinator (LECO Corporation, St Joseph, MI, USA). The ground soil samples were also acid digested following the United States EPA 200-2 analysis method for elemental analysis (see Supplementary Information) using inductively coupled plasma mass spectrometry (ICP/MS).

454 pyrosequencing of PCR amplicons

Four sets of barcoded PCR primers were designed to allow direct 454 tag sequencing of four different samples within a single region of a 454 GS FLX Standard (that is, non-titanium) PicoTiterPlate (454 Life Sciences, Branford, CT, USA). The PCR primers consisted of three components: 5′-[454 GS FLX adapter A/B]+[4 nt barcode (aka MID)]+[forward/reverse gene-specific PCR primer]-3′. The 454 GS FLX adapter sequences were 5′-GCCTCCCTCGCGCCATCAG-3′ (adapter A) and 5′-GCCTTGCCAGCCCGCTCAG-3′ (adapter B). The barcode sequences were 5′-AGTC-3′ (Battleship Promontory), 5′-ACAT-3′ (Beacon Valley), 5′-AGAC-3′ (Miers Valley) and 5′-ACGT-3′ (Upper Wright Valley). The forward PCR primer was 338F (5′-ACTCCTACGGGAGGCAGCAG-3′) and the reverse PCR primer was 806R (5′-GGACTACCAGGGTATCTAAT-3′) (Huws et al., 2007). Adapter A was always coupled with 338F, and adapter B with 806R. The predicted amplicon size for these primers was 511 bp.

Genomic DNA samples extracted from individual sampling points at a study site (that is, valley) were pooled at equal weight ratio and used as PCR template. PCR components and conditions, and quality control procedure for PCR amplicons are described in Supplementary Information. The prepared PCR amplicons were sent to the University of Otago High-Throughput DNA Sequencing Unit (Dunedin, New Zealand), where they were pooled at equal weight ratios and used to generate two emPCR libraries (for sequencing from adapter A and adapter B, respectively). The resulting emPCR products were pooled at equal weight ratios and sequenced using 1/16th region of a 454 GS FLX Standard PicoTiterPlate.

Sequence analysis

454 PCR amplicon pyrosequencing data was analyzed using AmpliconNoise v1.0. (Quince et al., 2011). Briefly, raw flowgrams (sff files) with perfectly matching primer and barcode sequences were filtered for a minimum flowgram length of 360 cycles (including primer and barcode sequences) before the first noisy signal (that is, 0.5–0.7 or no signal in all four nucleotides). All flowgrams were then truncated at 360 cycles and clustered to remove sequencing noise using PyroNoise (Quince et al., 2009, 2011). PCR noise was removed using SeqNoise (Quince et al., 2011), and PCR chimeras were removed using Perseus (Quince et al., 2011). The resulting dereplicated sequences were aligned using ClustalW 2.0.12 (Thompson et al., 1994), NAST (DeSantis et al., 2006b) and PRANK+F v.090707 (Löytynoja and Goldman, 2008). Default parameters were used for ClustalW, and the +F (for robustness against indel errors) and -uselogs (to enable alignment of large numbers of sequences) options were used for PRANK+F. NAST alignment was performed using the Greengenes reference alignment and default parameters (DeSantis et al., 2006a, 2006b). Mothur 1.10.2 (Schloss et al., 2009) was used to calculate sequence distance matrices and cluster sequences into operational taxonomic units (OTUs) defined at the furthest neighbor Jukes–Cantor distance of 0.03 (OTU0.03), the most stringent OTU definition allowing for intragenomic 16S rRNA variation and PCR/sequencing errors (Schloss and Handelsman, 2006; Kunin et al., 2009). Rank-abundance data were generated for each study site and groups of study sites, and rarefaction curves, collector's curves, and other population diversity indices (for example, ChaoI and ACE) were calculated. Community similarity trees, Venn diagrams and taxonomic heat maps were also generated using Mothur, and libshuff tests were performed. For phylogenetic assignments, representative sequences of all identified OTU0.03 were analyzed using the Classifier function provided by the Ribosomal Database Project Release 10, Update 15 (Wang et al., 2007). Taxonomic assignment threshold was set at 80%.

Statistical analysis

A one-way analysis of variance was performed on the soil physicochemical data using Statistica (Version 7, StatSoft Inc., Tulsa, AZ, USA) to identify significant differences in soil chemistry between sampling sites. Soil physicochemical data were also imported into the PRIMER 6 package (PRIMER-E Ltd, Ivybridge, UK), where the data were first transformed using a square-root transformation, followed by a log (X+1) transformation, and normalized. A distance matrix was generated based on Euclidean distances, with which hierarchical clustering (group average), non-metric multidimensional scaling (MDS) and one-way analysis of similarities (ANOSIM) analyses were performed.

ARISA run data were processed using Genetic Profiler (Version 2, GE Healthcare) to yield all detectable peaks without applying any arbitrary cutoff. The resulting data were processed using a modified version of a previously described program (Abdo et al., 2006) (see Supplementary Information for detailed parameters), and AFL profiles were then imported into PRIMER 6 and converted into presence/absence data. Data from multiple ARISA runs were averaged for each sample, and a similarity matrix was generated based on Bray–Curtis similarity, on which hierarchical clustering, MDS and ANOSIM analyses were performed. Biota-Environmental STepwise and LINKTREE analyses (Clarke et al., 2008) were performed using the transformed and normalized soil physicochemical data as environmental variables, and the ARISA similarity matrices as biotic factors. For Biota-Environmental STepwise analysis, the BIOENV method with Spearman ranking was used, and 99 permutation tests were run (none were ρ).

Results

Soil geochemistry

Overall, soil physicochemical properties varied significantly among valleys (Table 1). One-way ANOSIM reported a global R-value of 0.771 (R-value of one indicates that the most similar samples are all within the same study site) and an associated P-value of 0.001, indicating significant physicochemical differences between the study sites. When analyzed separately, the ICP/MS profiles for all four valleys were also statistically different from each other (ANOSIM, global R-value=0.691, P-value=0.001). Results from ANOSIM pairwise tests show that Miers Valley was most different from all other valleys (Supplementary Table 1). Copper concentration in Beacon Valley was considerably higher than all other valleys, corroborating previous reports (Wood et al., 2008a).

Table 1 Soil geochemical properties

The relationship between physicochemical profiles of all individual samples from four valleys is represented in a non-metric MDS plot (Figure 1), in which Miers Valley is clearly an outlier to the remaining valleys. This observation is supported by the pairwise ANOSIM results (Supplementary Table 1). Soil physicochemical properties are largely consistent within individual study sites, with the exception of sampling point C from Beacon Valley (BV_C) (Figure 1). This is likely because of its high Ag and Hg content (3–5-fold higher than average, Supplementary Table 2).

Figure 1
figure 1

MDS plot of soil physicochemical profiles. An Euclidean distance matrix of soil physicochemical profiles was calculated using soil physicochemical properties that have been square-root and log (X+1) transformed and normalized. A MDS plot was then generated using the distance matrix to represent the relative distances between individual samples. The samples are represented using the following symbols: Miers Valley (MV), gray triangles (); Beacon Valley (BV), upturned black triangles (); Battleship Promontory (BP), gray squares (); Upper Wright Valley (UW), gray diamonds (). Contour lines enclose samples with distances below certain thresholds (0.22, solid line; 0.34, dashed line).

Community profiling using ARISA

Significantly more bacterial AFL groups (AFLs, each representing at least one phylotype) were identified in Miers Valley and Battleship Promontory samples than Beacon Valley and Upper Wright Valley (Supplementary Table 3), an observation supported by one-way global analysis of variance performed on the number of bacterial AFLs (F-statistic=9.4, P-value <0.001). More bacterial AFLs were identified than cyanobacterial ones for all the study sites (Supplementary Table 3) and Miers Valley appears to harbor a moderately higher level of cyanobacterial diversity (one-way global analysis of variance, F-statistic=3.6, P-value=0.037) relative to other study sites (Supplementary Table 3).

Global ANOSIM analysis of the ARISA data revealed that the cyanobacterial communities (R-value=0.775, P-value=0.001) and overall bacterial communities (R-value=0.709, P-value=0.001) of the four study sites were significantly different from each other, suggesting localized microbial communities. MDS plots of ARISA data (Figures 2a and b) show that Miers Valley and Battleship Promontory samples strongly group within respective valleys. Meanwhile, samples from Beacon Valley and Upper Wright Valley appear to be more poorly resolved, an observation reflected in the pairwise ANOSIM data (Supplementary Table 4). The MDS plot of bacterial ARISA data shows that bacterial communities in the four study sites fall into two groups: one consisting of Miers Valley and Battleship Promontory, and the other containing Beacon Valley and Upper Wright Valley. This is different from the relationship shown in the MDS plot of cyanobacterial ARISA data, in which the cyanobacterial community of Miers Valley can be seen as an outlier to the other three study sites.

Figure 2
figure 2

MDS plots of bacterial (a) and cyanobacterial (b) ARISA profiles. Bray–Curtis similarity matrices of bacterial and cyanobacterial ARISA profiles were calculated based on presence/absence of AFLs. MDS plots were then generated using the distance matrices to represent the relative distances between individual samples. The samples are represented using the following symbols: Miers Valley (MV), gray triangles (); Beacon Valley (BV), upturned black triangles (); Battleship Promontory (BP), gray squares (); Upper Wright Valley (UW), gray diamonds (). Contour lines enclose samples with similarities above certain thresholds (bacterial ARISA: 0.20, solid line; 0.40, dashed line) (cyanobacterial ARISA: 0.20, solid line).

Sequencing results

To identify dominant bacteria in the four study sites, PCR amplicons of partial bacterial 16S rRNA gene (hypervariable regions V3–V4) were bidirectionally sequenced, and 4092 raw pyrosequencing reads were obtained. The flowgrams were processed using AmpliconNoise (Quince et al., 2011) for quality filtering, denoising and chimera removal. In total, 2108 reads from adapter B (that is, the reverse PCR primer) passed the quality control and chimera check pipeline, and no forward reads were obtained. Of the 999 reads that contained incorrect barcode or primer sequences and failed the quality control, 831 were attributed to forward reads (81%). Only reverse reads (hereafter referred to as high quality reads, see Materials and methods for details) were used for downstream analysis, and the average read length of the high quality reads was 259 nt (including PCR primer but not barcode sequences), slightly above the 250 nt average length typical of 454 GS FLX runs (Mardis, 2008). The lower-than-expected read counts were likely the result of a strong secondary structure formed between the 454 forward fusion primer and the 5′ end of the amplicon, which interfered with the emPCR and sequencing processes.

Biodiversity and richness measures

A non-redundant set of high quality reads was aligned using ClustalW, PRANK+F and NAST, and the resulting alignments were independently analyzed using Mothur. OTUs were defined as furthest neighbor clusters within a Jukes–Cantor distance of 0.03 (OTU0.03). Alpha diversity indices for the three different multiple sequence alignments showed that the results are reproducible across alignment algorithms (Supplementary Figure 1). All further analyses were based on the PRANK+F alignment.

The terminal slopes of rarefaction curves (that is, the increase in number of OTUs per sequence added) indicate that Beacon Valley and Upper Wright Valley have been sampled comprehensively (Supplementary Figure 2). For Miers Valley and Battleship Promontory, the slopes are 0.13 and 0.18, respectively, much higher than those for Beacon Valley (0.02) and Upper Wright Valley (0.03). These findings validate the higher levels of observed diversity and AFLs seen in Miers Valley and Battleship Promontory's (Supplementary Figure 1 and Table 3, respectively)

Between 33 and 126 OTUs0.03 were obtained for the four study sites from between 262 and 838 reads (Table 2). Miers Valley, which is the most diverse site (126 OTUs0.03), had 542 reads, whereas only 33 OTUs0.03 were observed for Beacon Valley, which had 838 reads. Singleton and duplet OTUs0.03 account for large portions of the observed OTUs0.03, and the proportion of ‘rare’ OTUs0.03 (that is, singletons and duplets) appears to increase with overall diversity (Table 2). With the exception of Miers Valley, which is dominated by one OTU0.03, all other study sites contain several dominant OTUs0.03 (Supplementary Figure 3). Although there is considerable variation in sampling depths among the samples, the observations above are not dependent on sampling depth, and we therefore forwent resampling the more comprehensively covered samples to maximize our ability to observe rare OTUs.

Table 2 Distribution and composition of OTUs0.03

Phylogeny and distribution of diversity

To assess the phylogenetic distribution and diversity present in the sequences, high quality sequences were classified using the Ribosomal Database Project Classifier at 80% confidence threshold (Wang et al., 2007). The overall phylum-level compositions of all four sites are highly similar (Figure 3), as confirmed by repeated analysis of variance test measures of phylum-level abundances (P-value ≈1). Sequences affiliated with Actinobacteria account for 62–73% of sequences in all sites, followed by Bacteroidetes-affiliated sequences at 5–21% (Figure 3). With the exception of Miers Valley, Proteobacteria (mostly α-Proteobacteria) account for <3% of sequences in the collections. Gemmatimonadetes are absent in Battleship Promontory and range from 1% to 15% in other valleys. Acidobacteria account for <2% of sequences in the Miers Valley and Battleship Promontory collections and are completely absent in the other two communities. The remaining phyla were only present at <1% abundance except TM7, which was only found in Battleship Promontory. Between 0.2% and 11.5% of the sequences could not be confidently classified, and no Archaeal sequences were found. Surprisingly, no cyanobacterial sequences were identified despite the cyanobacterial ARISA results and several reports of well-known cyanobacterial species in the Dry Valleys (Friedmann et al., 1993; La Torre de et al., 2003; Smith et al., 2006; Wood et al., 2008a; Pointing et al., 2009).

Figure 3
figure 3

Phylum-level distribution of bacterial OTUs0.03. Representative sequences of OTUs0.03 for each study site were phylogenetically assigned using the Ribosomal Database Project Classifier, and phylum-level assignments with >80% confidence threshold were retrieved and plotted.

A Venn diagram illustrating the distribution of unique OTUs0.03 among the valleys (Figure 4) shows a highly localized distribution pattern. Battleship Promontory and Miers Valley appeared to share a large number of OTUs0.03, whereas Beacon Valley and Upper Wright Valley share a very different collection of OTUs0.03. This supports the grouping of microbial communities demonstrated by the bacterial ARISA data (Figure 2a). Surprisingly, the number of OTUs0.03 shared by all four valleys (that is, cosmopolitan OTUs0.03) is exceptionally low (2 out of 214 OTUs0.03), and despite the universal dominance of Actinobacteria (Figure 3), both cosmopolitan OTUs0.03 were only dominant in Beacon Valley and Upper Wright Valley. Conversely, OTUs0.03 affiliated with Actinobacteria and dominant in Battleship Promontory and Miers Valley tend to be absent or in very low abundance in Beacon Valley and Upper Wright Valley (Table 3).

Figure 4
figure 4

Venn diagram for bacterial OTUs0.03 found in the four Dry Valleys.

Table 3 Phylogenetic assignment and sequence distribution of select OTUs0.03

Biogeochemical analysis

Using Biota-Environmental STepwise analysis in the PRIMER 6 package, the bacterial and cyanobacterial ARISA data (as multivariate biotic patterns) were linked to physicochemical data (all 38 variables listed in Table 1 plus altitude) to investigate potential relationships between microbial community structure and the underlying soil geochemistry (Supplementary Table 5). Variables most consistently explanatory of both bacterial and cyanobacterial ARISA patterns were altitude and conductivity. The amount of copper also appears to be significantly correlated with cyanobacterial distribution.

LINKTREE analysis (Clarke et al., 2008) was also performed to examine how physicochemical factors affect specific ARISA patterns (Supplementary Figure 4 and Supplementary Table 6). For the bacterial ARISA data, conductivity and Pb separate Miers Valley and Battleship Promontory from Beacon Valley and Upper Wright Valley, and have the same role for the cyanobacterial ARISA data. Multiple factors, including gravimetric water content and altitude (Supplementary Table 6), signify the split of Miers Valley and Battleship Promontory sites. The factors Ca, Sr and Mn clearly separate Beacon Valley and Upper Wright Valley samples, and may reflect differences in the underlying soil geochemistry.

Discussion

The Dry Valleys encompass a high degree of physicochemical heterogeneity, which is reflected in the four valleys chosen for this study: Beacon Valley, Upper Wright Valley, Battleship Promontory and Miers Valley. In addition to well-characterized differences in altitude (a proxy for annual mean temperature (Körner, 2007)) and soil salt content (a proxy for water availability (Lamsal and Paudyal, 1999) and age of the soil (Bockheim, 1979)), the study sites all possess distinct soil elemental compositions, reflecting the variability in local soil mineralogy (Table 1 and Supplementary Table 1). Overall, geochemical measurements of multiple samples taken from each individual 2500 m2 sampling area in four valleys formed robust clusters on the MDS plot, establishing a framework for inter-valley comparisons (Figure 1). Miers Valley is a clear outlier because of its high C/N ratio, total percentage C content and low gravimetric water content (Table 1). Miers Valley soils are also substantially more alkaline than all other valleys, which may have resulted from greater influence from salts of marine origin (Campbell and Claridge, 1987). Furthermore, the elevated soil pH and high C/N ratio of Miers Valley are potentially reflective of photosynthetic activity by cyanobacteria in the system (Wood et al., 2008a, 2008b). However, Battleship Promontory's average C/N ratio is similar to drier valleys (Table 1) despite considerable cyanobacterial presence in lithic habitats (Johnston and Vestal, 1991; Friedmann et al., 1993; Wynn-Williams, 2000; La Torre de et al., 2003). Miers Valley and Battleship Promontory samples have much lower conductivity values than the other two valleys, reflecting higher precipitation levels and the lack of reverse leaching of ground salts (McLeod et al., 2009).

Community DNA profiling of bacteria and cyanobacteria using ARISA revealed four very distinct communities with varied levels of diversity (Supplementary Tables 3 and 4). Miers Valley contains the most cyanobacterial AFLs (Supplementary Table 3) and is identified as an outlier to the remaining three valleys by ANOSIM (Supplementary Table 4) and MDS plot (Figure 2b), an outcome likely linked to the presence of a permanent lake that supports significant cyanobacterial biomass (Wood et al., 2008a). Despite having the highest copper concentration of all four sampling sites, Beacon Valley contains the second most cyanobacterial AFLs (Supplementary Table 3), which is unexpected given that previous studies have suggested that Miers Valley contains markedly higher cyanobacterial diversity than Beacon Valley (Wood et al., 2008a). This may be due to the more sensitive AFL identification algorithm used in this study, but the absence of cyanobacterial sequences in the pyrosequencing results limits the scope of discussion on the relationship between copper concentration and cyanobacterial diversity in Dry Valley soils. Bacterial ARISA data revealed significantly higher levels of diversity in Battleship Promontory and Miers Valley (Supplementary Table 3) and tight grouping between these two sampling sites, in contrast with looser grouping between Beacon Valley and Upper Wright Valley (Figure 2a and Supplementary Table 4) samples.

Pyrosequencing analysis of partial 16S rRNA gene PCR amplicons (V3–V4 regions), generated from pooled DNA from each valley using a different set of PCR primers, corroborated findings from the bacterial ARISA. Beacon Valley and Upper Wright Valley were found to harbor considerably lower levels of microbial diversity (Supplementary Figure 1) and share a number of OTUs0.03 different from those shared by Battleship Promontory and Miers Valley (Figure 4). Meanwhile, only 2 of the 214 OTUs0.03 identified in this study were found in all four valleys, questioning the idea that Dry Valley microbial ecology is dominated by ubiquitously abundant cosmopolitan species (Cameron et al., 1970; Wynn-Williams and Edwards, 1990; Vishniac, 1993). The discovery of genetically localized microbial communities in Dry Valley soils calls into the question the importance of aeolian transport in determining Dry Valley microbiota distribution (Nkem et al., 2006; Wood et al., 2008a) and suggests that Dry Valley soil bacterial populations may be subject to very limited inter-valley redistribution or that selection by local physicochemical conditions may be much stronger than previously expected (Moorhead et al., 1999; Barrett et al., 2006b).

Given the difference in physicochemical conditions at the four study sites, the similarity between bacterial community phyla structures is surprising (Figure 3). It has been shown in a wide range of soil habitats that phylum-level community structure is highly site-specific even in homogeneous landscapes (Herrera et al., 2007; Labbé et al., 2007; Lamarche et al., 2007; Fulthorpe et al., 2008), and the conservation of phyla structures seen here may be reflective of the strong selective pressure imposed by the Dry Valley environment. On the other hand, diversity at a finer phylogenetic resolution is very high across the four sites. For example, only 9 out of 81 OTUs0.03 affiliated with Actinobacteria and none of the 28 OTUs0.03 affiliated with Bacteroidetes were found in more than two valleys. These observations point to localized, genetically disparate bacterial populations in the Dry Valleys, and suggest potential endemism in Dry Valley soil microbiota. These ideas challenge the notion that aeolian transport has a major role in inter-valley biota distribution (Adams et al., 2007). Although it is tempting to compare the level of spatial heterogeneity observed here with those of more temperate ecosystems, differences in sampling strategy and analysis methodology make such comparisons unreliable.

The distribution pattern described here (Figure 4) is unlikely to be a product of PCR/sequencing artifacts because of two pieces of evidence. Firstly, the number of OTUs0.03 identified for each study site was unrelated to the number of sequences obtained for that site (Supplementary Figure 1) (Pearson's correlation analysis, R=−0.305, P-value=0.695), which indicates that the flowgram clustering approach employed by AmpliconNoise (Quince et al., 2011) and the Perseus chimera checker (Quince et al., 2011) have sufficiently removed ‘noisy reads’ that artificially increase the size of the rare biosphere as the number of reads increases (Kunin et al., 2009; Huse et al., 2010; Lee et al., 2011). Secondly, the two ‘cosmopolitan’ OTUs0.03 were not universally dominant, and other OTUs0.03 at much higher total abundances were completely absent in one or more of the valleys (Table 3). This observation is not necessarily due to the modest sampling depth, as two of the study sites (Beacon Valley and Upper Wright Valley) were in fact comprehensively sampled according to terminal slopes of their rarefaction curves. Furthermore, although more OTUs0.03 will likely to be found in all four valleys by a more extensive sequencing effort, it will not alter the observation that no OTUs0.03 are dominant in all four study sites, and that dominant microbial species appear highly localized within the Dry Valleys.

Surprisingly, cyanobacterial sequences were completely absent from the pyrosequencing data despite significant cyanobacterial diversity reported in cyanobacterial ARISA. Further examination of the PCR primers used to generate partial 16S rRNA gene amplicons (Probe Match, http://rdp.cme.msu.edu/probematch/search.jsp) revealed that the reverse PCR primer (806R) targets cyanobacteria very poorly (matching <1% known cyanobacterial sequences). Although PCR primer-related biases are well known, some reports suggested that it is largely limited to ‘rare’ taxa (Huse et al., 2008; Hamp et al., 2009). Here we demonstrate a case where PCR primer bias can influence observed diversity at the phylum level. It should be noted, however, that a previous report of microbial communities in the McKelvey Valley based on both PCR clone libraries and qPCR reported very low or non-existent cyanobacterial signals in soils (Pointing et al., 2009), and it is possible that PCR primer bias and low abundance conspired in the lack of cyanobacterial sequences in our results. Although Acidobacteria have been reported as a dominant group in many earlier reports of Dry Valley microbial ecology (Smith et al., 2006; Pointing et al., 2009; Cary et al., 2010), they represent <2% of the sequences in the Miers Valley and Battleship Promontory, and are completely absent in the other valleys. Results from Probe Match suggest that this was not due to primer bias (data not shown), and thus the absence of Acidobacteria in Beacon Valley and Upper Wright Valley may have ecological significance. We did not attempt to investigate archaeal diversity in our samples, and the PCR primers used do not target archaeal sequence. However, archaeal diversity in the Dry Valleys had previously been reported as extremely low and limited to the globally ubiquitous Group II low-temperature Crenarchaeotes (Aislabie et al., 2006; Hogg et al., 2006), and archaea-specific PCR assays failed to detect any signal in the McKelvey Valley (Pointing et al., 2009).

By combining biological and physicochemical data, Biota-Environmental STepwise and LINKTREE analyses both identified altitude and conductivity as primary drivers for the biological differences seen in bacterial and cyanobacterial ARISA data (Supplementary Table 5). Altitude can be considered a proxy for average temperature, which generally drops by 5.5 °C for every 1000 m increase in elevation (Körner, 2007). This temperature gradient can be exacerbated by the low air relative humidity in the Dry Valleys and may directly influence the presence of liquid water in the austral summer. Conductivity values (that is, soil salt content) for samples from Miers Valley and Battleship Promontory are remarkably low by Dry Valley standards (Aislabie et al., 2006), reflecting seasonal surface water availability in these two valleys. Conversely, the high conductivity of Beacon Valley and Upper Wright Valley samples suggests that those surface soils are relatively ancient and experience little leaching from precipitation (McLeod et al., 2009). Soil conductivity has been shown to negatively correlate with micro-invertebrate diversity (Freckman and Virginia, 1997; Barrett et al., 2004) within a single valley, and an earlier report comparing Dry Valley soils with various lithic habitats suggested that conductivity might have a role in determining overall biota distribution (Pointing et al., 2009). Our results corroborate previous findings and suggest that soil salt content is instrumental in structuring biota at all levels. Elemental Pb content and gravimetric water content, both of which are linked to the mineralogy of the soil, were also identified as influential in determining bacterial and cyanobacterial community compositions (Supplementary Table 5 and Supplementary Figure 4).

Although our modest sequencing effort was intended to identify dominant bacterial species, the results have shown that Dry Valley soils harbor far more than a few cosmopolitan species (Horowitz et al., 1972; Johnson et al., 1978; Vishniac, 1993), and that Dry Valley microbial communities appear to be highly localized and composed of potentially endemic populations. Moreover, our integrated approach, utilizing two types of molecular genetic techniques, geochemical measurements and statistical analyses, showed that the distribution of Dry Valley microbiota significantly correlates with physicochemical conditions of the environment. These discoveries not only challenge existing paradigms for Dry Valley microbiota distribution and refocus drivers of microbial diversity in the Dry Valley ecosystem, they also support the hypothesis that Dry Valley ecology is largely determined by abiotic factors in this ultraoligotrophic environment (Barrett et al., 2004; Cary et al., 2010). Our findings, based on a systematic survey of habitats across a wide physicochemical gradient, provide statistically robust data showing that a complex set of abiotic factors determine the microbial components of the Dry Valleys. These results also pose novel challenges to the biosecurity management of this unique and fragile habitat as existing policies have been designed under the assumption that Dry Valley soil bacterial populations are homogeneous across the landscape and detached from contemporary environmental conditions (Moorhead et al., 1999; Barrett et al., 2006b). Lastly, the discovery of highly localized, potentially endemic microbial populations in the Dry Valleys lends credence to the theory of potential links between Dry Valley glacial geomorphology and soil microbial ecology, which in turn lead to questions such as colonization opportunities and selection mechanisms; the results presented here therefore has implications for Antarctic research at large.