Ecological and genomic responses of soil microbiomes to high-severity wildfire: linking community assembly to functional potential

Increasing wildfire severity, which is common throughout the western United States, can have deleterious effects on plant regeneration and large impacts on carbon (C) and nitrogen (N) cycling rates. Soil microbes are pivotal in facilitating these elemental cycles, so understanding the impact of increasing fire severity on soil microbial communities is critical. Here, we assess the long-term impact of high-severity fires on the soil microbiome. We find that high-severity wildfires result in a multi-decadal (>25 y) recovery of the soil microbiome mediated by concomitant differences in aboveground vegetation, soil chemistry, and microbial assembly processes. Our results depict a distinct taxonomic and functional successional pattern of increasing selection in post-fire soil microbial communities. Changes in microbiome composition corresponded with changes in microbial functional potential, specifically altered C metabolism and enhanced N cycling potential, which related to rates of potential decomposition and inorganic N availability, respectively. Based on metagenome-assembled genomes, we show that bacterial genomes enriched in our earliest site (4 y since fire) harbor distinct traits such as a robust stress response and a high potential to degrade pyrogenic, polyaromatic C that allow them to thrive in post-fire environments. Taken together, these results provide a biological basis for previously reported process rate measurements and explain the temporal dynamics of post-fire biogeochemistry, which ultimately constrains ecosystem recovery.


Site description, experimental design, and soil sampling
This study was conducted on the Eldorado National Forest, which is located in the Central Sierra Nevada of California, an area historically fire-suppressed like much of western North America ( Figure 1A) [1]. We sampled in areas of varying time since stand-replacing wildfire using a fire chronosequence established within the South Fork of the American River Watershed. For a full description of the chronosequence, see Bohlman et al. [2], Dove et al. [3], and Dove [4]. Briefly, the fire sites are as follows: King Fire (4-y post fire), Freds Fire (13-y post fire), and Cleveland Fire (25-y post fire). We incorporated sites throughout the study area that had not burned since at least 1908 [1], which is the maximum period for which we know that no recorded burning occurred in this region. We operationally defined this as our late-successional site (> 115-y post fire). We controlled for pre-fire vegetation, elevation, slope, aspect, burn severity, postfire management, and USDA Soil Taxonomy (suborder) for all plots [3]. All soils are in the suborder Xerepts, with either an umbric or ochric epipedon.
Each site consisted of six to eight plots separated by at least 150 m (the 13-y site had only six plots due to sampling area constraints; all other sites had eight plots). Plotcenters were chosen randomly a priori using a GIS layer of appropriate site polygons (i.e., similar soils, elevation, aspect, burn severity, and management). Each plot was defined by a 5-m radius from plot-center. Within each plot, we sampled one point under each of the available lowest-stratum cover types. Cover types were tree (e.g., Abies concolor, Pinus ponderosa, Quercus spp.), seedling (e.g., Pinus ponderosa), shrub (Arctostaphylos spp., Salix spp.), nitrogen-fixing plant (Ceonothus spp., Chamaebatia foliolosa), herbaceous (e.g., Carex spp., Poaceae), and bare soil. When multiple representatives of the same cover type occurred within a plot (which occurred frequently), we sampled under the representative closest to plot-center.
We sampled mineral soil June 8-15, 2017. The organic horizon was removed with a sterile, gloved hand, and a 2-cm diameter soil corer (sterilized with 10% bleach followed by 70% ethanol) was used to sample the top 5 cm of mineral soil, where we expected the greatest impacts of fire, at each point. We did not sample the organic horizon because many of the 4-and 13-y plots did not have this horizon. To collect enough soil from each point (~100 g), we took and composited multiple (~10) cores within a 20-cm diameter area, within a given cover type stratum. Soil samples were placed in a sterile bag, immediately placed on Blue Ice ® (4 °C) and transferred to dry ice (-80 °C) within 3 h [4].

DNA extraction
We extracted total soil DNA (0.25 g of field moist soil) using the MoBio PowerSoil DNA isolation kit (Carlsbad, CA), following the manufacturer's instructions. We quantified DNA yields using the Quant-it PicoGreen dsDNA assay kit (Invitrogen, Carlsbad, CA).

Amplicon sequencing and analysis
Sample libraries were prepared and sequenced at the Department of Energy Joint Genome Institute (Berkeley, CA, USA). For prokaryotes, 16S rRNA genes were amplified in polymerase chain reactions (PCRs) using primers (515F/806R) that target the V4 region of the 16S rRNA gene [5]. For fungi, ITS2 regions were amplified in PCR reactions using ITS9f/ITS4R primers [6]. The PCR reactions contained 10 μl 5 PRIME HotMasterMix (Quantabio, Beverly, MA), 1 μl Roche BSA (10 mg/ml), 0.5 μl each of the forward and reverse primers (10 μM final concentration), 1.0 μl genomic DNA (10 ng/reaction), and nuclease-free water in total volume of 25 μl. Reactions were held at 94 °C for 3 min to denature the DNA, followed by amplification for 30 cycles at 94 °C for 45 s, 50 °C for 60 s, and 72 °C for 90 s; a final extension of 10 min at 72 °C was added to ensure complete amplification.
Each sample was amplified in triplicate, combined, and purified using the Agencourt AMPure XP PCR purification system (Beckman Coulter, Brea, CA).
Amplicons were pooled (10 ng/sample) and sequenced on one lane of the Illumina Miseq platform (Illumina Inc., San Diego, CA), resulting in 300 bp paired-end reads.
Taxonomy was annotated to the SILVA 132 SSU V4-V5 database [8] for 16S, and to the Unite database for ITS (v7) [9] using USEARCH. Amplicon sequence data are deposited in the JGI Genome Portal under project ID 1188685.

Metagenomic sequencing and analysis
We chose four plots randomly from each fire site for shotgun metagenomic sequencing because cost constraints prevented us from sequencing all samples (16 total metagenomes). These metagenome samples were prepared by compositing DNA Raw reads were trimmed and quality filtered using Trimmomatic (v 0.36, [11], and read taxonomy was classified using Kaiju (v 1.6) [12]. Prodigal (v 2.6) [13] was used to predict coding regions from the reads. The translated proteins from all detected coding regions of each metagenome were annotated by searching against carbohydrate active enzymes using the CAZy database [14,15] via DIAMOND BLASTp (options: -k 1 -e 1E-5-sensitive) [16]. Nitrogen (N) cycling, methane production and oxidation, and sulfate reduction genes were annotated via hmmer (v 3.1b2) [17] to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [18]. Gene abundances (gene counts per KEGG Orthology -KO) were normalized to the number of amino acids detected in each metagenome.
Samples were co-assembled using MEGAHIT [19] with a minimum contig length of 1000 bp. Then, each individual sample was mapped back to the MEGAHIT contigs with bbmap [20], and we extracted unmapped reads. Next, unmapped reads were concatenated and re-assembled using SPAdes in the "meta" setting [21]. The newly assembled contig folds were merged with the MEGAHIT contigs. Genome fragments that were larger than 1 kb were clustered into MAGs using MaxBin (v 2.2.5) [22] and MetaBAT2 (v 2.12.1) [23], and MAGs were dereplicated using DASTool (v 1.1.10) [24].
Potential mis-binning was identified with CheckM [25], and bins were further refined to remove contamination following Xue et al. [26]. Genome bin completeness and contamination are reported (Table S6). Each bin was annotated with Kaiju (v 1.6.2) [12] using default parameters, the NCBI nr database, and GTDB-Tk (v 0.3. 2) [27] to classify each contig into a taxonomic rank, from phylum to species. Protein-encoding genes from MAGs were predicted with Prodigal [13], and the resulting nucleotide sequences were searched against the KEGG database [18] reference sequences using DIAMOND BLASTX [16].
Differences in the prokaryotic (16S rRNA gene) and fungal (ITS region) compositions were determined by PerMANOVA [35] using Bray-Curtis distances [36] on proportionally normalized data. We conducted distance-based redundancy analysis (dbRDA) using edaphic data from Dove et al. [3] to assess the variation of the microbial community explained solely by soil physical and chemical properties. Differences in the richness and abundance of ectomycorrhizal and arbuscular mycorrhizal fungi (annotated using FUNGuild) [37] among cover types and with time since fire were determined using ANOVA and linear regression.
Assembly processes were assessed using null-modeling approaches following Stegen et al. [38] and Ning et al. [39]. Assembly processes are broadly categorized as: 1) variable selection, whereby selective processes lead to disparate microbial communities; 2) homogenous selection, whereby selective processes lead to similar microbial communities; 3) dispersal limitation, whereby limitations to dispersal allow ecological drift to lead to disparate microbial communities; and 4) homogenizing dispersal, whereby high rates of dispersal lead to similar microbial communities.
The Stegen et al. [38] approach uses both phylogenetic and taxonomic turnover to classify the dominant assembly process in pairwise sample comparisons. We used the between-community version of the (abundance-weighted) β-mean-nearest taxon distance (βMNTD) to characterize phylogenetic turnover [40]. Observed βMNTDs were then compared to a null model distribution of βMNTDs generated from 999 null model expectations (i.e., randomized taxa reshuffling among phylogenetic tree tips). The difference between observed βMNTD and the mean of the null distribution was measured in units of standard deviation (of the null distribution), and it is referred to as the β-nearest taxon index (βNTI). Values of βNTI ≥ 2 signify variable selection as the dominant assembly process, and βNTI values ≤ -2 signify homogenous selection as the dominant assembly process [38]. Taxonomic turnover was used to define the dominant assembly processes of |βNTI values| < 2. For taxonomic turnover, we use the Raup Crick index [41] modified to include species abundances [38]. We calculated null models using probability-based randomization. Null model microbial compositions were assembled for each sample by randomly sampling from the total OTU pool in proportion to their occupancy and abundance across all samples, maintaining similar levels of alpha diversity. The community composition for each sample was probabilistically generated 999 times. For each iteration of the null model, a pairwise Bray-Curtis dissimilarity matrix was calculated. For each pair of samples, the proportion of null iterations in which the index was smaller than or equal to the observed Bray-Curtis dissimilarity index was our resulting metric. We standardized this metric (RCBray) to range from -1 to 1 by subtracting 0.5 and multiplying by 2 [41]. Values of |βNTI | < 2 and RCBray ≥ 0.95 signify dispersal limitation, and values of |βNTI | < 2 and RCBray ≤ -0.95 signify homogenizing dispersal [38]. Values of |βNTI | < 2 and |RCBray| < 0.95 signify weak selection and moderate levels of dispersal such that no process dominates assembly and are therefore classified as "undominated" [42].
The Ning et al. [39] approach uses the iCAMP R package and is similar to the Stegen et al. [38] approach, except assembly processes are calculated for phylogenetic bins of closely related taxa. For this, default parameters were used. We used the taxa.binphy.big() function to determine optimal minimal phylogenetic difference for a significant phylogenetic signal (0.2) and phylogenetic bin size limit (24) with the environmental parameters soil organic carbon, N, and pH from Dove et al. [3].
Differences in CAZy abundance (including different functional types) and N cycling genes with time since fire were determined by linear regression and ANOVA on data normalized by Prodigal predicted amino acid coding reads [13]. To assess differences in the composition of CAZy genes with time since fire, we followed the same approach as for amplicons except, instead of proportionally normalizing data, data were normalized by predicted amino acid coding reads. We classified MAGs as early-or latesuccessional based on splines of their center log-ratio abundance with time since fire, using sparse principal components analysis [31]. Fig. S1. The relative abundance of the most abundant bacterial and two most abundant fungal phyla across the four fire sites. At the phylum-level, 16S, bacterial metagenomic (MG Bacteria), and metagenome-assembled genome (MAG) read profiles follow similar patterns, but ITS and fungal metagenomic (MG Fungi) read profiles were uncorrelated.   Time since fire (y) EM OTU richness B Fig. S4. Relative dominance of assembly processes for prokaryotes within each time point (variable selection was not detected, A). Correlations between βNTI and pairwise differences in pH (B) and percent soil organic carbon (SOC, C). The orange line represents the best-fit regression. Dashed horizontal lines at βNTI = -2 (homogeneous selection) and βNTI = 2 (variable selection) represent thresholds for assembly processes.