Insufficient sampling constrains our characterization of plant microbiomes

Bullington, Lorinda S.; Lekberg, Ylva; Larkin, Beau G.

doi:10.1038/s41598-021-83153-9

Download PDF

Article
Open access
Published: 11 February 2021

Insufficient sampling constrains our characterization of plant microbiomes

Lorinda S. Bullington^1,2,
Ylva Lekberg^1,2 &
Beau G. Larkin¹

Scientific Reports volume 11, Article number: 3645 (2021) Cite this article

3956 Accesses
15 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Plants host diverse microbial communities, but there is little consensus on how we sample these communities, and this has unknown consequences. Using root and leaf tissue from showy milkweed (Asclepias speciosa), we compared two common sampling strategies: (1) homogenizing after subsampling (30 mg), and (2) homogenizing bulk tissue before subsampling (30 mg). We targeted bacteria, arbuscular mycorrhizal (AM) fungi and non-AM fungi in roots, and foliar fungal endophytes (FFE) in leaves. We further extracted DNA from all of the leaf tissue collected to determine the extent of undersampling of FFE, and sampled FFE twice across the season using strategy one to assess temporal dynamics. All microbial groups except AM fungi differed in composition between the two sampling strategies. Community overlap increased when rare taxa were removed, but FFE and bacterial communities still differed between strategies, with largely non-overlapping communities within individual plants. Increasing the extraction mass 10 × increased FFE richness ~ 10 ×, confirming the severe undersampling indicated in the sampling comparisons. Still, seasonal patterns in FFEs were apparent, suggesting that strong drivers are identified despite severe undersampling. Our findings highlight that current sampling practices poorly characterize many microbial groups, and increased sampling intensity is necessary for increase reproducibility and to identify subtler patterns in microbial distributions.

Rapid differentiation of soil and root microbiomes in response to plant composition and biodiversity in the field

Article Open access 19 April 2023

Unexpected diversity among small-scale sample replicates of defined plant root compartments

Article Open access 10 November 2021

Plant part and a steep environmental gradient predict plant microbial composition in a tropical watershed

Article 13 November 2020

Introduction

The human microbiome has been thoroughly studied as an integral component of human health and performance^1,2,3. Similarly, the plant microbiome is critical in understanding the health and ecology of living plants^4,5, and their rate of decay and nutrient turnover after senescence⁶. The plant microbiome is incredibly complex and diverse; just 0.1 g of foliar plant tissue can host over 100 putative fungal species^7,8,9, whereas the richness of arbuscular mycorrhizal fungi (AMF) in roots is considerably lower¹⁰. Knowledge of the spatial heterogeneity and overall richness of plant-associated microorganisms should inform how we sample, as this may influence our understanding of the processes that shape microbial communities and their relationships with other organisms. Currently, there is no consensus on an optimal sampling strategy needed to characterize plant-associated microbial communities. In fact, there is little technical guidance on how common sampling strategies influence our interpretation and analyses of the microbial communities we observe.

Through the use of next-generation sequencing technology, we now know that biases associated with culture-based microbiome surveys leave a large proportion of microbial diversity completely undetected^11,12. Early departures from culture-based methods, like cloning and direct extraction of environmental DNA greatly improved our understanding of plant-associated microbial communities, but they revealed new biases suggesting that our understanding of microbial communities was still incomplete¹³. Even 454 pyrosequencing often did not provide enough high-quality sequences to adequately characterize microbial communities, resulting in sequencing effort curves that failed to reach an asymptote^14,15. Deeper, higher quality sequencing technology now allows researchers to see a more comprehensive picture of plant-associated microbial communities than previous methods. Thus, while sequencing effort for individual samples is no longer a bottleneck in characterizing plant-associated microbial communities, sampling effort may very well be.

Many advances have been made to reduce biases when processing microbial community data^16,17,18, but descriptions of sampling effort or strategy are often omitted or vague, with little or no justification of methods used. Site descriptions, including information about the developmental stage or uniformity of plants to be sampled, are often too cursory. In a recent review, Dickie et al. (2018)¹⁹ found that 95% of metabarcoding studies examined reported inappropriate or incomplete field or sampling methods that rendered them non-reproducible. In addition to unclear methods, many studies fail to report the robustness of their sampling efforts (e.g., through species accumulation curves that display the number of species recovered for each additional sample), despite the inherent consequences of undersampling. Indeed, undersampling of microbial communities appears to have become the rule, rather than the exception. One reason for this is that time and budget constraints limit the amount of lab work that can be performed, and often it is inappropriate or impossible to destructively sample whole plants. Most of the time researchers only sample small quantities (mg) of plant tissue representing a tiny fraction of the plant’s total biomass (often < 1%). Microbial communities observed in these samples are then used to make inferences on the larger population of colonizing microbes. In uncontrolled field studies where there are countless environmental drivers, this incomplete sampling may obscure subtle underlying patterns in microbial distributions²⁰, which in turn could compromise our estimates about their diversity and the degree to which sites, treatments and individual hosts truly differ.

Not only do we sample small proportions of total plant biomass, but researchers often differ in how these samples are collected and processed. To provide guidance for sampling plant-associated microbial groups, we compared two of the more common sampling strategies seen in the literature to see if they differ in richness and compositional estimates and if this depends on the microorganism targeted. The first sampling strategy involves collecting a specific surface area or volume of plant tissue (e.g. leaf discs or lengths of root segments) followed by tissue homogenization (e.g.,^14,21,22). In this strategy, the spatial extent of available plant tissue is somewhat maintained, but many taxa may be missed if their distributions are patchy, which in turn may lead to increased variance in richness and composition among samples. From hereon we will refer to this strategy as “homogenizing tissue after subsampling” (Fig. 1A). The second common sampling strategy involves either collecting a pre-specified amount of tissue from each plant (e.g., six leaves or root segments per plant) or collecting plant tissue somewhat haphazardly without much standardization among plant samples. Then samples are ground prior to collection of a standardized, homogenized subsample (e.g.,^9,23). We will refer to this method as “homogenizing tissue before subsampling” (Fig. 1B). A criticism of this method is that differences in initial sample size are usually not accounted for (e.g., variation in total leaf area or root length). In this case, the size of the homogenized subsample is standardized, but not the amount of tissue initially homogenized. When the amount of plant tissue initially homogenized is inconsistent among samples, we do not know if the differences in microbial communities results from variation in the amount of plant material collected, or from true differences among the host plants. Differences in the detected species richness could thus be biased towards larger plant samples, or larger plants, rather than revealing true differences in richness and composition. Despite this criticism, the homogenizing tissue before subsampling method is relied on because it is thought that the homogenized pool from a relatively large initial sample may yield a more comprehensive subsample of the microbial community present within the plant. It is often assumed that any subsample from the powdered tissue, regardless of sampling approach used, will yield the same or a similar microbial community. To our knowledge, these assumptions have not been tested.

Because the optimal strategy may depend on the microorganism targeted, we tested these two strategies by extracting and amplifying bacterial DNA (16S rRNA gene), general fungal DNA (internal transcribed spacer region 2, hereafter ITS2) and arbuscular mycorrhizal fungal DNA (18S rRNA gene), from roots (all) and/or leaves (ITS2 only) of naturally occurring, mature, showy milkweed (Asclepias speciosa). We chose showy milkweed as a previous study has shown high colonization of both AM and general fungi²⁴. Due to the highly diverse nature of foliar fungal endophytes (FFE) in general¹⁴, we extracted DNA from many additional subsamples per plant to assess the extent of undersampling within this group. Finally, we sampled leaves twice across the season to assess if broad-scale seasonal differences in FFE were still detectable despite potential undersampling, in order to better understand when—and for which type of questions—insufficient sampling is a problem. We predicted that (1) homogenizing before subsampling (strategy 2) would yield richer, more even microbial communities among plants, because more of the plant tissue would be initially homogenized, (2) different sampling strategies would result in different microbial community structures due to the spatial heterogeneity of microbes within plants, and (3) differences between sampling strategies would depend on the organism targeted and tissue sampled (roots or leaves), potentially due to inherent differences in global and likely local richness, as well as microbial modes of dispersal (above vs. belowground).

Methods

Field collections and sampling strategies

All collections occurred at the Teller Wildlife Refuge in western Montana, USA (46.3219 N 114.1292 W). The Teller Wildlife Refuge is a 525-ha property that comprises riparian, floodplain, wetland, and upland habitat. Showy milkweed grows in numerous patches across the property, concentrated near irrigation ditches, on the floodplain, and in old fields where sub-irrigation provides enough water. Plants were selected from a 500 m² area of a field with a patchy distribution. All plants sampled were at least 3 m apart, and at each collection we sampled plants of similar size and developmental stage. During our first collection, we collected 6 individual leaves each from 20 showy milkweed plants on May 18th, 2016 before flowering, just as buds were beginning to form. We sampled from the same population again on September 6th, 2016 when flowers were present and seed pods were almost ripe. During the September collection we also collected as much of the root system as possible within the upper 20 cm of soil by destructively sampling each plant. Roots were washed with tap water to remove all visible soil, and fine roots (< 1 mm in diameter) were retained for further processing. All tissue was kept on ice upon collection, and then stored at − 20 °C. Bulk root collections weighed between 0.3 and 0.8 g dry weight, per plant. Bulk leaf collections weighed between 1.2 and 2.5 g dry weight per plant. To remove microbial tissues from root and leaf exterior and characterize only endophytic microbes, all root and leaf tissue was surface sterilized in 70% ethanol for 1 min., 0.5% NaOCl for 1 min. and then rinsed 2 × in sterile water. Imprints of sterilized tissue were made on growth medium to confirm the sterilization procedure. The absence of growth observed indicated successful surface sterilization. This process may not completely remove all non-viable surface DNA, however the combination of rinsing, and immersion of tissue in NaOCl and ethanol is effective at eliminating epiphyte tissue so that only residual DNA should remain²⁵. After sterilization, leaves and roots from the September collection were separated into two groups based on the two sampling strategies we chose to compare.

For Strategy 1, homogenizing after subsampling, we removed one leaf disc from 6 leaves per plant (at random locations) using an 11.5 mm in diameter corkborer and placed the six discs per plant into single 1.2 ml Eppendorf tubes. We also used the same strategy for May leaf collections. This resulted in a total surface area of approximately 62 mm² per plant, with a dry weight of approximately 30.0 mg of leaf tissue. For roots, approximately 30.0 mg of dry root segments per plant was placed into individual 1.2 ml Eppendorf tubes. As many DNA extraction kits recommend a maximum volume of approximately 20–30.0 mg dry weight or 50.0 mg fresh weight, we have found this to be a commonly used standard (e.g.,^{26,27,28,29,30}), however, many studies lack sufficient detail on final extraction volumes. Subsampled root and leaf tissues were then freeze-dried using a Labconco Freezone benchtop freeze-dry system (Labconco, Kansas City, MO, USA). Subsampled tissue was homogenized using a 1600 MiniG tissue homogenizer and cell lyser (Spex SamplePrep, Metuchen, NJ, USA).

For Strategy 2, homogenizing before subsampling, all remaining root and leaf tissue from the September collection was subsequently freeze-dried and homogenized for each plant using the freeze-dry system and MiniG as described above. Approximately 30.0 mg dry weight per plant was then subsampled from the homogenized pool of root or leaf tissue and placed in 1.2 ml Eppendorf tubes for DNA extraction. All remaining foliar tissue from five plants collected in September (> 40 × more tissue per plant, equaling 1241–2467 mg per plant) was then divided into approximately 250 mg replicates for additional DNA extractions to assess the extent of undersampling.

DNA extractions and PCR

DNA extractions of 20 foliar samples collected in May, as well as 40 foliar and 40 root samples representing the two different sampling strategies in roots and leaves in September (100 total extractions), were performed using the MO BIO PowerPlant Pro-htp DNA isolation kit, which has been shown to be comparable to other kits for extracting microbial DNA³¹. For five of the plants collected in September, we extracted DNA from all the remaining foliar tissue of the six leaves collected (1241–2467 mg dry weight per plant, compared to the 15–30 mg commonly used to infer differences in microbial community composition). This increased the amount of tissue analyzed per plant by > 40 ×. The number of additional extractions depended on the size of the leaves and ranged from 5 to 10 extractions per plant. These additional bulk material extractions on larger volumes were performed using the DNeasy Plant Maxi Kit (Qiagen, Hilden, Germany), also per manufacturer recommendations. To monitor potential background or cross contamination among samples, we also included control extractions and PCRs for all microbial groups and extraction kits used.

After extracting DNA, all samples were prepared for Illumina sequencing through a two-step PCR amplification. For root tissue we amplified the 18S rRNA gene, the internal transcribed spacer 2 (ITS2), and the 16S rRNA gene regions for AMF, general fungi, and bacteria respectively. In leaf tissue only the ITS2 general fungal region was targeted. Primer pairs for PCR1 included AMF specific primers WANDA³² and AML2³³. For the ITS2 region we used a mixture of the fungal specific forward primers fITS7 and ITS7o^34,35 and the general eukaryotic reverse primer ITS4³⁶. AMF specific primers were used in addition to general fungal primers because ITS2 primers may not allow for a thorough characterization of AMF communities due to poor amplification of AMF when other fungi co-occur³⁷. For the 16S rRNA gene, we targeted the V4 region using primers 515F and 806R³⁸. More detailed descriptions of PCR reactions are in Bullington et al. (2018)³⁹ and Lekberg et al. (2018)³⁷. Briefly in PCR 1, each primer was flanked by 22 bp Fluidigm universal tags CS1 or CS2 (Fluidigm Inc. San Francisco, CA, USA). Each reaction contained a total volume of 12.5 μl which included 1.0 μl of DNA extract as template, 2.5 pmol of each primer in 1 × GoTaq Green Master Mix [(Green GoTaq Reaction Buffer, 200 μM dATP, 200 μM dGTP, 200 μM dCTP, 200 μM dTTP and 1.5 mM MgCl₂) Promega, USA]. Reactions were performed in duplicate or triplicate, depending on concentration of amplified product (number of reactions was kept consistent within each microbial group), in a Techne TC-4000 thermocycler (Bibby Scientific, Burlington, USA) under the following conditions: initial denaturation at 95 °C for 2 min followed by 35 cycles at 95 °C for 1 min, 54 °C (18S rRNA gene), 57 °C (ITS2) or 50 °C (16S rRNA gene) for 1 min, and 72 °C for one min, with a final elongation for 10 min at 72 °C. To confirm the presence of our target amplicons, all reactions were analyzed by 1.5% agarose gel electrophoresis using a 100 bp ladder (GeneRuler DNA Ladder, Thermo Scientific, USA) as a size standard. As a precaution, control samples were sequenced even if they did not produce a band during gel electrophoresis.

All amplicons generated during PCR1 were diluted 1:10 for use as template in PCR2. PCR2 primer complexes consisted of the same Fluidigm tags (CS1 or CS2) as PCR1 primers, 8 bp Illumina Nextera barcodes (Illumina Inc., San Diego, CA, USA), and Illumina adapters. To minimize index hopping, we used unique dual indexing pooling combinations⁴⁰, stored libraries individually, pooled only immediately before sequencing, and removed free adaptors from our libraries. PCR2 amplicons were purified using AMPure XP beads (Beckman Coulter Genomics, USA), quantified by Qubit 2.0 fluorometer (Invitrogen, USA) and pooled in equimolar concentration prior to sequencing. Sequencing was done at the Institute for Bioinformatics and Evolutionary Studies (IBEST) genomics resources core at the University of Idaho (http://www.ibest.uidaho.edu; Moscow, ID, USA). Amplicon libraries were sequenced using ¼ of a 2 × 300 paired-end (PE) run on an Illumina MiSeq sequencing platform (Illumina Inc., San Diego, CA, USA).

Bioinformatics

Processing of raw sequence data was performed using “Quantitative Insights into Molecular Microbial Ecology 2" (QIIME2 version 2018.4; https://qiime2.org/)¹⁸. Sequence reads were first demultiplexed using the q2-demux plugin (https://github.com/qiime2/q2-demux). Only forward reads were used for the 18S rRNA gene region, as the overlap between forward and reverse reads is too short to merge the two without significant sequence loss. For the 18S rRNA gene only, forward reads were trimmed to 210 bp, which covers the informative region of our 18S rRNA gene target³³. For ITS2 and the 16S rRNA gene, forward and reverse reads were trimmed where median quality score fell below 30, and if at any point quality score fell below 3 within the trimmed region, those sequences were removed from further analysis. The q2-dada2 plugin uses nucleotide quality scores to produce sequence variants (SVs), or sequence clusters with 100% similarity representing the estimated true biological variation within each sample. Although sequences are clustered at 100% similarity as opposed to the traditional 97% similarity, DADA2 produces fewer spurious sequences, fewer clusters, and results in a more accurate representation of the true biological variation present⁴¹. After DADA2 processing, microbial groups contained an average of 9000–17,000 sequences per sample (Supplementary Information, Table S1). All extraction and PCR controls were clean except for bacteria. Bacterial contaminants were subsequently removed from all samples to reduce potential background contamination. The database MaarjAM⁴² was used to assign taxonomy and remove non-target DNA for AMF, UNITE was used for general fungi (http://unite.ut.ee)⁴³; and Greengenes was used for bacteria (http://greengenes.lbl.gov). All SVs that did not match with at least 70% identity (for bacteria) or 90% identity (for fungi) with at least 70% coverage to sequences within one of the above databases were removed. To help remove non-target DNA, we added many sequences representing non-target organisms (including many Asclepias spp.) to both our AMF and ITS databases to better identify contaminants and reduce misclassification when assigning taxonomy to these sequences. Taxonomy for each microbial group was then assigned using QIIME2 q2-feature-classifier (https://github.com/qiime2/q2-feature-classifier), a naive Bayes machine-learning classifier which has been shown to meet or exceed classification accuracy of existing methods⁴⁴, setting a confidence threshold of 0.94 for fungi and 0.7 for bacteria. For 16S rRNA data, sequences identified as chloroplast or mitochondrial DNA were also removed, which resulted in the removal of > 90% of bacterial sequences (Supplementary Table S1). For non-AM root fungi, all Glomeromycota were removed in order to analyze AMF and non-AM root fungi separately.

Statistical analyses

All statistical analyses associated with microbial community richness and composition in roots and shoots of A. speciosa were conducted in R⁴⁵ using the vegan package⁴⁶, except where otherwise noted. All analyses were based on rarefied data (160, 7820, 634 and 1400 sequences for bacteria, AMF, non-AM root fungi and FFE, respectively) using the ‘rrarefy’ function. Samples with few sequences were removed from further analyses to allow for greater sequencing depth, which resulted in 36, 36, 34 and 36 samples for bacteria, AMF, non-AM root fungi and FFE, respectively. Analyses of ITS2 data when extracting DNA from all remaining foliar tissue was based on a rarefaction level of 8475 sequences. These sampling depths were chosen based on saturation of sequencing effort curves of all reads after the removal of non-target DNA (Fig. 2) and in effort to retain the most samples (n = 18 per sampling strategy for bacteria, AMF and foliar fungi and 17 for non-AM root fungi). Sequencing effort curves were produced using the iNEXT package⁴⁷. To test how sampling strategy influenced microbial alpha diversity metrics, we calculated richness as the number of SVs in each sample as well as Pielou’s ‘J’ evenness, which describes the similarity of species frequencies. To compare diversity metrics (based on SVs) between the two sampling strategies we performed a Wilcoxon signed-rank test on all paired values. To ensure that potential differences in sampling strategies were not due to artifactual SVs, we also compared results when implementing LULU, an algorithm for post-clustering curation that clustered SVs at 98.5% similarity and a minimum relative co-occurrence of 0.9⁴⁸.

We performed non-metric multidimensional scaling (NMDS) to evaluate community structures for each sampling strategy and target region, individually. Each NMDS analysis was performed using the ‘metaMDS’ function and stress for all plots was between 0.04 and 0.16. These analyses were performed on Bray–Curtis distances of Hellinger transformed sequence abundances as well as Raup–Crick distances of presence/absence data. The ‘Procrustes’ function in vegan was used to assess similarity of patterns produced in the NMDS analyses for the paired sampling strategies and congruency was visualized in Procrustes plots. The ‘protest’ function was used with 1000 permutations to estimate the significance of the Procrustes statistic. We performed analyses on both presence/absence and abundance data to determine how low-abundant SVs influenced the differences between sampling strategies.

To determine variation in microbial communities among individual plants when we extracted from multiple subsamples, we performed NMDS analyses as well as a PERMANOVA using the ‘adonis’ function. We also performed a PERMANOVA to detect seasonal differences between May and September leaf-disc collections. Figures 2, 3, 4, 5, 6 were generated using ggplot2⁴⁹.

Results

Richness and evenness between sampling strategies

Contrary to our predictions, richness differed only for root bacteria where homogenizing before subsampling resulted in more SVs recovered than homogenizing after subsampling (p = 0.04, Supplementary Table S2and Fig. S1). Evenness did not differ for any microbial group. In addition, neither sampling strategy produced saturated species accumulation curves for any microbial community sampled, although AMF sampling approached saturation (Supplementary Fig. S2). This indicated inadequate sampling to characterize site richness, due to species turnover among plants. However, sequencing effort curves did saturate for all groups, indicating that sequencing depth was not a limiting factor in estimating the richness in individual samples (Fig. 2).

The most abundant SVs, (those occurring in at least half of our samples and recovered by both sampling strategies), represented only a small proportion of the microbial communities recovered. This included 5.5% of all AMF SVs (within the genera Glomus and Clairoideoglomus), while just 0.7% of total bacterial SVs (Bacillus, unknown bacteria) and 0.9% non-AM fungal SVs (Nectriaceae, Plectosphaerella, Tetracladium), fit these criteria. Foliar fungi were the most SV-rich group (Fig. 3), and only 0.8% of total SVs (Mycosphaerella) occurred in at least half of all samples and were recovered by both sampling strategies.

Community overlap between sampling strategies

We examined the community overlap, that is, the total number and identity of individual SVs irrespective of their relative abundance, that were recovered from the milkweed population using both sampling strategies. AMF had the lowest total SV richness and the highest community overlap at 61% (Fig. 3). Even when rarefying AMF at 700 SVs as opposed to 7820 SVs, to more closely match rarefaction levels of other groups, overlap was maintained at 61% between sampling strategies. This indicated that it was not the higher rarefaction numbers that caused the greater overlap observed in this group. Sequencing rarefaction curves also showed sufficient sequence numbers to capture a majority of sample richness, even at the lower rarefaction levels. Bacteria and non-AM fungi in roots had a moderate overlap (34% each) and foliar fungal endophytes had the highest SV richness and the least overlap with just 10% of total SVs recovered by both sampling strategies.

To explore the extent to which low-abundant SVs influenced the differences between each sampling strategy, we gradually removed SVs that were represented by < 0.01% and then < 0.05% of total sequences (Supplementary Table S1). This resulted in the removal of SVs represented by fewer than 16, 2, 40, and 47 sequences (< 0.01%); and 78, 11, 201, and 235 sequences (< 0.05), for non-AM root fungi, bacteria, FFE and AMF, respectively. At each removal step, the overlap of microbial communities gradually increased (Table 1). After removing SVs that were represented by < 0.05% of all sequences, the AMF community overlapped 92% between the two sampling strategies. Non-AM fungi in roots overlapped 63%. Bacterial and FFE SV composition, however, remained more different than alike with just a 47% and 35% overlap, respectively. Since foliar fungi showed the greatest differences, we then compared SV overlap after clustering at 98.5% similarity and 90% co-occurrence. This coarser clustering did not increase the overlap between sampling strategies (7% as opposed to 10%), suggesting that at least for this group, the fact that we chose to cluster at 100% similarity using DADA2 did not explain the general lack of overlap.

Table 1 Total number and overlap of SVs when gradually filtering low abundant SVs of microbial communities recovered from milkweed leaves and roots using two different subsampling strategies (subsampling before homogenization and subsampling after homogenization of tissue).

Full size table

Within individual plants, the number of SVs overlapping in both sampling strategies was highly variable, particularly for all non-AM fungal communities (Table 2). For foliar fungal endophytes, the percentage SV overlap within a single plant ranged between 0 and 36%. Similarly, for root bacteria, the percentage SV overlap ranged from 0 to 33% among individual plants. Three A. speciosa individuals yielded completely different communities between sampling strategies (0% overlap in SVs) for foliar fungi. Root bacteria and non-AM root fungi each had one plant with non-overlapping communities between sampling strategies (Supplementary Fig. S3). Three and four additional plants yielded just a single common SV between the two sampling strategies for FFE and non-AM fungi, respectively, whereas 2 plants had only 1 overlapping SV in root bacteria.

Table 2 Mean and standard deviation of the percent of bacterial and fungal sequence variants (SVs) recovered from individual milkweed plants using two different subsampling strategies (subsampling before homogenization and subsampling after homogenization of tissue).

Full size table

Procrustes analyses reveal structural differences between sampling strategies

When we ran Procrustes analyses on bacterial and fungal groups using presence/absence data, we saw significant differences between sampling strategies for all groups except for AMF (Table 3). When considering the relative abundances of all SVs, as opposed to basing analyses on presence/absence, as we did above, the two sampling strategies still recovered significantly different microbial communities in root bacteria and in non-AM root fungi (Fig. 4, Table 3). For FFE, the two sampling strategies produced slightly more correlated community structures, but the similarity was weak (Table 3). This is likely because some plants had very similar FFE communities, while others had completely different FFE communities (Supplementary Fig. S3). Only AMF communities were consistently similar between sampling strategies when considering SV abundance.

Table 3 Results of Procrustes analyses comparing two sampling strategies used to sample microorganisms colonizing A. speciosa roots and foliar tissue. Procrustes analyses were performed on Bray–Curtis distances of Hellinger transformed SV (sequence variant) relative abundances as well as on Raup–Crick distances of presence/absence data. Significant values are highlighted in bold. M² values represent the sum of squared deviations between sample pairs, where lower values mean a better fit between matrices.

Full size table

Multiple extractions per plant revealed severe undersampling of FFE

From the six leaves collected from plants in September 2016, we extracted DNA from all remaining leaf tissue (1240–2460 mg per plant). Despite increasing the processed plant tissue by two orders of magnitude, species accumulation curves for individual plants still failed to approach a plateau (Fig. 5). This was true even when clustering SVs at 98.5% using the LULU method. Post-clustering curation using the LULU method removed fewer than 10% of SVs and did not change the overall results. Where extractions from subsamples of 30 mg of leaf tissue (6 discs from 6 different leaves) had recovered 7–26 total foliar fungal SVs per plant, extracting from tissue representing our entire bulk collection of 6 leaves per plant or 10 × as much leaf tissue, resulted in 76–142 total SVs per plant. The average richness recovered by each extraction was 24 ± 8 SVs and the average number of unique SVs added by each consecutive extraction was 12 ± 6 SVs. Extractions from the same plant overlapped as little as 21% of SVs, showing that extractions from the same homogenized plant tissue can vary substantially, even when extracting from much larger volumes. Perhaps not surprisingly, larger leaves contained more FFE SVs than did smaller leaves, indicating that overall richness increases with plant size.

Seasonal variation visible despite undersampling, but subtler differences were masked

With 30 mg subsampled per plant we were able to see broad seasonal differences between May and September (Fig. 6A). However, this approach did not allow us to differentiate among plants that harbored more similar communities. This became apparent when analyzing multiple replicates per plant (Fig. 6B). When we visualized the communities recovered in each additional extraction, we observed significant variation among plants that was not apparent when we only had community data from single 30 mg extractions. With the additional extractions we observed strong host filtering among individual plants (PERMANOVA, R² = 0.91, P < 0.001). Subsequent pairwise analyses revealed that the fungal community within individual plants varied significantly from all other plants (Pairwise PERMANOVA, P < 0.01), and that three of the plants appeared much more similar to each other than the other two plants, potentially due to factors associated with their spatial distribution within the site.

Discussion

Different sampling strategies yield different microbial communities

The sampling strategies compared in this study (homogenizing tissue before subsampling and homogenizing tissue after subsampling) are common methods found in the literature for characterizing plant-associated microbial communities^14,23,26,29. Procrustes analyses and community overlap between sampling strategies demonstrated that different strategies can capture disparate microbial communities within plants, with the extent of these differences depending on the community targeted and plant tissue type sampled. In FFE as well as bacterial and non-AM fungal communities in roots, subsamples from the same plant resulted in completely different sets of species recovered, illustrating the severe undersampling that is inherent to each of these strategies. With these sampling strategies, we are undoubtedly sacrificing power and accuracy to characterize the subtler aspects of plant microbiome interactions, despite often seeing community differences across landscapes, treatments or seasons.

Richness was higher when homogenizing before subsampling for bacteria only, despite differences observed in composition for all groups. It is perhaps surprising that homogenizing plant tissues before subsampling did not recover more species than homogenizing after subsampling for fungi as well, because with the former approach, more plant tissue is initially represented. Indeed, a previous study showed that sample pooling or homogenizing before subsampling resulted in a higher richness of soil fungi compared to equally sized individual samples⁵⁰. In Song et al. (2015)⁵⁰ they also found that multiple individual subsamples, rather than the single homogenized subsample, resulted in higher richness. This may suggest that the scale at which we are physically able to break down the particle size of plant tissues, as opposed to soil, is not always fine enough to sufficiently homogenize the fungi within. Because of this, plant-associated microbial communities may require a greater sampling effort than soil microbes. Additionally, the removal of low-abundant SVs did not result in differences in richness between the two sampling strategies for any microbial group, suggesting that neither strategy is better at capturing rare species. Although this study was performed only on milkweed plants, we believe that these results are applicable to other plant species as well. The richness reported here is similar to other studies of plant-associated microbes (e.g.^51,52), indicating that differences in subsamples were not due to extreme richness of milkweed-associated microbes.

Microbial diversity should inform sampling effort

The higher congruency that we saw between sampling strategies for AMF compared to other microbial communities may be due to the differences in their local and global estimated richness. While the global number of AMF species has been estimated in the hundreds to low thousands^42,53, global estimates of fungal species in general range in the millions^54,55. A recent global estimate of bacterial richness suggests similar scales⁵⁶. In this study specifically, AMF had the lowest total SV richness and the greatest similarity between sampling strategies, while foliar fungal endophytes had the highest SV richness, and the lowest overlap of SVs between strategies. Since the amount of tissue sampled was equal for all microbial communities, the sampling effort was likely much higher for AMF (relative to the whole AMF community), than it was for bacteria and non-AM fungi. Consequently, with each sample we are likely sampling a much larger proportion of true AMF species richness.

Even though the estimated total community richness was highest for foliar fungi, the average estimated richness per individual plant was highest for AMF. This suggests that similar AMF SVs re-occurred across all plants with low species turnover. On the other hand, fungi in leaves had lower average richness per plant (Fig. 4, Supplementary Fig. S3), but the highest total richness, meaning that there was higher turnover of FFE species among plants sampled. These results may be a direct reflection of the overall community richness of the different microbial groups as well as their ability to spread and co-occur within plants. Based on these patterns, more individual plants and a greater sampling effort within individuals are likely needed to characterize FFE communities compared to AMF communities.

Rare SVs contribute to variation among subsamples

Our results show that low abundant, rare SVs largely contributed to the differences seen between sampling strategies. Even AMF communities, which were already similar, increased in overlap by 50% between strategies after low abundant SVs (represented by < 0.05% of sequences, Table 1) were removed. Microbial community distributions are often characterized by long tails of low-abundant species¹⁵, and as such, the likelihood of resampling rare species in each replicate can be low. In one study, Zhou et al. (2011)⁵⁷ randomly sampled a simulated community with an exponential distribution. They observed only a 53% overlap between two samples when sampling just 1% of that community. We see even more extreme differences in overlap in this study, where initial sampling effort is also low relative to the whole microbial community.

The importance of rare microbes may vary and is easily overlooked in favor of highly abundant, and perhaps more influential fungi or bacteria. However, due to the compositional nature inherent to amplicon data, those SVs that appear to be in low abundance at the time of sampling may only be relatively so. Also, we do not yet fully understand microbial species turnover or succession. Plant-associated microbial communities can change significantly in just a matter of months⁵⁸, or even weeks⁵⁹. In addition, the exact relationship between sequence number and biomass of a species is variable⁶⁰, and there is little evidence, if any, that sequence number is in direct proportion with a species’ impact in an ecosystem. Some microbes may be more metabolically active than others, despite appearing to be present in smaller quantities⁶¹. The recovery of the rare microbial community is arguably just as vital as the recovery of species that appear more abundant.

Bioinformatics pipelines that artificially inflate the number of SVs, especially low abundant or rare SVs, could potentially inflate the differences we see among community subsamples. Hundreds of bioinformatics approaches have been used to analyze amplicon data, and no consensus exists on which is best. However, a recent study comparing the performance of 360 different software and parameter combinations showed that DADA2 (which is what we used here), with no other filter other than the removal of low quality and chimeric sequences, was best for recovering true richness and composition from a mock fungal community of 189 different strains⁶². If anything, DADA2 can erroneously lump closely related species⁴¹, which would make it more conservative than other methods used. However, in an effort not to overestimate the true variation between strategies compared in this study, we assessed the relative importance of rare taxa through the gradual removal of lesser-abundant sequences, and we also used LULU, which is sometimes employed to reduce artifactual diversity⁴⁸. We also removed all SVs that could not be confidently assigned to known microbial taxa. Even with these approaches, substantial variation remained due to the inherent undersampling of the strategies compared.

Severe undersampling obscures subtle community variation

With the possible exception of AMF, none of the sampling effort curves approached an asymptote meaning that both sampling strategies failed to adequately characterize the microbial communities present within a single plant or plant community (Supplementary Fig. S2). We found that multiple replicates from a single plant can vary by nearly 80% in FFE SVs, even when extracting from larger amounts (250 mg vs. 30 mg) of tissue. Lindahl et al. (2013)⁶³ suggests that if duplicate subsamples differ much in community composition then these differences threaten to obscure finer-scale treatment effects and ecological correlations, and that sampling effort should be increased. Indeed, a more robust sampling effort through the use of multiple technical replicates revealed remarkably strong (R² = 0.91), and significant host filtering within each individual plant that would have gone unobserved if extracting DNA from just a single replicate per plant. Although we may be able to observe patterns in under sampled data among sites or treatments, it is difficult to train models and make predictions or inferences in regard to the larger microbial population.

As Unterseher et al. (2011)¹⁵ suggests, it is often unnecessary to saturate richness in microbial communities, but this should be carefully considered before developing experiments and testing hypotheses. One must take into account the objectives of the study and the accuracy and precision required to meet those objectives. Although the methods traditionally employed to sample plant-associated microbes may be sufficient to generally observe landscape-scale differences, it is important to recognize that we are not characterizing these communities, rather we are taking a sliver of a ‘snapshot’ of species composition from a single point in time. A large proportion of true microbial diversity for most systems will likely still remain undetected and the specific results may be limited in their replicability.

Summary and recommendations

Although it used to be common practice, multiple studies now suggest that duplicate and triplicate PCR reactions are unnecessary for fungi and bacteria^64,65. However, based on the results of this study, we recommend the inclusion of a different kind of technical replicate (i.e., multiple extraction reps from a single plant), in addition to biological replicates (multiple plants in a single population), especially when studying factors that may generate subtler differences in plant associated microbial communities. We show that extracting DNA from the standard 25–30 mg (dry weight) per plant can result in microbial communities that vary by as much as 100% and extractions of 250 mg from a single plant can vary by as much as 79%. The need for increased replication is particularly important if site, treatment, or seasonal differences may be obscured by other environmental drivers. Striving for a more comprehensive understanding of the depth and structure of plant microbiomes and their response to their surrounding environments will help us to better understand the exact functions of plant–microbe associations and how we might manipulate plant microbiomes in order to reduce disease or increase plant productivity in the future.

Sample size and sampling effort have surpassed sequencing depth and cultivation as the bottleneck when characterizing plant-associated microbial communities. A good sampling design is essential to approximate underlying patterns in microbial community composition in a reproducible manner and both sampling effort and size should be clearly justified. Schloss (2018)²⁰ elaborates on the concern of replicability and reproducibility with the growing use of Illumina-based studies of microbial communities, and describes PCR bias, sequencing errors, and cryptic or poorly described bioinformatics as preventing data from being generalizable to other environments. Undersampling and poor to absent descriptions of sampling effort and strategy also contribute to this problem, and the current frequency of undersampling should be concerning. The differences we see here between sampling strategies and the extreme variation among replicates suggest that many studies of plant-associated microbial communities may not be sufficiently replicable or reproducible.

Due to variation in community structure among AMF, bacteria, and non-AM fungi, standardizing a sampling protocol for all organisms is difficult, and best practices will, to some degree, depend on the specific organism targeted, richness, and site. Since neither sampling approach appeared to outperform the other, in many studies the overall sampling effort may be of greater importance. For example, when investigating landscape-scale differences in abundant or species poor microorganisms, a smaller sampling effort is often sufficient. However, we suggest that more diverse plant-associated microbial communities, such as foliar fungal endophytes and root-associated bacteria, necessitate a more robust sampling effort than what is currently practiced in the literature. Per sample richness, relative to the estimated total community richness should always be considered when determining the optimum sampling strategy for any system. For example, sampling strategies and volumes sufficient for sampling AMF communities in extreme environments are likely not adequate for sampling fungal endophytes in the tropics where richness is high⁶⁶. The need for increased sampling effort is especially pertinent if noise associated with sites, treatments or sample processing may potentially obscure the differences among them. In studies that fail to see differences in microbial communities among sites or treatments, sampling effort should always be examined as a potential impediment.

In summary, we recommend that: (1) authors provide more transparent, detailed sampling information as well as sequencing and sampling effort curves, (2) sampling effort is not arbitrary, but is adjusted based on the diversity of plant microbiomes and per sample richness relative to total community richness (both of which may require preliminary sampling), and (3) authors consider increased sampling effort when investigating smaller-scale drivers of microbial communities such as host filtering or subtle gradients, or when attempting to truly characterize microbial communities. Finally, controlling for the amount of plant tissue sampled both before and after homogenization, although not tested here, may be the most optimal strategy for reducing potential bias. Standardizing how we sample plant-associated microbial communities as suggested by Dickie et al. (2018)¹⁹, or at the very least, insisting on more robust and transparent sampling strategies, will allow for more accurate and comprehensive analyses as well as better cross-study comparisons in the future.

Data availability

Raw amplicon sequence data: NCBI Sequence Read Archive (SRA), BioProject accession number PRJNA633878. Sequence variants: GenBank (https://www.ncbi.nlm.nih.gov/genbank) accessions: MN029770-MN030137 (FFE); MN029143-MN029446 (AMF); MN029447-MN029769 (root fungi); MN069335—MN069498 (root bacteria). Community matrices and metadata: Figshare https://doi.org/10.6084/m9.figshare.11741673.

References

Turnbaugh, P. J. et al. A core gut microbiome in obese and lean twins. Nature 457(7228), 480–484. https://doi.org/10.1038/nature07540 (2009).
Article ADS CAS PubMed Google Scholar
Jiang, H. et al. Altered fecal microbiota composition in patients with major depressive disorder. Brain Behav. Immun. 48, 186–194. https://doi.org/10.1016/j.bbi.2015.03.016 (2015).
Article PubMed Google Scholar
Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24(4), 392–400. https://doi.org/10.1038/nm.4517 (2018).
Article CAS PubMed PubMed Central Google Scholar
Berg, G., Grosch, R. & Smalla, K. Plant microbial diversity is suggested as the key to future biocontrol and health trends. FEMS Microbiol. Ecol. https://doi.org/10.1093/femsec/fix050 (2017).
Article PubMed Google Scholar
Hirakue, A. & Sugiyama, S. Relationship between foliar endophytes and apple cultivar disease resistance in an organic orchard. Biol. Control 127, 139–144. https://doi.org/10.1016/j.biocontrol.2018.09.007 (2018).
Article Google Scholar
Cline, L. C., Schilling, J. S., Menke, J., Groenhof, E. & Kennedy, P. G. Ecological and functional effects of fungal endophytes on wood decomposition. Funct. Ecol. 32(1), 181–191. https://doi.org/10.1111/1365-2435.12949 (2018).
Article Google Scholar
Bullington, L. S. & Larkin, B. G. Using direct amplification and next-generation sequencing technology to explore foliar endophyte communities in experimentally inoculated western white pines. Fungal Ecol. 17, 170–178. https://doi.org/10.1016/j.funeco.2015.07.005 (2015).
Article Google Scholar
Siddique, A. B. & Unterseher, M. A cost-effective and efficient strategy for Illumina sequencing of fungal communities: A case study of beech endophytes identified elevation as main explanatory factor for diversity and community composition. Fungal Ecol. 20, 175–185 (2016).
Article Google Scholar
Unterseher, M., Siddique, A. B., Brachmann, A. & Peršoh, D. Diversity and composition of the leaf mycobiome of beech (Fagussylvatica) are affected by local habitat conditions and leaf biochemistry. PLoS ONE 11(4), e0152878. https://doi.org/10.1371/journal.pone.0152878 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lekberg, Y. & Waller, L. P. What drives differences in arbuscular mycorrhizal fungal communities among plant species?. Fungal Ecol. 24, 135–138. https://doi.org/10.1016/j.funeco.2016.05.012 (2016).
Article Google Scholar
Pei, C. et al. Diversity of endophytic bacteria of Dendrobium officinale based on culture-dependent and culture-independent methods. Biotechnol. Biotechnol. Equip. 31(1), 112–119. https://doi.org/10.1080/13102818.2016.1254067 (2017).
Article Google Scholar
Dissanayake, A. J. et al. Direct comparison of culture-dependent and culture-independent molecular approaches reveal the diversity of fungal endophytic communities in stems of grapevine (Vitisvinifera). Fungal Divers. 90(1), 85–107. https://doi.org/10.1007/s13225-018-0399-3 (2018).
Article Google Scholar
Arnold, A. E. Understanding the diversity of foliar endophytic fungi: Progress, challenges, and frontiers. Fungal Biol. Rev. 21(2–3), 51–66. https://doi.org/10.1016/j.fbr.2007.05.003 (2007).
Article Google Scholar
Jumpponen, A. & Jones, K. L. Massively parallel 454 sequencing indicates hyperdiverse fungal communities in temperate Quercus macrocarpa phyllosphere. New Phytol. 184(2), 438–448. https://doi.org/10.1111/j.1469-8137.2009.02990.x (2009).
Article CAS PubMed Google Scholar
Unterseher, M. et al. Species abundance distributions and richness estimations in fungal metagenomics—lessons learned from community ecology: Community ecology in fungal metagenomics. Mol. Ecol. 20(2), 275–285. https://doi.org/10.1111/j.1365-294X.2010.04948.x (2011).
Article PubMed Google Scholar
McMurdie, P. J. & Holmes, S. Waste not, want not: Why rarefying microbiome data is inadmissible. PLoS Comput. Biol. 10(4), e1003531. https://doi.org/10.1371/journal.pcbi.1003531 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Allali, I. et al. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiol. 17(1), 194. https://doi.org/10.1186/s12866-017-1101-8 (2017).
Article CAS PubMed PubMed Central Google Scholar
Bolyen, E. et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. PeerJ Prepr. https://doi.org/10.7287/peerj.preprints.27295v2 (2018).
Article Google Scholar
Dickie, I. A. et al. Towards robust and repeatable sampling methods in eDNA-based studies. Mol. Ecol. Resour. 18(5), 940–952. https://doi.org/10.1111/1755-0998.12907 (2018).
Article Google Scholar
Schloss, P. D. Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research. mBio 9(3), 13 (2018).
Article Google Scholar
Daleo, P. et al. Nitrogen enrichment suppresses other environmental drivers and homogenizes salt marsh leaf microbiome. Ecology 99(6), 1411–1418. https://doi.org/10.1002/ecy.2240 (2018).
Article PubMed Google Scholar
Toju, H., Okayasu, K. & Notaguchi, M. Leaf-associated microbiomes of grafted tomato plants. Sci. Rep. 9(1), 1787. https://doi.org/10.1038/s41598-018-38344-2 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Zimmerman, N. B. & Vitousek, P. M. Fungal endophyte communities reflect environmental structuring across a Hawaiian landscape. Proc. Natl. Acad. Sci. 109(32), 13022–13027. https://doi.org/10.1073/pnas.1209872109 (2012).
Article ADS PubMed PubMed Central Google Scholar
Hahn, P. G. Effects of short- and long-term variation in resource conditions on soil fungal communities and plant responses to soil biota. Front. Plant Sci. 9, 15 (2018).
Article Google Scholar
Saldierna Guzmán, J. P., Nguyen, K. & Hart, S. C. Simple methods to remove microbes from leaf surfaces. J. Basic Microbiol. 60(8), 730–734. https://doi.org/10.1002/jobm.202000035 (2020).
Article CAS PubMed Google Scholar
Busby, P. E., Peay, K. G. & Newcombe, G. Common foliar fungi of Populus trichocarpa modify Melampsora rust disease severity. New Phytol. 209(4), 1681–1692. https://doi.org/10.1111/nph.13742 (2016).
Article CAS PubMed Google Scholar
Gdanetz, K. & Trail, F. The wheat microbiome under four management strategies, and potential for endophytes in disease protection. Phytobiomes J. 1(3), 158–168. https://doi.org/10.1094/PBIOMES-05-17-0023-R (2017).
Article Google Scholar
Haas, J. C. Microbial community response to growing season and plant nutrient optimisation in a boreal Norway spruce forest. Soil Biol. Biochem. 125, 197–209 (2018).
Article CAS Google Scholar
Barge, E. G., Leopold, D. R., Peay, K. G., Newcombe, G. & Busby, P. E. Differentiating spatial from environmental effects on foliar fungal communities of Populus trichocarpa. J. Biogeogr. 46(9), 2001–2011. https://doi.org/10.1111/jbi.13641 (2019).
Article Google Scholar
Bunn, R. A., Simpson, D. T., Bullington, L. S., Lekberg, Y. & Janos, D. P. Revisiting the ‘direct mineral cycling’ hypothesis: Arbuscular mycorrhizal fungi colonize leaf litter, but why?. ISME J. 13(8), 1891–1898. https://doi.org/10.1038/s41396-019-0403-2 (2019).
Article PubMed PubMed Central Google Scholar
Corcoll, N. et al. Comparison of four DNA extraction methods for comprehensive assessment of 16S rRNA bacterial diversity in marine biofilms using high-throughput sequencing. FEMS Microbiol. Lett. 364, fnx139. https://doi.org/10.1093/femsle/fnx139 (2017).
Article CAS Google Scholar
Dumbrell, A. J. et al. Distinct seasonal assemblages of arbuscular mycorrhizal fungi revealed by massively parallel pyrosequencing. New Phytol. 190(3), 794–804. https://doi.org/10.1111/j.1469-8137.2010.03636.x (2011).
Article CAS PubMed Google Scholar
Lee, J., Lee, S. & Young, J. P. W. Improved PCR primers for the detection and identification of arbuscular mycorrhizal fungi. FEMS Microbiol. Ecol. 65(2), 339–349. https://doi.org/10.1111/j.1574-6941.2008.00531.x (2008).
Article CAS PubMed Google Scholar
Ihrmark, K. et al. New primers to amplify the fungal ITS2 region—Evaluation by 454-sequencing of artificial and natural communities. FEMS Microbiol. Ecol. 82(3), 666–677. https://doi.org/10.1111/j.1574-6941.2012.01437.x (2012).
Article CAS PubMed Google Scholar
Kohout, P. et al. Comparison of commonly used primer sets for evaluating arbuscular mycorrhizal fungal communities: Is there a universal solution?. Soil Biol. Biochem. 68, 482–493. https://doi.org/10.1016/j.soilbio.2013.08.027 (2014).
Article CAS Google Scholar
White, T. J., Bruns, T., Lee, S. & Taylor, J. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications (eds Innis, M. A. et al.) 315–322 (Academic Press, London, 1990).
Google Scholar
Lekberg, Y. et al. More bang for the buck? Can arbuscular mycorrhizal fungal communities be characterized adequately alongside other fungi using general fungal primers?. New Phytol. 220(4), 971–976. https://doi.org/10.1111/nph.15035 (2018).
Article PubMed Google Scholar
Caporaso, J. G. et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. 108(Suppl 1), 4516–4522. https://doi.org/10.1073/pnas.1000080107 (2011).
Article ADS PubMed Google Scholar
Bullington, L. S., Lekberg, Y., Sniezko, R. & Larkin, B. The influence of genetics, defensive chemistry and the fungal microbiome on disease outcome in whitebark pine trees: Genetics, terpenes, fungi and disease. Mol. Plant Pathol. 19(8), 1847–1858. https://doi.org/10.1111/mpp.12663 (2018).
Article CAS PubMed Central Google Scholar
Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40(1), e3. https://doi.org/10.1093/nar/gkr771 (2012).
Article CAS PubMed Google Scholar
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13(7), 581–583. https://doi.org/10.1038/nmeth.3869 (2016).
Article CAS PubMed PubMed Central Google Scholar
Öpik, M. et al. Global sampling of plant roots expands the described molecular diversity of arbuscular mycorrhizal fungi. Mycorrhiza 23(5), 411–430. https://doi.org/10.1007/s00572-013-0482-2 (2013).
Article PubMed Google Scholar
Towards a unified paradigm for sequence‐based identification of fungi—Kõljalg—2013—Molecular Ecology—Wiley Online Library. Accessed October 30, 2020. https://doi.org/10.1111/mec.12481
Bokulich, N. A. et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6(1), 90. https://doi.org/10.1186/s40168-018-0470-z (2018).
Article PubMed PubMed Central Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2018).
Oksanen, J., Blanchet, F. G., Friendly, M. et al. Vegan: Community Ecology Package (2019). https://cran.r-project.org/package=vegan.
Hsieh, T. C., Ma, K. H. & Chao, A. iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods Ecol. Evol. 7(12), 1451–1456. https://doi.org/10.1111/2041-210X.12613 (2016).
Article Google Scholar
Frøslev, T. G. et al. Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nat. Commun. 8(1), 1188. https://doi.org/10.1038/s41467-017-01312-x (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Wickham, H. Ggplot2: Elegant graphics for data analysis (Springer, New York, 2009). https://doi.org/10.1007/978-0-387-98141-3.
Book MATH Google Scholar
Song, Z. et al. Effort versus reward: Preparing samples for fungal community characterization in high-throughput sequencing surveys of soils. PLoS ONE 10(5), e0127234. https://doi.org/10.1371/journal.pone.0127234 (2015).
Article CAS PubMed PubMed Central Google Scholar
Furtado, B. U., Gołębiewski, M., Skorupa, M., Hulisz, P. & Hrynkiewicz, K. Bacterial and fungal endophytic microbiomes of Salicorniaeuropaea. Appl. Environ. Microbiol. 85(13), e00305-19. https://doi.org/10.1128/AEM.00305-19 (2019).
Article PubMed PubMed Central Google Scholar
Kuźniar, A. et al. Culture-independent analysis of an endophytic core microbiome in two species of wheat: Triticumaestivum L. (cv. ‘Hondia’) and the first report of microbiota in Triticumspelta L. (cv. ‘Rokosz’). Syst. Appl. Microbiol. 43(1), 126025. https://doi.org/10.1016/j.syapm.2019.126025 (2020).
Article CAS PubMed Google Scholar
Davison, J. et al. Global assessment of arbuscular mycorrhizal fungus diversity reveals very low endemism. Science 349(6251), 970–973. https://doi.org/10.1126/science.aab1161 (2015).
Article ADS CAS PubMed Google Scholar
Hawksworth, D. L. The magnitude of fungal diversity: the 1.5 million species estimate revisited* *Paper presented at the Asian Mycological Congress 2000 (AMC 2000), incorporating the 2nd Asia-Pacific Mycological Congress on Biodiversity and Biotechnology, and held at the University of Hong Kong on 9–13 July 2000. Mycol. Res. 105(12), 1422–1432. https://doi.org/10.1017/S0953756201004725 (2001).
Article Google Scholar
Hawksworth, D. L. & Lücking, R. Fungal diversity revisited: 2.2 to 3.8 million species. Microbiol. Spectr. 1, 1. https://doi.org/10.1128/microbiolspec.FUNK-0052-2016 (2017).
Article Google Scholar
Louca, S., Mazel, F., Doebeli, M. & Parfrey, L. W. A census-based estimate of Earth’s bacterial and archaeal diversity. PLoS Biol. 17(2), e3000106. https://doi.org/10.1371/journal.pbio.3000106 (2019).
Article CAS PubMed PubMed Central Google Scholar
Zhou, J. et al. Reproducibility and quantitation of amplicon sequencing-based detection. ISME J. 5(8), 1303–1313. https://doi.org/10.1038/ismej.2011.11 (2011).
Article CAS PubMed PubMed Central Google Scholar
McTee, M., Bullington, L., Rillig, M. C. & Ramsey, P. W. Do soil bacterial communities respond differently to abrupt or gradual additions of copper?. FEMS Microbiol. Ecol. 95(1), fiy212. https://doi.org/10.1093/femsec/fiy212 (2019).
Article CAS Google Scholar
Gao, C. et al. Strong succession in arbuscular mycorrhizal fungal communities. ISME J. 13(1), 214–226. https://doi.org/10.1038/s41396-018-0264-0 (2019).
Article PubMed Google Scholar
Kleiner, M. et al. Assessing species biomass contributions in microbial communities via metaproteomics. Nat. Commun. 8(1), 1558. https://doi.org/10.1038/s41467-017-01544-x (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Jousset, A. et al. Where less may be more: How the rare biosphere pulls ecosystems strings. ISME J. 11(4), 853–862. https://doi.org/10.1038/ismej.2016.174 (2017).
Article PubMed PubMed Central Google Scholar
Pauvert, C. et al. Bioinformatics matters: The accuracy of plant and soil fungal community data is highly dependent on the metabarcoding pipeline. Fungal Ecol. 41, 23–33. https://doi.org/10.1016/j.funeco.2019.03.005 (2019).
Article Google Scholar
Lindahl, B. D. et al. Fungal community analysis by high-throughput sequencing of amplified markers—A user’s guide. New Phytol. 199(1), 288–299. https://doi.org/10.1111/nph.12243 (2013).
Article CAS PubMed PubMed Central Google Scholar
Egan, C. P. et al. Using mock communities of arbuscular mycorrhizal fungi to evaluate fidelity associated with Illumina sequencing. Fungal Ecol. 33, 52–64. https://doi.org/10.1016/j.funeco.2018.01.004 (2018).
Article Google Scholar
Marotz, C. et al. Triplicate PCR reactions for 16S rRNA gene amplicon sequencing are unnecessary. Biotechniques 67(1), 29–32. https://doi.org/10.2144/btn-2018-0192 (2019).
Article CAS PubMed PubMed Central Google Scholar
Arnold, A. E., Maynard, Z., Gilbert, G. S., Coley, P. D. & Kursar, T. A. Are tropical fungal endophytes hyperdiverse?. Ecol. Lett. 3(4), 267–274. https://doi.org/10.1046/j.1461-0248.2000.00159.x (2000).
Article Google Scholar

Download references

Acknowledgements

We would like to thank MPG Ranch for funding this research. We would also like to thank Emily Martin and Ben Mason for help performing DNA extractions and PCR. We are eternally grateful for Peter Kennedy, Mike McTee, Morgan McLeod, Marirose Kuhlman, and William Blake, who all provided valuable edits of this manuscript. Sequencing was performed by the IBEST Genomics Resources Core at the University of Idaho and was supported in part by NIH COBRE Grant P30GM103324.

Author information

Authors and Affiliations

MPG Ranch, Missoula, MT, 59801, USA
Lorinda S. Bullington, Ylva Lekberg & Beau G. Larkin
Department of Ecosystem and Conservation Sciences, University of Montana, Missoula, MT, 59812, USA
Lorinda S. Bullington & Ylva Lekberg

Authors

Lorinda S. Bullington
View author publications
You can also search for this author in PubMed Google Scholar
Ylva Lekberg
View author publications
You can also search for this author in PubMed Google Scholar
Beau G. Larkin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.B., Y.L. and B.L. conceived the study. L.B., Y.L. and B.L. collected samples. L.B. processed samples, analyzed data and wrote the manuscript with input and edits provided by Y.L. and B.L.

Corresponding author

Correspondence to Lorinda S. Bullington.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bullington, L.S., Lekberg, Y. & Larkin, B.G. Insufficient sampling constrains our characterization of plant microbiomes. Sci Rep 11, 3645 (2021). https://doi.org/10.1038/s41598-021-83153-9

Download citation

Received: 30 October 2020
Accepted: 29 January 2021
Published: 11 February 2021
DOI: https://doi.org/10.1038/s41598-021-83153-9

This article is cited by

Tomato defences modulate not only insect performance but also their gut microbial composition
- Andreea Bosorogan
- Erick Cardenas-Poire
- Eliana Gonzales-Vigil
Scientific Reports (2023)
Bioprospecting and Challenges of Plant Microbiome Research for Sustainable Agriculture, a Review on Soybean Endophytic Bacteria
- Modupe Stella Ayilara
- Bartholomew Saanu Adeleke
- Olubukola Oluranti Babalola
Microbial Ecology (2023)
A Comprehensive Insight of Current and Future Challenges in Large-Scale Soil Microbiome Analyses
- Jean Legeay
- Mohamed Hijri
Microbial Ecology (2023)
Temporary establishment of bacteria from indoor plant leaves and soil on human skin
- Gwynne Á. Mhuireach
- Ashkaan K. Fahimipour
- Brendan J. M. Bohannan
Environmental Microbiome (2022)
Acquisition and evolution of enhanced mutualism—an underappreciated mechanism for invasive success?
- Min Sheng
- Christoph Rosche
- Ylva Lekberg
The ISME Journal (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Methods

Field collections and sampling strategies

DNA extractions and PCR

Bioinformatics

Statistical analyses

Results

Richness and evenness between sampling strategies

Community overlap between sampling strategies

Procrustes analyses reveal structural differences between sampling strategies

Multiple extractions per plant revealed severe undersampling of FFE

Seasonal variation visible despite undersampling, but subtler differences were masked

Discussion

Different sampling strategies yield different microbial communities

Microbial diversity should inform sampling effort

Rare SVs contribute to variation among subsamples

Severe undersampling obscures subtle community variation

Summary and recommendations

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links