Abstract
Plants host diverse microbial communities, but there is little consensus on how we sample these communities, and this has unknown consequences. Using root and leaf tissue from showy milkweed (Asclepias speciosa), we compared two common sampling strategies: (1) homogenizing after subsampling (30 mg), and (2) homogenizing bulk tissue before subsampling (30 mg). We targeted bacteria, arbuscular mycorrhizal (AM) fungi and non-AM fungi in roots, and foliar fungal endophytes (FFE) in leaves. We further extracted DNA from all of the leaf tissue collected to determine the extent of undersampling of FFE, and sampled FFE twice across the season using strategy one to assess temporal dynamics. All microbial groups except AM fungi differed in composition between the two sampling strategies. Community overlap increased when rare taxa were removed, but FFE and bacterial communities still differed between strategies, with largely non-overlapping communities within individual plants. Increasing the extraction mass 10 × increased FFE richness ~ 10 ×, confirming the severe undersampling indicated in the sampling comparisons. Still, seasonal patterns in FFEs were apparent, suggesting that strong drivers are identified despite severe undersampling. Our findings highlight that current sampling practices poorly characterize many microbial groups, and increased sampling intensity is necessary for increase reproducibility and to identify subtler patterns in microbial distributions.
Similar content being viewed by others
Introduction
The human microbiome has been thoroughly studied as an integral component of human health and performance1,2,3. Similarly, the plant microbiome is critical in understanding the health and ecology of living plants4,5, and their rate of decay and nutrient turnover after senescence6. The plant microbiome is incredibly complex and diverse; just 0.1 g of foliar plant tissue can host over 100 putative fungal species7,8,9, whereas the richness of arbuscular mycorrhizal fungi (AMF) in roots is considerably lower10. Knowledge of the spatial heterogeneity and overall richness of plant-associated microorganisms should inform how we sample, as this may influence our understanding of the processes that shape microbial communities and their relationships with other organisms. Currently, there is no consensus on an optimal sampling strategy needed to characterize plant-associated microbial communities. In fact, there is little technical guidance on how common sampling strategies influence our interpretation and analyses of the microbial communities we observe.
Through the use of next-generation sequencing technology, we now know that biases associated with culture-based microbiome surveys leave a large proportion of microbial diversity completely undetected11,12. Early departures from culture-based methods, like cloning and direct extraction of environmental DNA greatly improved our understanding of plant-associated microbial communities, but they revealed new biases suggesting that our understanding of microbial communities was still incomplete13. Even 454 pyrosequencing often did not provide enough high-quality sequences to adequately characterize microbial communities, resulting in sequencing effort curves that failed to reach an asymptote14,15. Deeper, higher quality sequencing technology now allows researchers to see a more comprehensive picture of plant-associated microbial communities than previous methods. Thus, while sequencing effort for individual samples is no longer a bottleneck in characterizing plant-associated microbial communities, sampling effort may very well be.
Many advances have been made to reduce biases when processing microbial community data16,17,18, but descriptions of sampling effort or strategy are often omitted or vague, with little or no justification of methods used. Site descriptions, including information about the developmental stage or uniformity of plants to be sampled, are often too cursory. In a recent review, Dickie et al. (2018)19 found that 95% of metabarcoding studies examined reported inappropriate or incomplete field or sampling methods that rendered them non-reproducible. In addition to unclear methods, many studies fail to report the robustness of their sampling efforts (e.g., through species accumulation curves that display the number of species recovered for each additional sample), despite the inherent consequences of undersampling. Indeed, undersampling of microbial communities appears to have become the rule, rather than the exception. One reason for this is that time and budget constraints limit the amount of lab work that can be performed, and often it is inappropriate or impossible to destructively sample whole plants. Most of the time researchers only sample small quantities (mg) of plant tissue representing a tiny fraction of the plant’s total biomass (often < 1%). Microbial communities observed in these samples are then used to make inferences on the larger population of colonizing microbes. In uncontrolled field studies where there are countless environmental drivers, this incomplete sampling may obscure subtle underlying patterns in microbial distributions20, which in turn could compromise our estimates about their diversity and the degree to which sites, treatments and individual hosts truly differ.
Not only do we sample small proportions of total plant biomass, but researchers often differ in how these samples are collected and processed. To provide guidance for sampling plant-associated microbial groups, we compared two of the more common sampling strategies seen in the literature to see if they differ in richness and compositional estimates and if this depends on the microorganism targeted. The first sampling strategy involves collecting a specific surface area or volume of plant tissue (e.g. leaf discs or lengths of root segments) followed by tissue homogenization (e.g.,14,21,22). In this strategy, the spatial extent of available plant tissue is somewhat maintained, but many taxa may be missed if their distributions are patchy, which in turn may lead to increased variance in richness and composition among samples. From hereon we will refer to this strategy as “homogenizing tissue after subsampling” (Fig. 1A). The second common sampling strategy involves either collecting a pre-specified amount of tissue from each plant (e.g., six leaves or root segments per plant) or collecting plant tissue somewhat haphazardly without much standardization among plant samples. Then samples are ground prior to collection of a standardized, homogenized subsample (e.g.,9,23). We will refer to this method as “homogenizing tissue before subsampling” (Fig. 1B). A criticism of this method is that differences in initial sample size are usually not accounted for (e.g., variation in total leaf area or root length). In this case, the size of the homogenized subsample is standardized, but not the amount of tissue initially homogenized. When the amount of plant tissue initially homogenized is inconsistent among samples, we do not know if the differences in microbial communities results from variation in the amount of plant material collected, or from true differences among the host plants. Differences in the detected species richness could thus be biased towards larger plant samples, or larger plants, rather than revealing true differences in richness and composition. Despite this criticism, the homogenizing tissue before subsampling method is relied on because it is thought that the homogenized pool from a relatively large initial sample may yield a more comprehensive subsample of the microbial community present within the plant. It is often assumed that any subsample from the powdered tissue, regardless of sampling approach used, will yield the same or a similar microbial community. To our knowledge, these assumptions have not been tested.
Because the optimal strategy may depend on the microorganism targeted, we tested these two strategies by extracting and amplifying bacterial DNA (16S rRNA gene), general fungal DNA (internal transcribed spacer region 2, hereafter ITS2) and arbuscular mycorrhizal fungal DNA (18S rRNA gene), from roots (all) and/or leaves (ITS2 only) of naturally occurring, mature, showy milkweed (Asclepias speciosa). We chose showy milkweed as a previous study has shown high colonization of both AM and general fungi24. Due to the highly diverse nature of foliar fungal endophytes (FFE) in general14, we extracted DNA from many additional subsamples per plant to assess the extent of undersampling within this group. Finally, we sampled leaves twice across the season to assess if broad-scale seasonal differences in FFE were still detectable despite potential undersampling, in order to better understand when—and for which type of questions—insufficient sampling is a problem. We predicted that (1) homogenizing before subsampling (strategy 2) would yield richer, more even microbial communities among plants, because more of the plant tissue would be initially homogenized, (2) different sampling strategies would result in different microbial community structures due to the spatial heterogeneity of microbes within plants, and (3) differences between sampling strategies would depend on the organism targeted and tissue sampled (roots or leaves), potentially due to inherent differences in global and likely local richness, as well as microbial modes of dispersal (above vs. belowground).
Methods
Field collections and sampling strategies
All collections occurred at the Teller Wildlife Refuge in western Montana, USA (46.3219 N 114.1292 W). The Teller Wildlife Refuge is a 525-ha property that comprises riparian, floodplain, wetland, and upland habitat. Showy milkweed grows in numerous patches across the property, concentrated near irrigation ditches, on the floodplain, and in old fields where sub-irrigation provides enough water. Plants were selected from a 500 m2 area of a field with a patchy distribution. All plants sampled were at least 3 m apart, and at each collection we sampled plants of similar size and developmental stage. During our first collection, we collected 6 individual leaves each from 20 showy milkweed plants on May 18th, 2016 before flowering, just as buds were beginning to form. We sampled from the same population again on September 6th, 2016 when flowers were present and seed pods were almost ripe. During the September collection we also collected as much of the root system as possible within the upper 20 cm of soil by destructively sampling each plant. Roots were washed with tap water to remove all visible soil, and fine roots (< 1 mm in diameter) were retained for further processing. All tissue was kept on ice upon collection, and then stored at − 20 °C. Bulk root collections weighed between 0.3 and 0.8 g dry weight, per plant. Bulk leaf collections weighed between 1.2 and 2.5 g dry weight per plant. To remove microbial tissues from root and leaf exterior and characterize only endophytic microbes, all root and leaf tissue was surface sterilized in 70% ethanol for 1 min., 0.5% NaOCl for 1 min. and then rinsed 2 × in sterile water. Imprints of sterilized tissue were made on growth medium to confirm the sterilization procedure. The absence of growth observed indicated successful surface sterilization. This process may not completely remove all non-viable surface DNA, however the combination of rinsing, and immersion of tissue in NaOCl and ethanol is effective at eliminating epiphyte tissue so that only residual DNA should remain25. After sterilization, leaves and roots from the September collection were separated into two groups based on the two sampling strategies we chose to compare.
For Strategy 1, homogenizing after subsampling, we removed one leaf disc from 6 leaves per plant (at random locations) using an 11.5 mm in diameter corkborer and placed the six discs per plant into single 1.2 ml Eppendorf tubes. We also used the same strategy for May leaf collections. This resulted in a total surface area of approximately 62 mm2 per plant, with a dry weight of approximately 30.0 mg of leaf tissue. For roots, approximately 30.0 mg of dry root segments per plant was placed into individual 1.2 ml Eppendorf tubes. As many DNA extraction kits recommend a maximum volume of approximately 20–30.0 mg dry weight or 50.0 mg fresh weight, we have found this to be a commonly used standard (e.g.,26,27,28,29,30), however, many studies lack sufficient detail on final extraction volumes. Subsampled root and leaf tissues were then freeze-dried using a Labconco Freezone benchtop freeze-dry system (Labconco, Kansas City, MO, USA). Subsampled tissue was homogenized using a 1600 MiniG tissue homogenizer and cell lyser (Spex SamplePrep, Metuchen, NJ, USA).
For Strategy 2, homogenizing before subsampling, all remaining root and leaf tissue from the September collection was subsequently freeze-dried and homogenized for each plant using the freeze-dry system and MiniG as described above. Approximately 30.0 mg dry weight per plant was then subsampled from the homogenized pool of root or leaf tissue and placed in 1.2 ml Eppendorf tubes for DNA extraction. All remaining foliar tissue from five plants collected in September (> 40 × more tissue per plant, equaling 1241–2467 mg per plant) was then divided into approximately 250 mg replicates for additional DNA extractions to assess the extent of undersampling.
DNA extractions and PCR
DNA extractions of 20 foliar samples collected in May, as well as 40 foliar and 40 root samples representing the two different sampling strategies in roots and leaves in September (100 total extractions), were performed using the MO BIO PowerPlant Pro-htp DNA isolation kit, which has been shown to be comparable to other kits for extracting microbial DNA31. For five of the plants collected in September, we extracted DNA from all the remaining foliar tissue of the six leaves collected (1241–2467 mg dry weight per plant, compared to the 15–30 mg commonly used to infer differences in microbial community composition). This increased the amount of tissue analyzed per plant by > 40 ×. The number of additional extractions depended on the size of the leaves and ranged from 5 to 10 extractions per plant. These additional bulk material extractions on larger volumes were performed using the DNeasy Plant Maxi Kit (Qiagen, Hilden, Germany), also per manufacturer recommendations. To monitor potential background or cross contamination among samples, we also included control extractions and PCRs for all microbial groups and extraction kits used.
After extracting DNA, all samples were prepared for Illumina sequencing through a two-step PCR amplification. For root tissue we amplified the 18S rRNA gene, the internal transcribed spacer 2 (ITS2), and the 16S rRNA gene regions for AMF, general fungi, and bacteria respectively. In leaf tissue only the ITS2 general fungal region was targeted. Primer pairs for PCR1 included AMF specific primers WANDA32 and AML233. For the ITS2 region we used a mixture of the fungal specific forward primers fITS7 and ITS7o34,35 and the general eukaryotic reverse primer ITS436. AMF specific primers were used in addition to general fungal primers because ITS2 primers may not allow for a thorough characterization of AMF communities due to poor amplification of AMF when other fungi co-occur37. For the 16S rRNA gene, we targeted the V4 region using primers 515F and 806R38. More detailed descriptions of PCR reactions are in Bullington et al. (2018)39 and Lekberg et al. (2018)37. Briefly in PCR 1, each primer was flanked by 22 bp Fluidigm universal tags CS1 or CS2 (Fluidigm Inc. San Francisco, CA, USA). Each reaction contained a total volume of 12.5 μl which included 1.0 μl of DNA extract as template, 2.5 pmol of each primer in 1 × GoTaq Green Master Mix [(Green GoTaq Reaction Buffer, 200 μM dATP, 200 μM dGTP, 200 μM dCTP, 200 μM dTTP and 1.5 mM MgCl2) Promega, USA]. Reactions were performed in duplicate or triplicate, depending on concentration of amplified product (number of reactions was kept consistent within each microbial group), in a Techne TC-4000 thermocycler (Bibby Scientific, Burlington, USA) under the following conditions: initial denaturation at 95 °C for 2 min followed by 35 cycles at 95 °C for 1 min, 54 °C (18S rRNA gene), 57 °C (ITS2) or 50 °C (16S rRNA gene) for 1 min, and 72 °C for one min, with a final elongation for 10 min at 72 °C. To confirm the presence of our target amplicons, all reactions were analyzed by 1.5% agarose gel electrophoresis using a 100 bp ladder (GeneRuler DNA Ladder, Thermo Scientific, USA) as a size standard. As a precaution, control samples were sequenced even if they did not produce a band during gel electrophoresis.
All amplicons generated during PCR1 were diluted 1:10 for use as template in PCR2. PCR2 primer complexes consisted of the same Fluidigm tags (CS1 or CS2) as PCR1 primers, 8 bp Illumina Nextera barcodes (Illumina Inc., San Diego, CA, USA), and Illumina adapters. To minimize index hopping, we used unique dual indexing pooling combinations40, stored libraries individually, pooled only immediately before sequencing, and removed free adaptors from our libraries. PCR2 amplicons were purified using AMPure XP beads (Beckman Coulter Genomics, USA), quantified by Qubit 2.0 fluorometer (Invitrogen, USA) and pooled in equimolar concentration prior to sequencing. Sequencing was done at the Institute for Bioinformatics and Evolutionary Studies (IBEST) genomics resources core at the University of Idaho (http://www.ibest.uidaho.edu; Moscow, ID, USA). Amplicon libraries were sequenced using ¼ of a 2 × 300 paired-end (PE) run on an Illumina MiSeq sequencing platform (Illumina Inc., San Diego, CA, USA).
Bioinformatics
Processing of raw sequence data was performed using “Quantitative Insights into Molecular Microbial Ecology 2" (QIIME2 version 2018.4; https://qiime2.org/)18. Sequence reads were first demultiplexed using the q2-demux plugin (https://github.com/qiime2/q2-demux). Only forward reads were used for the 18S rRNA gene region, as the overlap between forward and reverse reads is too short to merge the two without significant sequence loss. For the 18S rRNA gene only, forward reads were trimmed to 210 bp, which covers the informative region of our 18S rRNA gene target33. For ITS2 and the 16S rRNA gene, forward and reverse reads were trimmed where median quality score fell below 30, and if at any point quality score fell below 3 within the trimmed region, those sequences were removed from further analysis. The q2-dada2 plugin uses nucleotide quality scores to produce sequence variants (SVs), or sequence clusters with 100% similarity representing the estimated true biological variation within each sample. Although sequences are clustered at 100% similarity as opposed to the traditional 97% similarity, DADA2 produces fewer spurious sequences, fewer clusters, and results in a more accurate representation of the true biological variation present41. After DADA2 processing, microbial groups contained an average of 9000–17,000 sequences per sample (Supplementary Information, Table S1). All extraction and PCR controls were clean except for bacteria. Bacterial contaminants were subsequently removed from all samples to reduce potential background contamination. The database MaarjAM42 was used to assign taxonomy and remove non-target DNA for AMF, UNITE was used for general fungi (http://unite.ut.ee)43; and Greengenes was used for bacteria (http://greengenes.lbl.gov). All SVs that did not match with at least 70% identity (for bacteria) or 90% identity (for fungi) with at least 70% coverage to sequences within one of the above databases were removed. To help remove non-target DNA, we added many sequences representing non-target organisms (including many Asclepias spp.) to both our AMF and ITS databases to better identify contaminants and reduce misclassification when assigning taxonomy to these sequences. Taxonomy for each microbial group was then assigned using QIIME2 q2-feature-classifier (https://github.com/qiime2/q2-feature-classifier), a naive Bayes machine-learning classifier which has been shown to meet or exceed classification accuracy of existing methods44, setting a confidence threshold of 0.94 for fungi and 0.7 for bacteria. For 16S rRNA data, sequences identified as chloroplast or mitochondrial DNA were also removed, which resulted in the removal of > 90% of bacterial sequences (Supplementary Table S1). For non-AM root fungi, all Glomeromycota were removed in order to analyze AMF and non-AM root fungi separately.
Statistical analyses
All statistical analyses associated with microbial community richness and composition in roots and shoots of A. speciosa were conducted in R45 using the vegan package46, except where otherwise noted. All analyses were based on rarefied data (160, 7820, 634 and 1400 sequences for bacteria, AMF, non-AM root fungi and FFE, respectively) using the ‘rrarefy’ function. Samples with few sequences were removed from further analyses to allow for greater sequencing depth, which resulted in 36, 36, 34 and 36 samples for bacteria, AMF, non-AM root fungi and FFE, respectively. Analyses of ITS2 data when extracting DNA from all remaining foliar tissue was based on a rarefaction level of 8475 sequences. These sampling depths were chosen based on saturation of sequencing effort curves of all reads after the removal of non-target DNA (Fig. 2) and in effort to retain the most samples (n = 18 per sampling strategy for bacteria, AMF and foliar fungi and 17 for non-AM root fungi). Sequencing effort curves were produced using the iNEXT package47. To test how sampling strategy influenced microbial alpha diversity metrics, we calculated richness as the number of SVs in each sample as well as Pielou’s ‘J’ evenness, which describes the similarity of species frequencies. To compare diversity metrics (based on SVs) between the two sampling strategies we performed a Wilcoxon signed-rank test on all paired values. To ensure that potential differences in sampling strategies were not due to artifactual SVs, we also compared results when implementing LULU, an algorithm for post-clustering curation that clustered SVs at 98.5% similarity and a minimum relative co-occurrence of 0.948.
We performed non-metric multidimensional scaling (NMDS) to evaluate community structures for each sampling strategy and target region, individually. Each NMDS analysis was performed using the ‘metaMDS’ function and stress for all plots was between 0.04 and 0.16. These analyses were performed on Bray–Curtis distances of Hellinger transformed sequence abundances as well as Raup–Crick distances of presence/absence data. The ‘Procrustes’ function in vegan was used to assess similarity of patterns produced in the NMDS analyses for the paired sampling strategies and congruency was visualized in Procrustes plots. The ‘protest’ function was used with 1000 permutations to estimate the significance of the Procrustes statistic. We performed analyses on both presence/absence and abundance data to determine how low-abundant SVs influenced the differences between sampling strategies.
To determine variation in microbial communities among individual plants when we extracted from multiple subsamples, we performed NMDS analyses as well as a PERMANOVA using the ‘adonis’ function. We also performed a PERMANOVA to detect seasonal differences between May and September leaf-disc collections. Figures 2, 3, 4, 5, 6 were generated using ggplot249.
Results
Richness and evenness between sampling strategies
Contrary to our predictions, richness differed only for root bacteria where homogenizing before subsampling resulted in more SVs recovered than homogenizing after subsampling (p = 0.04, Supplementary Table S2and Fig. S1). Evenness did not differ for any microbial group. In addition, neither sampling strategy produced saturated species accumulation curves for any microbial community sampled, although AMF sampling approached saturation (Supplementary Fig. S2). This indicated inadequate sampling to characterize site richness, due to species turnover among plants. However, sequencing effort curves did saturate for all groups, indicating that sequencing depth was not a limiting factor in estimating the richness in individual samples (Fig. 2).
The most abundant SVs, (those occurring in at least half of our samples and recovered by both sampling strategies), represented only a small proportion of the microbial communities recovered. This included 5.5% of all AMF SVs (within the genera Glomus and Clairoideoglomus), while just 0.7% of total bacterial SVs (Bacillus, unknown bacteria) and 0.9% non-AM fungal SVs (Nectriaceae, Plectosphaerella, Tetracladium), fit these criteria. Foliar fungi were the most SV-rich group (Fig. 3), and only 0.8% of total SVs (Mycosphaerella) occurred in at least half of all samples and were recovered by both sampling strategies.
Community overlap between sampling strategies
We examined the community overlap, that is, the total number and identity of individual SVs irrespective of their relative abundance, that were recovered from the milkweed population using both sampling strategies. AMF had the lowest total SV richness and the highest community overlap at 61% (Fig. 3). Even when rarefying AMF at 700 SVs as opposed to 7820 SVs, to more closely match rarefaction levels of other groups, overlap was maintained at 61% between sampling strategies. This indicated that it was not the higher rarefaction numbers that caused the greater overlap observed in this group. Sequencing rarefaction curves also showed sufficient sequence numbers to capture a majority of sample richness, even at the lower rarefaction levels. Bacteria and non-AM fungi in roots had a moderate overlap (34% each) and foliar fungal endophytes had the highest SV richness and the least overlap with just 10% of total SVs recovered by both sampling strategies.
To explore the extent to which low-abundant SVs influenced the differences between each sampling strategy, we gradually removed SVs that were represented by < 0.01% and then < 0.05% of total sequences (Supplementary Table S1). This resulted in the removal of SVs represented by fewer than 16, 2, 40, and 47 sequences (< 0.01%); and 78, 11, 201, and 235 sequences (< 0.05), for non-AM root fungi, bacteria, FFE and AMF, respectively. At each removal step, the overlap of microbial communities gradually increased (Table 1). After removing SVs that were represented by < 0.05% of all sequences, the AMF community overlapped 92% between the two sampling strategies. Non-AM fungi in roots overlapped 63%. Bacterial and FFE SV composition, however, remained more different than alike with just a 47% and 35% overlap, respectively. Since foliar fungi showed the greatest differences, we then compared SV overlap after clustering at 98.5% similarity and 90% co-occurrence. This coarser clustering did not increase the overlap between sampling strategies (7% as opposed to 10%), suggesting that at least for this group, the fact that we chose to cluster at 100% similarity using DADA2 did not explain the general lack of overlap.
Within individual plants, the number of SVs overlapping in both sampling strategies was highly variable, particularly for all non-AM fungal communities (Table 2). For foliar fungal endophytes, the percentage SV overlap within a single plant ranged between 0 and 36%. Similarly, for root bacteria, the percentage SV overlap ranged from 0 to 33% among individual plants. Three A. speciosa individuals yielded completely different communities between sampling strategies (0% overlap in SVs) for foliar fungi. Root bacteria and non-AM root fungi each had one plant with non-overlapping communities between sampling strategies (Supplementary Fig. S3). Three and four additional plants yielded just a single common SV between the two sampling strategies for FFE and non-AM fungi, respectively, whereas 2 plants had only 1 overlapping SV in root bacteria.
Procrustes analyses reveal structural differences between sampling strategies
When we ran Procrustes analyses on bacterial and fungal groups using presence/absence data, we saw significant differences between sampling strategies for all groups except for AMF (Table 3). When considering the relative abundances of all SVs, as opposed to basing analyses on presence/absence, as we did above, the two sampling strategies still recovered significantly different microbial communities in root bacteria and in non-AM root fungi (Fig. 4, Table 3). For FFE, the two sampling strategies produced slightly more correlated community structures, but the similarity was weak (Table 3). This is likely because some plants had very similar FFE communities, while others had completely different FFE communities (Supplementary Fig. S3). Only AMF communities were consistently similar between sampling strategies when considering SV abundance.
Multiple extractions per plant revealed severe undersampling of FFE
From the six leaves collected from plants in September 2016, we extracted DNA from all remaining leaf tissue (1240–2460 mg per plant). Despite increasing the processed plant tissue by two orders of magnitude, species accumulation curves for individual plants still failed to approach a plateau (Fig. 5). This was true even when clustering SVs at 98.5% using the LULU method. Post-clustering curation using the LULU method removed fewer than 10% of SVs and did not change the overall results. Where extractions from subsamples of 30 mg of leaf tissue (6 discs from 6 different leaves) had recovered 7–26 total foliar fungal SVs per plant, extracting from tissue representing our entire bulk collection of 6 leaves per plant or 10 × as much leaf tissue, resulted in 76–142 total SVs per plant. The average richness recovered by each extraction was 24 ± 8 SVs and the average number of unique SVs added by each consecutive extraction was 12 ± 6 SVs. Extractions from the same plant overlapped as little as 21% of SVs, showing that extractions from the same homogenized plant tissue can vary substantially, even when extracting from much larger volumes. Perhaps not surprisingly, larger leaves contained more FFE SVs than did smaller leaves, indicating that overall richness increases with plant size.
Seasonal variation visible despite undersampling, but subtler differences were masked
With 30 mg subsampled per plant we were able to see broad seasonal differences between May and September (Fig. 6A). However, this approach did not allow us to differentiate among plants that harbored more similar communities. This became apparent when analyzing multiple replicates per plant (Fig. 6B). When we visualized the communities recovered in each additional extraction, we observed significant variation among plants that was not apparent when we only had community data from single 30 mg extractions. With the additional extractions we observed strong host filtering among individual plants (PERMANOVA, R2 = 0.91, P < 0.001). Subsequent pairwise analyses revealed that the fungal community within individual plants varied significantly from all other plants (Pairwise PERMANOVA, P < 0.01), and that three of the plants appeared much more similar to each other than the other two plants, potentially due to factors associated with their spatial distribution within the site.
Discussion
Different sampling strategies yield different microbial communities
The sampling strategies compared in this study (homogenizing tissue before subsampling and homogenizing tissue after subsampling) are common methods found in the literature for characterizing plant-associated microbial communities14,23,26,29. Procrustes analyses and community overlap between sampling strategies demonstrated that different strategies can capture disparate microbial communities within plants, with the extent of these differences depending on the community targeted and plant tissue type sampled. In FFE as well as bacterial and non-AM fungal communities in roots, subsamples from the same plant resulted in completely different sets of species recovered, illustrating the severe undersampling that is inherent to each of these strategies. With these sampling strategies, we are undoubtedly sacrificing power and accuracy to characterize the subtler aspects of plant microbiome interactions, despite often seeing community differences across landscapes, treatments or seasons.
Richness was higher when homogenizing before subsampling for bacteria only, despite differences observed in composition for all groups. It is perhaps surprising that homogenizing plant tissues before subsampling did not recover more species than homogenizing after subsampling for fungi as well, because with the former approach, more plant tissue is initially represented. Indeed, a previous study showed that sample pooling or homogenizing before subsampling resulted in a higher richness of soil fungi compared to equally sized individual samples50. In Song et al. (2015)50 they also found that multiple individual subsamples, rather than the single homogenized subsample, resulted in higher richness. This may suggest that the scale at which we are physically able to break down the particle size of plant tissues, as opposed to soil, is not always fine enough to sufficiently homogenize the fungi within. Because of this, plant-associated microbial communities may require a greater sampling effort than soil microbes. Additionally, the removal of low-abundant SVs did not result in differences in richness between the two sampling strategies for any microbial group, suggesting that neither strategy is better at capturing rare species. Although this study was performed only on milkweed plants, we believe that these results are applicable to other plant species as well. The richness reported here is similar to other studies of plant-associated microbes (e.g.51,52), indicating that differences in subsamples were not due to extreme richness of milkweed-associated microbes.
Microbial diversity should inform sampling effort
The higher congruency that we saw between sampling strategies for AMF compared to other microbial communities may be due to the differences in their local and global estimated richness. While the global number of AMF species has been estimated in the hundreds to low thousands42,53, global estimates of fungal species in general range in the millions54,55. A recent global estimate of bacterial richness suggests similar scales56. In this study specifically, AMF had the lowest total SV richness and the greatest similarity between sampling strategies, while foliar fungal endophytes had the highest SV richness, and the lowest overlap of SVs between strategies. Since the amount of tissue sampled was equal for all microbial communities, the sampling effort was likely much higher for AMF (relative to the whole AMF community), than it was for bacteria and non-AM fungi. Consequently, with each sample we are likely sampling a much larger proportion of true AMF species richness.
Even though the estimated total community richness was highest for foliar fungi, the average estimated richness per individual plant was highest for AMF. This suggests that similar AMF SVs re-occurred across all plants with low species turnover. On the other hand, fungi in leaves had lower average richness per plant (Fig. 4, Supplementary Fig. S3), but the highest total richness, meaning that there was higher turnover of FFE species among plants sampled. These results may be a direct reflection of the overall community richness of the different microbial groups as well as their ability to spread and co-occur within plants. Based on these patterns, more individual plants and a greater sampling effort within individuals are likely needed to characterize FFE communities compared to AMF communities.
Rare SVs contribute to variation among subsamples
Our results show that low abundant, rare SVs largely contributed to the differences seen between sampling strategies. Even AMF communities, which were already similar, increased in overlap by 50% between strategies after low abundant SVs (represented by < 0.05% of sequences, Table 1) were removed. Microbial community distributions are often characterized by long tails of low-abundant species15, and as such, the likelihood of resampling rare species in each replicate can be low. In one study, Zhou et al. (2011)57 randomly sampled a simulated community with an exponential distribution. They observed only a 53% overlap between two samples when sampling just 1% of that community. We see even more extreme differences in overlap in this study, where initial sampling effort is also low relative to the whole microbial community.
The importance of rare microbes may vary and is easily overlooked in favor of highly abundant, and perhaps more influential fungi or bacteria. However, due to the compositional nature inherent to amplicon data, those SVs that appear to be in low abundance at the time of sampling may only be relatively so. Also, we do not yet fully understand microbial species turnover or succession. Plant-associated microbial communities can change significantly in just a matter of months58, or even weeks59. In addition, the exact relationship between sequence number and biomass of a species is variable60, and there is little evidence, if any, that sequence number is in direct proportion with a species’ impact in an ecosystem. Some microbes may be more metabolically active than others, despite appearing to be present in smaller quantities61. The recovery of the rare microbial community is arguably just as vital as the recovery of species that appear more abundant.
Bioinformatics pipelines that artificially inflate the number of SVs, especially low abundant or rare SVs, could potentially inflate the differences we see among community subsamples. Hundreds of bioinformatics approaches have been used to analyze amplicon data, and no consensus exists on which is best. However, a recent study comparing the performance of 360 different software and parameter combinations showed that DADA2 (which is what we used here), with no other filter other than the removal of low quality and chimeric sequences, was best for recovering true richness and composition from a mock fungal community of 189 different strains62. If anything, DADA2 can erroneously lump closely related species41, which would make it more conservative than other methods used. However, in an effort not to overestimate the true variation between strategies compared in this study, we assessed the relative importance of rare taxa through the gradual removal of lesser-abundant sequences, and we also used LULU, which is sometimes employed to reduce artifactual diversity48. We also removed all SVs that could not be confidently assigned to known microbial taxa. Even with these approaches, substantial variation remained due to the inherent undersampling of the strategies compared.
Severe undersampling obscures subtle community variation
With the possible exception of AMF, none of the sampling effort curves approached an asymptote meaning that both sampling strategies failed to adequately characterize the microbial communities present within a single plant or plant community (Supplementary Fig. S2). We found that multiple replicates from a single plant can vary by nearly 80% in FFE SVs, even when extracting from larger amounts (250 mg vs. 30 mg) of tissue. Lindahl et al. (2013)63 suggests that if duplicate subsamples differ much in community composition then these differences threaten to obscure finer-scale treatment effects and ecological correlations, and that sampling effort should be increased. Indeed, a more robust sampling effort through the use of multiple technical replicates revealed remarkably strong (R2 = 0.91), and significant host filtering within each individual plant that would have gone unobserved if extracting DNA from just a single replicate per plant. Although we may be able to observe patterns in under sampled data among sites or treatments, it is difficult to train models and make predictions or inferences in regard to the larger microbial population.
As Unterseher et al. (2011)15 suggests, it is often unnecessary to saturate richness in microbial communities, but this should be carefully considered before developing experiments and testing hypotheses. One must take into account the objectives of the study and the accuracy and precision required to meet those objectives. Although the methods traditionally employed to sample plant-associated microbes may be sufficient to generally observe landscape-scale differences, it is important to recognize that we are not characterizing these communities, rather we are taking a sliver of a ‘snapshot’ of species composition from a single point in time. A large proportion of true microbial diversity for most systems will likely still remain undetected and the specific results may be limited in their replicability.
Summary and recommendations
Although it used to be common practice, multiple studies now suggest that duplicate and triplicate PCR reactions are unnecessary for fungi and bacteria64,65. However, based on the results of this study, we recommend the inclusion of a different kind of technical replicate (i.e., multiple extraction reps from a single plant), in addition to biological replicates (multiple plants in a single population), especially when studying factors that may generate subtler differences in plant associated microbial communities. We show that extracting DNA from the standard 25–30 mg (dry weight) per plant can result in microbial communities that vary by as much as 100% and extractions of 250 mg from a single plant can vary by as much as 79%. The need for increased replication is particularly important if site, treatment, or seasonal differences may be obscured by other environmental drivers. Striving for a more comprehensive understanding of the depth and structure of plant microbiomes and their response to their surrounding environments will help us to better understand the exact functions of plant–microbe associations and how we might manipulate plant microbiomes in order to reduce disease or increase plant productivity in the future.
Sample size and sampling effort have surpassed sequencing depth and cultivation as the bottleneck when characterizing plant-associated microbial communities. A good sampling design is essential to approximate underlying patterns in microbial community composition in a reproducible manner and both sampling effort and size should be clearly justified. Schloss (2018)20 elaborates on the concern of replicability and reproducibility with the growing use of Illumina-based studies of microbial communities, and describes PCR bias, sequencing errors, and cryptic or poorly described bioinformatics as preventing data from being generalizable to other environments. Undersampling and poor to absent descriptions of sampling effort and strategy also contribute to this problem, and the current frequency of undersampling should be concerning. The differences we see here between sampling strategies and the extreme variation among replicates suggest that many studies of plant-associated microbial communities may not be sufficiently replicable or reproducible.
Due to variation in community structure among AMF, bacteria, and non-AM fungi, standardizing a sampling protocol for all organisms is difficult, and best practices will, to some degree, depend on the specific organism targeted, richness, and site. Since neither sampling approach appeared to outperform the other, in many studies the overall sampling effort may be of greater importance. For example, when investigating landscape-scale differences in abundant or species poor microorganisms, a smaller sampling effort is often sufficient. However, we suggest that more diverse plant-associated microbial communities, such as foliar fungal endophytes and root-associated bacteria, necessitate a more robust sampling effort than what is currently practiced in the literature. Per sample richness, relative to the estimated total community richness should always be considered when determining the optimum sampling strategy for any system. For example, sampling strategies and volumes sufficient for sampling AMF communities in extreme environments are likely not adequate for sampling fungal endophytes in the tropics where richness is high66. The need for increased sampling effort is especially pertinent if noise associated with sites, treatments or sample processing may potentially obscure the differences among them. In studies that fail to see differences in microbial communities among sites or treatments, sampling effort should always be examined as a potential impediment.
In summary, we recommend that: (1) authors provide more transparent, detailed sampling information as well as sequencing and sampling effort curves, (2) sampling effort is not arbitrary, but is adjusted based on the diversity of plant microbiomes and per sample richness relative to total community richness (both of which may require preliminary sampling), and (3) authors consider increased sampling effort when investigating smaller-scale drivers of microbial communities such as host filtering or subtle gradients, or when attempting to truly characterize microbial communities. Finally, controlling for the amount of plant tissue sampled both before and after homogenization, although not tested here, may be the most optimal strategy for reducing potential bias. Standardizing how we sample plant-associated microbial communities as suggested by Dickie et al. (2018)19, or at the very least, insisting on more robust and transparent sampling strategies, will allow for more accurate and comprehensive analyses as well as better cross-study comparisons in the future.
Data availability
Raw amplicon sequence data: NCBI Sequence Read Archive (SRA), BioProject accession number PRJNA633878. Sequence variants: GenBank (https://www.ncbi.nlm.nih.gov/genbank) accessions: MN029770-MN030137 (FFE); MN029143-MN029446 (AMF); MN029447-MN029769 (root fungi); MN069335—MN069498 (root bacteria). Community matrices and metadata: Figshare https://doi.org/10.6084/m9.figshare.11741673.
References
Turnbaugh, P. J. et al. A core gut microbiome in obese and lean twins. Nature 457(7228), 480–484. https://doi.org/10.1038/nature07540 (2009).
Jiang, H. et al. Altered fecal microbiota composition in patients with major depressive disorder. Brain Behav. Immun. 48, 186–194. https://doi.org/10.1016/j.bbi.2015.03.016 (2015).
Gilbert, J. A. et al. Current understanding of the human microbiome. Nat. Med. 24(4), 392–400. https://doi.org/10.1038/nm.4517 (2018).
Berg, G., Grosch, R. & Smalla, K. Plant microbial diversity is suggested as the key to future biocontrol and health trends. FEMS Microbiol. Ecol. https://doi.org/10.1093/femsec/fix050 (2017).
Hirakue, A. & Sugiyama, S. Relationship between foliar endophytes and apple cultivar disease resistance in an organic orchard. Biol. Control 127, 139–144. https://doi.org/10.1016/j.biocontrol.2018.09.007 (2018).
Cline, L. C., Schilling, J. S., Menke, J., Groenhof, E. & Kennedy, P. G. Ecological and functional effects of fungal endophytes on wood decomposition. Funct. Ecol. 32(1), 181–191. https://doi.org/10.1111/1365-2435.12949 (2018).
Bullington, L. S. & Larkin, B. G. Using direct amplification and next-generation sequencing technology to explore foliar endophyte communities in experimentally inoculated western white pines. Fungal Ecol. 17, 170–178. https://doi.org/10.1016/j.funeco.2015.07.005 (2015).
Siddique, A. B. & Unterseher, M. A cost-effective and efficient strategy for Illumina sequencing of fungal communities: A case study of beech endophytes identified elevation as main explanatory factor for diversity and community composition. Fungal Ecol. 20, 175–185 (2016).
Unterseher, M., Siddique, A. B., Brachmann, A. & Peršoh, D. Diversity and composition of the leaf mycobiome of beech (Fagussylvatica) are affected by local habitat conditions and leaf biochemistry. PLoS ONE 11(4), e0152878. https://doi.org/10.1371/journal.pone.0152878 (2016).
Lekberg, Y. & Waller, L. P. What drives differences in arbuscular mycorrhizal fungal communities among plant species?. Fungal Ecol. 24, 135–138. https://doi.org/10.1016/j.funeco.2016.05.012 (2016).
Pei, C. et al. Diversity of endophytic bacteria of Dendrobium officinale based on culture-dependent and culture-independent methods. Biotechnol. Biotechnol. Equip. 31(1), 112–119. https://doi.org/10.1080/13102818.2016.1254067 (2017).
Dissanayake, A. J. et al. Direct comparison of culture-dependent and culture-independent molecular approaches reveal the diversity of fungal endophytic communities in stems of grapevine (Vitisvinifera). Fungal Divers. 90(1), 85–107. https://doi.org/10.1007/s13225-018-0399-3 (2018).
Arnold, A. E. Understanding the diversity of foliar endophytic fungi: Progress, challenges, and frontiers. Fungal Biol. Rev. 21(2–3), 51–66. https://doi.org/10.1016/j.fbr.2007.05.003 (2007).
Jumpponen, A. & Jones, K. L. Massively parallel 454 sequencing indicates hyperdiverse fungal communities in temperate Quercus macrocarpa phyllosphere. New Phytol. 184(2), 438–448. https://doi.org/10.1111/j.1469-8137.2009.02990.x (2009).
Unterseher, M. et al. Species abundance distributions and richness estimations in fungal metagenomics—lessons learned from community ecology: Community ecology in fungal metagenomics. Mol. Ecol. 20(2), 275–285. https://doi.org/10.1111/j.1365-294X.2010.04948.x (2011).
McMurdie, P. J. & Holmes, S. Waste not, want not: Why rarefying microbiome data is inadmissible. PLoS Comput. Biol. 10(4), e1003531. https://doi.org/10.1371/journal.pcbi.1003531 (2014).
Allali, I. et al. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiol. 17(1), 194. https://doi.org/10.1186/s12866-017-1101-8 (2017).
Bolyen, E. et al. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. PeerJ Prepr. https://doi.org/10.7287/peerj.preprints.27295v2 (2018).
Dickie, I. A. et al. Towards robust and repeatable sampling methods in eDNA-based studies. Mol. Ecol. Resour. 18(5), 940–952. https://doi.org/10.1111/1755-0998.12907 (2018).
Schloss, P. D. Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research. mBio 9(3), 13 (2018).
Daleo, P. et al. Nitrogen enrichment suppresses other environmental drivers and homogenizes salt marsh leaf microbiome. Ecology 99(6), 1411–1418. https://doi.org/10.1002/ecy.2240 (2018).
Toju, H., Okayasu, K. & Notaguchi, M. Leaf-associated microbiomes of grafted tomato plants. Sci. Rep. 9(1), 1787. https://doi.org/10.1038/s41598-018-38344-2 (2019).
Zimmerman, N. B. & Vitousek, P. M. Fungal endophyte communities reflect environmental structuring across a Hawaiian landscape. Proc. Natl. Acad. Sci. 109(32), 13022–13027. https://doi.org/10.1073/pnas.1209872109 (2012).
Hahn, P. G. Effects of short- and long-term variation in resource conditions on soil fungal communities and plant responses to soil biota. Front. Plant Sci. 9, 15 (2018).
Saldierna Guzmán, J. P., Nguyen, K. & Hart, S. C. Simple methods to remove microbes from leaf surfaces. J. Basic Microbiol. 60(8), 730–734. https://doi.org/10.1002/jobm.202000035 (2020).
Busby, P. E., Peay, K. G. & Newcombe, G. Common foliar fungi of Populus trichocarpa modify Melampsora rust disease severity. New Phytol. 209(4), 1681–1692. https://doi.org/10.1111/nph.13742 (2016).
Gdanetz, K. & Trail, F. The wheat microbiome under four management strategies, and potential for endophytes in disease protection. Phytobiomes J. 1(3), 158–168. https://doi.org/10.1094/PBIOMES-05-17-0023-R (2017).
Haas, J. C. Microbial community response to growing season and plant nutrient optimisation in a boreal Norway spruce forest. Soil Biol. Biochem. 125, 197–209 (2018).
Barge, E. G., Leopold, D. R., Peay, K. G., Newcombe, G. & Busby, P. E. Differentiating spatial from environmental effects on foliar fungal communities of Populus trichocarpa. J. Biogeogr. 46(9), 2001–2011. https://doi.org/10.1111/jbi.13641 (2019).
Bunn, R. A., Simpson, D. T., Bullington, L. S., Lekberg, Y. & Janos, D. P. Revisiting the ‘direct mineral cycling’ hypothesis: Arbuscular mycorrhizal fungi colonize leaf litter, but why?. ISME J. 13(8), 1891–1898. https://doi.org/10.1038/s41396-019-0403-2 (2019).
Corcoll, N. et al. Comparison of four DNA extraction methods for comprehensive assessment of 16S rRNA bacterial diversity in marine biofilms using high-throughput sequencing. FEMS Microbiol. Lett. 364, fnx139. https://doi.org/10.1093/femsle/fnx139 (2017).
Dumbrell, A. J. et al. Distinct seasonal assemblages of arbuscular mycorrhizal fungi revealed by massively parallel pyrosequencing. New Phytol. 190(3), 794–804. https://doi.org/10.1111/j.1469-8137.2010.03636.x (2011).
Lee, J., Lee, S. & Young, J. P. W. Improved PCR primers for the detection and identification of arbuscular mycorrhizal fungi. FEMS Microbiol. Ecol. 65(2), 339–349. https://doi.org/10.1111/j.1574-6941.2008.00531.x (2008).
Ihrmark, K. et al. New primers to amplify the fungal ITS2 region—Evaluation by 454-sequencing of artificial and natural communities. FEMS Microbiol. Ecol. 82(3), 666–677. https://doi.org/10.1111/j.1574-6941.2012.01437.x (2012).
Kohout, P. et al. Comparison of commonly used primer sets for evaluating arbuscular mycorrhizal fungal communities: Is there a universal solution?. Soil Biol. Biochem. 68, 482–493. https://doi.org/10.1016/j.soilbio.2013.08.027 (2014).
White, T. J., Bruns, T., Lee, S. & Taylor, J. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A Guide to Methods and Applications (eds Innis, M. A. et al.) 315–322 (Academic Press, London, 1990).
Lekberg, Y. et al. More bang for the buck? Can arbuscular mycorrhizal fungal communities be characterized adequately alongside other fungi using general fungal primers?. New Phytol. 220(4), 971–976. https://doi.org/10.1111/nph.15035 (2018).
Caporaso, J. G. et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. 108(Suppl 1), 4516–4522. https://doi.org/10.1073/pnas.1000080107 (2011).
Bullington, L. S., Lekberg, Y., Sniezko, R. & Larkin, B. The influence of genetics, defensive chemistry and the fungal microbiome on disease outcome in whitebark pine trees: Genetics, terpenes, fungi and disease. Mol. Plant Pathol. 19(8), 1847–1858. https://doi.org/10.1111/mpp.12663 (2018).
Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40(1), e3. https://doi.org/10.1093/nar/gkr771 (2012).
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13(7), 581–583. https://doi.org/10.1038/nmeth.3869 (2016).
Öpik, M. et al. Global sampling of plant roots expands the described molecular diversity of arbuscular mycorrhizal fungi. Mycorrhiza 23(5), 411–430. https://doi.org/10.1007/s00572-013-0482-2 (2013).
Towards a unified paradigm for sequence‐based identification of fungi—Kõljalg—2013—Molecular Ecology—Wiley Online Library. Accessed October 30, 2020. https://doi.org/10.1111/mec.12481
Bokulich, N. A. et al. Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome 6(1), 90. https://doi.org/10.1186/s40168-018-0470-z (2018).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2018).
Oksanen, J., Blanchet, F. G., Friendly, M. et al. Vegan: Community Ecology Package (2019). https://cran.r-project.org/package=vegan.
Hsieh, T. C., Ma, K. H. & Chao, A. iNEXT: An R package for rarefaction and extrapolation of species diversity (Hill numbers). Methods Ecol. Evol. 7(12), 1451–1456. https://doi.org/10.1111/2041-210X.12613 (2016).
Frøslev, T. G. et al. Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nat. Commun. 8(1), 1188. https://doi.org/10.1038/s41467-017-01312-x (2017).
Wickham, H. Ggplot2: Elegant graphics for data analysis (Springer, New York, 2009). https://doi.org/10.1007/978-0-387-98141-3.
Song, Z. et al. Effort versus reward: Preparing samples for fungal community characterization in high-throughput sequencing surveys of soils. PLoS ONE 10(5), e0127234. https://doi.org/10.1371/journal.pone.0127234 (2015).
Furtado, B. U., Gołębiewski, M., Skorupa, M., Hulisz, P. & Hrynkiewicz, K. Bacterial and fungal endophytic microbiomes of Salicorniaeuropaea. Appl. Environ. Microbiol. 85(13), e00305-19. https://doi.org/10.1128/AEM.00305-19 (2019).
Kuźniar, A. et al. Culture-independent analysis of an endophytic core microbiome in two species of wheat: Triticumaestivum L. (cv. ‘Hondia’) and the first report of microbiota in Triticumspelta L. (cv. ‘Rokosz’). Syst. Appl. Microbiol. 43(1), 126025. https://doi.org/10.1016/j.syapm.2019.126025 (2020).
Davison, J. et al. Global assessment of arbuscular mycorrhizal fungus diversity reveals very low endemism. Science 349(6251), 970–973. https://doi.org/10.1126/science.aab1161 (2015).
Hawksworth, D. L. The magnitude of fungal diversity: the 1.5 million species estimate revisited* *Paper presented at the Asian Mycological Congress 2000 (AMC 2000), incorporating the 2nd Asia-Pacific Mycological Congress on Biodiversity and Biotechnology, and held at the University of Hong Kong on 9–13 July 2000. Mycol. Res. 105(12), 1422–1432. https://doi.org/10.1017/S0953756201004725 (2001).
Hawksworth, D. L. & Lücking, R. Fungal diversity revisited: 2.2 to 3.8 million species. Microbiol. Spectr. 1, 1. https://doi.org/10.1128/microbiolspec.FUNK-0052-2016 (2017).
Louca, S., Mazel, F., Doebeli, M. & Parfrey, L. W. A census-based estimate of Earth’s bacterial and archaeal diversity. PLoS Biol. 17(2), e3000106. https://doi.org/10.1371/journal.pbio.3000106 (2019).
Zhou, J. et al. Reproducibility and quantitation of amplicon sequencing-based detection. ISME J. 5(8), 1303–1313. https://doi.org/10.1038/ismej.2011.11 (2011).
McTee, M., Bullington, L., Rillig, M. C. & Ramsey, P. W. Do soil bacterial communities respond differently to abrupt or gradual additions of copper?. FEMS Microbiol. Ecol. 95(1), fiy212. https://doi.org/10.1093/femsec/fiy212 (2019).
Gao, C. et al. Strong succession in arbuscular mycorrhizal fungal communities. ISME J. 13(1), 214–226. https://doi.org/10.1038/s41396-018-0264-0 (2019).
Kleiner, M. et al. Assessing species biomass contributions in microbial communities via metaproteomics. Nat. Commun. 8(1), 1558. https://doi.org/10.1038/s41467-017-01544-x (2017).
Jousset, A. et al. Where less may be more: How the rare biosphere pulls ecosystems strings. ISME J. 11(4), 853–862. https://doi.org/10.1038/ismej.2016.174 (2017).
Pauvert, C. et al. Bioinformatics matters: The accuracy of plant and soil fungal community data is highly dependent on the metabarcoding pipeline. Fungal Ecol. 41, 23–33. https://doi.org/10.1016/j.funeco.2019.03.005 (2019).
Lindahl, B. D. et al. Fungal community analysis by high-throughput sequencing of amplified markers—A user’s guide. New Phytol. 199(1), 288–299. https://doi.org/10.1111/nph.12243 (2013).
Egan, C. P. et al. Using mock communities of arbuscular mycorrhizal fungi to evaluate fidelity associated with Illumina sequencing. Fungal Ecol. 33, 52–64. https://doi.org/10.1016/j.funeco.2018.01.004 (2018).
Marotz, C. et al. Triplicate PCR reactions for 16S rRNA gene amplicon sequencing are unnecessary. Biotechniques 67(1), 29–32. https://doi.org/10.2144/btn-2018-0192 (2019).
Arnold, A. E., Maynard, Z., Gilbert, G. S., Coley, P. D. & Kursar, T. A. Are tropical fungal endophytes hyperdiverse?. Ecol. Lett. 3(4), 267–274. https://doi.org/10.1046/j.1461-0248.2000.00159.x (2000).
Acknowledgements
We would like to thank MPG Ranch for funding this research. We would also like to thank Emily Martin and Ben Mason for help performing DNA extractions and PCR. We are eternally grateful for Peter Kennedy, Mike McTee, Morgan McLeod, Marirose Kuhlman, and William Blake, who all provided valuable edits of this manuscript. Sequencing was performed by the IBEST Genomics Resources Core at the University of Idaho and was supported in part by NIH COBRE Grant P30GM103324.
Author information
Authors and Affiliations
Contributions
L.B., Y.L. and B.L. conceived the study. L.B., Y.L. and B.L. collected samples. L.B. processed samples, analyzed data and wrote the manuscript with input and edits provided by Y.L. and B.L.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bullington, L.S., Lekberg, Y. & Larkin, B.G. Insufficient sampling constrains our characterization of plant microbiomes. Sci Rep 11, 3645 (2021). https://doi.org/10.1038/s41598-021-83153-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-021-83153-9
This article is cited by
-
Tomato defences modulate not only insect performance but also their gut microbial composition
Scientific Reports (2023)
-
Bioprospecting and Challenges of Plant Microbiome Research for Sustainable Agriculture, a Review on Soybean Endophytic Bacteria
Microbial Ecology (2023)
-
A Comprehensive Insight of Current and Future Challenges in Large-Scale Soil Microbiome Analyses
Microbial Ecology (2023)
-
Temporary establishment of bacteria from indoor plant leaves and soil on human skin
Environmental Microbiome (2022)
-
Acquisition and evolution of enhanced mutualism—an underappreciated mechanism for invasive success?
The ISME Journal (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.