Abstract
Plants are colonized by distinct pathogenic and commensal microbiomes across different regions of the globe, but the factors driving their geographic variation are largely unknown. Here, using 16S ribosomal DNA and shotgun sequencing, we characterized the associations of the Arabidopsis thaliana leaf microbiome with host genetics and climate variables from 267 populations in the species’ native range across Europe. Comparing the distribution of the 575 major bacterial amplicon variants (phylotypes), we discovered that microbiome composition in A. thaliana segregates along a latitudinal gradient. The latitudinal clines in microbiome composition are predicted by metrics of drought, but also by the spatial genetics of the host. To validate the relative effects of drought and host genotype we conducted a common garden field study, finding 10% of the core bacteria to be affected directly by drought and 20% to be affected by host genetic associations with drought. These data provide a valuable resource for the plant microbiome field, with the identified associations suggesting that drought can directly and indirectly shape genetic variation in A. thaliana via the leaf microbiome.
Similar content being viewed by others
Main
The widely different environments in which the cosmopolitan species Arabidopsis thaliana is found today1 have left strong signatures of selection throughout its genome2. While geographic differences in abiotic factors are well appreciated, similar differences in the resident microbiota are also likely to influence local plant fitness3. A recent survey of A. thaliana root microbiomes4 found regional differentiation, often reflecting the composition of the soil microbiota. Host location was similarly significantly correlated with both root- and leaf-associated microbial composition of another crucifer, Boechera stricta5.
We already know that host genetics can influence microbiome composition5,6,7,8, and geographic differences in host genetics may in turn structure the resident microbiome, but the two might also be independently affected by physical distance, including abiotic factors that vary geographically4,5. For example, pH is a significant predictor of bacteria in the A. thaliana rhizosphere4, consistent with pH as a major driver of soil bacterial communities9. Similarly, precipitation can be a significant predictor of plant microbiome composition10.
Because previous studies have typically been limited in the number of populations4 or the geographic range surveyed3, it has been difficult to disentangle the effects of host genetics, geography and abiotic factors on the plant-associated microbiome. In this Resource, we use a continental-scale assessment of bacteria that colonize A. thaliana leaves to identify environmental and host genetic factors that are strongly associated with distinct microbiome types. We then determine the environmental variables that best predict microbiome composition. Finally, we follow up with a controlled field experiment to test the relative contributions of host genetics and of water availability to these predictable patterns and a direct demonstration that a common bacterial taxon can provide drought protection. Our results indicate that differential plant survival in low-water environments might in part be due to different bacteria colonizing drought-adapted and drought-susceptible plants.
Results
From February to May 2018, we visited 267 European A. thaliana populations around the end of their vegetative growth and close to the onset of flowering11 (Fig. 1a,b). At each site we collected whole rosettes from two individuals, along with a neighbouring crucifer (family Brassicaceae, primarily Capsella bursa-pastoris), if present, and two soil samples. We evaluated A. thaliana life history traits (Fig. 1c and Extended Data Fig. 1) and extracted information on climate variables for the collection sites12. We assessed the microbial composition of the leaf and soil samples by sequencing the V3–V4 region of the 16S ribosomal RNA locus and identifying amplicon sequence variants (ASV) using DADA13. Each ASV was considered a distinct bacterial lineage or phylotype. Host genetics and absolute microbe abundance were assessed by shotgun sequencing plant tissue, which generates reads of host and microbial genomes14.
Phyllosphere composition is distinct from the soil and is host species specific
There is considerable debate as to the origin of the microbes that colonize plants, although soil often has a measurable influence4,15,16. A study across 17 European A. thaliana populations4 found differentiation between root and non-root-associated microbes, but no significant differences between A. thaliana and neighbouring grasses4. Intra-species comparisons in a common garden experiment had suggested that host genetics can explain about 10% of the variance among A. thaliana leaf bacteria17. At the basis of these comparisons is the question of how much the host influences microbiome assembly, either because of active recruitment of specific microbes, or because of the differential ability of microbes to colonize their hosts.
To explicitly test for enrichment of specific taxa in the phyllosphere, we compared soil and plant leaves across all 267 sites via multi-dimensional scaling (MDS; Hellinger transformation). As expected, there was broad-scale separation between the phyllosphere and the soil (Fig. 2a,b). Modelling18 the effect of compartment on the microbial core phylotypes in the phyllosphere revealed differential abundance of 91% (524/575) of phylotypes between the A. thaliana phyllosphere and soil (False Discovery Rate (FDR) <0.01). Focusing on differences among host species18, we found 36% (205/575) of phylotypes to distinguish A. thaliana from neighbouring crucifers (Extended Data Fig. 2). This indicates that inter-host species differences in genetics or phenology have a strong influence on microbiome composition. On a phylotype-by-phylotype basis, abundance in A. thaliana was poorly predicted by a phylotype’s abundance in soil or in the surrounding companion plants (Extended Data Fig. 2).
Phyllosphere microbial composition varies with latitude
We tested the geographic differentiation of microbiomes using dimensionality reduction for the entire community and assessment of the spatial distribution for each bacterial phylotype. The former reveals global trends in composition, while the latter provides information on individual microbes contributing to such trends. Loadings on both the first and second principal coordinate axes (Fig. 2c) correlated with latitude (Pearson’s r = 0.75, P = 2.2 × 10−16, and r = −0.24, P = 1.35 × 10−7, respectively), suggesting geographic structure in the phyllosphere microbiome. Because silhouette scoring19 indicated that A. thaliana phyllosphere microbiomes were best characterized as two distinct types, we used k-means clustering of the Hellinger-transformed counts table to classify our samples (Fig. 2c and Extended Data Fig. 3). We found that the two microbiome types were strongly differentiated by geography, with one dominating in Northern and the other in Southern Europe (Fig. 2d,e). Among individual phylotypes, the relative abundance of one third (33%) was significantly associated with latitude (linear regression, FDR <0. 01), but only a small minority, 2%, was correlated with longitude, confirming that Northern and Southern European A. thaliana reproducibly harbour different microbiota. One percent of the plant-associated phylotypes were also significantly correlated in the soil with latitude, suggesting that the latitudinal contrast is formed via colonization.
The phyllosphere changes with plant development and the seasons20. To test whether the observed latitudinal phyllosphere contrast could be explained by seasonal and developmental differences, we compared our samples with a multi-year dataset from a single location in Germany21. Projecting seasonal phylotype composition into the MDS biplots of our pan-European samples did not reveal any preferential association of collection season with microbiome type (Fig. 2f). Comparing changes in the abundance of single phylotypes between seasons and between the two major microbiome types (Fig. 2g) similarly did not point to the latitudinal contrast reflecting environmental variation being caused by local seasonal differences (Wald test of multinomial frequency estimates, P > 0. 01).
The association between latitude and phylotype abundance was phylotype specific, differing within and between bacterial families (Fig. 3a and Extended Data Fig. 3). Pseudomonas and Sphingomonas are abundant across A. thaliana populations21,22,23 and both genera can affect A. thaliana health21,24,25. Linear regression of each core phylotype onto latitude revealed that four of the five most abundant sphingomonads have latitudinal clines (Fig. 3a,b, FDR <0. 01), while the most abundant pseudomonad phylotypes did not show long-distance variation (Fig. 3b–e). Rhizobiaceae were also latitudinally differentiated. A consequence of phylotype-specific association with latitude was that the two major microbiome types were significantly differentiated at the phylotype level, but not at higher taxonomic levels (Fig. 2e and Extended Data Fig. 3). Thus, even though A. thaliana is colonized by different individual phylotypes in Northern and Southern Europe, the bacterial classes remain broadly the same (Fig. 2e).
Common phylotypes differ in their geographic distributions
A single Pseudomonas phylotype, ATUE5 (previously OTU5), is a common opportunistic pathogen in local populations in south-west Germany, where it is an important driver of total microbial load21. Because ATUE5 was also the most abundant pseudomonad in our study, we wanted to learn how its distribution was geographically structured (Fig. 3c). ATUE5 was the seventh most common phyllosphere phylotype overall, with a relative abundance of up to 64% (mean of 1.8%). ATUE5 was found in 56% of samples, but without significant latitudinal differentiation (Pearson’s r = 0.01, P = 0.92).
Despite ATUE5 being a common phyllosphere member, its distribution was disjoint, and ordinary Kriging interpolation across the sampled range confirmed a very patchy presence (Fig. 3c). In contrast, the most frequent Sphingomonas phylotype (and most frequent phylotype overall) showed a significant latitudinal cline (Fig. 3b). High ATUE5 abundance was largely limited to single populations or populations very close to each other, with a spatial autocorrelation restricted to distances of under 50 km (Extended Data Fig. 6). In summary, the Pseudomonas pathogen ATUE5 is widely yet very unevenly distributed.
Drought metrics predict microbiome composition
Common garden experiments have indicated that environmental factors strongly shape bacterial microbiome composition17. Our continental-scale data enabled us to test which abiotic factors are most correlated with geographic structure of the phyllosphere microbiome.
We tested for associations between climate variables and microbiome composition, including developmental and health traits as potential confounders26. Altogether, we considered 39 covariates that could influence microbiome composition (Extended Data Fig. 7 and Extended Data Table 1). We first removed covariates that were highly correlated with others and then performed random forest classification using the two microbiome types as response variables (Fig. 4 and Extended Data Fig. 8). The covariate with greatest explanatory power was the Palmer Drought Severity Index (PDSI) mean from the six pre-collection months, a metric of recent dryness27. PDSI was similarly the best predictor for the loading of a sample on MDS1. In general, environmental covariates were better predictors than were plant traits. In contrast, environmental covariates (including PDS1) had poor predictive power for plant-associated phylotypes in the soil microbiome, explaining less than 1% of the variance in the loading on the first principal coordinate axis.
Because PDSI is correlated with latitude, we tested whether information about both variables improves prediction outcomes. Inclusion of PDSI significantly improved predictive capacity (P = 4.2 × 10−7 for logistic regression with microbiome type and P = 2.7 × 10−7 for linear regression on MDS1), indicating that the association between microbiome type and PDSI extends beyond latitudinal correlation. PDSI was also predictive for microbiome composition within geographic regions and their corresponding sampling tours (P = 2.3 × 10−7 for logistic regression with cluster identity and P = 0. 047 for linear regression on MDS1).
From mixed-effects modelling, we estimated the marginal R2 for PDSI to be 50%. Together with previous work supporting the importance of water availability in determining host-associated microbiomes9, we conclude that water availability affects which microbes can access the host plant and/or proliferate on the host. Drought might do so directly by affecting plant physiology, indirectly by shaping host genetics or by a combination of the two. Additionally, drought affects the abundances of microbes in the abiotic environment, and hence which microbes are present for colonization.
Host genetics is associated with microbiome composition
Arabidopsis thaliana exhibits strong population structure across Europe, with a pattern of isolation by distance28 and greater latitudinal than longitudinal differentiation1. Climate-driven selective pressures, particularly water availability and drought29, along with different groups of insect predators30 have contributed to the geographic structure of A. thaliana genetic diversity.
To determine whether this extends to the phyllosphere microbiome, we extracted heritability estimates for phyllosphere phylotypes from eight common garden experiments in which 200 A. thaliana accessions had been grown in four Swedish locations across 2 years8. Two thirds (368/575; 64%) of our core phylotypes had been observed in this study8. We were able to obtain heritability estimates for 251 of these phylotypes, almost all of which (247; 98.4%) had significant positive heritability in at least one of the eight experiments. Genetic differences are therefore very likely to contribute to the observed geographic differentiation of the A. thaliana phyllosphere microbiome across Europe. However, heritability does not necessarily imply direct host control of each phylotype, as it can also be exerted indirectly via microbial hub taxa8.
To determine how microbiome composition in our study might be influenced by host genetics, which was representative of previous surveys1 (Extended Data Fig. 4), we fitted a mixed-effects model that included relatedness as a random effect and the loading on the first axis of the decomposition of the microbiome composition as the phenotypic response variable. Plant genotype alone explains 68% of the variance in the loading along MDS1 and 52% of the variance in the MDS2 loading (pseudo h2 0.68, standard error of the mean (s.e.m.) 0.10 for MDS1 and pseudo h2 0.52, s.e.m. 0.12 for MDS2). MDS1 explains 8% and MDS2 5% of the variance in microbiome composition, consistent with host genetics probably playing only a subordinate role in structuring the microbiome8,17,31. In a mixed-effects model, PDSI was associated with MDS1, whereas several genetic principal components were associated with MDS2 (Extended Data Tables 2–4).
Because immune genes are prime targets for interactions with microbes32,33, we tested whether specific immune gene alleles are associated with the two microbiome types. Among a generous, though not exhaustive, list of 1,103 genes with connection to pathogen response and defense34, the top single-nucleotide polymorphism (SNP) was in ACD6 (empirical P = 0.0001) (Fig. 3f and Extended Data Fig. 5). ACD6 alleles can differentially impact pathogen resistance through constitutive effects on immunity35. The full ACD6 haplotypes associated with each microbiome type have not yet been reconstructed, as the short reads used for genotypic comparisons did not allow for resolution of full-length alleles. Nonetheless, our results demonstrate a striking association between microbiome type and polymorphisms in a central regulator of immune activation. Whether resident microbiota select for ACD6 allele type, or instead ACD6 allele type influences microbiome type, remains to be determined.
Are genetic alleles responsible for microbiome variation across geography? For defense genes such as R genes, this is probably not the case as variation tends to be maintained within local populations of A. thaliana36,37. We do not know whether this extends to genes that control the non-pathogenic microbiota. A previous study found ~150 SNPs to be significantly associated with heritable microbiome composition in A. thaliana31. When we tested the geographic differentiation of these SNPs across Europe (Extended Data Fig. 5), we found that they had significantly higher global Fst values than the genome-wide background, consistent with different A. thaliana populations selecting for different microbiota.
Host adaptation to drought influences microbial abundance
To disentangle the impact of drought from that of plant genetics, we conducted a common garden field experiment in California. Using a setup similar to our previous work in Europe29, we grew A. thaliana accessions (Extended Data Table 5) under a high- and low-watering regimen. Focusing on accessions that had previously been identified as drought adapted or susceptible based on genetic loci associated with adaptation to drought29, we assessed differences in phyllosphere composition after drought stress. Of the 575 core phylotypes in the European field collections, 154 were present in California and 20 were sufficiently common to enable us to determine the relative influences of genetics and drought treatment on their relative abundances (Extended Data Tables 2–4). Of these 20 phylotypes, 3 were significantly influenced by host genetic classification of drought-adapted versus susceptible accessions, and 3/20 showed a significant interaction between drought treatment and host genotype (Extended Data Table 6). Two out of 20 showed a significant response to the abiotic drought treatment alone. The phylotypes that were significantly associated with plant genotype in the California field experiment accounted for an appreciable fraction of the total microbiome in the European wild collections—an average of 13.2% of the total microbial community in a plant and as high as 71.9% total relative abundance in a plant (Extended Data Fig. 9). The most abundant phylotype across the European collection (Extended Data Fig. 9) was significantly associated with plant genotypic classification. In total, these results indicate that genetic adaptation to drought has an impact on some of the most abundant bacteria that colonize a plant.
Common phylotypes alter drought effects on A. thaliana
Finally, we tested whether water availability can influence the abundance of a common phylotype, the opportunistic pathogen ATUE5. In growth chambers, we exposed 5-week-old plants of the Col-0 reference accession to a week-long drought, followed by syringe inoculation with the ATUE5 p25.c2 strain21. Three days after infection, we compared bacterial growth and green tissue in drought-stressed and well-watered plants. Drought significantly reduced the ability of ATUE5 to proliferate in planta (Extended Data Fig. 10; two-sided Wilcoxon rank-sum test, P = 0.003), a result consistent with Pseudomonas pathogens relying on water availability to spread and multiply38. Drought also significantly reduced the green, photosynthetically active leaf area (Extended Data Fig. 10), with ATUE5 infection blunting this negative effect of drought.
These results indicate that infection by an opportunistic pathogen may be conditionally beneficial, conferring drought tolerance under specific conditions. ATUE5 was previously shown to influence A. thaliana growth in a genotype-specific manner39, indicating that the interaction between drought and ATUE5 infection is likely to differ between plant populations. This is reminiscent of viral infection reducing drought-based mortality40 and in agreement with plant growth promoting effects of microbes under drought41, as discussed in a recent review42 of the diverse mechanisms of microbe-mediated drought tolerance. Moreover, there is precedence for cryptic A. thaliana pathogens providing environment-specific fitness benefits43.
Discussion
Our results reveal several robust trends. Firstly, colonization of A. thaliana leaves imposes a strong bottleneck on the microbes that arrive from the surrounding soil and other plants, with most microbes differing in abundance between the soil and A. thaliana leaves and more than a quarter differing between A. thaliana and companion plants from the same family. Host genetics clearly matters for determining which microbes manage to establish in and on the plant. Our results indicate that these trends, observed before over small regions4,7,8, are reproducible and ubiquitous on a continental scale. Secondly, geography and associated abiotic factors significantly influence the microbes on A. thaliana: a plant in Spain will very probably be colonized by a different suite of microbes than a plant in Sweden. Our field experiment begins to disentangle the direct contribution of geography-dependent climate differences on the microbiome from those that are mediated by adaptive differences in host genetics. We note, however, that both genetic population structure and environmental variables exhibit autocorrelation, hence the variance explained by plant genotype is invariably confounded by correlated environmental factors, with the exact extent being difficult to discern. We identify genetic variation in an immunity gene, ACD6, to be associated with microbiome type and with PDSI. Specific alleles of ACD6 confer drought tolerance44, adding further complexity to our understanding of the relationship between drought, microbes and plant genetics. Lastly, our analyses suggest that microbial colonization of plants is strongly dictated by water availability and the attendant microbiota. This again raises the question of how different microbial communities influence plant phenotype. Drought not only plays a major selective role in A. thaliana populations29, but it is also known to affect the ability of plants to withstand pathogen attack. An important question will be whether different background microbiomes in plants that are more likely to experience drought in the wild will help or hamper defense against pathogens45.
Methods
Sample collection
Arabidopsis thaliana and other crucifers were sampled during local springtime in 2018. Most crucifer companion samples were Capsella bursa-pastoris, and the rest were Cardamine hirsuta. A full list of sampling locations and dates is provided in Extended Data Table 1. Rosettes were separated from the roots using alcohol wipe-sterilized scissors and forceps, then washed with water and ground with a sharp disposable spatula (Roth) in RNAlater (Sigma, now Thermo Fisher). For each A. thaliana plant for which soil was accessible, one to three tablespoons of soil were collected from the location where the plant had been removed and placed in a clean airtight bag. Samples were then maintained in electrical coolers (Severin Kühlbox KB2922) until the end of the sampling trip (which were 1–12 days long). In the lab, samples were stored at 4 °C. Within 0–3 days, RNAlater was removed from plant samples. Samples were centrifuged for 1 min at 1,000g, the supernatant was removed and samples were washed with 1 ml autoclaved water. For storage at −80 °C, plant tissue was transferred with ethanol sterilized forceps to screw cap freezer tubes containing 1.0 mm Garnet Sharp Particles (BioSpec Products, Cat. No. 11079110GAR). A ~200 mg aliquot from each soil sample was transferred to a screw cap freezer tube using an ethanol sterilized spatula, with great effort to exclude plant and insect pieces. Before aliquoting, soil bags were kept at −80 °C and defrosted at 4 °C overnight, unless aliquoting was done immediately upon arrival in the lab at the end of the sampling trip.
Nagoya Protocol Compliance
Respective national authorities of all sampled countries that are party to the Nagoya Protocol were contacted. Where needed, advised measures were taken and resulted in sampling and export permits: KC3M-160/11. 04. 2018 (Bulgaria), ABSCH-IRCC-FR-253846-1 (France) and ABSCH-IRCC-ES-259169-1 (Spain).
Plant phenotyping
Scores presented in Fig. 1 and Extended Data Fig. 1 are
-
Developmental state: vegetative (1), just bolting (2), flowering (3), mature (4) and drying (5)
-
Herbivory index: no (1), weak (2), strong (3) and very strong (4) herbivory
-
For rosette diameter, a 1 cm rosette diameter classification corresponds to any rosette diameter ≤1 cm.
DNA extraction
DNA was extracted from plant samples according to the protocol from ref. 21. Soil DNA was extracted using Qiagen Mag Attract PowerSoil DNA EP Kit (384) (cat. 27100-4-EP). On dry ice, soil samples were transferred from tubes to PowerBead DNA plates using sterile individual funnels. Plates were stored up to 2 weeks at −80 °C until processing. The Qiagen protocol was adapted to a 96-well-pipette (Integra Viaflo96). PowerBead solution and SL Solution were pre-warmed at 55–60 °C to avoid precipitation. RNase A was added to the PowerBead solution just before use. From step 17 of the protocol, instead of starting epMotion protocol, the following steps were performed: to each well of the 2 ml deep-well plate containing maximum 850 µl of supernatant, 750 µl of Bead Solution was added and mixed with Eppendorf MixMate at 650 rpm for 10–20 min. Plates were placed on a magnet for 5 min, the supernatant solution discarded and the beads washed three times with 500 µl wash solution. Beads were eluted with 100 µl elution buffer. The eluate was transferred to PCR plates and stored at −20 °C until library preparation.
Drought treatment with infection
Plants of the A. thaliana Col-0 reference accession were grown for 35 days at 23 °C under short day conditions (8 h light:16 h dark) with normal watering (approximately 1 l water per tray once soil moisture dropped below a reading of 3; XLUX Soil Moisture Meter). At 35 days, plants were randomized into new trays and watering treatments started. Soil moisture was measured every day. Control plants were watered normally once the soil moisture readings were between 2 and 3. Drought-stressed trays were dried down to an average soil moisture reading of 1, kept ≤1 for a full day, then maintained between a reading of 1 and 2 with minimal watering. The plants were exposed to these contrasting water conditions for seven days before infection. On day 7, control trays were watered normally (until soil moisture averaged a reading of 5–6 per tray) and drought trays were watered at 0.4× normal water per tray (reaching an average soil moisture reading of 2–3). After having been watered, two leaves per plant were syringe-infiltrated with either MgSO4 (control) or ATUE5 p25.c2 at an OD600 of 0.0002. Each treatment had approximately 96 plants, divided over four trays. Plants were photographed every other day, starting at 35 days after planting. Plant growth and health were estimated by measuring green pixel area per plant using plantCV46 (Supplementary Data Table 1). At 3 days after infection, hole punches were taken from two leaves per plant, ground and resuspended in dilutions 10 mM MgSO4. Colonies were counted after 2–3 days of growth on selective lysogeny broth agar plates with 100 µg ml−1 nitrofurantoin to select for Pseudomonas (Supplementary Data Table 2). No statistical methods were used to pre-determine sample sizes but sample sizes are similar to or greater than those reported in previous publications47.
Field experiment
Accessions
A total of 110 A. thaliana accessions were planted in a common garden experiment with water manipulation in a common garden field site at the Carnegie Institution for Science (37.42857020996903° N, 122.17944689424299° W) in Stanford in the spring of 2023 (Extended Data Table 5). We selected two groups of accessions based on their predicted contrasts in ability to survive drought in two consecutive field experiments at two locations. Based on survival data under low watering in Spain29, polygenic scores were trained on 515 accessions following state-of-the-art methods48 using PLINK v2.00a2.349. Conducting polygenic scores with different sets of SNPs (varying P value of their association with survival from 10−3 to 10−9), we verified a broad overlap of accessions in the top 30 and bottom 30 of the rank distribution. We utilized a threshold of 0.001 to select such 30 top and 30 bottom accessions. In a second round of experiments in California, a pilot study for the current work, polygenic scores were trained on total fitness (survival and fruit production) under drought conditions in 245 accessions. Polygenic score analyses used the software GEMMA and the Bayesian Sparse Linear Mixed Model50. This approach utilized genome-wide SNP information and their estimated parameters (probability of causal effect and the effect size) to make polygenic score predictions. We again selected 30 accessions with the highest and lowest polygenic scores. Finally, from the two polygenic score prediction rounds we identified 57 accessions with a high score in drought survival and 59 with a low score to conduct field experiments and microbiome analyses (3 and 1 accessions, respectively, did not have enough seeds for our experiment size). As there was some overlap in selected accessions from the first to the second year, only a total of 110 unique accessions were sown.
Experiment
We planted seeds from selected accessions in 464 individual, randomized pots on 16 November 2022 in a common garden field site at the Carnegie Institution for Science. Five to ten seeds were planted in each pot within a 60-pot tray with Nutrient Ag Solutions PROMIX PGX Biofungicide Plug & Germination mix. The trays were gently watered for 2 weeks until germinants were established. We thinned each pot to have a single plant, before imposing a high and low precipitation treatment. For the well-watered treatment, the plants received an additional 144 min of rainfall every 2 days from December 2022 to May 2023 (about 600 additional mm for the entire growing season) on top of the natural rainfall at this location. The drought treatment consisted of only natural rainfall, which in California typically leads to water stress and visible mortality of A. thaliana plants.
Microbiome study
On 5 April 2023, we collected two true leaves from every plant that had not begun to senesce or decay (386 plants in total). All tools were sterilized between plant sampling. Tubes with tissue were immediately submerged in liquid nitrogen and transferred to a −80 °C freezer.
16S rDNA ASV identification
Oligonucleotide primers targeting the consensus V3–V4 ribosomal DNA (rDNA) region from 341 bp (5′-CCTACGGGAGGCAGCAG-3′) to 806 bp (5′-GGACTACNVGGGTWTCTAAT-3′) were used to amplify 16S rDNA sequences with the protocol described in ref. 21. Briefly, amplification was achieved with a two-step PCR protocol in which 100 µM peptide nucleic acid was used in the initial PCR to block amplification of chloroplasts. Amplicons were sequenced on the MiSeq (Illumina) platform using the MiSeq Reagent Kit v3 (600 cycle). Samples with lower coverage were preferentially sequenced to greater depth in subsequent runs in a total of four runs of the Miseq. Output from all runs was pooled for downstream analysis. Primer sequences were removed before analysis with a combination of usearch (version 11, ref. 51) and custom bash scripting. The 16S rDNA sequences were quality trimmed using DADA213 (version 1.10.1). The forward read was truncated at position 260 and the reverse read at position 210 due to decreased quality of the second read. Reads were truncated when the quality score dropped to less than or equal to 2 (trunQ=2). Chimeras were removed with the removeBimeraDenovo function (method=‘consensus’) and ASVs called de novo using DADA2. The resulting reads were then aligned using AlignSeqs from the DECIPHER package52 (version 2.8.1). A phylogenetic tree of the de novo called ASVs was constructed using fasttreeMP53 (version 2.1.11). Taxonomic assignment of reads was performed with comparisons of 16S rDNA sequences to the Silva database54 (nr v132 training set).
Only samples with at least 1,000 reads after filtering for mitochondria and chloroplast reads were included. We began with 939 samples (including soil samples and neighbouring non-A. thaliana plants), in which we found 195,545 ASVs. A total of 918 samples had a sufficient number of reads (>1,000 reads) and after removing ASVs that were not found in any single sample with more than 50 reads, we were left with 10,566 ASVs. We identified a core set of 575 ASVs by filtering for those ASVs that were present in at least 5% of A. thaliana samples. The ASVs classified as belonging to the taxonomic class Cyanobacteria were removed from the dataset to eliminate possible misassignment of plant chloroplast DNA that can vary between plant genotypes and skew subsequent analyses.
For the Californian field experiment, we sequenced the 16S rDNA amplicons as above and processed ASVs with the same pipeline used for the European wild samples. In the Californian ASV table, we identified ASVs present in 10% or more of the samples, and merged these ASV identifiers with those of the European collections to call the intersection of observed ASVs.
Climate variables
The majority of climate variables were obtained from Terraclimate12 using the data for 2018 (http://www.climatologylab.org/terraclimate.html), a dataset with approximately 4 km spatial resolution. For random forest modelling and climate associations, we calculated the average value of each climate metric over the 6 months preceding the date of collection. The following variables were included in the random forest modelling from the Terraclimate dataset: tmax, maximum temperature; tmin, minimum temperature; vp, vapour pressure: ppt, precipitation accumulation; srad, downward surface shortwave radiation; ws, wind speed; pet, reference evapotranspiration (ASCE Penman–Montieth); q, runoff; aet, actual evapotranspiration; def, climate water deficit soil and soil moisture; swe, snow water equivalent; PDSI; and vpd, vapour pressure deficit.
We further analysed associations with Koeppen–Geiger climatic zones55,56, which were inferred in R using the package kgc and the regional classifications from ref. 57. Initial assessments of the density of microbes throughout Europe were calculated via ordinary Kriging using the R package automap58 (version 1.0-14). Four models were tested during variogram fitting, namely ‘Sph’, ‘Exp’, ‘Gau’ and ‘Ste’. Interpolation was performed either on the abundance data untransformed or on log10-transformed values with 0. 0001 added to allow for zero counts to be included. Global information on the major vegetation types was obtained using the Globcover 2009 map (released December 2010) from the European Space Agency (http://due.esrin.esa.int/page_globcover.php). Measures of soil properties were obtained using the International Soil Reference and Information Centre (ISRIC, global gridded soil information) Soil Grids (https://soilgrids.org/#!/?layer=geonode:taxnwrb_250m).
At the time of collection we took several measurements of the soil and air temperature and humidity (Soil temp, Air temp, Soil hum and Air hum), the surrounding plant community and the location type: distance between the focal and the closest neighbouring A. thaliana plant (Ath.Ath), distance between the focal and the closest other plant (Ath.other), immediate plant density (Ground cover), visible H. arabidopsidis infection on focal plant (Hpa plant) or at site (Hpa site), visible Albugo spp. infection on focal plant (Albugo tour), fraction of herbal plants in the surrounding (Strata herbs), and estimated sun exposure (Sun), slope (Slope) and ground humidity (Humidity ground). Measurements are listed and detailed in Extended Data Table 1.
Feature selection and random forest modelling
Features of interest were first identified by feature selection in the R package caret59 (version 6.0-86) using repeated cross-validation (three repeats). Prediction variables were preprocessed by centring, scaling and nearest-neighbour imputation for samples that lacked data for a variable. A training set was generated with 75% of the data. Random forest regression was performed to minimize the root mean squared error with repeated cross-validation. Variable importance was assessed via generalized cross-validation in the package caret59.
ASV differential abundance analysis
Differential abundance of ASVs in the soil versus A. thaliana, and A. thaliana versus other Brassicaceae was assessed using the edgeR18 package in R (version 3.28.1). We estimated a common negative binomial dispersion parameter, and abundance-dispersion trends by Cox–Reid approximated profile likelihoods60. We then fit a quasi-likelihood negative binomial generalized log-linear model to the count data. We tested for differential abundance by a likelihood ratio test.
Phylotype classification and regression
Phylotypic clusters were identified by k-means clustering of Hellinger-transformed ASV count matrices. The optimal number of clusters was determined through both partitioning around medioids61 using the pamk function in the R package fpc62 (version 2.2.9) and through silhouette analysis19 in the cluster (version 2.1.2) package in R63.
To determine the relative effect sizes of drought, latitude and plant identity on MDS loadings, phenotypes were modelled using restricted expectation maximum likelihood with the lmekin package in R with kinship as a random effect64. The kinship matrix was constructed using several methods including the R package gaston64 as well as the centred kinship matrix in gemma (version 0.98.3)65. The different methods yielded unstable estimates of kinship, probably due to the low coverage of the plant genomes. To account for the low coverage, we employed a method designed for kinship estimation in low coverage data, SEEKIN66 using the homogeneous parameter. Mixed-effects modelling with a kinship matrix was computed both with lmekin67 and with GEMMA. The data distribution was assumed to be normal but this was not formally tested. The proportion of phenotypic variance explained by the environmental covariates was estimated with the function ‘r.squaredLR’ from the package MuMIn (version 1.43.1) and the pseudo-heritability was estimated using the kinship matrix and lmekin as well as in GEMMA (-gk = 1, maf = 0.1). In the paper we report the lower estimate for pseudo-heritability as estimated in GEMMA with the centred kinship matrix also estimated in GEMMA.
To test for the relative effects of genotype, latitude and PDSI in a single model, we estimated the first five principal components of the plant genotype relatedness matrix68 and included the eigenvectors as covariates in our models for microbiome type and the loading on MDS1 and MDS2 (Fig. 2). The data distribution was assumed to be normal but this was not formally tested. Regressions used the lm and glm functions (logit link) in the R stats package. The relative importance of PDSI and Tour ID were tested with the models in glm glmer(cluster identity) ~ PDSI + 1|Tour_ID, family = ‘binomial’) or with lmer(MDS1 ~ PDSI + 1|Tour_ID).
Plant polymorphism calling and filtering
Raw reads were mapped to the TAIR10 reference genome of A. thaliana with bwa-mem (bwa 0. 7. 15)69. SNP calling was performed using GATK (version 3.5) HaplotypeCaller using recommended best practices70 with some modifications. Filtering for individuals with greater than 25% missing data (across all the SNPs) and bi-allelic SNPs with greater than 25% missing data (across all the individuals) resulted in a final set of 527 individuals with 409,850 bi-allelic SNPs for further analysis.
Population structure analysis of A. thaliana
Wright’s fixation index (Fst) was calculated using the method of Cockeram and Weir71. The 1001 Genomes1 dataset (without individuals from North America) was merged with the dataset from this study to perform principal component analysis. Genotypes from this study were projected into the principal component space of the 1001 Genomes genotypes using the SmartPCA tool of EIGENSOFT (version 6)72.
Heritability comparisons
For comparison of ASV distributions and heritability estimates, we identified related OTUs from four microbiome common garden experiments in Sweden8. OTU sequences from ref. 8 were downloaded from https://forgemia.inra.fr/bbrachi/microbiota_paper, as were heritability estimates for the OTUs. Correspondence between Swedish OTUs (called from sequenced V5–V7 region of 16S rDNA) and the ASVs in our study (identified from sequenced V3–V4 regions of the 16S rDNA locus) was established using the Qiime273 fragment insertion method using sepp-refs-gg-13-8 as the reference database. Correspondence between the OTUs and ASVs was established with divergence of less than 1% on the Green Genes tree.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The V3–V4 16S rDNA sequence data and metagenomic sequencing data of plants were deposited in the European Nucleotide Archive (ENA) under the Primary Accession ENA: PRJEB44379. Metadata and processed read data sets including phyloseq objects are available at Zenodo via https://doi.org/10.5281/zenodo.11187761 (ref. 74).
Code availability
Scripts for data processing, analyses and figure generation can be accessed at GitHub via https://github.com/tkarasov/pathodopsis.
References
1001 Genomes Consortium. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166, 481–491 (2016).
Hancock, A. M. et al. Adaptation to climate across the Arabidopsis thaliana genome. Science 334, 83–86 (2011).
Bartoli, C. et al. In situ relationships between microbiota and potential pathobiota in Arabidopsis thaliana. ISME J. 12, 2024–2038 (2018).
Thiergart, T. et al. Root microbiota assembly and adaptive differentiation among European Arabidopsis populations. Nat. Ecol. Evol. 4, 122–131 (2020).
Wagner, M. R. et al. Host genotype and age shape the leaf and root microbiomes of a wild perennial plant. Nat. Commun. 7, 12151 (2016).
Bodenhausen, N., Horton, M. W. & Bergelson, J. Bacterial communities associated with the leaves and the roots of Arabidopsis thaliana. PLoS One 8, e56329 (2013).
Mittelstrass, J., Sperone, F. G. & Horton, M. W. Using transects to disentangle the environmental drivers of plant-microbiome assembly. Plant Cell Environ. 44, 3515–3525 (2021).
Brachi, B. et al. Plant genetic effects on microbial hubs impact host fitness in repeated field trials. Proc. Natl Acad. Sci. USA 119, e2201285119 (2022).
Delgado-Baquerizo, M. et al. A global atlas of the dominant bacteria found in soil. Science 359, 320–325 (2018).
Roux, F., Frachon, L. & Bartoli, C. The genetic architecture of adaptation to leaf and root bacterial microbiota in Arabidopsis thaliana. Mol. Biol. Evol. 40, msad093 (2023).
Wagner, M. R. et al. Natural soil microbes alter flowering phenology and the intensity of selection on flowering time in a wild Arabidopsis relative. Ecol. Lett. 17, 717–726 (2014).
Abatzoglou, J. T., Dobrowski, S. Z., Parks, S. A. & Hegewisch, K. C. TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958–2015. Sci. Data 5, 170191 (2018).
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).
Karasov, T. L., Neumann, M. & Duque-Jaramillo, A. The relationship between microbial biomass and disease in the Arabidopsis thaliana phyllosphere. Preprint at bioRxiv https://doi.org/10.1101/828814 (2019).
Lundberg, D. S. et al. Defining the core Arabidopsis thaliana root microbiome. Nature 488, 86–90 (2012).
Bonito, G. et al. Plant host and soil origin influence fungal and bacterial assemblages in the roots of woody plants. Mol. Ecol. 23, 3356–3370 (2014).
Horton, M. W. et al. Genome-wide association study of Arabidopsis thaliana leaf microbial community. Nat. Commun. 5, 5320 (2014).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Beilsmith, K., Perisin, M. & Bergelson, J. Natural bacterial assemblages in Arabidopsis thaliana tissues become more distinguishable and diverse during host development. mBio 12, e02723–20 (2021).
Karasov, T. L. et al. Arabidopsis thaliana and pseudomonas pathogens exhibit stable associations over evolutionary timescales. Cell Host Microbe 24, 168–179.e4 (2018).
Regalado, J. et al. Combining whole-genome shotgun sequencing and rRNA gene amplicon analyses to improve detection of microbe–microbe interaction networks in plant leaves. ISME J. 14, 2116–2130 (2020).
Lundberg, D. S. et al. Contrasting patterns of microbial dominance in the Arabidopsis thaliana phyllosphere. Proc. Natl Acad. Sci. USA 119, e2211881119 (2021).
Innerebner, G., Knief, C. & Vorholt, J. A. Protection of Arabidopsis thaliana against leaf-pathogenic Pseudomonas syringae by Sphingomonas strains in a controlled model system. Appl. Environ. Microbiol. 77, 3202–3210 (2011).
Shalev, O. et al. Commensal Pseudomonas strains facilitate protective response against pathogens in the host plant. Nat. Ecol. Evol. 6, 383–396 (2022).
McMullan, M. et al. Evidence for suppression of immunity as a driver for genomic introgressions and host range expansion in races of Albugo candida, a generalist parasite. eLife 4, e04550 (2015).
Palmer, W. C. Meteorological Drought (US Department of Commerce Weather Bureau, 1965).
Platt, A. et al. The scale of population structure in Arabidopsis thaliana. PLoS Genet. 6, e1000843 (2010).
Exposito-Alonso, M. et al. Natural selection on the Arabidopsis thaliana genome in present and future climates. Nature 573, 126–129 (2019).
Züst, T. et al. Natural enemies drive geographic variation in plant defenses. Science 338, 116–119 (2012).
Bergelson, J., Mittelstrass, J. & Horton, M. W. Characterizing both bacteria and fungi improves understanding of the Arabidopsis root microbiome. Sci. Rep. 9, 24 (2019).
Teixeira, P. J. P., Colaianni, N. R., Fitzpatrick, C. R. & Dangl, J. L. Beyond pathogens: microbiota interactions with the plant immune system. Curr. Opin. Microbiol. 49, 7–17 (2019).
Ma, K.-W. et al. Coordination of microbe–host homeostasis by crosstalk with plant innate immunity. Nat. Plants 7, 814–825 (2021).
Glander, S. et al. Assortment of flowering time and immunity alleles in natural Arabidopsis thaliana populations suggests immunity and vegetative lifespan strategies coevolve. Genome Biol. Evol. 10, 2278–2291 (2018).
Todesco, M. et al. Natural allelic variation underlying a major fitness trade-off in Arabidopsis thaliana. Nature 465, 632–636 (2010).
Bakker, E. G., Toomajian, C., Kreitman, M. & Bergelson, J. A genome-wide survey of R gene polymorphisms in Arabidopsis. Plant Cell 18, 1803–1818 (2006).
Karasov, T. L. et al. The long-term maintenance of a resistance polymorphism through diffuse interactions. Nature 512, 436–440 (2014).
Aung, K., Jiang, Y. & He, S. Y. The role of water in plant–microbe interactions. Plant J. 93, 771–780 (2018).
Duque-Jaramillo, A. et al. The genetic and physiological basis of Arabidopsis thaliana tolerance to Pseudomonas viridiflava. New Phytol. 240, 1961–1975 (2023).
González, R. et al. Plant virus evolution under strong drought conditions results in a transition from parasitism to mutualism. Proc. Natl Acad. Sci. USA 118, e2020990118 (2021).
Ma, Y., Dias, M. C. & Freitas, H. Drought and salinity stress responses and microbe-induced tolerance in plants. Front. Plant Sci. 11, 591911 (2020).
Shaffique, S. et al. Research progress in the field of microbial mitigation of drought stress in plants. Front. Plant Sci. 13, 870626 (2022).
Hiruma, K. et al. Root endophyte Colletotrichum tofieldiae confers plant fitness benefits that are phosphate status dependent. Cell 165, 464–474 (2016).
Okuma, E., Nozawa, R., Murata, Y. & Miura, K. Accumulation of endogenous salicylic acid confers drought tolerance to Arabidopsis. Plant Signal. Behav. 9, e28085 (2014).
Colaianni, N. R. et al. A complex immune response to flagellin epitope variation in commensal communities. Cell Host Microbe 29, 635–649.e9 (2021).
Berry, J. C., Fahlgren, N., Pokorny, A. A., Bart, R. S. & Veley, K. M. An automated, high-throughput method for standardizing image color profiles to improve image-based plant phenotyping. PeerJ 6, e5727 (2018).
Goel, A. K. et al. The Pseudomonas syringae type III effector HopAM1 enhances virulence on water-stressed plants. Mol. Plant. Microbe Interact. 21, 361–370 (2008).
Choi, S. W., Mak, T. S.-H. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Wright, E. S. Using DECIPHER v2. 0 to analyze big biological sequence data in R. R J. 8, 352–359 (2016).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).
Koppen, W. Das geographische System der Klimate. Handbuch der Klimatologie I, 1–44 (1936).
Köppen, W. Versuch einer Klassifikation der Klimate, vorzugsweise nach ihren Beziehungen zur Pflanzenwelt. Geogr. Z. 6, 593–611 (1900).
Rubel, F. & Kottek, M. Observed and projected climate shifts 1901–2100 depicted by world maps of the Köppen-Geiger climate classification. Meteorol. Z. 19, 135–141 (2010).
Hiemstra, P. automap: automatic interpolation package. R package version 1.0-14. https://cran.r-project.org/web/packages/automap/automap.pdf (2013).
Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 28, 1–26 (2008).
Cox, D. R. & Reid, N. Parameter orthogonality and approximate conditional inference. J. R. Stat. Soc. Ser. B 49, 1–39 (1987).
Kaufman, L. & Rousseeuw, P. J. in Finding Groups in Data: An Introduction to Cluster Analysis 344, 68–125 (Wiley, 1990).
Hennig, C. fpc: flexible procedures for clustering. R package version 2.2-12. CRAN https://CRAN.R-project.org/package=fpc (2024).
Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M. & Hornik, K. cluster: cluster analysis basics and extensions. R package version 2.1.5. (CRAN, 2023).
Perdry, H. & Dandine-Roulland, C. gaston: genetic data handling (QC, GRM, LD, PCA) and linear mixed models version 1. CRAN https://cran.r-project.org/web/packages/gaston/gaston.pdf (2023).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Dou, J. et al. Estimation of kinship coefficient in structured and admixed populations using sparse sequencing data. PLoS Genet. 13, e1007021 (2017).
Therneau, T. M. & Therneau, M. T. M. coxme: mixed effects cox models. CRAN https://cran.r-project.org/web/packages/coxme/index.html (2015).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Van der Auwera, G. A. et al. From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).
Continental-scale associations of Arabidopsis thaliana phyllosphere members with host genotype and drought. Zenodo https://doi.org/10.5281/zenodo.11187761 (2024).
Schliep, K. P. phangorn: phylogenetic analysis in R. Bioinformatics 27, 592–593 (2011).
Acknowledgements
We thank J. Keck, T. Hagmaier, A. Rütten, T. Vaupel, K. Poersch, N. Vasilenko, H. Vo-Gia, J. Elis, C. Tahtsidou, T. Schlegel and F. Vogt for aliquoting soil. We thank F. Roux, H. Burbano, A. Duque and M. Collenberg for comments on the paper. We thank M. Horton for providing global SNP Fst values. We thank H. Burbano, M. Horton and B. Brachi for discussion. This work was funded by an HFSPO Long-term Fellowship LT000348/2016-L EMBO LRTF 1483-2015 and NIH grant R35 GM150722-01 (T.L.K.), ERC-SyG PATHOCOM (J.B. and D.W.) and the Max Planck Society (D.W.).
Funding
Open access funding provided by Max Planck Society.
Author information
Authors and Affiliations
Consortia
Contributions
T.L.K., R.S., G.S., J.B., M.E.-A. and D.W. devised the study. T.L.K., R.S., G.S., M.N., L.L., E.S. and the PATHODOPSIS collection team collected and prepared the samples. T.L.K., G.S. and M.N. processed the samples. T.L.K., R.S. and G.S. analysed the data. G.M. provided climate data. A.H. performed infection experiments. T.L.K., D.W. and R.S. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
D.W. holds equity in Computomics, which advises plant breeders. D.W. consults for KWS SE, a plant breeder and seed producer. The other authors declare no competing interests.
Peer review
Peer review information
Nature Microbiology thanks Maggie Wagner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Distribution of sampled A. thaliana plants with various developmental and health states.
Arbitrary scales (see Methods) except for rosette size (cm).
Extended Data Fig. 2 Differential abundance of phylotypes in soil, A. thaliana phyllospheres, and phyllospheres of other Brassicaceae.
a, 91% of phylotypes were differentially abundant between A. thaliana and soil. b, 36% of phylotypes were differentially abundant between A. thaliana and other Brassicaceae. Bold points indicate significance with an FDR ≤ 0.01. c, Within-site correlation of phylotype abundance. Correlation coefficients (scale on top left) were calculated for the co-occurrence of a phylotype within a site between the two A. thaliana plants collected at the site, A. thaliana x A. thaliana (third ring from the outside), other Brassicaceae x A. thaliana (second ring from the outside), and soil x A. thaliana (outermost ring). The central tree in the Circos plot represents the maximum likelihood tree of phylotypes, plotted without inferred branch lengths75.
Extended Data Fig. 3 Contrasts in phylotype abundances between Southern and Northern microbiome clusters.
a, Silhouette scores for membership assignment to either of the two main microbiome types, cluster 1 (Southern) and cluster 2 (Northern). For each cluster, number of individuals and average distance between a sample and members of the other cluster is indicated. b, Differential abundance of phylotypes between Southern and Northern microbiome clusters. y-axis shows the log2(Fold Change) for the relative abundance difference of a phylotype between clusters. Bold points indicate significance with an FDR ≤ 0.01.
Extended Data Fig. 4 Projection of A. thaliana genotypes from this study into genotypic PC space from the 1001 Genomes Project.
Individuals from this study (‘Pathodopsis’) align well with the broader 1001 Genomes (https://1001genomes.org) collection of primarily Eurasian accessions.
Extended Data Fig. 5 Fst around ACD6 and globally.
a, Cockerham and Weir’s fixation index Fst was estimated for SNPs in a list of known immune-associated genes. The genome-wide most extreme Fst values are concentrated in a region on chromosome 4 that includes the immune regulator ACD6 (At4g14400). Reference genome positions (in nt) on chromosome 4 given at the bottom. b, Bergelson and colleagues32 identified A. thaliana SNPs associated with (and likely to influence) microbiome composition. We compared the geographic differentiation of these SNPs (Fst) to the genome-wide distribution. Microbiome-associated SNPs exhibit significantly higher Fst values than the remainder of the genomic SNPs (Wilcoxon rank-sum test p < 2.2x10−16). The central horizontal line in each box indicates the median, the bounds indicate the upper and lower quartiles and the whiskers indicate 1.5*inner quartile range.
Extended Data Fig. 6 Distance-Semivariance plot for ATUE5.
Relationship between the geographic distance between two sampled A. thaliana plants, and the correlation of the abundance of ATUE5 between these two plants.
Extended Data Fig. 7 Correlogram of relationship between environmental and developmental covariates used in random forest modeling.
Covariates are detailed in Methods.
Extended Data Fig. 8 Biplots of the correlation of environmental and physiological variables on the MDS axes in Fig. 2.
a, Environmental variables derived from Terraclimate. b, Environmental variables measured at time of collection. Correlations were assessed with the envfit() function in vegan, and vector length corresponds to strength of correlation. Long-term climate variables (a) are better predictors of microbiome composition than are more temporary weather and physiological variables measured at the time of collection (b).
Extended Data Fig. 9 Relative abundance of phylotypes is significantly associated with plant genotype classification.
Four out of 20 phylotypes that were shared between the Eurasian collections and California field experiment were significantly associated with plant genotype. a, Histograms for the relative abundance of each of the four significant phylotypes across all plants collected in Eurasia. b, Histogram of the total relative abundance per plant of the sum of all four phylotypes (mean = 13.2% RA).
Extended Data Fig. 10 Impact of a common phylotype on plant growth as a function of drought status.
Arabidopsis thaliana plants were exposed to combinations of drought and infection with ATUE5 strain p25.c2. a, The change in plant leaf area from day 0 to day 10 as calculated based on daily images and extracting green pixels from images. Infection with ATUE5 reduced the negative effect of drought on plant growth (ANOVA, p = 0.0063 in ANOVA). b, measured colony forming units (cfu) on day 3 post infection. 5/41 (12%) drought-treated plants had established infection on day 3, whereas 17/23 (42%) of control plants had established infection at this same timepoint (Wilcoxon rank-sum p = 0.0027).
Supplementary information
Supplementary Data
Supplementary Data Tables 1 and 2 in .xlsx format with tabs.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Karasov, T.L., Neumann, M., Leventhal, L. et al. Continental-scale associations of Arabidopsis thaliana phyllosphere members with host genotype and drought. Nat Microbiol 9, 2748–2758 (2024). https://doi.org/10.1038/s41564-024-01773-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41564-024-01773-z