Middle-Eastern plant communities tolerate 9 years of drought in a multi-site climate manipulation experiment

For evaluating climate change impacts on biodiversity, extensive experiments are urgently needed to complement popular non-mechanistic models which map future ecosystem properties onto their current climatic niche. Here, we experimentally test the main prediction of these models by means of a novel multi-site approach. We implement rainfall manipulations—irrigation and drought—to dryland plant communities situated along a steep climatic gradient in a global biodiversity hotspot containing many wild progenitors of crops. Despite the large extent of our study, spanning nine plant generations and many species, very few differences between treatments were observed in the vegetation response variables: biomass, species composition, species richness and density. The lack of a clear drought effect challenges studies classifying dryland ecosystems as most vulnerable to global change. We attribute this resistance to the tremendous temporal and spatial heterogeneity under which the plants have evolved, concluding that this should be accounted for when predicting future biodiversity change.


Supplementary
. Summary of statistical results (F-values) for soil moisture (vol. %) and temperature (°C) measurements (Fig. 3). Linear mixed models for each parameter and site (Semi-Arid or Mediterranean) included treatment, (micro) habitat (under Shrubs vs. Open), Year and their interactions as fixed effects; sensor was included as a random effect.
Unfortunately, theft, vandalism and temporary malfunctioning of the equipment led to occasional lack of data. Therefore, we analyzed the data for each site separately by averaging per day, month, and growing season. Only days with recordings for more than 21 hours, months with more than 15 days and years with four or more months of data were included in the analysis (94% of all months for temperature and 93% for soil moisture). The final data  Although not directly comparable, this analysis has less explanatory power than the main analysis (Supplementary Table 2  9 (3x3 combination of ambient, -25%, -50% rainfall; weekly, biweekly, or monthly shots) reduced by only 1 treatment (-50%drought), no effect of frequency reduced by only 1 treatment (-50%drought), no effect of frequency reduced by only 1 treatment (-50% rain), no effect of frequency annuals  1 True statistical replicates per treatment. 2 Number of replicates multiplied with manipulated years. 3 Number of sampling units observed per treatment during the entire study period (replicates x manipulated years x # of subsamples). This number represents the total sampling effort. 4 Includes reports on species number (richness) or diversity indices (e.g. Shannon, Simpson). 5 Only composition analyses at multi-species level were considered (multivariate analyses), not reports on functional types.

Detailed site description
Our experiments were conducted at four research sites along the steep climatic gradient that runs from the Negev desert to the upper Galilee in Israel ( Fig. 1; Fig. 2).
The study region is characterized by particularly high plant species richness, most of which is accounted for by annual plants.
The 80% of all species. Between 50% (ungrazed areas) and 90% (grazed areas) of the ANPP can be accounted for by herbaceous plants, though shrubs and small trees may attain a large fraction of the standing biomass.

Detailed description of rationale and procedure of climate manipulations
The intensity and direction of our treatments was inspired by regional climate scenarios that were developed through statistical and dynamic downscaling of global models 23,24 . Downscaling is particularly important for the study region because rainfall varies considerably across very small distances (Fig. 1a, Fig. 2). At the onset of the study, we depended on low-resolution global circulation models (GCMs 25 ). Though suggesting a more likely decrease in mean annual precipitation, they had a high uncertainty and ranged from a 30% decrease to a 30% increase. To cover this range of possibilities, we applied both drought and irrigation. The most recent ensembles of downscaled climate scenarios, which were produced alongside our study, predict an increasing aridification throughout most of the study region, including a decrease of the annual precipitation with regional variation and an increase in temperatures 23,24 .
The average predicted decrease in rainfall is roughly 20%, with large geographic variation (approx. 10-30%). Therefore, the drought treatment is the most relevant for predicting climate change response in our study systems. Since temperatures were also increased by that treatment (Fig. 3, Supplementary Table 1) it mimicked the predicted change even more realistically, making our findings particularly robust.
Details about the rationale of the drought treatment can be found in the methods section of the main text. The method of using permanent strips ( Supplementary Fig. 1) instead of a closed roof to exclude rainfall was chosen mainly because it has been shown that the strips do not produce unwanted side effects, such as shading, reduction of temperature or reduction of wind speed, all of which would counteract the intended treatment effect 26 . In order to confirm this assumption also in our study, we measured soil moisture and temperatures as described in the methods in the main text. The findings from our on-site environmental recordings (Fig.   3, Supplementary Table 1), confirm the effectiveness of both the drought as well as the irrigation treatment, i.e. they had the desired effect on water availability and temperature.
The general rationale of the irrigation treatment ( Supplementary Fig. 1) was similar to that of most previous irrigation manipulation experiments e.g. 17 in that irrigation was increased based on the long-term average to make sure that irrigation would lead to a detectable increase in water availability compared to the long-term average, irrespective of precipitation variation among years. The solution was to supplement an amount which is consistent with the 'positive' scenarios suggested by our climatologist colleagues, and within the range of low resolution GCMs at the onset of the study; i.e. we added 30% of the long-term average rainfall to the irrigated plots.
This resulted in an additional 164mm rainfall in the Mediterranean and 90mm in the semi-arid site. We carefully checked the rainfall distribution at the sites in the past years to identify a protocol that fulfilled the following criteria: 1) irrigation should be spread across the entire season as to be largely parallel to the drought treatment; 2) irrigation should take place immediately after significant rain events and under cloudy sky so not to alter rain frequency and distribution and to minimize evaporation loss; 3) the irrigation should not extend the growing and monitoring season, i.e. the last irrigation should be applied no later than mid-April. To meet these criteria, we supplemented 10mm of rain after each major rainstorm that exceeded 5mm, with the final supplementary irrigation occurring usually in mid-March, while in two seasons (2007/08 and 2008/09) irrigation ended mid-April.

Calibrating biomass measurements
The

Assessing experimental power
We acknowledge that larger sampling effort is always desirable in any experiment, particularly in ecology. However, determining the strength of our experiment both in terms of statistical power and contrasting to other experiments of similar type can maybe go some way to justifying our main suggestion that highly variable, water limited systems are more resistant to climate change than expected.
Supplementary It should be noted though that post-hoc power analyses are not well advocated 27,28 . Part of the problem is determining thresholds for effect sizes of what we could reasonably expect to see. However, by comparing these proposed effect sizes with the observed response variables both across sites (Fig. 4) and among years at a site (Fig. 5), suggests that our experimental design was probably strong enough to find appropriate responses if they existed. Although rainfall difference is bigger between sites than for our applied manipulations, the proportional change in response variables are also much larger (Fig. 4). Among year within site differences also suggest that at least phenotypic, if not selection, responses should be observable over many years (e.g. SA average fold differences between driest and wettest years in open plots: richness = 2.1, density 11.6, biomass 4.3). We therefore deduce that other effects, possibly specific plant adaptations to variability, are likely to be slowing down any community responses to the climate manipulations.

Selection of model for testing treatment effects
Our core prediction was that communities will change their structure due to climate change, and we have included the factor 'time' as years in the models. However, one may argue that precipitation in the study years should have been included in the models. For example, there is a large amount of environmental variation between years which could have been caused by precipitation differences. We felt it best to capture this by including "year" as a categorical variable (Supplementary Table 2)thereby not making any assumptions on precisely which (combination) of several possible environmental variables were causing the between-year variations in community parameters.
Including precipitation in the statistical models did not change our finding of 'no response to the treatments' -despite containing far fewer parameters, models that included precipitation even had a larger AIC than those described purely by categorical years (Supplementary Table 3 and 4).
Nevertheless, we had thought of including rainfall within the statistical model. It is not possible to include rainfall when year is a categorical variable, but we did try it with year as a continuous variable (focusing on a "treatment x year" interaction to show us any manipulation effects, Supplementary Table 3). Analogous to the Auto Regressive (lag 2) function used in the main analysis, we found a model which included rainfall in year T (current growing season) T-1 (year previous to growing season) and T-2 (2 years prior to growing season) to have the best fit. We have also tested our response parameters by removing "year" completely from the model, and including rainfall at T, Table 3). Supplementary Tables 3 and 4 show that in general we find that while there are some subtle differences in details, the overall message is the same irrespective of the analyses, i.e. very few significant changes in the community due to the experimental treatments. More specifically, when rainfall was included in a statistical model as a predictor, treatment responses often explained some part of the remaining variance -as shown by the generally higher F-values (Supplementary Tables 3 and 4 compared to 2), but still only showed three significant responses, and never any difference between wet and dry.

T-1 and T-2 instead (Supplementary
Therefore, to keep the manuscript concise, we base our conclusions on the original and most powerful methods presented in the main text and in Supplementary Table 2.