The effect of periodic disturbances and carrying capacity on the significance of selection and drift in complex bacterial communities

Understanding how periodical disturbances affect the community assembly processes is vital for predicting temporal dynamics in microbial communities. However, the effect of dilutions as disturbances are poorly understood. We used a marine bacterial community to investigate the effect of disturbance (+/−) and carrying capacity (high/low) over 50 days in a dispersal-limited 2 × 2 factorial study in triplicates, with a crossover in the disturbance regime between microcosms halfway in the experiment. We modelled the rate of change in community composition between replicates and used this rate to quantify selection and ecological drift. The disturbed communities increased in Bray–Curtis similarity with 0.011 ± 0.0045 (Period 1) and 0.0092 ± 0.0080 day−1 (Period 2), indicating that selection dominated community assembly. The undisturbed communities decreased in similarity at a rate of −0.015 ± 0.0038 day−1 in Period 1 and were stable in Period 2 at 0.00050 ± 0.0040 day−1, suggesting drift structured community assembly. Interestingly, carrying capacity had minor effects on community dynamics. This study is the first to show that stochastic effects are suppressed by periodical disturbances resulting in exponential growth periods due to density-independent biomass loss and resource input. The increased contribution of selection as a response to disturbances implies that ecosystem prediction is achievable.


Replicate similarity model selection
For each sample, relevant metadata was the disturbance regime at sampling, carrying capacity and sampling day. We calculated the similarity between biological replicates at each sampling day plotted them over time (Supplementary figure 1 a). We also investigated the variation over time in each cultivation regime as the standard deviation. The data indicated a temporal trend regarding both the replicate similarity and the standard deviation as a response to the disturbance regime, especially in Period 1. The replicate similarity appeared to decrease over Based on these observations, we decided to use a Bayesian hierarchical model approach because this accounts for the limited data points per day, the dependency between sampling days within each microcosm and the heteroscedastic variance (1,2). Bayesian modelling requires that one chooses an appropriate distribution for the data.
We first fitted a normal-, lognormal-, gamma-, beta-and logistic distribution to the data through maximum likelihood estimation for both Bray-Curtis and Sørensen similarity indexes. The function fitdist() from the package fitdistrplus version 1.0-14 was used for the distribution fitting (3). The observed data were compared with the distributions in a density-, quantile-, cumulative density-and probability plot. Based on these plots, the lognormal, gamma and beta distribution appeared to fit the data well. Because the variation was heteroscedastic, we chose to continue with the normal-and lognormal distributions as these allows constraining the variation to the metadata.
In a first trial, we fitted 24 model-formulas to the temporal Bray-Curtis replicate similarity in Period 1 (Supplementary table 1). The aim was to identify whether lognormal or normal best fitted the data, indicate the model formula, and test whether sampling day should be centralised.
Mean-centralising continuous variables can help to reduce covariance between random effect slopes and intercepts (4). For each model, we calculated the predictive density with the leaveone-out-cross validation method through loo(model, reloo = TRUE) from the package loo (version 2.2.0) (5). Then, all the 48 models predictive densities were compared through loo_compare() to identify the model that best fitted the data. The comparison of the predictive densities indicated that the normal distribution was the appropriate distribution, that variance should be given by sampling-day (time) and disturbance regime, and centralising sampling day  As an assurance to continue model selection with the appropriate formulas, we estimated model 1, 5, 9-13, 17 and 21-24 with the dataset for Bray-Curtis replicate similarity in Period 2 and for Sørensen in both periods. Within each dataset, the models' predictive densities were compared (Supplementary figure 2). Models 21 and 22 were removed due to a high elpd_diff (elpd_diff > 135). Model 23 had the highest predictive density for all the datasets. However, the models with day did not converge fully, and therefore, model 11 with centralised time was chosen.
Supplementary figure 2: The difference in the models expected log pointwise predictive density (elpd_diff) within each dataset. A total of 12 models were estimated for each of the datasets to determine if conclusions from the model selection in Period 1 of Bray-Curtis replicate similarity were appropriate for the other datasets. The predictive density of each model was estimated, and all models within the same dataset were compared to each other yielding the difference in expected log pointwise predictive density(elpd_diff). The datasets used were the replicate similarity based on Bray-Curtis in period 1 (BC P1) and 2 (BC P2) and Sørensen in period 1 (Sør P1) and 2 (Sør P2).
We then tested twenty formula variations of model 11 and compared the models' predictive densities (Supplementary table 2    The same ordination space as in a) but subsetted by sampling week to highlight the succession based on the disturbance regime. UD (circles) were undisturbed the first 28 days and disturbed the last 22 days, while DU (triangles) were disturbed in the first period and undisturbed in the second. H (filled) and L (empty) indicates high and low carrying capacity, respectively. Colours represent the disturbance regime at sampling, and the shaded area the spread of samples with similar disturbance regime.