High dispersal levels and lake warming are emergent drivers of cyanobacterial community assembly in peri-Alpine lakes

Disentangling the relative importance of deterministic and stochastic processes in shaping natural communities is central to ecology. Studies about community assembly over broad temporal and spatial scales in aquatic microorganisms are scarce. Here, we used 16S rDNA sequence data from lake sediments to test for community assembly patterns in cyanobacterial phylogenies across ten European peri-Alpine lakes and over a century of eutrophication and climate warming. We studied phylogenetic similarity in cyanobacterial assemblages over spatial and temporal distance, and over environmental gradients, comparing detected patterns with theoretical expectations from deterministic and stochastic processes. We found limited evidence for deviation of lake communities from a random assembly model and no significant effects of geographic distance on phylogenetic similarity, suggesting no dispersal limitation and high levels of stochastic assembly. We detected a weak influence of phosphorus, but no significant effect of nitrogen levels on deviation of community phylogenies from random. We found however a significant decay of phylogenetic similarity for non-random communities over a gradient of air temperature and water column stability. We show how phylogenetic data from sedimentary archives can improve our understanding of microbial community assembly processes, and support previous evidence that climate warming has been the strongest environmental driver of cyanobacterial community assembly over the past century.

Understanding the mechanisms that determine changes in the structure and composition of natural communities over large spatial and temporal scales is critical, given the impacts that human activities have on biodiversity and ecosystem functions 1 . The relative importance of stochastic and deterministic processes driving community assembly might vary over space and time: environmental conditions, dispersal, demographic stochasticity, ecological interactions and evolutionary processes can all influence the structure of natural communities across scales [2][3][4][5][6][7] . It is an on-going challenge to understand how anthropogenic environmental changes influence ecological and evolutionary mechanisms determining community assembly, particularly in aquatic microbes whose dispersal appears to have no boundaries 8,9 . Assembly studies focusing on ecological mechanisms in lake cyanobacterial communities have been scarce due to a lack of data at the appropriated spatial and temporal scale, despite the importance that these organisms have reached over the past decades for freshwater ecosystem functioning and services 10 . Over the last century, the frequency and severity of cyanobacterial blooms have increased in lakes and reservoirs worldwide despite remediation measures applied at the regional and international scale 11,12 . Cyanobacterial blooms are often dominated by toxic species, and there is a global concern that environmental changes are promoting the geographic expansion of some potentially harmful taxa 13,14 , due to a combined effect of increasing temperature and nutrient loads 11,15,16 . Toxic species such as Dolichospermum lemmermannii and Planktothrix rubescens have indeed widened their geographic distribution, supporting the idea that some harmful cyanobacteria are spreading across temperate lakes 16 . The role of geographic dispersal (where distance limits the establishment of new taxa) relative (2019) 9:7366 | https://doi.org/10.1038/s41598-019-43814-2 www.nature.com/scientificreports www.nature.com/scientificreports/ to turnover of taxa driven by environmental gradients has not been explicitly explored in the assembly of these globally important microorganisms.
In this study, we analysed cyanobacterial community composition data spanning over a hundred years and across ten lakes. We used 16S rDNA sequences from sediment cores of European peri-Alpine lakes ( Supplementary Fig. S1) that underwent directional environmental change characterised by climate warming and eutrophication 16 . Our previous work has explored the patterns of long-term change in alpha and beta diversity in lake cyanobacterial communities, showing a homogenization of assemblage composition at the regional scale 16 . The aim of this study was to test for emergent deterministic (environment-driven) and stochastic (dispersal-driven) patterns in the phylogenetic structure of cyanobacterial assemblages across these different lakes of the same region, using the same dataset 17,18 .
We used a null-model that accounted for temporal changes in the size of the species pool to simulate random assembly. We then tested for deviation from random patterns as phylogenetic clustering and overdispersion: the tendency for taxa to co-occur with larger or smaller expectancy, respectively, than predicted by the null-model ( Fig. 1a) [19][20][21][22][23] . In most cases, dispersal-driven assembly would generate random taxa co-occurrence patterns, while environmental drivers would lead to deviation from random assembly 20,22,24,25 . There can be interactions among assembly mechanisms that generate exceptions to these predictions 26,27 . We however expect that comparison of phylogenetic structures to null-model simulations, combined with the patterns of community phylogenetic similarity across lakes and spatial or ecological distance, will allow us to test for deterministic and stochastic signatures in cyanobacterial community assembly.
Specifically, when dispersal of cyanobacterial taxa among lakes is not limited (Fig. 1b), we expect that phylogenetic community similarity will decrease over an environmental gradient, while no change is expected when the system is driven only by dispersal (Fig. 1c) 18 . If there are barriers to dispersal of cyanobacteria, we predict differences in similarities among lake communities that are only dependent on the geographic distance (Fig. 1b), and no effects driven by ecological gradients (Fig. 1c) 18 . The environmental-driven decrease in phylogenetic community similarity will not be influenced by dispersal limitation (Fig. 1b), and will vary deterministically as a consequence of the gradient itself (Fig. 1c) 18 . This is because we expect that, under environment-driven assembly, the turnover of taxa along the ecological gradient will determine community structure in each lake. Here, we investigated whether cyanobacterial community phylogenetic structures within and across lakes over time matched these expectations from assembly processes, and what patterns dominate.

Results
Community phylogenetic structure. We calculated a standardised effect sizes (SES) of the mean-nearest-taxon-distance (MNTD) within each local community based on the comparison of the observed MNTD values with the values of a randomly assembled community (Methods). We then calculated the Nearest Taxon Index (NTI), which is the inverse of SES MNTD 28 . Based on NTI, 58% of cyanobacterial communities showed a phylogenetic structure that significantly differed from the null (random) expectation (Fig. 2). All of these non-randomly assembled communities were significantly phylogenetically clustered, with positive NTI values outside the 95% confidence interval of the null model simulation. Although the remaining thirty-two communities analysed did not show significant signal of non-randomness, most (especially since the 1980s) of the NTI values were positive, suggesting a tendency towards phylogenetic clustering. Schematic description of the theoretical expectations for phylogenetic community assembly within and between communities, based on stochastic (dispersal-driven) and deterministic (environment-driven) processes. (a) Taxa associations are analysed using their phylogenetic structure (mean-pairwise-distance (MPD), mean-nearest-taxon-distance (MNTD) or phylogenetic distance (UniFrac)) for each date of each timeseries and by comparing it to expected patterns from null-model simulations of random assembly (grey box): clustering and overdispersion (above and below the random expectation, respectively) signal communities that are composed of species phylogenetically closer or further apart than expected by chance, respectively, as a sign of deterministic processes. (b) Predicted patterns in phylogenetic community similarity depending on limitation (black line) and no limitation (grey line) in taxa dispersal among sites. (c) Predicted change in phylogenetic similarity for completely stochastic (grey) and deterministic (black) models of community assembly along an environmental gradient (e.g. lake physics and chemistry).
www.nature.com/scientificreports www.nature.com/scientificreports/ Distance-decay relationships. We estimated beta-diversity across all pairs of communities reconstructed from the sedimentary archives of the ten lakes at each time-period and investigated the role of geographical and temporal distance (Fig. 3). Our analysis based on the MNTD metric did not reveal an increase in phylogenetic beta-diversity with geographic distance (Fig. 3a), suggesting no dispersal limitation of cyanobacteria at the regional (peri-Alpine) scale. On the contrary, we observed in four of the lakes (Lugano, Hallwilersee, Maggiore, and Zurich) a decay of phylogenetic similarity along the temporal gradient representing the history of each lake ( Fig. 3b and Supplementary Fig. S2). When using the alternative beta-diversity measures beta-MPD, the results show significant decay in lakes Hallwilersee, Pusiano, Maggiore, and Zurich ( Supplementary Fig. S3) www.nature.com/scientificreports www.nature.com/scientificreports/ and a significant decrease in UniFrac similarity through time in all lakes, with the exception of lakes Geneva and Annecy (the latter due to insufficient data points) ( Supplementary Fig. S4).
Community similarity over environmental gradients. All non-random samples identified in Fig. 2 using the SES MNTD metric were used to investigate the role of the main chemical (total phosphorus [TP] and nitrate [NO 3 − ]) and physical (air temperature and water column stability) drivers in explaining cyanobacterial community deviation from a random assembly. We found no evidence for a role of the NO 3 − (p = 0.989, DF = 349 DF) in explaining non-random community structure and only a weak effect of the role of TP (p = 0.0164, adjusted R 2 = 0.0084, DF = 559; Fig. 4). The relationship between the main chemicals and beta diversity was also investigated using two other common phylogenetic diversity metrics, i.e. UniFrac and MPD. UniFrac similarity declined slightly over the TP gradient ( Supplementary Fig. S5). No significant relationship between MPD and TP or NO 3 − was observed (Supplementary Fig. S6). The effect of ammonia (NH 4 + ) was also considered, although concentrations of this nutrient have not been found to be historically high in these lakes. As for NO 3 − , there was no evidence for effects of NH 4 + on pairwise cyanobacterial phylogenetic diversity based on UniFrac, beta-MPD, and beta-MNTD ( Supplementary Fig. S7).
On the other hand, ordinary least squares regression showed a significant increase in community beta-MNTD along with both air temperature (p = 6.18e-06, adjusted R 2 = 0.0204, DF = 944) and water column stability (Schmidt Stability Index -SSI) gradients (p = 3.815e-08, adjusted R 2 = 0.0741, DF = 208) (Fig. 4). The regression based on UniFrac revealed a significant decay in phylogenetic similarity with the temperature gradient (p = 1.497 e-15 , adjusted R 2 = 0.0642, DF = 944), and when using beta-MPD values, we only observed a significant relationship with the water column stability gradient (p = 3.05 e-08 , adjusted R 2 = 0.1333, DF = 208; Supplementary  Fig. S6). This suggests that communities in lakes characterized by similar physical characteristics related to lake water temperature are more similar in cyanobacterial community composition compared to lakes that display greater differences in temperature and stratification.

Discussion
Over half of the cyanobacterial communities obtained in this study from sedimentary archives showed significant deviation from a random phylogenetic structure, suggesting a mixed signal of deterministic (environment-driven) and stochastic (dispersal-driven) community assembly in lake cyanobacteria. Our previous work has shown that DNA-based reconstructions of cyanobacterial communities are robust 16,29 , therefore the observed patterns are unlikely to be driven by biases in sedimentary DNA-based community reconstructions 2 . The decay in phylogenetic similarity over time coupled with the lack of a geographic distance-decay relationship across lake communities (Fig. 3) suggest temporally dynamic communities (potentially driven by environmental change) with no limitation to dispersal at the regional (peri-Alpine) scale. In a recent study on genetic divergence among populations of a marine diatom, a significant relationship could not be found between genetic and geographic distances at regional and global scales 9 . Most reports about microbial dispersal so far did not show clear evidence for geographic distance-decay patterns at the local (0-100 km) and regional (101-5,000 km) scales 30 . The scale of distances in our study was not suited to capture dissimilarity changes among cyanobacterial communities along very large geographical distances (e.g. continental), where an effect of geographical isolation might emerge 31 . Nevertheless, our research suggests that cyanobacterial communities present weak dispersal limitation among lakes of the same region, even around and across barriers such as the European Alpine mountain range.
Previous work has shown that communities of cyanobacteria have become more homogeneous in terms of composition across peri-Alpine lakes over the last decades, in favour of a few clades of bloom-forming and potentially toxic taxa 14,16 . We speculate that this could result in an increase of phylogenetic clustering over time, if the traits under selection by environmental changes are phylogenetically conserved 20 . The most sensitive metric of phylogenetic diversity in our study was the MNTD, which measures changes among the closest relative taxa at the tip of the phylogeny 28,32 . Our data show that the phylogenetic structure of about half of the assemblages displayed significant clustering at this level, which suggests the hypothesis that the environmental driving forces of cyanobacterial assembly in the lakes are acting on traits that are conserved at the tip of phylogenies.
Coloniality and buoyancy regulation are multiphyletic traits (i.e. present in multiple lineages) that are however conserved at among close relatives (within Family, Genus, Species), and have been associated to the spreading taxa, which belong to different cyanobacterial phylogenetic lineages within the orders Chroococcales, Nostocales and Oscillatoriales 10,11,16 . While coloniality is a defence trait under grazing pressure, buoyancy regulation becomes clearly advantageous under lake warming and a stable water column, since it allows cyanobacteria to adjust to vertical light conditions and access nutrients in deep waters. It appears reasonable to hypothesize that buoyancy regulation is advantageous in warming lakes, as it has been suggested in past reviews about drivers of cyanobacterial dominance 11,15 . The prevalent signal of clustering in the community phylogenies, supported by previous evidence, suggests environmental selection for traits such as those mentioned above.
The recorded levels of NO 3 − and NH 4 + across lakes did not significantly explain deviation from random assembly in the investigated cyanobacterial communities (Fig. 4 and Supplementary Fig. S5). In the case of TP, only a weak effect was found on beta-MNTD and UniFrac similarity, whereas no significant effect was observed on beta-MPD ( Fig. 4 and Supplementary Figs S5 and S6). It is important to note that most of the lakes investigated here classify as meso-eutrophic to eutrophic 33 (see average TP, NO 3 − and NH 4 + concentrations reported in Supplementary Data S1 and S2 in 16 ). Significant patterns of phylogenetic similarity might emerge across communities characterized by a broader nutrient gradient, but this remains to be tested by surveys or experimentally. Rather, the difference in lake physical conditions, such as temperature and strength of the water column stratification, seemed in our study to explain the most significant proportion of observed variance in the phylogenetic relatedness among cyanobacterial communities ( Fig. 4; Supplementary Figs S5 and S6). Warming might also explain the observed decay of phylogenetic similarity over time ( Fig. 3b; Supplementary Figs S2-S4). The increasing trend in air temperatures, which has accelerated since the 1980s across the peri-Alpine region, has caused modifications in the thermal regime of lakes, e.g., via changes in the duration and strength of the water column stratification 16,34 , which in turn affects recirculation and availability of nutrients for phytoplankton growth 10,23,35 . This effect has been amply documented and has favoured, as mentioned above, buoyant cyanobacterial forms that are able to control their vertical position in the water column to reach optimal nutrient and light conditions 14,16,23,34,35 . Our findings therefore support previous evidence and suggest that climate warming is the strongest environmental driver of the assembly of lake cyanobacterial communities 36 , and might select for specific traits such as, for example, buoyancy regulation.
In conclusion, this is the first study to our knowledge that explicitly tests for deterministic and stochastic assembly patterns in cyanobacterial communities across regional scales and over the past century, period during which humans have been recognized as a major driver of environmental change. Our study shows that both stochastic (dispersal-driven) and deterministic (environmental-driven) processes are important in assembling cyanobacterial communities across lakes of the European peri-Alpine region. Cultural eutrophication and climate change are the most notable environmental factors favouring cyanobacterial growth, but the deterministic processes governing community assembly appeared in our study to be more significantly driven by lake warming. Our results confirm previous evidence 16,36 and expand our understanding of cyanobacterial community assembly processes. Knowledge about the relative importance of (potentially controllable) environmental drivers and (likely uncontrollable) dispersal of organisms in shaping the structure of cyanobacterial assemblages is important for the management of aquatic ecosystems whose services are threatened by an increasing prevalence of potentially toxic taxa.

Materials and Methods
Data collection. We used the high-resolution 16S rDNA sequence dataset from 16 , spanning across ten European peri-Alpine lakes and between the early 1900s to 2016, to estimate phylogenetic diversity of cyanobacterial communities. Briefly, sediment cores were collected in ten lakes between 2013 and 2016 using a gravity corer, www.nature.com/scientificreports www.nature.com/scientificreports/ and layers were dated by varve counting and, in most cases, with radionuclide (Pb 210 , Cs 237 ) measurements 16,29 . Based on the sediment age models, sediment sub-samples were collected at various depths in cores from each lake to capture the cyanobacterial community composition over the last ~100 years. DNA was extracted from bulk sediments in a clean laboratory facility following strict ancient DNA work protocols, and the DNA extracts were used for PCR and high-throughput sequencing of the cyanobacterial 16S rRNA gene (Supplementary Table S1) on a MiSeq Illumina platform as previously described 16,29 .
The clean, primmer-trimmed sequences were clustered in operational taxonomic units (OTUs) with a 97% threshold of sequence similarity in QIIME 37 using the UPARSE workflow 38 . PyNast 37 and the Greengenes microbial sequence database 39 were used for sequence alignment, and FastTree 40 was used to estimate a phylogeny based on maximum-likelihood containing all OTUs found in the lakes. OTUs were taxonomically assigned with a confidence threshold of 85% and the ones assigned to phyla other than photosynthetic cyanobacteria were removed from the dataset. The 'phyloseq' package in Bioconductor 41 was used to import and filter the sequence data and all analyses were performed with the software R version 3.3.2 42 . Each sample was rarefied to 2,744 sequences (cyanobacteria only) prior to phylogenetic analyses.
The physical (air temperature in °C) and chemical (nitrate [NO 3 − ] and ammonia [NH 4 + ] in mg/L, total phosphorus [TP] in µg/L) data consist of several decades of monitoring of the ten lakes 16 . In all lakes, with the exception of Lake Pusiano, the nutrient data was collected at discrete depths over the water column and we have integrated values over the twenty upper meters. For Lake Pusiano, only the integrated values (whole water column) were available. Annual means were derived from monthly or bi-monthly data (Supplementary Data S1 and S2). For each sediment layer, the mean annual nutrient concentration of three consecutive years was used in order to reduce the bias related to sediment dating uncertainty (see 29 for further details). The annual maximal Schmidt Stability Index (SSI; the maximal strength of water column stratification) was derived from water temperature and hypsometry data 16,43 . Euclidean distances for each environmental variable were calculated among lakes to derive environmental gradients used in the linear models in the R package 'vegan' version 2.4.4 44 . phylogenetic analyses. To derive the phylogenetic structure of each community, we quantified the mean-nearest-taxon-distance (MNTD) 32,45 using mntd and ses.mntd in the package 'picante' version 1.6.2 for R 46 and used null-model simulations of random assembly that account for temporal changes in the size of the species pool 22 . The MNTD metric accounts for changes among closest relatives, which makes it suitable to investigate changes over relatively recent evolutionary times 45 . We calculated a standardised effect size of MNTD (SES MNTD ) within each local community subtree based on the comparison of the observed MNTD values with the values in the random distribution using 999 randomisations of the species at the tip of the phylogenetic tree, while species richness was maintained [46][47][48] : SES MNTD = mean(MNTD Observed − MNTD Randomized )/SD(MNTD Randomized ) 49 . The SES MNTD values were multiplied by −1 to be equivalent to the neearest taxon index (NTI) 28 .
To quantify beta-diversity across lakes and turnover in phylogenetic composition through time, we derived the beta-mean-nearest-taxon-distance (beta-MNTD). Additionally, we investigated changes in beta-mean-pairwise-distance (beta-MPD) and UniFrac similarity based on the OTU table and the fasta files from amplicon sequencing. To derive the beta-MNTD and beta-MPD pairwise distances, we used the comdist and the comdistnt functions, respectively, in the package 'picante' . The UniFrac phylogenetic distances 50 between all pairs of samples were derived using the dist function in the Bioconductor package 'phyloseq' 41 . For the geographic distance-decay analysis, we used the GeoDistanceInMetresMatrix function in R to derive a matrix of geographical distances between lakes (see Supplementary Methods and Supplementary Table S2). The geographic distance-decay relationship was measured on binned communities each representing a period of one decade (from the 1930s to the 2010s; decades 1900s, 1910s and 1920s were excluded due to insufficient number of samples). The binning was done to remove the factor time from the analysis, as it would introduce a bias when comparing multiple samples from single lakes over time. The temporal distance-decay pattern of phylogenetic similarity was studied by plotting beta-MNTD, beta-MPD, and UniFrac distance against natural log-transformed time distances (years) for each lake in the dataset. Significance of the distance-decay relationship at each decade was tested using Mantel tests in the R package 'ade4' with a significance threshold of p ≤ 0.05. To test whether the environment (physical and chemical parameters) was a driver of community assembly, we used the samples that were identified as phylogenetically non-random (i.e., those which NTI values were outside the 95% confidence interval of the null model simulation) in linear ordinary least square (OLS) regressions where physical and chemical lake data were the explanatory variables.