Unscrambling phylogenetic effects and ecological determinants of chromosome number in major angiosperm clades

As variations in the chromosome number are recognized to be of evolutionary interest but are also widely debated in the literature, we aimed to quantitatively test for possible relationships among the chromosome number, plant traits, and environmental factors. In particular, the chromosome number and drivers of its variation were examined in 801 Italian endemic vascular plants, for a total of 1364 accessions. We estimated phylogenetic inertia and adaptation in chromosome number - based on an Ornstein-Uhlenbeck process - and related chromosome numbers with other plant traits and environmental variables. Phylogenetic effects in chromosome number varied among the examined clades but were generally high. Chromosome numbers were poorly related to large scale climatic conditions, while a stronger relationship with categorical variables was found. Specifically, open, disturbed, drought-prone habitats selected for low chromosome numbers, while perennial herbs, living in shaded, stable environments were associated with high chromosome numbers. Altogether, our findings support an evolutionary role of chromosome number variation, and we argue that environmental stability favours higher recombination rates in comparison to unstable environments. In addition, by comparing the results of models testing for the evolvability of 2n and of x, we provide insight into the presumptive ecological significance of polyploidy.

chromosome interactions during meiosis consistent with MIT 14 . However, in this case no obvious increases in genome size should be expected, unless caused by massive appearance of repetitive non-coding DNA which is very frequent in plants 3 .
Although a generalised model of chromosome number variation is lacking and despite the failure of models to fully predict available experimental evidence, the development of such a diverse range of models is suggestive, per se, of the high evolutionary interest of chromosome number variation.
Despite the interest in chromosome number evolution over the last decades [21][22][23][24] , in the absence of studies linking changes of chromosome number to natural selection, it is impossible to identify the adaptive function of such variation. At present, an adaptive role has been demonstrated for other genomic phenotypic traits, such as genome size [25][26][27] showing that species with large genomes may be at a selective disadvantage in extreme or unstable environmental conditions. However, for chromosome number variation, evolutionary hypotheses have been raised by a limited set of preliminary works 28 . It is generally admitted that chromosome number and genome size are not positively correlated in angiosperms and gymnosperms, while such a correlation is highly significant in ferns and lycophytes 29 . Further, Grant 30 argued that in angiosperms, species with a basic chromosome number higher than 14 should be considered paleo-polyploid, thus linking high chromosome numbers with multiple ancient rounds of whole genome duplications. In addition, a series of recent studies suggested that paleo-polyploidy is widespread across angiosperms 31 and seed plants 32 in general, irrespective of their chromosome number. Unbalanced (i.e., odd) chromosome numbers often cause significant phenotypic changes and severely impact plant growth and sexual fitness 33 , with the exception of apomictic plants and those with holocentric chromosomes, for which no deleterious effect (i.e., counter-selection) is expected 34,35 . It has been postulated that descendant dysploid chromosome number changes, coupled with the transition to an annual life form, are the main trend in angiosperms 34 ; however, this soon resulted in an overly broad generalization, like many other karyological assumptions concerning, for instance, the direction of variation in karyotype asymmetry 36 . Stebbins 1 also hypothesized that a chromosome number reduction by dysploidy should be expected in plants occupying pioneer habitats to avoid excessive segregation and recombination of genes. Nevertheless, protection of favourable allelic co-occurrences against recombination is also achieved by inversions -with no effects on chromosome number -in some animal taxa 37 . However, high chromosome numbers carry an increased risk of mis-segregation during nuclear division 20 . Stebbins 1 and Darlington 38 , together with Grant 2 , agreed that low recombination rates should be favoured in individuals living in unstable environments to quickly develop populations; in contrast, environmental stability is expected to select for increased recombination rates because loss of alleles is overbalanced by those rare allelic combinations with high fitness.
In particular, given that genetic recombination acts as a trade-off between the opposing needs of immediate fitness and evolutionary adaptability 2 , chromosome number should clearly play a central role in this balance 1,39 . To date, the possible relationship among chromosome number, plant traits, and environmental parameters has scarcely been investigated quantitatively. Accordingly, the main aim of our study was to use a dataset of 801 Italian endemic vascular plants (1364 accessions) to test these relationships. Italian endemics represent an ideal case study because complete information is available for basic chromosome number (x), and they share a common geographical evolutionary history 40 . Whilst we expect a significant phylogenetic signal in chromosome numbers, we specifically aim to quantitatively test the hypothesis that environmental stability and longer life cycles select for higher chromosome numbers, while unstable habitats select for lower chromosome numbers. To this end, we used a phylogenetic comparative approach that estimates phylogenetic inertia and adaptation in chromosome number based on an Ornstein-Uhlenbeck process 41 . The method considers a single trait adapting to optima influenced by continuous, randomly changing predictor variables or in response to fixed categorical niches. One of the main parameters returned by the model is the phylogenetic half-life (t 1/2 ), indicating the time it takes for half the ancestral influence on a trait to evolve towards the predicted optimal phenotype 42 . Using this method, we were able to estimate relationships between chromosome numbers, plant traits and environmental factors in an evolutionary framework. Finally, by comparing the results of the different models testing for evolvability of 2n and of x, we were able to assess the relative contribution of polyploidy to 2n and provide insight into its presumptive ecological significance.

Results
Phylogenetic effect in chromosome number. The phylogenetic effects in chromosome number varied among the examined clades, but were generally moderate to large (Table 1), thus rejecting the hypothesis of species independence. Diploid chromosome numbers (2n) exhibited significant phylogenetic effects (t 1/2 > 0), and the supported values (values within two log-likelihood units lower than the maximum log-likelihood) ranged from a moderate to strong phylogenetic effect, not exceeding 1.0 total length (t 1/2 < 1), with the exceptions of Malvids and Caryophyllales. Indeed, the best estimate for Malvids was t 1/2 = 0.01, with a support region from 0 to 0.05, suggesting that the trait evolved rapidly, nearly instantly on the timescale of this phylogeny. In contrast, the half-life value for Caryophyllales was many times the total tree length (infinity), indicating that 2n evolved as if by Brownian motion in this clade.
Compared to diploid numbers, the basic chromosome number (x) showed stronger phylogenetic effects with a half-life that included t 1/2 > 1 and a supporting region that included t 1/2 = ∞ in all clades. Furthermore, the wider support regions suggested that estimates of t 1/2 for x were more uncertain than those for 2n.
The stronger phylogenetic effects exhibited by x were associated with coefficients of variation (CV) lower than 0.5, while the CV calculated for 2n was larger and was followed by weaker phylogenetic effects (Table 2).
Adaptation and inertia in chromosome number. We found clear evidence for adaptation of chromosome number to environmental and morphological predictors, but in several cases, and especially for basic  (Tables 1 and 3). The effects of the continuous (climatic) predictors were generally weak. Indeed, only eight of the 60 models that used climatic predictors had lower AICc and lower half-life (indicating that not all the phylogenetic effect was due to phylogenetic inertia) compared to the model without the predictor (Table 1). Nevertheless, even in these cases (all these models refer to 2n), the half-life reduction was small and the AICc decrease never exceeded 2 units which indicated that there was a weak tendency for chromosome numbers to evolve towards the optimum. Hence, these results should be regarded with caution as they represent only a tentative indication of climatic effects on chromosome numbers.
In the eight models mentioned above, the climatic variables that explained variation in chromosome number, at least marginally, were mean temperature, temperature continentality and precipitation seasonality. Overall, the association of mean temperature with chromosome number was negative with the slope of the phylogenetic regression nearly flat in Caryophyllales and Lamiids, while it was steeper in other clades, despite explaining less than 4% of the variance. However, although the relationship was not significant, it was positive in Fabids. A negative relationship with chromosome number was also found for precipitation seasonality, while temperature continentality was generally positively associated with chromosome number.  Table 1. Phylogenetic regression results of the evolution of log-transformed chromosome numbers (2n and x) on climatic variables in each angiosperm clade (as indicated). All phylogenetic trees are scaled to 1.0 total length. Predictor variables are given in the committed column, except when a Brownian motion or a single equilibrium O-U was fit, which has only an intercept. In the latter case, the phylogenetic half-life (t ½ , with 2-unit support interval in parentheses) is a measure of the overall effect of the phylogeny on the response variable (phylogenetic signal). In models with predictors, t ½ indicates the time it takes the trait to evolve half the way from an ancestral state to the optimal state (rate of adaptation). Stationary variance (v y ), intercept (±SE), slope (±SE) from phylogenetic regression and the amount of variance explained by the model (R 2 ), Akaike Information Criterion (AIC) and Akaike Information Criterion weights (AICw) in comparison with the nospecific-adaptation model (single-equilibrium O-U model: 2n ~ 1 or x ~ 1) are shown. *Simple AICw is the AIC weight for each model relative to the no-predictor O-U model. **Global AICw is the AIC weight for each model relative to all models tested. For each set of models, the highest simple AICw is in bold when associated with a reduction of the phylogenetic half-life. For categorical predictors, the relationship with both 2n and x was overall robust with 32 out of the 84 models outperforming the model without the predictor (Table 3). Chromosome number was mostly affected by the habitat categories (light, moisture and nutrients), but in some clades, also by morphological categories (growth form and flower size). The gain in AICc values for the outperforming models attained more than 2 units with the percentage of the variance explained largely exceeding 5%. These models also exhibited a significant reduction in half-life, which indicated that chromosome number evolved in response to categorical variables while some, but not all, of the phylogenetic effects in chromosome number is due to phylogenetic inertia. Specifically, open habitat selected for low chromosome numbers while shaded, stable environments (e.g., forest) selected for higher chromosome numbers in Monocots, Fabids, and Campanulids (for both 2n and x) and in Malvids and Lamiids (for 2n only) (Fig. 1). Habitat moisture and nutrient availability were positively associated with chromosome number, although some optimal values (θ) had little biological meaning. In particular, this lack of biological meaning applies to eutrophic and wet categories owing to the low number of branches attributed to these categories. Hence, we regard these results with caution; nevertheless, the overall patterns of the models were robust and understandable. Chromosome number was also positively related with plant morphological traits, namely growth form (life forms with longer life cycles had higher chromosome numbers) and flower size but in a few cases, was also negatively associated with flowers clustered into a flower-like inflorescence.

Discussion
Adaptation and inertia in chromosome number. We found evidence for adaptation of chromosome number to environmental and morphological predictors, especially for 2n, while, not surprisingly, x exhibited a lower degree of variation and less significant adaptive evidence, providing insight into the possible ecological significance of polyploidy. Therefore, although the majority of variation remains unexplained, a Brownian motion process alone is certainly inadequate to explain the evolution of chromosome number 42,43 .
Our study indicates that phylogenetic inertia is a significant component in all models (t 1/2 > 0). Nevertheless, categorical predictors and, to a lesser extent, some climatic variables (mean temperature and precipitation seasonality) supported the hypothesis 1,2,38 that environmental instability and seasonality select for low chromosome numbers (i.e., decreased recombination rates). Specifically, open, disturbed, drought-prone habitats selected for low chromosome numbers, while shaded, stable environments with good availability of water and nutrients selected for high chromosome numbers and consequently for increased recombination rates 2 .
In addition, in agreement with Bell 44 and Stebbins 1 , who argued that rapid reproductive cycles correlate with low chromosome numbers, we found that low chromosome numbers are associated with small flowers densely clustered in inflorescences. In contrast, we found that higher chromosome numbers are linked to perennial herbs, especially geophytes, in agreement with previous studies 1,17,45 .
From our results, it seems that chromosome number is poorly related to large scale climatic conditions. Only three climate variables had a relationship with the chromosome number, so that species with higher chromosome numbers tend to occur in sites with lower annual temperature, lower precipitation seasonality and higher continentality, consistent with previous work 46 . Nevertheless, the effects of climatic predictors are best explained by models of 2n evolution, indicating the relative contribution of polyploidy and its ecological significance to diploid chromosome number. In addition, among categorical predictors, models fit on 2n exhibited stronger association with environmental and morphological predictors than the models fit on x, again suggesting a putative ecological significance of polyploidy. The increase of polyploid taxa with latitude and in colder sites with continental climates has been already shown by previous studies 34,[47][48][49] , albeit questioned by others [50][51][52] . The recent availability of large chromosome number databases 21,53 has spurred further research on this subject, spanning a substantially wider taxonomic space and confirming this trend across the whole Arctic flora 54 and at other geographical scales 22 .
Phylogenetic effect in chromosome number and clade-specific implications. In this study, we found phylogenetic effects in chromosome number ranging from moderate to large. This finding is congruent with data published previously 23 , which highlighted that closely related species share similar patterns of chromosome number variation. Despite this, chromosome number per se has little systematic significance, because it is well known that many taxa share the same chromosome number across different groups 1 . Further, the use of other analytical approaches to measuring phylogenetic signal (e.g., Blomberg's K andPagel's lambda) returned highly congruent results with patterns of phylogenetic effects higher for basic chromosome number (x) than those for 2n (Table S3). The degree of chromosome number variation detected here allows for the investigation of possible mechanisms regulating chromosome number together with genome size. Indeed, the six considered clades, despite their origin from ancestors with very small genomes 55,56 , show remarkable differences in genome size, which is larger on average in Monocots (1C = 11.9 pg) and Campanulids (1C = 4.03 pg) than in Fabids (1C = 2.35 pg), Lamiids (1C = 2.32), Caryophyllales (1C = 1.7 pg), and Malvids (1 C = 1.45 pg) (data from 57 ). Hence, in the first two clades, large nuclear volumes 19 allow an increase in chromosome numbers and/or chromosome sizes, in agreement with the MIT 14 . In contrast, in the remaining four clades, an increase in chromosome number should be paralleled by the occurrence of smaller chromosomes in the absence of significantly larger genomes and nuclear volumes, again consistent with the MIT 14 . Accordingly, a reduction of chromosome number by descending dysploidy should be seriously constrained by limited nuclear volumes in Fabids, Lamiids, Caryophyllales, and Malvids. Indeed, in our dataset, these clades show smaller CV values concerning x, linked to a lower frequency of dysploidy than that observed in Monocots and Campanulids.
Monocots exhibit a high variation in chromosome number and include genera (e.g., in Poales) in which holocentric chromosomes occur, which is a peculiar condition allowing for rapid diversification and an extended  Table 3. Primary optima (±SE) of log-transformed chromosome numbers (2n and x) for each categorical predictor variable in each angiosperm clade (as indicated). The niches of each predictor variable were mapped onto the internal branches of the phylogeny using a parsimony reconstruction. All phylogenies are scaled to a total length of 1. Predictor variables are given in the column, except when a Brownian motion or a single equilibrium O-U was fit, which has only an intercept. Stationary variance (v y ) and the amount of variance explained by the model (R 2 ), Akaike Information Criterion (AIC) and Akaike Information Criterion weights (AICw) in comparison with the no-specific-adaptation model (single-equilibrium O-U model: 2n ~ 1 or x ~ 1) are also shown. *Simple AICw is the AIC weight for each model relative to the no-predictor O-U model. **Global AICw is the AIC weight for each model relative to all models tested. For each set of models, the highest simple AICw is in bold when associated with a reduction of the phylogenetic half-life.
Scientific REPORts | (2018) 8:14258 | DOI:10.1038/s41598-018-32515-x range of chromosome numbers 58 . However, the high frequency of geophytes in this clade also explains the obtained results, especially for 2n. Indeed, in geophytes, large cells are assumed to be an advantage during the rapid development of the plant body and cell expansion 59 ; as a consequence, they show higher tolerance to genome duplication with a high frequency of polyploids 60 . Under the assumptions of 19 , the occurrence of polyploidy in Monocots and Campanulids should be accompanied by significantly larger nuclear volumes, given their larger average genome sizes 57 . If polyploidisation events are evolutionarily followed by descending dysploidy events, the consequent reduction of chromosome number and simultaneous increase of chromosome size in these two clades should not be subjected to selective constraint under MIT 14 . Finally, in Monocots the co-occurrence of massive polyploidisation and large genome size variation points towards evolutionary phenomena linked to the large genome constraint hypothesis 27 .
The performance of the models on Malvids might be biased by low sampling 61 , while in Fabids, the results are partly obscured by the co-occurrence of dysploidy and polyploidy phenomena in our dataset, especially in Mediterranean taxa (e.g., Genista). In Caryophyllales, results are biased by the predominance of taxa belonging to the genus Limonium, but model fits on 2n and x are largely comparable. In this latter clade, habitat patchiness/ stochasticity and high frequency of hybridization, rather than polyploidy, might have influenced speciation and chromosome number changes.
In the present study, several associations among chromosome number, plant traits and environmental factors were found. However, as mentioned by other authors 28,62 , there is no reason to expect chromosome number per se to affect plant fitness. Instead, chromosome number variation can be driven by cellular processes that affect meiosis and mitosis 14,20 . Hence, we also interpret the evolution towards an optimal state as a karyotypic equilibrium determined by mutation rates 63 , by nuclear division dynamics 14,20 , and possibly by epigenomic surveillance systems 18 , rather than an adaptive optimum.
Albeit limited in sample size, at first glance, our results showed that the evolutionary process is homogeneous across the phylogeny 41 among different clades of Italian endemics. Yet, having demonstrated that phylogenetic inertia is a significant component in chromosome number evolution, future studies can be extended to larger samples and also using different approaches 64 , where not only the primary optima but also the α-parameter (strength of selection) and σ-parameter (strength of drift) differ among niches, allowing clade-dependent rates of adaptation to be tested 65 .
In conclusion, our study presents non-stochastic demonstrations for chromosome number variation, and we argue that environmental stability favors higher recombination rates in comparison to unstable environments. In addition, whilst phylogeny is a strong predictor of trait values, especially for x, we highlight that a simple phylogenetic explanation is inadequate to account for its variation in 2n and x.

Methods
Chromosome data. Chromosome counts for 1364 accessions of 801 vascular plants endemic to Italy have been extracted from Chrobase.it (http://bot.biologia.unipi.it/chrobase/) 66 . Chrobase.it is an online dataset of chromosome counts for the Italian vascular flora 21 , hosting cytogenetic data for endemic and non-endemic species. For this study we only selected counts of endemic plants because they are the most sensitive components of a flora, often being restricted to ecologically selective habitats 67 , for which we are confident that the environmental variables calculated in the present study can be a good proxy for the total ecological requirements of the species. Most counts in Chrobase.it are associated with an exact geographic locality. For those chromosome counts lacking precise information (<10%), we identified an approximate locality based on the restricted distribution range of the species 68,69 .
Mean chromosome numbers were estimated for each species, while within-species variation to be incorporated into the phylogenetic analysis was not estimated separately for each species, because within-species samples were limited to a few counts 70 . Hence, we estimated the pooled variance across the species and used it, weighted by sample size, to estimate the observation variance of the individual species, as recommended by 71 . Preliminary analyses revealed that log transformation improved model fit by over 500 log-likelihood units relative to untransformed data, thus mean chromosome numbers were log transformed prior to analysis 62 .
Chromosome number evolution was analysed on a dataset separated into 6 major angiosperm clades 72 (Monocots, Fabids, Malvids, Caryophyllales, Lamiids, and Campanulids) to guarantee an adequate number of taxa per clade (ranging from a minimum of 67 to a maximum of 200). This analysis allowed us to evaluate whether chromosome number evolution proceeded differently or homogeneously along different evolutionary histories. The complete data set assembled for the present study is reported in supplementary Table S2.
Phylogeny. We compiled a phylogeny using the dated, ultrametric supertree for 4685 European vascular plants (DaPhnE 1.0 supertree) 73 based on 518 recent molecular phylogenies. We first completed this supertree for the 281 species in our data set that were absent from the supertree (or for which related species were missing in the supertree) following the methods of Durka and Michalski 73 , using more than 90 phylogenetic and systematic studies. We then reduced this tree to the 801 species in our dataset. The complete list of sources used in this paper is reported in supplementary Table S1. Mean diploid (2n) and basic (x, see 74 ) chromosome numbers for each taxon were visualized on the phylogenetic tree (Figs 2 and S1) with the plotsimmap function in the package phytools 75 of R 76 . Basic chromosome numbers (x) were obtained by a taxon by taxon screening of relevant karyological literature previously published by 40 . Climatic, ecological and morphological data. Climatic data associated with the sampling sites were downloaded from the Worldclim database at 2.5 min scale (http://www.worldclim.org 77 ). The climate data considered were: mean annual temperature (°C), temperature seasonality (SD × 100), temperature continentality (°C), mean annual precipitation (mm) and precipitation seasonality (coefficient of variation). Means and standard deviations were estimated for each species within a buffer of 10 min over georeferenced sites using QGIS v. 2.18 (Quantum GIS, http://www.qgis.org).
Thus, we used the phylogenetic comparative method implemented in the R program SLOUCH, established to study adaptive evolution of a trait along a phylogenetic tree 41 . The method assumes that the response trait evolves as if by an Ornstein-Uhlenbeck model of adaptive evolution towards a primary optimum θ, defined as the optimal state that species will approach in a given niche 42 .
Whilst we illustrated the overall evolutionary relationships among clades in Fig. 2, the phylogenetic comparative analyses were run only on the subtrees associated with each clade. Phylogenetic trees are scaled to 1.0 (from the root to the tip of the ultrametric tree) for an easier interpretation of parameter estimates 41 . The two main parameters returned by the model are the phylogenetic half-life (t 1/2 ) and the stationary variance (v y ). Phylogenetic half-life indicates the time it takes for half the ancestral influence on a trait to evolve towards the predicted optimal phenotype 42 . A half-life greater than zero means that adaptation is not instantaneous, while t 1/2 = 0 means that there is no evolutionary lag. The stationary variance is the stochastic component of the model and can be considered as evolutionary changes in the response trait induced by genetic drift.
Phylogenetic half-life in a model that only includes the intercept is an estimate of the phylogenetic effect in the response trait. In such a model, a half-life = zero means that the response variable is not phylogenetically clustered, while a half-life >0 suggests that there is an influence of phylogeny on the data; a half-life with high values can be attributed to an underlying continuous Brownian motion process.
The intercept-only model is contrasted with a model that also includes a predictor variable. This type of model is regarded as an adaptation model because it tests whether the response traits evolve towards optima influenced by a predictor. By comparing a model with and without the inclusion of predictor variables, it is possible to determine how much of the phylogenetic signal can be attributed to phylogenetic inertia (i.e., resistance to adaptation). No reduction in t 1/2 suggests that the phylogenetic signal can be entirely attributed to phylogenetic inertia; in contrast, when a trait evolves in response to a variable, a reduction in the half-life for the response trait (and/or of its support interval) should be observed.
The adaptation models, which include continuous predictors (i.e., the climatic variables in our study), are fitted using maximum likelihood for estimation of t 1/2 and v y and generalized least squares for estimation of the regression parameters in an iterative procedure 41 . In addition, the SLOUCH model assumes that the predictors have a longer phylogenetic half-life than the model residuals, and this is well supported by the variables involved in our study. The model returns parameters of an optimal regression and of a phylogenetic regression. The former is the relationship between the response and predictor variable that is predicted to evolve free of ancestral influence (absence of inertia). Therefore, the slope of this regression must be steeper than that of the phylogenetic regression.
To evaluate the effect of the categorical predictors on the evolution of chromosome number, the ANOVA and ANCOVA extensions implemented in SLOUCH were used. Categorical predictors were mapped onto the phylogeny using parsimony reconstruction.
We used Akaike's Information Criterion (AIC) and AIC weights (AICw) to compare intercept-only models to the adaptation models. Use of AICw standardizes the weight of evidence in favour of each alternative model between 0 and 1 80 . We also reported AIC weights for each model relative to the set of models evaluated.
Finally, model interpretations were based on comparisons of t 1/2 and v y of the adaptation models with the intercept-only model and with a pure Brownian motion model, along with the amount of variation in chromosome number that the models explain. All statistical analyses have been carried out in R v3.2.3 76 .

Data Availability
All data analysed during this study are included in this published article (and its Supplementary Information files).