Global patterns and drivers of phylogenetic structure in island floras

Islands are ideal for investigating processes that shape species assemblages because they are isolated and have discrete boundaries. Quantifying phylogenetic assemblage structure allows inferences about these processes, in particular dispersal, environmental filtering and in-situ speciation. Here, we link phylogenetic assemblage structure to island characteristics across 393 islands worldwide and 37,041 vascular plant species (representing angiosperms overall, palms and ferns). Physical and bioclimatic factors, especially those impeding colonization and promoting speciation, explained more variation in phylogenetic structure of angiosperms overall (49%) and palms (52%) than of ferns (18%). The relationships showed different or contrasting trends among these major plant groups, consistent with their dispersal- and speciation-related traits and climatic adaptations. Phylogenetic diversity was negatively related to isolation for palms, but unexpectedly it was positively related to isolation for angiosperms overall. This indicates strong dispersal filtering for the predominantly large-seeded, animal-dispersed palm family whereas colonization from biogeographically distinct source pools on remote islands likely drives the phylogenetic structure of angiosperm floras. We show that signatures of dispersal limitation, environmental filtering and in-situ speciation differ markedly among taxonomic groups on islands, which sheds light on the origin of insular plant diversity.

. Variation explained by Generalized Additive Models of the standardized effect size of phylogenetic diversity and mean pairwise phylogenetic distance of angiosperms overall, palms and ferns on islands based on a global island species pool and three different regional species pool delineations as response variables and environmental predictor variables. Supplementary References S1. Literature used to compile the global dataset of angiosperm, palm and fern species composition on 393 islands worldwide.

Supplementary
Supplementary References S2. References cited in this supplement physiological constraints like their lack of active stomatal control, their need for water for sperm movement and their comparatively simple xylem anatomy 21,22 . Accordingly, fern floras on islands may be unbalanced compared to mainland floras, with an overrepresentation of families characterized by adaptations to islands' ecological niches 18 .
We assembled plant species lists for marine islands from floras, checklists and online databases. Based on the more than 1000 species lists in our database, we only included lists which claimed completeness in our analysis (see Supplementary References S1). The final selection covered 393 islands, 375 for all angiosperms (flowering plants), 386 for palms and 328 for ferns. All species names (including subspecies name and author information if available) were matched to the working list of all known plant species, the Plant List, version 1.0 (www.theplantlist.org/1/). Genus names not found in the Plant List were manually checked for mistakes and validity according to  . Species names were matched to the Plant List using fuzzy matching and replaced by names accepted by the Plant List if they were found to be synonyms. If a name could not be matched or its taxonomic status in the Plant List was unresolved, we used the Taxonomic Name Resolution Service provided by iPlant (tnrs.iplantcollaborative.org, accessed May 2, 2013) for taxonomic match-up.
Species names that were matched but not resolved by either service or that were not matched at all were used in their matched or original form, respectively (97.7% matched / 85.8% resolved for angiosperms, 97.2% matched / 65.3% resolved for ferns, 99.8% matched / resolved for palms; in total 95% using the Plant List, 5% using iPlant). All names entered further analyses at the species level.
Family assignment followed the Plant List, which corresponds to the Angiosperm Phylogeny Group (APG) classification III 24 . To match the taxonomic concepts of the fern phylogeny, and to acknowledge recent advances in fern taxonomy, all fern names were additionally subjected to a comprehensive and careful taxonomic check (by M.K. and S.L.) so that genus and family assignments were up to date.
In order to link the species from the island checklists to the phylogenetic trees, phylogenies were pruned to family level (angiosperms and ferns) or genus level (palms and ferns) and species added as polytomies according to their family or genus membership.
For angiosperms, the original phylogeny 25 considered DNA sequence data for 560 species from 335 families and 45 orders, and was simultaneously estimated and dated using Bayesian methods based on 35 fossils and an additional age constraint for the root of the tree 25 . For comparison with ferns, we pruned the phylogeny to family level ( Supplementary Fig. S4). Five pairs of families that would otherwise not be monophyletic were merged ( Supplementary Fig. S4). 60 families representing 935 species were missing from the phylogeny and were manually added to the tree according to ref. 26 Trimeniaceae, Triuridaceae, Xyridaceae). Families thought to be the sister clade of a family in the tree were added at 2/3 of the stem age of the family in the tree. Families thought to be sister to larger clades were added half way between nodes. For the calculation of phylogenetic community metrics, the 32,446 angiosperm species from the island checklists were added to the family-level phylogeny as polytomies at 1/3 of the family stem node ages. The final phylogeny pruned to include only species present in the considered island floras comprised 315 families and merged groups ( Supplementary Fig.   S4).
The palm phylogeny was based on a complete genus-level supertree of palms, dated using a Bayesian relaxed molecular clock approach with uncorrelated rates and calibrated using four palm fossil taxa and a stem node age constrained to 110 to 120 Ma 28 . For comparison with ferns, we pruned the phylogeny to genus level ( Supplementary Fig. S5). For the calculation of phylogenetic community metrics, the 1143 palm species from the island checklists were added to the phylogeny as polytomies at 2/3 of the genus stem node ages 10 . All palm species included in the species checklists were represented by genera in the phylogeny. The final phylogeny pruned to only include species present in the considered island floras comprised 118 genera ( Supplementary Fig. S5).
For ferns, we used a dated phylogeny based on a global fern phylogeny 29 . This dataset was updated by querying GenBank release 184 (June 15 2011), complemented with additional data not included in the queried release (KJ628500-KJ628963; KJ716370-KJ716414), and filtered to retain only those taxa that were represented in the dataset by at least two genes (one of which had to be rbcL) and more than 1000 base pairs of sequence data. Furthermore, the most similar taxa (defined by the pairwise distance of aligned sequences) were removed until no pair of taxa had pairwise distance less than 0.5%. This resulted in a taxonomically broad sample of 1118 taxa representing most extant fern genera. Molecular dating was based on uncorrelated exponential relaxed clock analysis in Beast 1.7.3 30 , using 42 fossil calibrated nodes and a partially constrained starting tree produced in RAxML 7.3.0 31,32 . For comparison with angiosperms, we pruned the phylogeny to family level following the classification from ref. 33, and for comparison with palms, we pruned the phylogeny to genus level ( Supplementary   Fig. S6). A group of nine genera that would otherwise not be monophyletic was merged (Polypodiaceae A in Supplementary Fig. S6: Lemmaphyllum, Lepidomicrosorium, Lepisorus, Leptochilus, Microsorum, Neocheiropteris, Neolepisorus, Paragramma and Tricholepidium). The genus Odontosoria was split into an old world clade and a new world clade to avoid polyphyly. 25 missing genera representing 146 species were added to the tree manually according to information from the literature ( Supplementary Fig. S6). Genera thought to be located inside genera in the tree were merged with the already present genera. Genera thought to be the sister clade of a genus in the tree were added at 2/3 stem age of the genus in the tree (Aenigmopteris, Austrogramme, Cerosora, Cheiroglossa, Oenotrichia, Paraselliguea, Scoliosorus, Syngramma, Taenitis and Vaginularia).
Genera thought to be sister to larger clades were added half way between nodes (Ananthacorus and Trachypteris). The genus Adenoderris (one species with one occurrence on Jamaica) was excluded due to its unknown phylogenetic position. For the calculation of phylogenetic community metrics, the 3689 fern species from the island checklists were added to the family-level phylogenies as tips at 1/3 of the family stem node ages and to the genus-level phylogenies as polytomies at 2/3 of the genus stem node ages. We chose 1/3 in the family phylogenies to account for the higher discrepancy between stem node ages of families and species when compared to genera and species in the genus-level phylogenies (2/3 stem node age). However, comprehensive sensitivity analyses of the palm phylogeny show that the specific age thresholds for polytomies do not qualitatively affect patterns and determinants of phylogenetic community structure 10 because the metrics are predominantly influenced by long branch lengths in the older parts of the phylogeny. The final phylogenies pruned to include only species present in the considered island floras comprised 42 families and 168 genera ( Supplementary Fig. S6).
Angiosperms, palms and ferns differ in age, number of species and major clades, and number of islands inhabited. However, the fern phylogeny encompasses a similar time span to the angiosperm phylogeny and an intermediate number of species compared to angiosperms and palms (Supplementary Figs S4,S5 and S6). In contrast to common belief, extant fern diversity is not older than angiosperm diversity; the largest fern lineages diversified in response to diversification in angiosperms 34 . Differences between angiosperms, palms and ferns in the importance of dispersal, environmental filtering and diversification for phylogenetic assembly can therefore directly be compared and linked to differences in dispersal-and speciation-related traits.
The comparison between family-and genus-level analyses for ferns enabled us to scrutinize the sensitivity of our analyses towards the resolution of the phylogenies. Both levels, however, provide sufficient detail to address our hypotheses ( Fig. 1) and to disentangle patterns and determinants of phylogenetic structure of island floras as most variation in branch lengths is in basal parts of phylogenies. Thus, higher resolution in relationships among species is not expected to considerably influence general patterns and dependencies (compare sensitivity analyses in ref. 10). In addition, dispersal-related traits and environmental adaptations are phylogenetically conserved in many large and old clades 2,3,6 . Thus filtering mechanisms should affect phylogenetic structure independent of whether family-or genus-level phylogenies have been used. Furthermore, usually young island radiations 35 are clearly distinguishable from relict lineages which often go back way beyond genus and even family level (e.g. Amborellaceae 9 ) even in genus or family-level phylogenies with species appended as polytomies. In fact, there are plenty of examples of island radiations producing up to hundreds of closely related species 7,36 , leading to clustered island assemblages.
Our hypotheses on dispersal and environmental filtering as drivers of phylogenetic patterns assume phylogenetic signals in traits which vary with phylogenetic scale 37 . The more of the tree of life is encompassed, the more conservative the traits should be 4 . However, if traits of clades of different biogeographic regions have converged, conservatism may diminish 4,37 , hampering comparisons among phylogenies. In fact, we could not test for a phylogenetic signal of traits in the phylogenies, but previous studies suggest that dispersal related traits like seed size and dispersal mode, and adaptations to climates are phylogenetically conserved in many large and old clades 2,3,6,38 (Supplementary Text S1). Furthermore, our results may help to understand patterns arising from different levels of phylogenetic signal in traits. The environmental models explained varying proportions of variance for angiosperms, palms and ferns (Table 1) suggesting differences in both predominant traits and levels of trait conservatism.
Tree editing was performed with R statistical software version 3.0.1 (R development Core Team, available at cran.r-project.org) using the package ape 39 . Phylogenetic community metrics were calculated using the package picante 40 .

Supplementary Methods S2. Statistical models and spatial autocorrelation.
To account for spatial autocorrelation in model residuals of the best non-spatial model of each plant group, we applied spatial eigenvector filtering 41 . We applied principal coordinate analysis to a neighbourhood matrix (PCNM) to deconstruct geographic distances between island centroids into orthogonal spatial eigenvectors. Spatial distances were truncated by exchanging distances larger than 1000 km by 4000 km to emphasize spatial autocorrelation at relatively small scales 41 . All eigenvectors with positive eigenvalues were considered as they represent positive spatial autocorrelation at different spatial scales. Following ref. 42, we consecutively added spatial filters as linear effects to the best models until residual spatial autocorrelation was no longer significant. In each round, the spatial filter that best reduced residual Moran's I values was retained in the model for the next round. Moran's I values were calculated for varying neighbourhood structures considering the k = 1 to 25 nearest neighbours; the highest significant Moran's I value was always considered. Afterwards, the model selection procedure to find the best model and model averaging based on Akaike's Information Criterion corrected for small sampling sizes were repeated, with the identified set of spatial eigenvectors included in each model.
Analyses were performed with R statistical software version 3.0.1 (R development Core Team, available at cran.r-project.org) using packages mgcv 43 for Generalized Additive Models, MuMIn 44 for model selection and averaging, vegan 45 for PCNM and spdep 46 for spatial autocorrelation assessment. Table S1. Pearson correlations of phylogenetic community metrics within angiosperms, palms and ferns on islands worldwide. Metrics were calculated for angiosperms based on the dated family-level phylogeny from ref. 25, and for palms based on a dated genus-level phylogeny. For comparison, metrics for ferns were calculated using phylogenies at both family and genus levels. MPDes = standardized effect size of mean pairwise phylogenetic distance, PDes = standardized effect size of phylogenetic diversity (PD). n = 363 islands for all angiosperms, n = 71 islands for palms and n = 234 islands for ferns. Coefficients and p-values were corrected for spatial autocorrelation following ref. 47. Significance: * p < 0.05, ** p < 0.01, *** p < 0.001. Criterion corrected for small sampling sizes of the relationships of the standardized effect size of phylogenetic diversity (PDes) of angiosperms, palms and ferns with environmental factors on islands.

Supplementary
In addition to the parameters shown here, the models included spatial eigenvectors to account for spatial autocorrelation. For angiosperms, PDes was calculated based on a dated family-level phylogeny, and for palms based on a dated genus-level phylogeny. For comparison, PDes of ferns was calculated using phylogenies at both family and genus levels (n = 363 islands for all angiosperms, n = 71 islands for palms and n = 234 islands for ferns). R² is a partial R² for the predictor variables, removing the effect of the spatial eigenvectors; edf = effective degrees of freedom; column t/F-value contains t-values in case of intercepts and linear effects (edf =1) and F-values in case of smooth terms with a global island species pool and three different regional species pool delineations (standardized effect size of phylogenetic diversity (PDes) and mean pairwise phylogenetic distance (MPDes)).
Regional species pools include all species of all islands that 1) belong to a particular floristic realm after Takhtajan   with environmental factors on islands. In addition to island age, the full models included ten environmental predictors and spatial eigenvectors to account for spatial autocorrelation. The effect of island age was not significant in any averaged model (p > 0.05). Regression lines are therefore not plotted. PDes and MPDes were calculated based on a dated family-level phylogeny for angiosperms (orange) and based on a dated genus-level phylogeny for palms (red). Metrics for ferns (blue) were calculated based on a dated family-level phylogeny for comparison with angiosperms (column 1), and based on a dated genus-level phylogeny for comparison with palms (column 2). Only islands with at least two species of the focal group and with information on island age were included in models (n = 187 islands for all angiosperms, n = 31 islands for palms only and n = 138 islands for ferns).    Figure S7. Moran's I correlograms of spatial autocorrelation for the standardized effect size of phylogenetic diversity (PDes; Response variable) of angiosperms, palms and ferns on islands; residuals from the best Generalized Additive Models of PDes in dependence on environmental predictors (Residuals (non-spatial)); and residuals from best spatial models (Residuals (spatial)) including a set of e spatial eigenvectors to reduce spatial autocorrelation (see Supplementary Methods S2 for details). In (a), PDes was calculated based on dated family-level phylogenies of angiosperms and ferns. In (b), PDes was calculated based on dated genus-level phylogenies of palms and ferns.