Epiphytic leafy liverworts diversified in angiosperm-dominated forests

Recent studies have provided evidence for pulses in the diversification of angiosperms, ferns, gymnosperms, and mosses as well as various groups of animals during the Cretaceous revolution of terrestrial ecosystems. However, evidence for such pulses has not been reported so far for liverworts. Here we provide new insight into liverwort evolution by integrating a comprehensive molecular dataset with a set of 20 fossil age constraints. We found evidence for a relative constant diversification rate of generalistic liverworts (Jungermanniales) since the Palaeozoic, whereas epiphytic liverworts (Porellales) show a sudden increase of lineage accumulation in the Cretaceous. This difference is likely caused by the pronounced response of Porellales to the ecological opportunities provided by humid, megathermal forests, which were increasingly available as a result of the rise of the angiosperms.


Analyses with R
The following analyses were performed with R 1 (http://www.r-project.org/).

General remarks
A dated phylogenetic framework was reconstructed by estimating simultaneously divergence time estimates and topology using a bayesian divergence time estimate approach as implemented in Beast 1.6.2. The analysed dataset comprised representatives of the leafy and simple thalloid liverworts (class Jungermanniopsida) plus a few representatives of the complex thalloid liverworts (class Marchantiopsida) utilized as outgroup taxa. A taxonomical overview of liverworts including thorough morphological discussion can be found in 2 . The sampling was designed to obtain a proportional sampling of the two orders Jungermanniales and Porellales via random sampling within clades that are known to be phylogenetically robust. Except for 1.1.7 and 1.1.8, four independent analyses were run with the sampling corresponding to the hierachical classification: the class Jungermanniopsida (simple thalloid and leafy liverworts), the subclass Jungermanniidae (leafy liverworts), as well as the orders Porellales (mainly epiphytic leafy liverworts) and Jungermanniales (mainly generalistic leafy liverworts). This was done to investigate whether the results for Jungermanniopsida (comprising all investigated taxa) were challenged if the subclades were analysed separately (subsampling effects). For analyses 1.1.2, 1.1.4, 1.1.6, 1.1.7, 1.1.8, and 1.1.9 the maximum credibility tree obtained from the Beast analyses was used. For analyses 1.1.3 and 1.1.5, 100 trees were selected randomly from the trees saved in the convergence phase of the Beast analysis used to estimate the maximum credibility tree. These analyses allowed us to explore the influence of phylogenetic uncertainty to our investigation. The assignments of the fossils are discussed in section 1.2.1.

Penalized Likelihood (Chronoplots with APE)
This method was applied to explore the substitution rate changes and the influence of the fossil assignments on our investigation. In particular, we investigated if the studied changes in the diversification rates show correlations with changes in the substitution rates. To visualize rate changes across the maximum credibility tree the penalized likelihood 3 (PL) approach as implemented in the chronoplot function of APE 4 (http://cran.rproject.org/web/packages/ape/index.html) was performed for Jungermanniopsida, Jungermanniidae, Jungermanniales and Porellales. To find the smallest smoothing parameter lambda that allows the maximal variation of the substitution rates in the subsequent PL, the cross-validation option was employed (values for the smoothing parameter lambda ranging from 10e-4 to 10e4). Nine values of lambda were sampled and plotted against the sum of the cross-validation values, then the analysis was performed with the smallest value for lambda 5 . The results for the cross-validation of the four clades are displayed in Supplementary 2.2. Figs 3-6 in Supplementary 2.2 show the rates plotted on the four clades of the maximum credibility tree from the Beast analysis.

Lineages through time plots with TreeSIM
To visualize the lineage diversification of the four clades (see 1.1.1) in time with confidence intervals, lineages through time plots (LTTP) were estimated with the R package TreeSIM 6 (http://cran.rproject.org/web/packages/TreeSim/index.html) for 100 random trees from the convergence phase of the Beast analysis. These plots provide a visualisation of the accumulation of extant diversity trough time (shown at the x-axis in million years towards the present (Ma). The y-axis plotted in a logarithmic scale shows the proportion of lineages present at a given time interval. By doing so, we were able to integrate phyologenetic uncertainty (composed by topological uncertainty and variation of branch length estimates).

Gamma Statistics with APE and MCCR test with GEIGER
The gamma statistics of Pybus & Harvey 7 is a constant rates (CR) test that evaluates the null hypothesis that speciation (λ) and extinction (µ) rates are constant through time using the internode intervals in a tree. Given a pure birth process (yule process) and a complete phylogeny the gamma values (γ) show a normal distribution with a mean zero and a standard deviation 1. The estimated γ shows to what extent a given phylogeny differs from this distribution. A negative γ (γ < 0) means that the cladogenic events are distributed more towards the root of the tree than can be expected under a pure birth model, indicating declining diversification rates. Positive values (γ > 0) indicate a concentration of cladogenic events towards the tip nodes and therefore an acceleration of diversification. The null hypothesis of a constant rate is rejected when γ < -1.645 or γ > 1.645. γ was estimated with the "gammaStat" function of APE 4 for the four clades of the maximum credibility tree (see 1.1.1). To account for the incompleteness of the phylogeny that may have biased the γ towards negative values, the null distribution of γ is estimated with the MCCR Test 7 . If the observed γ falls into the tail of this null distribution even with incomplete taxon sampling, there are more cladogenetic events early or later in the history of the tree than expected under a constant rates model. The birth-death simulation with 500 replicates and random pruning was carried out with GEIGER 8 (http://cran.r-project.org/web/packages/geiger/index.html) for the four clades recognised in the credibility tree (see above).

Estimation of diversification rate shifts with TreePAR
The R package TreePAR 9 (http://cran.r-project.org/web/packages/TreePar/index.html) includes functions to estimate and visualize diversification rates shifts for a set of trees. The software was designed to indentify evidence for changes in speciation (λ) and extinction (µ) rates as well as mass extinction events using not a single tree but evaluating the hypothesis accross several trees generated for the same dataset. The function "bd.shifts.optim", that estimates the maximum likelihood speciation and extinction rates as well as the rate shift times, was run with 100 random trees for the four clades (see 1.1.1). These analyses provide additional evidence for the detected rate changes under consideration of phylogenetic uncertainty measured using 100 randomly sampled trees (see above).

Modelfitting with LASER
To test which evolutionary model fits best to the branch times of the maximum credibility tree (see 1.1.1), we used the functions "fitdAICrc" and "yule-n-rate" in implemented LASER 10 (http://cran.rproject.org/web/packages/laser/index.html). The fitdAICrc function fits a set of rate-variable and rate-constant models to the branching times and compares the AIC (Akaike Information Criterion) scores to find the best fitting models. The analysis was run for the following models: pure birth (yule), birth-death, DDL, and DDX. The "pureBirth" function models the speciation rate (λ) and the "bd" function additionally the extinction rate (µ); both are constant rate models. The diversity density dependend models DDX (taxa dependent) and DDL (rate dependent) were used beside the other models. The "yule-n-rate" function fits models with up to 5 rate changes under a pure birth model to a set of branching times and estimates the time of the rate change, again an AIC score is given. The four clades (see 1.

MEDUSA
With the "medusa" function of the R package GEIGER 8 it is possible to estimate breakpoints in diversification by fitting diversification models (yule, birth-death, mixed) to a phylogeny. This approach allows to correct the impact of the incomplete sampling by considering the estimates of species diversity of terminal nodes. In this case, we used estimates for the species richness of each family of liverworts. To choose the best fitting model the AIC (Akaike Information Criterion) and the AICc (corrected Akaike Information Criterion) were employed. It was necessary to create a matrix of taxonomic richness containing the species numbers of genera or families as well as a reduced tree containing only one representative of the genus or family. With Mesquite 11 (http://mesquiteproject.org) the maximum credibility tree of Jungermanniopsida (see 1.1.1) was reduced to one species per family and a matrix with the species numbers per family was build. The following arguments were employed: AICc and AIC as criterions, "mixed" as model (considers yule and birth-death).

BiSSE with DIVERSITREE
To estimate if the characters "epiphytism" and "generalism" have a significant influence on the evolution of species, a BiSSE (binary state speciation and extinction) 12,13 analysis was performed with the R package DIVERSITREE 12 (http://cran.r-project.org/web/packages/diversitree/index.html) for the maximum credibility tree and the clade Jungermanniopsida. The speciation rates (λ), extinction rates (µ) and character state transition rates (q) for epiphytes (E) and generalists (G) were estimated with the "make.bisse" function with maximum likelihood as well as bayesian inference under an exponential prior distribution 13,14 . We explored in total seven models with one model having six (no constant rates), three models having five (one constant rate), and three models having four parameters (two constant rates). For each model, the likelihood values were used besides AIC to rank the models hierarchically. Significance was tested using a chi-square test to estimate p-values. Particular care was given to observe the test statistics to identify conflicts. Character dependence of rate changes were also inferred using a sister comparison carried out by a Wilcoxon test. This analysis provided additional support to the hypothesis estimated independently from the BiSSE approach.

Estimation of speciation and extinction rates with DIVERSITREE
To explore the robustness of the test statistics, we employed the MCMC-based versions of BiSSE embedded in DIVERSITREE 14 : We calculated the distribution T null of each parameter set considered in the models above under a birth-death process. The constant rate function ("make.bd" function of DIVERSITREE) was employed when applicable. The obtained test statistics was used to explore the significance of the estimated model parameters in combination with the test statistics used to explore the best fitting models.

General remarks
Fossils are assigned as most recent common ancestors (mrca) based on comparison with the extant liverwort diversity and phylogenetic evidence. Information on the morphology of the fossils was taken from the literature and/or based on the study of original material including types. In the framework of this study, we investigated some

Amber fossils:
A. Burmese amber Based on biostratigraphic work 40,41,42 Burmese amber is of mid Cretaceous origin (~ Cenomanian); therefore 99.6 Ma was employed as minimum age.
Frullania cretacea: The cosmopolitan genus Frullania is represented in numerous Cenozoic amber inclusions, and also present in mid Cretaceous Burmese amber (for review see 43 ). It is characterized by reddish gametophytes with incubously-inserted, complicate bilobed leaves with the ventral lobe often forming a watersac. Several Burmese amber inclusions have been assigned to the extinct F. cretacea; a single gametophyte fragment from the same deposit has been described as F. baerlocheri. Since these fossils match the morphology of the Frullania crowngroup, they were treated as their mrca.

B. Baltic and Bitterfeld amber
The age of the Eocene Baltic amber is estimated as 45 . It is a very prominent source of fossil liverworts 23 . Numerous morphological details of these inclusions are preserved allowing for reliable assignments of the related species to extant genera. The Bitterfeld amber is of Oligocene origin and ca. 23 Ma old 46,47,48 , and includes a liverwort diversity that is comparable to that of Baltic amber.

B.1. Porellales
Ptilidium spec.: A single fossil from Baltic amber was assigned to the extant species Ptilidium pulcherimum 23 .
The deeply lobed leaves with long ciliate appendages are indicative of Ptilidium, but the extensive sequence similarities of the extant sister species P. pulcherrium and P. ciliare 22,49 as well as slight differences in leaf shape oppose this species assignment. Accordingly, the fossil was treated as Ptilidium spec. and assigned to the clade containing the genera Ptilidium, Neotrichocolea and Trichocoleopsis. grandiloba and P. obtusata 23,24 . Due to the similaties to extant species in leaf and underleaf morphology, it was assigned to the Porella crown group.
Radula oblongifolia and Radula sphaerocarpoides are represented in Baltic and Bitterfeld amber; both species clearly belong to Radula based on the presense of Radula-type branching, incubous, complicate bilobed leaves and the absence of amphigastria. R. sphaerocarpoides is known only in sterile condition whereas some specimens of R. oblongifolia include fertile material with sporophytes 23 . The morphological overlap of both species with extant representatives of Radula allows for a crown group assignment. We abstain from an assignment to subclades of Radula because of the morphologically diffuse nature of these clades 50 .
Nipponolejeunea europaea: Gametophytes of this genus occur frequently in Baltic and Bitterfeld amber. A Baltic specimen was described as the extinct Nipponolejeunea europaea 28

B.2 Jungermanniales
Bazzania polyodus: This species was recognized in Baltic and Bitterfeld amber and identified as a member of the extant genus Bazzania 34 based on the incubous leaf insertion, bifid leaves, and irregularly quadrilaciniate underleaves. It is treated as mrca of Bazzania since its incomplete preservation hampers a comprehensive comparision to extant species.
Plagiochila groehnii: Two sterile shoot fragments from Baltic amber with alternating, toothed, wide spreading leaves are indicative of Plagiochila and resemble the extant P. sciophila 35 . Since taxonomically relevant fertile structures and branches are unknown, the species was not assigned to a certain crown group lineage but treated as its mrca.
Calypogeia stenzeliana: This fossil species is known only in sterile condition and is characterized by incubous leaves, bilobed amphigastria and sometimes tapering shoot apices. The vegetative structures resemble those of the Holarctic Calypogeia suecica 23 ; lack of fertile structures and sporophytes only allow a treatment as the mrca of the Calypogeia crown group.
Cylindrocolea dimorpha: Several inclusions of ventrally branched, minute plants with succubously inserted, bilobed leaves in Baltic and Bitterfeld amber have been assigned to this morphologically variable, extinct species. The taxon was first assigned to Cephaloziella 34 but subsequently transferred to Cylindrocolea based on the lack of gemmae 23 . Since Cylindrocolea is probably nested in Cephaloziella, the fossil taxon was treated as mrca of a well supported clade containing both genera.
Lophozia kutscheri: This fossil species is characterized by 2-3-lobed, transversely inserted leaves with gemmiferous apices. It was assigned to a widely circumscribed genus Lophozia including Barbilophozia, and compared to the extant Barbilophozia hatcheri 23 . In recent years the genus Barbilophozia was separated from Lophozia based on molecular phylogenetic and morphological evidence 53 ; hence, L. kutscheri was assigned as mrca of this genus.  38 and subsequently recognized in Baltic amber 39 . The morphology of the fossil species indicates an affiliation to the extant subgenus Scapania 51,54 ; hence S. hoffeinsiana was treated as the mrca of this subgenus.

C. Dominican amber
Dominican amber is of early to middle Miocene origin (15-20 Ma) 55 . It contains numerous inclusions of leafy liverworts, especially of the order Porellales 32 : Marchesinia pusilla: Two species of Marchesinia were recognized by Gradstein 30 . One sterile inclusion was assigned to the extant M. brachiata, but the assignment needs verification according to 56 Table 1. This is the chronogram as shown in Fig. 1.

Penalized Likelihood (Chronoplots with APE)
The mean substitution rates were estimated as relatively similar for all clades (Supplementary 2.2. Table 1), while the values for the minimum rates differ considerably, especially between Porellales and Jungermanniales, with the latter having much higher substitution rates. This is also apparent in the chronograms (Supplementary 2.2. Figs 2-5), with the colours corresponding to the substitution rates. There seem to be more rate changes within Porellales and some branches show very slow rates from 0.0-0.4 substitutions. Jungermanniales show mostly faster rates, with only few branches having a rate of 0.8-0.6. In summary, the results document the influence of the assigned fossil constrain and the absence of significant variation of substituation rates among the main clades of liverworts studied despite the lack of evidence for a constant molecular clock. years shown again using the hierachical classification with class, subclass and orders (see 1.1.1). No major effect was detected for the subsampling and no evidence was found for a significant difference of the substituton rates estimated for the two orders.

Lineages through time plots with TreeSIM
The lineages through time plots (LTTPs) were estimated with 100 randomly chosen trees from the convergence phase of the Beast 1.6.2 analysis (see 1.1.1). The resulting confidence intervals are not very large and show the same trends as the LTTPs estimated for the four clades of the maximum credibility tree (Fig. 1B). To visualize times of putative rate changes the shift times estimated with LASER are included as arrows.

Gamma Statistics with APE and MCCR test with GEIGER
The significantly negative gamma values (γ; Supplementary 2.4. Table1) estimated for the four subsets (see 1.1.1) suggest that the cladogenic events are disproportionately distributed towards the root of the tree. This rejects the hypothesis that the diversification rates remained constant over time, indicates decreasing diversification rates and may also be an indicator of a rapid initial diversification 8 . All estimated γ values, except for Porellales, are well outside the null distribution (Supplementary 2.4. Fig. 1). The p-values indicate an overestimation of the γ due to incomplete taxon sampling. With a value of -2.1 the γ of Porellales falls just under the tail of the null distribution (Supplementary 2.4. Fig. 1 Fig. 1: Visualiyation of the null distributions of the gamma statistics with p-values derived from the MCCR test. The red arrows indicate the γ value as estimated with the CR test. All CR values were placed in the tails of the estimated distributions and were found significant.

Estimation of diversification rate shifts with TreePAR
These analyses were employed to explore rate-shifts accross 100 randonly sampled trees obtained from the convergence phase of the Beast analysis for four samplings: Jungermanniopsida, Jungermannidae, Porellales, and Jungerananiales (see 1.1.1).
Supplementary 2.5. Fig. 1: Vizualization of the output of the TreePAR analyses with the time towards the presents plotted on the x-axis in million years towards the present, and the diversification rates plotted onto the y-axis. The blue lines show the distribution of rates for each tree considered. All analyses recovered a reduction of the diversification rate after 40 million years towards the present, whereas most trees of the Porellales showed also an increase of the rates around 80 to 60 million years towards the present.

Modelfitting with LASER
These tables present the results optained with the "fitdAICrc" and the "yule-n-rate" functions. The results for analyses with rate constant models are shown in Supplementary 2.6. Table 1 and the results for the rate variable models in Supplementary 2.6. Table 2. The rate constant model "pure birth" estimates only the speciation rate (λ) while the rate constant birth-death model additionaly estimates the extinction rate (µ). In this analysis the pure birth model was chosen with the Akaike Information Criterion (AIC) as the best fitting constant rate model (best). The DDX and DDL models fit density dependent speciation models to branch times and assume one rate shift. The "yule-n-rate" function can estimate up to 5 rate shifts as well as the shift time based on a pure birth (yule) model. Supplementary 2.6. Table 2 shows the results for the constant rate models The yule-5-rate model was chosen as best fitting model (best in Supplementary 2.6. Table 2) based on the AIC score. It is seen here as the best fitting of all models tested because it received a better AIC score than the pure birth model.

BiSSE
Sumary of the BiSSE analyses with and without the addition of DIVERSTREE analyses carried out to test evidence suggesting dependence of diversification rate changes from ecological preferences. We added also a BiSSE independent test(Supplementary 2.8.1. Table 1).

Wilcoxon statistics to explore rate dependence of ecological preferences in sister clades
Clades with more than 50% epiphytes were considered as epiphytic, whereas clades with less than 50% epiphytes were classified as generalists. Test for group 1 and 2 compared the diversification rates of families classified as epiphytes versus families classified as generalists. Group 3 and 4 compared the difference in the diversification rates ofsister clades sharing the same ecological prefereence or show distinct ecological differences.  Table 1: Results of Wilcoxon statistics. P < 0.5 was considered to be significant. Thus, rate changes are associated with sister clades showing the transformation of ecological preferences, while epiphytic families did not show significant differences in their speciation rate compared to generalist families.

BiSSE with DIVERSITREE
For the BiSSE analysis with DIVERSITREE a character matrix with the substrate preference (E = epiphytic, G= generalistic) was assigned to the maximum credibility tree of Jungermanniopsida. Speciation rates (λ), extinction rates (µ) and character state transitions between the states (q) were estimated with maximum likelihood and Bayesian inference. The analyses were performed with several parameter settings for which the likelihood and Akaike Information Criterion (AIC) values of the maximum likelihood estimation were best: 6 parameter (λ E ≠ λ G , µ E ≠µ G , q E>G ≠q G>E are estimated independently), 5 parameter (λ E ,=λ G , µ E ≠µ G , q E>G ≠q G>E ; λ E ≠λ G , µ E =µ G ,q E>G ≠q G>E , λ E ≠λ G , µ E ≠µ G , q E>G =q G>E ) and 4 parameter λ E ,=λ G , µ E ,=µ G , q E>G , q G>E ; λ E , ≠λ G , µ E ,=µ G , q E>G =q G>E ; λ E ,=λ G , µ E , ≠µ G , q E>G =q G>E ). The p-values indicate that the differences between the variables of all models are significant.  Fig. 1: Posterior probability distribution from the analysis with 6 variable parameters (λ: speciation rates, µ: extinction rates, q: transition rates).