Omnivory in birds is a macroevolutionary sink

Diet is commonly assumed to affect the evolution of species, but few studies have directly tested its effect at macroevolutionary scales. Here we use Bayesian models of trait-dependent diversification and a comprehensive dietary database of all birds worldwide to assess speciation and extinction dynamics of avian dietary guilds (carnivores, frugivores, granivores, herbivores, insectivores, nectarivores, omnivores and piscivores). Our results suggest that omnivory is associated with higher extinction rates and lower speciation rates than other guilds, and that overall net diversification is negative. Trait-dependent models, dietary similarity and network analyses show that transitions into omnivory occur at higher rates than into any other guild. We suggest that omnivory acts as macroevolutionary sink, where its ephemeral nature is retrieved through transitions from other guilds rather than from omnivore speciation. We propose that these dynamics result from competition within and among dietary guilds, influenced by the deep-time availability and predictability of food resources.

: Distribution of species without molecular data (3323 species, in black) and with molecular data (6670 species, red). The species without molecular data were inserted into the molecular phylogeny through a Yule process.
Supplementary Figure 14: First 3 principal components of PCA using the scores of each food item for omnivore species as variables plotted against one another. It is possible to see that herbivores and piscivores are similar to omnivores in all combinations, potentially reflecting the high transition rates from these two guilds into omnivores. Also possible to see that omnivory overlaps with more than 2 guilds in all axes. Simulated scenarios where rates of diversification were not associated with trait states were unable to produce simulations that were similar to empirical data. Even with the 8 viable simulations for fitting, we could not observe any similarity between those fittings and the empirical results (posterior distributions do not show omnivores [or any other guild] as having negative net diversification rates, nor they show one guild with clearly distinct rates). The lack of viable simulations and differences in rate estimates and trait state distributions indicate that without an associated speciation and extinction dynamics the transition rates alone together with the empirical tree topologies cannot recover the observed empirical patterns.    The distance values represent the number of permutations needed to transform one tree into another tree. The results show that the relative difference between the two backbones is similar to the difference between each backbone and either one of the two new phylogenetic hypotheses. This suggests that phylogenetic heterogeneity (or uncertainty) as incorporated in our analysis (the use of trees from both backbones) encapsulates an amount of phylogenetic dissimilarity that is comparable to the one captured when comparing the backbone trees with other phylogenetic hypothesis. Taken together, those two previous results suggest that when simulations ignored the rates of speciation and extinction associated with trait states, the vast majority of these simulations resulted in scenarios, which differed from the empirical data either in relation to the proportion between guilds and/or from the perspective of rate estimates, to the point that those were not used or had to be discarded to the fitting exercise ( Supplementary Figure16).
The fitting of MuSSE was only possible on 8 simulations and in none of these remaining simulations we observe negative diversification rates for omnivores (Supplementary Figure 17). In fact, no trait state was estimated to be associated with negative net diversification rates on those 8 simulations, and the rate estimates were not even vaguely similar to the empirical estimates. With respect to our empirical case, these simulation results suggest that the transition dynamics (without changes in speciation and extinction due to trait states) and the specific tree topology alone are not capable of reproducing the empirical diversification dynamics (Supplementary Figure 17). Thus, we suggest that our results are unlikely to be caused by model misbehavior, but rather by an association between trait states (diets) and diversification rates.

A2) Posterior predictive simulations to test for model adequacy
The results from the simulation that used the empirical rate estimates associated with different dietary guilds produced a distribution of proportions between guilds that lies around the proportions observed in the empirical data (Supplementary Figure 18). This indicates that the rates estimated by our analysis can generate plausible scenarios (Supplementary Figure 18). We interpret this result as evidence that the models used to characterize the diversification dynamics associated with the different dietary guilds are adequate.

A3) Model fitting in sub-clades
The results from the sub-clades analyses showed a similar pattern for the extinction regimes associated with omnivory. Here, as in the main analysis of the whole tree, extinction rates associated with omnivory tended to be either indistinguishable from other guilds or higher than in other guilds (Supplementary Figure 19). The exceptions were the extinction rates of herbivores in both Passeriformes ( Supplementary Figure 19.a) and Psittaciformes (Supplementary Figure 19.c). We suggest that these very high extinction rate estimates for herbivores result from the small number of herbivore species present in each of those orders (12 species in Passeriformes and 10 species in Psittaciformes) and therefore are in fact unreliable estimates.
Supplementary Figure 19 also shows that estimates of speciation rates of omnivores are either not significantly different or higher than the speciation rates of other guilds for all four orders (Supplementary Figure 19.a,c,e,g). This is different from what we discovered in the main analysis, but we note that the transition dynamics between guilds are also different from the dynamics estimated for the whole tree with the exception of Piciformes, which show transitions towards omnivory. We suspect that using sub-clades will lead the models to preferentially assign the origin of new omnivore species to speciation rather than to transitions because fewer transitions remain at the sub-clade level.
Although these sub-clade results indicate some differences to the whole tree analysis, the direct interpretation of the sub-clade analysis in xxSSE models other than BiSSE may not be as straightforward as previously advocated for BiSSE (68). For a binary-state character all sub-clades of a given tree will have the same number of states (two) as the whole tree, but when working with multi-state characters this might not be necessarily true (for example the order Passeriformes comprise more than half of all bird species, but no carnivore, piscivore or scavenger species are present). Second, as previously mentioned, dividing a tree into sub-clades might obscure the dynamics of transitions, because a state that might have had multiple origins in the whole tree might be perceived as having few transitions in a sub-clade that has, for example, just one or very few transitions. Finally, when looking at sub-clades some states may be very under-represented. According to (1), the low relative number of species in a given state leads BiSSE models to incorrectly estimate the parameters. This might be also the case in other xxSSE models, especially when the absolute number of species is low (for example the very low number of herbivore species in both Passeriformes and Psittaciformes, i.e. 12 and 10, respectively). Given those constraints, we do not expect that the sub-clade analysis always replicates the results from the whole tree analysis. Therefore, the results from this analysis should be taken with caution when using a MuSSE model.

A.4) General conclusion on the use of MuSSE in our data.
To conclude, the first two sets of tests provided good evidence for the proper behavior of the model to estimate true state-dependent speciation, extinction and transition rates. The results from the sub-clade analysis were rather inconclusive, although some trends (e.g. extinction regimes) were similar to the whole tree analysis. We suggest that the overall signal from the sensitivity analyses provides evidence that the model is providing reliable results, and that the qualitative results and conclusions drawn from the whole tree analysis are robust.

B) Sensitivity analysis for dietary classification
The results shown in supplementary figures 20-24 suggest that a more inclusive categorization of omnivory would result in the same patterns as observed in the main analysis. In this new classification scheme, omnivores also showed either lower speciation and/or higher extinction rates when compared to all other dietary guilds. This ultimately results in a lower (and again even negative) net diversification rate for this guild. Additionally, the observed transition rate patterns are similar to what we recover in our main analysis and support the scenario that poses omnivory as a macroevolutionary sink. We therefore suggest that our main conclusions are independent to the classification scheme used.

C) Comparison of backbone trees
It is possible to see that the two backbones used on our study show a level of phylogenetic dissimilarity that is comparable to the one captured when comparing those backbones to other recent phylogenetic hypothesis (supplementary figure 25). This suggests that phylogenetic uncertainty incorporated in our analysis (the use of trees from both backbones) is comparable to the ones captured from the two other phylogenetic hypotheses.

A) Assessing xxSSE model limitations
We and assumed symmetrical transition rates. We suggest that this way of implementation is inappropriate for testing Type I error rates in xxSSE models.
In the case of the xxSSE models, to properly test for a Type I error, the null model should be constructed with the premise that the tree itself was generated by a constant-rate birth-death process. If this is ignored, a potential false association between a simulated neutral character and rates of diversification deviate from the null scenario of constant-rate birth-death process. However, developing a null model that incorporates the rate heterogeneity as commonly seen in empirical trees to verify a potentially spurious association is not an easy task. Hence, rather than trying to develop a test to detect Type I errors, we think it is more appropriate to evaluate the reliability of empirical results by firstly decoupling the transition dynamics from the speciation/extinction dynamics and secondly testing whether similar associations between trait states and speciation and extinction rates can be recovered. The latter ones are ultimately the processes that determine rate heterogeneity seen in empirical trees.
We therefore propose a small modification to Rabosky & Golbderg's protocol to assess the reliability of xxSSE models when dealing with specific empirical trees. Given the intricate roles of diversification and transition rates on generating the empirical distribution of trait states on the tips, which will eventually be used to test the reliability of the xxSSE models, we suggest to control for the effect of asymmetric transition rates. This can be done by simply using the empirical transition rates instead of symmetrical transition rates when simulating trait evolution. This gives a more reliable assessment of the risk of detecting a spurious association between diversification rates and given trait states. Note that this, as in the case of Rabosky and Goldberg (2015), is not a test of Type I error in a purely statistical sense, but rather a test if the empirical pattern can be generated by a lack of association between trait states and speciation and extinction rates simulated via trait evolution on empirical trees.
In the case of decoupled dynamics described above, if an association between speciation/extinction rate and trait state similar to the empirical ones is  (4). This distance represents differences in topology between two trees, and is calculated by counting the number of permutation operations that need to be performed to transform one tree into the other (3).
Since the difference between trees is expected to grow with tree size, these distances cannot be compared between different tree-size classes (i.e. here the trees with 48 and 198 terminals, respectively). Moreover, these Robinson-Foulds distances cannot be easily interpreted without a reference scenario. We therefore simulated 1000 random topologies for each tree size (48 and 198 terminals) using a constant-rate birth-death model, and later calculated the distances between each of the three empirical trees in each size class (Ericson, Hackett and Jarvis for the 48 terminals class, and Ericson, Hackett and Prum for the 198 terminals class), using all 1000 simulated trees from the respective size.
This was done to provide a maximum expected value of distances for each size class, so that the absolute distance values could be better compared (Supplementary Figure 25).