Article | Open | Published:

# Unifying host-associated diversification processes using butterfly–plant networks

## Abstract

Explaining the exceptional diversity of herbivorous insects is an old problem in evolutionary ecology. Here we focus on the two prominent hypothesised drivers of their diversification, radiations after major host switch or variability in host use due to continuous probing of new hosts. Unfortunately, current methods cannot distinguish between these hypotheses, causing controversy in the literature. Here we present an approach combining network and phylogenetic analyses, which directly quantifies support for these opposing hypotheses. After demonstrating that each hypothesis produces divergent network structures, we then investigate the contribution of each to diversification in two butterfly families: Pieridae and Nymphalidae. Overall, we find that variability in host use is essential for butterfly diversification, while radiations following colonisation of a new host are rare but can produce high diversity. Beyond providing an important reconciliation of alternative hypotheses for butterfly diversification, our approach has potential to test many other hypotheses in evolutionary biology.

## Introduction

The diversification of herbivorous insects is one of the most successful animal radiations in the history of life1, hence understanding its drivers is central to understanding a major mode of evolution. Ever since Ehrlich and Raven2 argued for interactions between herbivorous insects and their host plants as being central to the diversification of both, and in the process formalising the concept of coevolution, evolutionary ecologists have searched for evidence of how such interactions could drive diversification3,4. Ehrlich and Raven assumed that a trait that allows an individual organism to explore a novel niche also promotes diversification, as the new niche would equate to a new adaptive zone, relatively free from competition. However, the mechanism connecting the increase in individual fitness to an increase in cladogenesis was not specified5. This gap in how micro- and macroevolution are connected has resulted in a range of proposed mechanisms linking insect–plant interactions to diversification6,7,8,9,10,11. Unfortunately, to date no clear consensus has emerged regarding the relative importance of these mechanisms. Here, we seek to advance this debate by reconciling the two most prominent and opposing explanations for the evolution of insect–plant interactions. We do so by proposing an approach to disentangle evolutionary hypotheses based on the patterns of interaction they are expected to produce.

The colonisation of a new host plant is often recognised as an opportunity for insect diversification. The various hypotheses of how and in which cases colonisation leads to diversification can be placed along two main axes: (i) the relative prevalence of complete host shifts vs. expansion of the number of hosts, which depends on variability in the insect host range, and (ii) the relative importance of key innovations vs. existing abilities (standing genetic variation and phenotypic plasticity) for colonisation of new hosts. Here we compare two alternative extremes among the above-mentioned explanations, herein referred to as the adaptive radiation scenario and the variability scenario. Each scenario aims to explain how changes in host use affect net diversification rates (without, however, teasing apart speciation, and extinction rates). The adaptive radiation scenario hypothesises that herbivorous insects quickly radiate into many species following a shift from an old to a novel plant taxon, by overcoming their host defences. As such, this is consistent with the idea of a key innovation by Ehrlich and Raven, though it does not require subsequent coevolution7. Rather, it is the complete change in host use, which increases the chances for ecological and geographic divergence, that are considered the main drivers of insect diversification8. In contrast, the variability scenario predicts that diversification is maximised in insect taxa with large variability in host use (aka the plasticity scenario7,12). Such variability results from the mixing and matching of hosts acquired by generalist ancestors and retained in the fundamental host repertoire (analogous to fundamental niche). Although most descendant species specialise on a subset of the ancestor’s host repertoire, they retain the ability to use a wider range of potential hosts, including taxonomically distant plant taxa. The existence of such potential hosts—remnants of past host range expansions—makes host ranges unstable over evolutionary time, as insects can mix and match between hosts relatively easily. The resulting oscillations in host range increase the chance of population fragmentation and thereby speciation, via both adaptive and neutral processes7.

Distinguishing between the radiation and variability scenarios is extremely challenging, as the complexity of host use makes it intractable for most phylogenetic reconstruction methods3. Most phylogenetic methods can reconstruct either the association between a given insect group and one host plant taxon at a time (and then combine the inferences from taxon-specific models; e.g.13), or the evolution of host range per se without specifying host taxa. Although there is an increased realisation that host range is labile across time and space13,14,15,16, its importance for diversification of herbivorous insects is still under debate17,18,19. Novel statistical approaches to study state-dependent diversification have been developed recently20,21, but have so far produced divergent results and, consequently, different explanations for the effect of host range on diversification10,11,17. Part of this problem arises from the classification of host range, which is a complex trait, into two opposing states (specialist vs. generalist) or multiple states. Strictly speaking, host range is not an independently evolving trait, but rather an emergent property of the underlying dynamics of gaining and losing specific host plant taxa.

To investigate the role of hosts in diversification processes one would thus need to incorporate both the number of hosts used by each taxon (i.e., host range) and the identity of the host plants (i.e., host repertoire). A challenge to solve is how to circumvent computational limitations that constrain the application of such a method when modelling the evolution of host use. An alternative solution for this problem is to contrast the different patterns of interaction between insects and their host plants predicted by different diversification processes. Network analysis is a promising approach for this purpose22, as it provides not only a visual representation of complex ecological systems, but also a formal way to quantify patterns of interaction in the studied system23. The mechanisms underlying these patterns can then be assessed using independent sources of information, such as phylogenetic relationships24,25,26,27.

The butterfly families, Nymphalidae and Pieridae, were two of the examples of coevolution used by Ehrlich and Raven2, and today are the primary examples of the variability and radiation scenarios, respectively. Nymphalidae comprises much of the diversity of butterflies and also shows dramatic variability in host use. The variability scenario was first proposed based on host use patterns in this family28, but the diversification of at least one tribe, Satyrini, seems to be a radiation on a novel host clade29,30. Diversification of Pieridae, on the other hand, has been viewed by many as adhering to the adaptive radiation scenario2, wherein radiation of the Pierinae followed the colonisation of the chemically well-defended Brassicales host plants. Later studies found support for such a butterfly–plant arms race9,31.

Here, we estimate the relative importance of the radiation and the variability scenarios by translating their predictions into network properties (see Results) and investigating these processes in the butterfly families Nymphalidae and Pieridae. As the diversification of nymphalid and pierid butterflies are often seen as classic examples of the variability scenario vs. the adaptive radiation scenario, respectively, we expected to find contrasting patterns of interaction between these butterflies and their host plants. Instead, although network structure varies between the two groups, the patterns of interaction in both families have much in common, leading us to propose a unified explanation for the evolution of butterfly–plant interactions. The proposed approach thus appears to be a promising tool to assess whether the same dynamics apply to host–parasite systems in general, and to evaluate other hypotheses about evolutionary dynamics and diversification.

## Results

### Diversification scenarios and network structure

Here we represent butterfly–plant interactions as a network, with each taxon (butterfly or plant) being a node and connections between nodes arising from their interaction (i.e., host–plant usage). In this network, butterflies using plants in the same family can then be clustered by this shared connection. Thus, if most of the diversity of butterflies was generated by adaptive radiations on new host plants, the resulting network should be highly modular. Modularity emerges when a network contains recognisable subsets of taxa that interact more with each other than with other taxa in the network. Each module would then be composed of closely related plant taxa, which represent a distinct adaptive zone, and closely related butterflies, which descend from the ancestor that made the host shift. On the other hand, the variability scenario would produce a nested butterfly–plant network. Nestedness emerges if (i) there is a specialist-generalist gradient in both trophic levels and (ii) the interacting assemblage of a taxon is a subset of the interacting assemblages of taxa with more interactions. In the variability scenario, temporal changes in host range produce a specialist-generalist gradient at any point in time, with specialised species utilising a subset of the host plants of their closely related generalists, which creates network nestedness.

To validate these predictions, we used a fixed tree and simulated butterfly diversification as taking place owing to either the radiation or the variability scenario, or various combinations of the two (Fig. 1, see Methods and R code in the Supplementary Software for details). The tree was composed of 100 terminal taxa, separated into 10 clades grouped in pairs, with each pair having subclades with a low and high number of taxa (n = 5 and 15 taxa, respectively; Fig. 1a). This way, the difference in diversity between subclades in each pair could be generated by either one of the diversification scenarios. For comparison with a neutral scenario, we also simulated a network by randomly choosing 20% of the butterfly–plant interactions. For each simulation, we then analysed the resulting butterfly–plant network to see how well we could detect the relative contributions of the two scenarios that were simulated (Fig. 1b, Supplementary Table 1). We also recorded the number of hosts used by each butterfly taxon to compare with empirical networks (Supplementary Fig. 2).

According to our expectations, when diversification in all five clade pairs was generated by the radiation scenario (R5V0 in Fig. 1), the network was highly modular, with each module being composed of closely related butterflies and one plant taxon. As we decreased the number of diversification events by adaptive radiations, replacing them with diversification by variability in host use (R4V1–R1V4), network modularity decreased, but was still higher than expected by the theoretical benchmark provided by null model (see Methods). Even when diversification in only one of the five pairs followed the radiation scenario (R1V4) the network was still modular and not nested. This pattern shifted when diversification in the whole phylogeny was generated by the variability scenario (R0V5), which produced a nested and not modular network. We interpret these results as indicating that forming modules is much easier than creating nestedness, as the latter does not readily emerge from such simulations. These results suggest that with real data from much larger clades, detecting modules produced by the radiation scenario will be easier than detecting nestedness produced by the variability scenario. Finally, when interactions are randomly chosen, the levels of modularity and nestedness were not significant (Random in Fig. 1b and Supplementary Table 1) and the number of hosts used per butterfly followed a binomial distribution (Supplementary Fig. 2a).

### Butterfly–plant network structure

To quantify the nestedness and modularity of butterfly–plant interactions, we constructed presence/absence matrices of interactions using existing literature (Methods). The Nymphalidae-plant network included 566 interactions between 295 Nymphalidae genera and 43 host–plant orders, and the Pieridae-plant network included 126 interactions between 67 Pieridae genera and 34 host–plant families. For consistency between butterfly families with respect to the classification level of plants, we also analysed a network between Nymphalidae genera and plant families. Nymphalidae network structure is very similar at both order and family level (see Supplementary Methods, Supplementary Figs. 3 and 4). For the other analyses we focused on the network at order level because that is the taxonomic level at which ancestral-state reconstructions of host use have been done for Nymphalidae.

For each network, we analysed nestedness32 and modularity33. The Nymphalidae-plant network is both more nested (NODF = 13.09, permutation test, p < 0.001, z score = 8.27) and modular (M = 0.58, permutation test, p < 0.01, z score = 3.96) than networks generated by the null model (Fig. 1c). Butterflies and plants were grouped in 10 modules by an optimisation algorithm that maximises modularity (Figs. 2 and 3a). The smallest module, M7, has only four taxa (two butterfly genera and two plant orders) and is the only module that has no interactions with other modules of the network. The remaining nine modules are formed by at least 20 taxa, which are connected by one of the nine main host–plant orders (module and network hubs in Fig. 3c). In addition to nestedness at network level, within-module interactions are also significantly nested in two modules (M1: NODF = 49.89, permutation test, p = 0.03; M6: NODF = 60.21, permutation test, p < 0.001).

Contrary to our expectations, the Pieridae-plant network is also both significantly nested (NODF = 14.23, permutation test, p < 0.001, z score = 3.9) and modular (M = 0.66, permutation test, p = 0.03, z score = 1.96; Fig. 1c). This network is structured in 10 modules, three of them composed by only one butterfly–plant interaction (Figs. 3b and 4). Butterflies and plants in the four modules with more than 10 taxa are connected by the main plant family in the module, or module hub (Fig. 3d). Within-module interactions are also nested in two of these modules (M7: NODF = 63.13, permutation test, p = 0.02; M8: NODF = 72.14, p < 0.001). Although both networks have the same number of modules, the Pieridae-plant network has fewer interactions between modules (16.6% of interactions) than the Nymphalidae-plant network (28.8% of interactions).

As the empirical networks show signs from both diversification scenarios, we simulated an additional scenario to test whether nestedness and modularity could have emerged simply from phylogenetic signal in the repertoire of hosts used by butterflies. In this scenario (herein referred to as uniform evolution, Supplementary Fig. 1b), the fundamental host repertoire evolved uniformly along all branches of the butterfly tree, resulting in fundamental host repertoires of the same size at the tips of the tree. Closely related clades shared more hosts, whereas basal clades had more unique hosts. Importantly, low and high-diversity subclades within each of the five pairs of clades had the same fundamental host repertoire. Then, realised repertoires were randomly sampled from the fundamental host repertoire. The resulting network was not significantly nested (NODF = 10.55, permutation test, p = 0.26, z score = 0.59), but modularity was slightly higher than expected by the null model (M = 0.54, permutation test, p = 0.02, z score = 1.96; Fig. 1c; Supplementary Table 1). Thus, phylogenetic conservatism in host repertoire alone can create low levels of modularity, but for nestedness to emerge, phylogenetic conservatism has to be coupled with host range expansion events (as in the variability scenario).

Comparing the simulated and empirical networks, the high levels of nestedness in the empirical networks suggest that the variability scenario played an important role on the diversification of both butterfly families. The modularity levels, however, could have emerged simply from phylogenetic conservatism in host repertoire, especially in Pieridae, where modularity is low.

### Structural roles of ancestral and recent hosts

According to the variability scenario7, the pool of host plants used by a clade derives mainly from previous events of polyphagy. Recent reconstructions of past host use for nymphalid butterflies suggest Rosales as the most likely ancestral host order followed by Malpighiales, which suggests that both orders were used by a generalist ancestor early in the evolution of the family13. These plant orders are probably the ones with the longest evolutionary association with nymphalid butterflies, and therefore may have an important role in shaping structural patterns of the studied network. Besides the ancestral hosts, two recent host orders—Poales and Solanales—support species-rich butterfly taxa, which are likely the result of radiation events29,34. Hence, we expect these hosts to also have an important effect on network structure.

In support of the proposed link between diversification scenarios and network structure, we found that ancestral hosts produce nestedness and recent hosts produce modularity in the Nymphalidae-plant network. Nestedness is significantly lower when Rosales and Malpighiales are removed from the network, as compared with the effect of all other host plants (NODF = 11.39, permutation test, p = 0.002, z score = −1.67), whereas modularity decreases most when Poales and Solanales are removed (M = 0.52, permutation test, p = 0.001, z score = −5.14).

Host use evolution in pierid butterflies is marked by a shift in host preference from Fabales (the probable ancestral host for all butterflies) to Brassicales. Because this shift was followed by increases in diversification rate9, Brassicales plants are the most common hosts for pierids, especially from Capparaceae and Brassicaceae families. In the Pieridae-plant network Capparaceae and Brassicaceae have the strongest effect on network structure. Removal of these hosts significantly decreases nestedness (NODF = 11.38, permutation test, p = 0.0018, z score = −6.25) and increases modularity (M = 0.72, permutation test, p = 0.0018, z score = 5.31). Therefore, these hosts act as ancestral hosts that promote variability in host use, despite having supported butterfly radiations in the beginning of this ecological association.

### Phylogenetic composition of modules

Most modules with more than five butterfly taxa in both networks are composed of phylogenetically closely related butterflies (Tables 1 and 2, Figs. 2 and 4). The two exceptions are modules with butterflies specialised on ancestral hosts: M2 of the nymphalid network, which includes Rosales, and M3 of the pierid network, which includes Capparaceae. These exceptions support the expectation from the variability scenario that butterflies retain the ability to use ancestral hosts. As for the phylogenetic diversity of plant orders, with the exception of the two modules that only have one host plant, all modules are composed of a phylogenetically widespread combination of host plants (Tables 1 and 2). These results indicate that host use is phylogenetically conserved (related butterflies use the same repertoire of plants), but this repertoire usually includes unrelated plant clades.

Combining our results, it is clear that the modular structure is formed by grouping closely related butterflies that use a main host taxon (module hub). But several modules also include a number of other distantly related hosts that are used by a subset of the butterflies in the module, producing nestedness within modules. Hosts with a long evolutionary history of association with the butterflies tie the various modules together, resulting in overall network nestedness.

## Discussion

Here, we describe and implement an approach to show that different host-associated diversification dynamics produce distinct butterfly–plant network structures. We then use this approach upon two of the original exemplar butterfly families that Ehrlich and Raven used to introduce coevolution2, and despite the general acceptance that Nymphalidae and Pieridae underwent different diversification and host use processes during their evolution9,28, we show that the network structures of the two families are very similar. We suggest that the evolution of butterfly–plant networks is mainly driven by the formation of new ecological interactions (initially at the population level, but carried over to the species level after speciation events), combined with phylogenetic conservatism. Conservatism in host use is one of the most prevalent characteristics of herbivorous insects13,35, yet it does not prevent colonisation of new hosts when opportunity arises36,37. Instead, even highly specialised insects are expected to have a wider fundamental than realised host repertoire—analogous to fundamental and realised niche. Phenotypic plasticity, resource tracking, and recurrence homoplasy allow insects to continuously explore their fundamental host repertoire by probing new hosts38. Under some circumstances, this exploration produces patterns that can be detected in the network structure.

We suggest that the ubiquitous properties of network structure identified in this study reflect three phases in the evolution of butterfly–plant interactions. First, one of the host colonisations may lead to a complete shift in host use, especially if old and novel hosts are significantly different, as in the case of the shift from rosids to Poales (grasses) by Satyrinae, the largest Nymphalidae subfamily. The colonisation of Poales happened ~ 60 Mya by the common ancestor of Satyrinae + Morphini + Brassolini29, and the spread and diversification of Satyrini throughout the world happened ~ 40 Mya30. This diversification on grasses produced a clear butterfly–plant module with little plant phylogenetic diversity (M9 of nymphalid network). In general, we expect such events to be rare because colonisation of host groups that were not used before should be difficult and may not have much success in terms of diversification until specific traits (such as detoxification genes) evolve. And even in the case of a successful colonisation, not all novel plant groups would provide enough opportunities for diversification.

Second, herbivores continue to explore their fundamental host repertoire even after host shifts. For example, a complete shift in preference happened following the colonisation of Brassicales plants by Pieridae butterflies ~ 70 Mya9 (which would represent the first phase). But, with time, pierids also colonised other host plants, breaking up the strict modular structure and creating nestedness within the module. In fact, it seems that the colonisation of the order Brassicales (first Capparaceae and subsequently, Brassicaceae) facilitated the colonisation of other plant families. These include distantly related Brassicales families such as Tropaeolaceae, but also plants from entirely unrelated orders, such as the family Loranthaceae (showy mistletoes) from the order Santalales. In other words, colonisation of Brassicales led to an increase in variability in host use in pierid butterflies, producing within-module nestedness, and characterising a second phase of host use evolution.

Third, the recurrent addition of new host plants increases among-module interactions. The pierid network is surprisingly similar to the nymphalid network. The main difference is that Nymphalidae is a larger clade with more variability—but also overlap—in host use, which is reflected in network size and inter-module interactions. These differences might be partially explained by the time of association between the butterflies and the main ancestral host group. The association between nymphalid butterflies and their ancestral host, Rosales, is ~ 100 million years old13,39, the oldest one in this study, whereas the pierid association with Brassicales is ~ 70 million years old9. Although the ability to use Rosales seems to be retained in most clades of Nymphalidae (high phylogenetic diversity in module M2), various clades use other host plants more often. These are not complete shifts in host plant, but changes in host use frequencies, and can be seen in the modular structure of the network, with highly connected modules (Fig. 3a). As a consequence, the third phase is characterised by both nestedness and modularity. Network nestedness (and therefore, coherence) is maintained by ancestral hosts, whereasmodularity increases with specialisation to new hosts (module hubs; Fig. 3c).

Here, our goal was to compare the main alternative hypotheses for host-associated diversification based on network properties that emerge from the evolutionary dynamics. We focused on the colonisation process (or creation of new interaction in the network), which is the necessary first step for network assembly. Teasing apart the effects of speciation and extinction is a difficult task that requires being the focus of a future study. Although the two alternative extremes (radiation and variability scenarios) are indeed associated with opposite network properties (modularity and nestedness), the unification of these complementary parts results in a better description of host use evolution and diversification than each part separately. For instance, lack of variability in host use can be the reason why some colonisation events are not followed by rapid diversification (e.g., M7 of nymphalid network), and key innovations allowing colonisation of novel host taxa providing new niches can be thought of as evolutionary novelties that suddenly increase realised and fundamental host repertoires, and therefore the potential for variability in host use (e.g., M8 of pierid network).

In conclusion, we argue that the variability and radiation scenarios can be reconciled into a unified view of butterfly–plant evolution in which the continuous probing of new hosts allows both ongoing diversification through variability in host use and episodic radiations on new hosts. Somewhat ironically, this was foreshadowed already by Ehrlich and Raven in their seminal paper. In one passage that has been given much less attention than their arms-race coevolution ideas, they noted that “the degree of plasticity of chemoreceptive response and the potential for physiological adjustment to various plant secondary substances in butterfly populations must in large measure determine their potential for evolutionary radiation”. With the recent recognition that host–parasite systems have much in common with herbivorous insect–plant systems38, our approach could be applied to other host–parasite systems to test the generality of our conclusions. Moreover, we believe this study demonstrates the potential of using network analysis in a phylogenetic context to investigate hypotheses about macroevolutionary dynamics.

## Methods

### Diversification scenarios and network structure

Then six combinations were simulated (Supplementary Fig. 1), spanning from diversity in all five pairs explained by the radiation scenario (R5V0) to diversity in all pairs explained by the variability scenario (R0V5). For the intermediate networks, we started simulating the radiation scenario and shifted to the variability scenario at different points of the phylogeny (Supplementary Fig. 1d–g).

For comparison, we also simulated two networks where the evolution of host repertoire does not affect diversification. In the random network (Supplementary Fig. 1a), the fundamental host repertoire of all terminal taxa contained 40 hosts (same as in the variability scenario) and 20% of possible interactions (all combinations of butterflies and plants) were randomly chosen. Thus, realised host repertoires were randomly sampled from a fixed fundamental host repertoire. In the uniform evolution scenario, hosts were added to the fundamental host repertoire uniformly through time (Supplementary Fig. 1b), so that more closely related clades shared more hosts, whereas basal clades had more unique hosts. As in the random and the variability scenarios, 20% of possible interactions were randomly chosen. We then measured nestedness and modularity of each simulated network as described below.

### Butterfly–plant network structure

To build the Nymphalidae-plant network we used the host use data set sampled by ref. 13, which is based on records of host plant orders for butterfly genera reported in the literature and on ref. 40. We followed ref. 39 for the phylogenetic relationships between Nymphalidae genera, and for phylogenetic relationships between plant orders we followed refs. 41,42. The interactions between Pieridae genera and plant families were also gathered from the literature43,44,45,46,47,48,49. We followed ref. 9 for the phylogenetic relationships between Pieridae genera, and refs 9,42 for phylogenetic relationships between plant families.

We used the program ANINHADO50 to compute the NODF index, a nestedness metric based on overlap and decreasing fill51. To detect modularity, we used Newman and Girvan’s metric52 modified for bipartite networks53 as implemented in the software MODULAR54. We used a simulated annealing algorithm to maximise the index of modularity (M) and identify the modules. As the algorithm is based on an optimisation process, the outcome of different runs may vary. That is particularly important for networks with many interactions between modules. Therefore, we ran the analysis 10 times and compared the resulting modules and index of modularity. As network configuration did not vary significantly across runs, we simply chose the one with highest modularity, M.

In order to produce null distributions of NODF and M-values, we computed these indices for 1000 matrices generated by a null model in which the probability of each interaction is proportional to the number of interactions of the insect and the plant, therefore taking into account heterogeneity in host range and in butterfly richness per host taxon (null model 2 of ref. 32). Thus, if the observed patterns are significantly different from what is generated by the null model, such patterns do not emerge simply from a specialisation gradient but from another underlying process. Based on the null expectation, we then standardised NODF and M values using Z-score $$= \frac{{X_{{\mathrm{obs}}} - X_{{\mathrm{exp}}}}}{{StDev_{{\mathrm{exp}}}}}$$; where Xobs is the metric of interest, Xexp is the mean value and StDevexp is the standard deviation from the null distribution. This is a standardisation that quantifies the position of the observed metric within the null distribution in terms of units of standard deviation55.

Based on topological properties, each taxon was assigned a role in the network following ref. 33. The role of a node is defined by how it interacts within its own module (standardised within-module degree) and with nodes in other modules (among-module connectivity).

### Structural roles of ancestral and recent hosts

We assessed the role of host plants by recalculating nestedness and modularity after removing one of all possible combinations of two plant taxa from the network. This resulted in 904 combinations for the nymphalid network and 561 combinations for the pierid network. The importance of any given combination of hosts was assessed by calculating the Z-score for NODF and M-values of the network without the given combination of hosts in relation to all other networks.

### Phylogenetic composition of modules

We calculated Faith’s phylogenetic diversity56 of butterflies and plants in each module and contrasted that to a null distribution, using the package picante version 1.6–257 of R58. Faith’s phylogenetic diversity is the sum of the total phylogenetic branch lengths leading to the terminal taxa in the sample. We used a null model that shuffles taxon labels across tips of the phylogeny to generate expected values of phylogenetic diversity for each module, maintaining the number of butterflies and plants on each module.

### Code availability

Custom code used to simulate theoretical diversification models is available as Supplementary Software.

## Data availability

The authors declare that the data supporting the findings of this study are available within the paper (and its Supplementary Information). A reporting summary for this Article is available as a Supplementary Information file.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

1. 1.

Mitter, C., Farrell, B. & Wiegmann, B. The phylogenetic study of adaptive zones: has phytophagy promoted insect diversification? Am. Nat. 132, 107–128 (1988).

2. 2.

Ehrlich, P. R. & Raven, P. H. Butterflies and plants: a study in coevolution. Evolution 18, 586 (1964).

3. 3.

Janz, N. Ehrlich and Raven revisited: mechanisms underlying codiversification of plants and enemies. Annu. Rev. Ecol. Evol. Syst. 42, 71–89 (2011).

4. 4.

Forister, M. L., Dyer, L. A., Singer, M. S., Stireman, J. O. I. & Lill, J. T. Revisiting the evolution of ecological specialization, with emphasis on insect-plant interactions. Ecology 93, 981–991 (2012).

5. 5.

Futuyma, D. J. Some current approaches to the evolution of plant–herbivore interactions. Plant Species Biol. 15, 1–9 (2000).

6. 6.

Thompson, J. N. The Geographic Mosaic of Coevolution. (University of Chicago Press, 2005).

7. 7.

Janz, N. & Nylin, S. The oscillation hypothesis of host-plant range and speciation. In Specialization, Speciation, and Radiation: the Evolutionary Biology of Herbivorous Insects (ed. Tilmon, K. J.) 203–215 (University of California Press, 2008).

8. 8.

Fordyce, J. A. Host shifts and evolutionary radiations of butterflies. Proc. Biol. Sci. 277, 3735–3743 (2010).

9. 9.

Edger, P. P. et al. The butterfly plant arms-race escalated by gene and genome duplications. Proc. Natl. Acad. Sci. USA 112, 8362–8366 (2015).

10. 10.

Hardy, N. B. & Otto, S. P. Specialization and generalization in the diversification of phytophagous insects: tests of the musical chairs and oscillation hypotheses. Proc. Biol. Sci. 281, 20132960–20132960 (2014).

11. 11.

Hardy, N. B., Peterson, D. A. & Normark, B. B. Nonadaptive radiation: pervasive diet specialization by drift in scale insects? Evolution 70, 2421–2428 (2016).

12. 12.

Nylin, S. & Wahlberg, N. Does plasticity drive speciation? Host-plant shifts and diversification in nymphaline butterflies (Lepidoptera: Nymphalidae) during the tertiary. Biol. J. Linn. Soc. 94, 115–130 (2008).

13. 13.

Nylin, S., Slove, J. & Janz, N. Host plant utilization, host range oscillations and diversification in nymphalid butterflies: a phylogenetic investigation. Evolution 68, 105–124 (2014).

14. 14.

Janz, N., Nyblom, K. & Nylin, S. Evolutionary dynamics of host-plant specialization: a case study of the tribe Nymphalini. Evolution 55, 783–796 (2001).

15. 15.

Nosil, P. Transition rates between specialization and generalization in phytophagous insects. Evolution 56, 1701–1706 (2002).

16. 16.

Calatayud, J. et al. Geography and major host evolutionary transitions shape the resource use of plant parasites. Proc. Natl. Acad. Sci. USA 113, 201608381–9845 (2016).

17. 17.

Hamm, C. A. & Fordyce, J. A. Patterns of host plant utilization and diversification in the brush-footed butterflies. Evolution 69, 589–601 (2015).

18. 18.

Janz, N., Braga, M. P., Wahlberg, N. & Nylin, S. On oscillations and flutterings—a reply to Hamm and Fordyce. Evolution 70, 1150–1155 (2016).

19. 19.

Hamm, C. A. & Fordyce, J. A. Greater host breadth still not associated with increased diversification rate in the Nymphalidae—a response to Janz et al. Evolution 70, 1156–1160 (2016).

20. 20.

Maddison, W. P., Midford, P. E. & Otto, S. P. Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 56, 701–710 (2007).

21. 21.

FitzJohn, R. G. Diversitree: comparative phylogenetic analyses of diversification in R. Methods Ecol. Evol. 3, 1084–1092 (2012).

22. 22.

Bascompte, J. & Jordano, P. Plant-animal mutualistic networks: the architecture of biodiversity. Annu. Rev. Ecol. Evol. Syst. 38, 567–593 (2007).

23. 23.

Poulin, R. Network analysis shining light on parasite ecology and diversity. Trends Parasitol. 26, 492–498 (2010).

24. 24.

Ives, A. R. & Godfray, H. C. J. Phylogenetic analysis of trophic associations. Am. Nat. 168, E1–E14 (2006).

25. 25.

Rezende, E. L., Jordano, P. & Bascompte, J. Effects of phenotypic complementarity and phylogeny on the nested structure of mutualistic networks. Oikos 116, 1919–1929 (2007).

26. 26.

Donatti, C. I. et al. Analysis of a hyper-diverse seed dispersal network: modularity and underlying mechanisms. Ecol. Lett. 14, 773–781 (2011).

27. 27.

Eklöf, A. et al. The dimensionality of ecological networks. Ecol. Lett. 16, 577–583 (2013).

28. 28.

Janz, N., Nylin, S. & Wahlberg, N. Diversity begets diversity: host expansions and the diversification of plant-feeding insects. BMC Evol. Biol. 6, 4–10 (2006).

29. 29.

Peña, C. & Wahlberg, N. Prehistorical climate change increased diversification of a group of butterflies. Biol. Lett. 4, 274–278 (2008).

30. 30.

Peña, C., Nylin, S. & Wahlberg, N. The radiation of Satyrini butterflies (Nymphalidae: Satyrinae): a challenge for phylogenetic methods. Zool. J. Linn. Soc. 161, 64–87 (2011).

31. 31.

Wheat, C. W. et al. The genetic basis of a plant-insect coevolutionary key innovation. Proc. Natl. Acad. Sci. USA 104, 20427–20431 (2007).

32. 32.

Bascompte, J., Jordano, P., Melián, C. J. & Olesen, J. M. The nested assembly of plant-animal mutualistic networks. Proc. Natl. Acad. Sci. USA 100, 9383–9387 (2003).

33. 33.

Olesen, J. M., Bascompte, J., Dupont, Y. L. & Jordano, P. The modularity of pollination networks. Proc. Natl. Acad. Sci. USA 104, 19891–19896 (2007).

34. 34.

Elias, M. et al. Out of the Andes: patterns of diversification in clearwing butterflies. Mol. Ecol. 18, 1716–1729 (2009).

35. 35.

Nyman, T., Vikberg, V. & Smith, D. R. How common is ecological speciation in plant-feeding insects? A ’Higher’ Nematinae perspective. BMC Evol. Biol. 10, 266 (2010).

36. 36.

Singer, M. C., Thomas, C. D. & Parmesan, C. Rapid human-induced evolution of insect host associations. Nature 366, 681–683 (1993).

37. 37.

Fraser, S. M. & Lawton, J. H. Host-range expansion by british moths onto introduced conifers. Ecol. Entomol. 19, 127–137 (1994).

38. 38.

Nylin, S. et al. Embracing colonizations: a new paradigm for species association dynamics. Trends Ecol. Evol. 33, 4–14 (2018).

39. 39.

Wahlberg, N. et al. Nymphalid butterflies diversify following near demise at the Cretaceous/Tertiary boundary. Proc. Biol. Sci. 276, 4295–4302 (2009).

40. 40.

Savela, M. Lepidoptera and some other life forms. Available at ftp://ftp.funet.fi/index/Tree_of_life/insecta/lepidoptera/index.html (2014).

41. 41.

Stevens, P. F. Angiosperm phylogeny website. Available at http://www.mobot.org/MOBOT/research/APweb/ (2001 onwards).

42. 42.

Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).

43. 43.

Beccaloni, G. W., Viloria, A. L., Hall, S. K. & Robinson, G. S. Catalogue of the hosplants of the Neotropical butterflies. Monografias Tercer Milenio 8 (2008).

44. 44.

Smith, D. S., Miller, L. D. & Miller, J. Y. The butterflies of West Indies and South Florida. (Oxford University Press, 1994).

45. 45.

Tennent, J. The butterflies of Morocco, Algeria and Tunisia. (Gem Publishing Company, 1996).

46. 46.

Tolman, T. & Lewington, R. Collins field guide: Butterflies of Britain and Europe. (Harper Collins Publishers Ltd., 1997).

47. 47.

Tuzov, V. K. Guide to the Butterflies of Russia and Adjacent Territories: Hesperiidae, Papilionidae, Pieridae, Satyridae. (Reference Work, Vol 1) (Pensoft Pub, 1997).

48. 48.

Underwood, D. L. A. Intraspecific variability in host plant quality and ovipositional preferences in Eucheira socialis (Lepidoptera, Pieridae). Ecol. Entomol. 19, 245–256 (1994).

49. 49.

Waterfield, E. M. Notes on the life-history of Caloperis eulimine. Trans. R. Èntomol. Soc. Lond. 73, xxvi–xxviii (1925).

50. 50.

Guimaraes, P. Jr & Guimaraes, P. Improving the analyses of nestedness for large sets of matrices. Environ. Model. Softw. 21, 1512–1513 (2006).

51. 51.

Almeida-Neto, M., Guimarães, P., Guimarães, P. R., Loyola, R. D. & Ulrich, W. A consistent metric for nestedness analysis in ecological systems: reconciling concept and measurement. Oikos 117, 1227–1239 (2008).

52. 52.

Newman, M. E. J. & Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E. Stat. Nonlin. Soft. Matter Phys. 69, 026113 (2004).

53. 53.

Barber, M. J. Modularity and community detection in bipartite networks. Phys. Rev. E. Stat. Nonlin. Soft. Matter Phys. 76, 066102 (2007).

54. 54.

Marquitti, F. M. D., Guimarães, P. R., Pires, M. M. & Bittencourt, L. F. MODULAR: software for the autonomous computation of modularity in large network sets. Ecography 37, 221–224 (2014).

55. 55.

Ulrich, W., Almeida-Neto, M. & Gotelli, N. J. A consumer’s guide to nestedness analysis. Oikos 118, 3–17 (2009).

56. 56.

Faith, D. P. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 1–10 (1992).

57. 57.

Kembel, S. W. et al. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464 (2010).

58. 58.

R Core Team. R: a language and environment for statistical computing. (R Foundation for Statistical Computing, Vienna, Austria, 2017).

## Acknowledgements

SN was supported by the Swedish Research Council (2015-04218) and PRG was supported by FAPESP (2016/20739-9) and CNPq. Discussions at the symposium “Changing species associations in a changing world: a Marcus Wallenberg symposium” (MWS 2015.0009 to SN) improved the manuscript.

## Author information

### Affiliations

1. #### Department of Zoology, Stockholm University, Stockholm, 10691, Sweden

• Mariana P. Braga
• , Christopher W. Wheat
• , Sören Nylin
•  & Niklas Janz
2. #### Departamento de Ecologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, 05508-900, Brazil

• Paulo R. Guimarães Jr

### Contributions

M.P.B, N.J. and S.N. conceived the study; S.N. and C.W.W provided the data; M.P.B and P.R.G designed the analyses; M.P.B analysed the data; M.P.B wrote the paper with input from all the other authors.

### Competing interests

The authors declare no competing interests.

### Corresponding author

Correspondence to Mariana P. Braga.