Mosaic patterns of diversification dynamics following the colonization of Melanesian islands

The fate of newly settled dispersers on freshly colonized oceanic islands is a central theme of island biogeography. The emergence of increasingly sophisticated methods of macroevolutionary pattern inference paves the way for a deeper understanding of the mechanisms governing these diversification patterns on lineages following their colonization of oceanic islands. Here we infer a comprehensive molecular phylogeny for Melanesian Exocelina diving beetles. Recent methods in historical biogeography and diversification rate inference were then used to investigate the evolution of these insects in space and time. An Australian origin in the mid-Miocene was followed by independent colonization events towards New Guinea and New Caledonia in the late Miocene. One colonization of New Guinea led to a large radiation of >150 species and 3 independent colonizations of New Caledonia gave rise to about 40 species. The comparably late colonizations of Vanuatu, Hawaii and China left only one or two species in each region. The contrasting diversification trajectories of these insects on Melanesian islands are likely accounted for by island size, age and availability of ecological opportunities during the colonization stage.

1. The Australian biota is comparably ancient and the source of colonization events to surrounding islands. Due to the mid-Miocene aridification in Australia, diversification rates might have decreased as shown in a different group of diving beetles 29 . 2. Colonization of New Caledonia was rare and diversification rates should have decreased as available habitats were mainly being filled by cladogenetic speciation. 3. The New Guinean biota comes from multiple colonizations events and diversification rates should be constant or increasing as the island is continuously increasing in area, altitude and topographic complexity since the mid-Miocene.

Results
Altogether, the Maximum Likelihood (ML) and Bayesian Inference (BI) topologies respectively resulting from the RAxML and MrBayes analyses were congruent. We recover all Copelatinae genera as monophyletic with strong support in both BI and ML (  (Fig. 1). The best biogeographical model recovered under the program BioGeoBEARS (Table 2) is the DEC+ J model (AIC = 93.94, − logL = 43.97) implementing a founder-effect component presented in Fig. 1, although it is not significantly better than the DIVALIKE+ J model (AIC = 93.97, − logL = 43.98). The only difference in the biogeographical pattern recovered is the geographic range at the ancestor of C5 and C6, which is Australia in DIVALIKE+ J and Australia + New Guinea in DEC+ J. Overall, all models implementing a founder-effect parameter were significantly better than the same models without this component (Table 2). Our results suggest an Australian origin of Exocelina followed by two independent colonizations of New Caledonia during the late Miocene respectively in C1 and C4 roughly between 5 and 10 Ma. The colonization of New Guinea out of Australia are suggested to have occurred at the same time. The colonizations of China, Hawaii and Vanuatu were suggested to be recent more or less long-distance dispersal events out of New Caledonia in C4.
The best model of diversification rate dynamics recovered in the TreePar analysis is a model with a unique diversification shift (− logL = 199.856, p-value = 0.008 compared to the constant rate model) in the mid-Pliocene around 4.1 Ma (Fig. 2, Supplementary Table S1). This model is significantly better than a constant-rate model or other diversification models with varying rates. Using a diversity of 300 species instead of the 198 known to us, we obtain similar results with a significantly better model including a unique shift around 4.1 Ma. Under this model Exocelina would have had an initially slow diversification rate between 15.35 Ma and 4.1 Ma before entering a stage of high diversification that is still ongoing. The model recovers a slight turnover in the early evolution of the group and no turnover after the diversification rate shift.
The diversification rate analyses conducted in the program BAMM (Bayesian Analysis of Macroevolutionary Mixtures) also recovered one significantly supported shift localized in C4 at the root of the New Guinean radiation (Fig. 2). Our clade-specific analyses are presented in Figs 3 and 4. Although the reduced Australian and New Caledonian clades are found with a similar speciation rate (respectively λ = 0.324 and λ = 0.320), the New Guinean radiation has a much faster speciation rate (λ = 0.785) whereas the global Exocelina radiation has an intermediate value (λ = 0.497) (Fig. 3). As highlighted in Figs 3 and 4, the New Guinean radiation is responsible for a large part of the speciation rate under the BAMM model with a calculated speciation rate for Exocelina of λ = 0.328 without the New Guinean radiation.
The best model recovered in the diversification rate analyses conducted in MuSSE (Multiple State Speciation Extinction) regarding the diversification pattern of the islands suggests different speciation rates but equal extinction and transition states (AIC = 648.46, − logL = 318.23, Supplementary Table S2). As a result, the best model supports the hypothesis of different speciation rates depending on the island the beetles are or have colonized. The second best model recovered is similar to the best model but it also relaxes the extinction parameter so that the different islands have both different speciation and extinction rates (AIC = 653.16, − logL = 317.58). One model (λ 1, μ 1 and q1 free) consistently crashed in the program R and as result was not included in the global comparison of models although considering the likelihood and AIC scores of the closest models it seems unlikely that this model might have been the best.

Discussion
We confirm an Australian origin for the genus Exocelina 28 and refine the temporal sequence during which these beetles colonized surrounding regions and diversified. The ancestor of the Exocelina radiation originated in Australia in the mid-Miocene. Our results are in agreement with previous studies using different calibration priors to obtain absolute divergence ages. Balke et al. 28 found an origin of Exocelina at 23.34 Ma (95% HPD: 20.8-25.9 Ma) using a calibration on the node connecting the stygobiont species E. abdita and its sister lineage accounting for the timing of underground aquifer invasion in Australia. More recently Toussaint et al. 26     setting 14 . The best model suggests colonization of New Caledonia and New Guinea in the late Miocene within roughly 10 Myr. Additionally, recent taxonomic work revealed one lentic species (E. baliem) in New Guinea which we were unable to sequence but places together with the Australian E. ferruginea in C2 based on morphology 30 . Likewise, the New Caledonian fauna comprises the lentic E. inexpectata known from the holotype only and that is closely related to two above species based on an apomorphy from the male genital (paramere shape) therefore suggesting an additional colonization event of New Caledonia 30  For New Caledonia, a scenario of long-distance dispersal caused by climatic disruptions might be more suitable with respect to the distance separating the island from Australia. Indeed, the only land bridge hypothesized between Australia and New Caledonia has been suggested to have existed until the Paleocene/Eocene 32 . This largely predates the colonization events out-of-Australia in our case. Long-distance dispersal events have rarely been documented in diving beetles but the distribution of some endemic clades and species in remote oceanic archipelagoes (e.g. Hawaii, Tristan da Cunha) seem to indicate that these beetles are good dispersers. Understanding if these events are the result of active (direct flight) or passive dispersal (e.g. as adults by rafting on tree logs or wind dispersal/as larvae in soil in which they pupate/as eggs in aquatic plant stems and leaves) would certainly bring new insights into our understanding of biogeographical and diversification processes in dytiscids.
All diversification rate analyses indicate a single shift in the evolution of the genus (Fig. 2). The BAMM analyses indicate that this shift took place when Exocelina colonized New Guinea in the late Miocene. This event comes comparably late in the evolution of Exocelina, as New Caledonia had already been colonized twice despite its more remote location.
Our results reject some of our initial hypotheses as well as predictions from the theory of island biogeography and reveal some interesting patterns for the region: The two distinct New Caledonian clades do not exhibit particularly fast diversification rates compared to the New Guinean radiation. Interestingly, the diving beetle fauna of New Caledonia is rather diverse, with Exocelina accounting for more than 50% of its species, but there are no ecologically similar species 33 . This might indicate either that these beetles were able to diversify into vacant niches at the time of their arrival on the island, or that they were extremely efficient competitors that eliminated pre-existing indigenous taxa occupying the same niche. Copelatus are phylogenetically and ecologically close but only one stagnant water species occurs in New Caledonia 33 . Other genera are either not associated with running-water or ecologically well differentiated even in terms of body size and shape. Therefore, they were unlikely to represent competitors for niche filling. This is supported by the fact that Exocelina diving beetles are absent from Fiji where Copelatus diving beetles are quite diverse occupying the same type of habitat as Exocelina 34 . In addition to vacant niches, the association of these beetles with running-water habitats might have fuelled their diversification through local allopatric speciation 35,36 . Micro-endemism is likely a dominant process accounting for the diversification of Exocelina 26 . The ancestor of the second and larger running-water clade C4 might have brought new ecological preferences, helping them to compete and diversify on the island.
One colonization of New Guinea gave rise to a radiation of certainly more than 150 riparian species. This diversification has been very fast (Figs 1-4) and serves as a prime example of in situ speciation as a source of (island) diversity 8 . This result is in contradiction with theoretical predictions as most of the diversity results from a unique colonization event followed by cladogenetisis rather than from multiple colonization events. As suggested by Toussaint et al. 26 , there is an intimate relationship between the massive orogeny of New Guinea and the diversification processes responsible for this diversification. The association of micro-endemism with high elevational gradients along the central Cordillera might have supported ecological (perhaps montane) speciation 37,38 along elevational ecological gradients and/or allopatry in a complex vertical and horizontal setting. New Guinea Exocelina occur from almost sea level up to at least 2800 m 26 . Especially towards lower altitudes, they compete with ecologically very similar but slightly larger New Guinea Copelatus species (Balke pers. com.), which might explain the lack of larger New Guinea Exocelina species.
Our results reveal heterogeneous post-colonization diversification patterns on different Melanesian islands. Colonizers from the Australian source faced different levels of competition, niche preemption and habitat availability with these different parameters being shaped by the age and distance from the source of the island being colonized. Here we show that on the very recent and close landmass of New Guinea, a unique colonization event resulted in one of the most extensive beetle radiations of the region. The recent nature of that colonization and fast ongoing diversification are reflected by rather homogeneous morphology and lineage accumulation below saturation. Roughly 3,000 kilometers southeast, New Caledonia exhibits a rather different evolutionary history. Colonizers have reached the island on multiple occasions but unlike New Guinea, the island at the time of colonization hosted a variety of habitats and a diverse biota derived either from old Gondawanan stocks and/or from relatively old colonizers that settled in New Caledonia after its debated submergence [20][21][22][23][24] . If that was the case, the new colonizers were able to outcompete such fauna, and diversify extensively. Competition with a second Exocelina colonization might have fostered the extinction of the species of the older clade and pushed survivors to more extreme high altitude habitats. Our results broadly embrace the predictions of the theory of island biogeography but also bring new insights to our understanding of Melanesian biogeography and macroevolutionary patterns and processes. The study of well-established systems such as Exocelina should be moved to the next level where the use of comprehensive local sampling across environmental gradients and population genomics will help to reveal the actual ecological and evolutionary mechanisms of speciation on these islands 8 .

Molecular phylogenetics.
We used Bayesian Inference (BI) and Maximum Likelihood (ML) to reconstruct phylogenetic relationships using a concatenated dataset. The partitions and corresponding optimal models of substitution were searched under PartitionFinder 1.1.1 41 using the 'greedy' algorithm. Both 'mrbayes' and 'raxml' sets of models were used in combination with the Akaike Information Criterion corrected (AICc) to compare the fit of the different models of substitution. All protein-coding gene fragments were divided by codon positions and non-coding gene fragments were kept alone. Since it is not currently possible to specify different models of substitution for different partitions in RAxML, we used the partitions selected in PartitionFinder with a GTR+ Γ + I model. The BI analyses were performed using MrBayes3.2.2 42 . Instead of selecting the substitution models a priori based on the results of PartitionFinder, we used the different partitions recovered but used reversible-jump Metropolis-coupled Markov chain Monte Carlo to explore the entire space of substitution models 43 . Two simultaneous and independent runs consisting of eight MCMC (one cold and seven incrementally heated) running 100 million generations were used, with a tree sampling every 5000 generations to calculate posterior probabilities (PP). We assessed convergence of the runs by investigating the average standard deviation of split frequencies and Effective Sample Size (ESS) of all parameters in Tracer1.5 (http://BEAST.bio.ed.ac. uk/Tracer). A value of ESS > 200 was acknowledged as a good indicator of convergence. All posterior trees that predated the time needed to reach a log-likelihood plateau were discarded as burn-in, and the remaining samples were summarized to generate a 50% majority rule consensus tree. The ML analyses were conducted with the best partitioning scheme selected in PartitionFinder 1.1.1 41 using RaxML 44 . We performed 1000 Bootstrap replicates (BS) to investigate the level of support at each node'.

Divergence time estimation.
Divergence times were inferred with BEAST 1.8.0 45 . The partitions and models of nucleotide substitution were selected under PartitionFinder 1.1.1 41 using the 'greedy' algorithm, the 'beast' set of models and the AICc. We tested the applicability of a molecular clock for both datasets using Paup* 46 , and since it was significantly rejected (P < 0.001), we used a Bayesian relaxed clock allowing rate variation among lineages as implemented in BEAST. In order to calibrate the tree, we used an amber fossil of the genus Copelatus. Copelatus aphrodite † is an extinct species described based on a very-well preserved female specimen embedded in Baltic amber 47 . Because of its exceptional condition, the authors were able to place this species with confidence in the genus Copelatus even though it was not assigned to an extant species-group 47 . Because we sampled a substantial number of Copelatus species-group to capture a part of the variability in this diverse genus, we placed the fossil at the crown of Copelatus. Evidences from stratigraphic 48 and K-Ar radiometric studies 49  million years (Myr) at the root, an age equivalent to the oldest Dytiscid fossil known 51 . As a result, a minimum age of 44.0 Myr was placed at the crown of Copelatus with a soft exponential distribution prior (95% of the distribution between 44.0 and 154.8) as advocated for fossil information 52 . The runs performed under a Birth-Death process consisted of 100 million generations sampled every 2000 generations. The convergence of the runs was investigated using ESS, a conservative burn-in of 25% applied after checking the log-likelihood curves and the different runs merged using LogCombiner 1.8.0 45 . The maximum credibility tree, median ages and their 95% highest posterior density (HPD) were generated afterwards under TreeAnnotator 1.8.0 45 . Ancestral range estimation. We used BioGeoBEARS 53 as implemented in R to infer the biogeographical history of Exocelina diving beetles across their entire range of distribution. This program allows estimation of ancestral ranges under different models such as DEC 11,12 , a likelihood interpretation of DIVA 54 (DIVALIKE) or a likelihood interpretation of BayArea 13 (BAYAREALIKE). Additionally, it implements a parameter describing founder-event speciation (+ J, post-colonization cladogenesis) likely important in oceanic settings 14 Table S4).

Diversification analyses.
We used the program R with the BEAST MCC tree (Supplementary Information S5) from which we pruned all outgroups to investigate the diversification pattern of Exocelina diving beetles in a temporal framework whilst accounting for missing taxon sampling. We used different R packages to test several hypotheses regarding the evolution of Exocelina diving beetles in Oceania.
First, we used the package TreePar 55 to estimate the potential shifts in speciation and extinction rates in the whole phylogeny through the function 'bd.shifts.optim' (Supplementary Information S7). This function uses the empirical branching times from the MCC tree as an input and fits several birth-death models including 0 (constant-rate model) to several diversification rate shifts during the lineage evolution. We tested different models ranging from 0 to 5 rate shifts. All analyses were carried out with the following non-default settings: taxon sampling was set to 164/210, start = 0, end = 15 and grid = 0.1 Myr for a fine-scale estimation of rate shifts. Since New Guinea in particular might hold a large layer of unknown diversity, we also acknowledged a possible bias in clade species-richness using a total putative diversity of 300 species. We calculated AICc scores and computed Likelihood Ratio Tests (LRT) to select the best-fit between the different models allowing incrementally more shifts during the evolution of the clade.
Second, we used Bayesian Analysis of Macroevolutionary Mixtures (BAMM 56 ) and its R implementation BAMMtools 57 to identify clades with higher or lower speciation rates in the phylogeny ( Supplementary Information S8). BAMM relaxes the assumption of time-homogeneous diversification that underlies alternative approaches such as the MEDUSA model 58 . It incorporates incomplete taxon sampling directly into likelihood calculations; here we assumed that our tree contains 83.0% of Exocelina diving beetle species. We performed multiple BAMM runs on the MCC Exocelina phylogeny, with 5 million generations of Markov Chain Monte Carlo (MCMC) sampling per run and sampling evolutionary parameters every 1000 generations. We reconstructed marginal distributions of net diversification rates for each branch in the BEAST MCC tree without outgroups using the posterior distribution of evolutionary parameters sampled using the reversible jump MCMC algorithm in BAMM. Finally, we reconstructed rate-through-time curves for the whole genus with or without New Guinea, but also individually for Australia, New Caledonia and New Guinea from the joint posterior density of parameters simulated with BAMM. Since it is not feasible to reconstruct plots for paraphyletic clades we considered only the large clades of Australian and New Caledonian species (including Vanuatu) therefore excluding 10 species from further calculations.
Third, we tested if diversification on the three main islands of Australia, New Caledonia and New Guinea was different throughout the evolution of the genus. To do so, we used the Multiple State Speciation Extinction (MuSSE 59 ) model implemented in the Diversitree package 59 (Supplementary Information S9). Three parameters for each state are included in this analysis; a speciation rate (λ ), an extinction rate (μ ) and a transition rate between different states (q). In order to test the hypothesis of different speciation rates between Oceanian islands, we used four states: Australia, New Caledonia, New Guinea and other (China, Hawaii, Vanuatu). The MuSSE model allows testing a wide array of parameter combinations (e.g. all speciation rates equal, all mutation rates different, some transition rates equal and some others different). We tested 36 different models and then compared their likelihood using AICc. We estimated posterior density distribution with Bayesian MCMC analyses (10,000 steps) performed with the best-fitting models and the resulting speciation, extinction and dispersal rates.
Finally, we used the package ape 60 to infer lineage-through-time (LTT) plots for the New Guinean radiation as well as for the two main clades considered in the BAMM analyses using 100 pruned BEAST posterior trees.