Replying to Daniel Berner. Nature Communications https://doi.org/10.1038/s41467-021-23092-1 (2021)

A Matters Arising article1 raised concerns about the interpretation of our findings reported in our recent publication on admixture-facilitated ecological speciation in Lake Constance stickleback2. After careful consideration of the criticism, including additional analyses testing the proposed alternative hypotheses, we can confirm our confidence in the inference of secondary contact between a West European and an East European stickleback lineage in the catchment of Lake Constance, and that this admixture facilitated the ecological divergence between lake and stream ecotypes within Lake Constance2.

In particular, Berner1 (i) questioned whether West and East European stickleback populations should be considered as divergent lineages, (ii) suggested that Lake Constance stickleback originated from the upper Danube instead of East Europe, (iii) questioned the suitability of our demographic modelling approach to reject an ‘ecological vicariance’ scenario, (iv) proposed that divergent selection within Lake Constance biased our inference of a secondary contact and admixture scenario, and (v) criticized our conclusion on admixture-facilitation of ecological speciation as premature. We address each of these concerns in this sequence.

Divergent West and East European lineages

The deepest divergence among European threespine stickleback is between the Trans-Atlantic and South European clades3,4,5,6, with West and East European and Lake Constance stickleback part of the former, as we had clearly stated2. Within the Trans-Atlantic clade, hierarchical subclades exist that are structured by geography with divergence times estimated by others between 37 and 6.5 ky6,7, consistent with isolation in distinct glacial refugia3 or between different river catchments colonized during postglacial range expansion6,7. We referred to these as “divergent lineages”, consistent with our demographic modelling based estimates of ~8000 years between West (Rhine/upper Rhone) and East (Vistula) European populations2 (assuming 2 years generation time based on the average lifetime reproductive age rather than the age of first reproduction8) and high genomic differentiation (Fig. 5a in2). The limited bootstrap support for a reciprocally monophyletic West European stickleback clade in a new phylogenetic analysis of Berner1 (grey rectangle in Fig. 1 in1) is irrelevant to the argument and a consequence of the inclusion of hybrid populations (e.g., Lake Constance, upper Danube populations, see below) reducing internal branch bootstrap support2,9. The new analyses of Berner1 thus do not contradict our interpretation that stickleback populations from West and East European river catchments are old and divergent lineages2.

Fig. 1: Signatures of secondary contact and admixture are not artefacts of parallel evolution.
figure 1

As uniquely predicted by the hypothesis of admixture between Rhine / upper Rhone and East European lineages in Lake Constance2, Constance stream stickleback show much stronger excess allele sharing with Rhine and upper Rhone populations than with other stream-adapted stickleback from West and South Europe. In contrast, a parallel evolution bias would predict excess allele sharing between all stream-adapted stickleback populations with Lake Constance stream stickleback, compared to the lake ecotype. a Map of European stickleback populations used in tests for excess allele sharing with colours indicating population identity, with (b) inset showing the Lake Constance region where colours indicate lake or stream habitat. c Predicted and observed patterns of excess allele sharing between West European stream-adapted stickleback (P3) and Lake Constance stream stickleback (P2), compared to Lake Constance lake stickleback (P1). Coloured dots show the estimate of Patterson’s D for each comparison and whiskers ±3 standard errors derived from a standard block-jackknife procedure. Gasterosteus wheatlandi (n = 1) was used as outgroup (O in the tree) in all tests for excess allele sharing, and all topologies are supported by P1 / P2 showing the highest shared derived allele count (number of “BBAA” patterns). Brackets next to population abbreviations give the number of individuals per population used in each test (e.g., n = 6 individuals for population FRS4). Watershed maps are derived from “Water Base: Global River Basins” by The World Bank used under CC BY 4.0, river and lake maps from “European catchments and Rivers network system (Ecrins)” by the European Environment Agency (EEA). Source data are provided as a Source Data file.

Origin and timing of colonization

According to Berner1, phylogenetic clustering of Lake Constance stickleback with upper Danube stickleback suggests a natural colonization of Lake Constance through its postglacial connection to the Danube 15-10 ky ago10, as shown for other fish species with Danubian or mixed Danubian and Rhine ancestry and native to Lake Constance11,12,13. However, the history of hybrid lineages cannot be resolved on the sole basis of a phylogeny from concatenated markers. Evidently, the two individuals from the Lake Constance GRA population in Berner’s analysis cluster in two different clades (Fig. 1 in1, Supplementary Fig. 1d), which is consistent with our previously estimated 50:50 West vs. East European ancestry2. We previously assumed that upper Danube stickleback are hybrids from multiple introductions including West European lineages14 and thus excluded them from our initial analysis2. Now, we have tested this assumption: upper Danube stickleback indeed show an admixture signature between West and East European populations just like Lake Constance stickleback (Supplementary Fig. 2), including phylogenetic analyses clustering them with either West or East European lineages depending on the context of what other lineages are included in a tree (Supplementary Fig. 1e). An admixed West / East European origin of both Lake Constance and upper Danube stickleback thus explains their similarity and placement in a phylogeny dominated by hybrid populations (see1, Supplementary Fig. 1d).

When and how stickleback got into Lake Constance and into the upper Danube remains difficult to resolve with genomic data alone, considering the complexity of demographic models2,15 and uncertainties in underlying mutation rates. In particular, the inference of split times based on the site frequency spectrum (SFS) relies on a known mutation rate or a known effective size16. We used a gene alignment based17 mutation rate of 1.7e-8, which is more reliable that the arbitrary estimate of 6.8e-8 obtained from an SFS based demographic analysis without fixing any population size or time divergence parameters18. Luckily, a rich ichthyological record of the Danube and Lake Constance documents the historical absence and only recent colonization of threespine stickleback in both regions14,19,20,21,22,23,24,25,26,27,28,29 as well as known anthropogenic introductions14,23,29, inconsistent with a natural postglacial colonization of Lake Constance from the Danube. Geographically highly resolved historical and contemporary data on lateral plate morph distributions in Europe14,23,24,30,31,32,33,34 further support arrival in Lake Constance from multiple sources, in line with genomic signatures of secondary contact and admixture between several lineages2 (Supplementary Fig. 2). Considering all evidence, a rather recent formation of a hybrid lineage in secondary contact in the Lake Constance catchment, facilitated by multiple introductions appears to be the most parsimonious scenario, in line with our earlier interpretation2.

Demographic modelling of “ecological vicariance”

Berner1 criticized that neutral demographic models cannot adequately represent an “ecological vicariance” scenario in which divergent selection is a central component18. If selection-driven divergence indeed precluded inference of population history, earlier interpretations of the origin and timing of Lake Constance stickleback from a phylogeny and demographic model1,18 would be flawed too. But divergent selection affects genomes only locally at and around targets of selection while much of the genome evolves under neutrality, even though the extent of either might be debated35,36,37. Both empirical18,38 and experimental data39,40 in stickleback support the view that only a minority of the genome is affected by divergent selection, with the majority preserving information on the neutral history that can be harnessed in demographic modelling or phylogenetic analyses. The “ecological vicariance” hypothesis is thus certainly testable with neutral demographic models, especially because it makes a clear demographic prediction: colonization by a single lineage followed by primary divergence between ecotypes within this lineage. In our test of these neutral predictions, we were able to clearly reject an ‘ecological vicariance’ scenario over better-fitting alternative secondary contact and admixture scenarios2 which had not been considered previously18.

Parallel evolution vs. inference of secondary contact

Berner1 raised an interesting possibility that parallel evolution could have biased our inference of secondary contact and admixture both in our demographic analyses and our other population genomic analyses2. Specifically, parallel evolution due to selection on the same alleles in similar habitats would lead to support for demographic models with admixture between independently evolved stream-adapted stickleback and signatures of excess allele sharing between such stream-adapted stickleback relative to lake-adapted stickleback closely related to one of them (Fig. 1c). In our demographic modelling, we controlled for local effects of selection by removing low recombination regions from the analysis2, with the rationale that effects of background and divergent selection on linked neutral variation are strongest in regions of low recombination while highly recombining regions behave mostly neutrally36. Berner1 questioned the efficacy of such a control due to the lack of a correlation between divergent selection and recombination rate in whole-genome data40. However, we used sparser RAD-sequencing data in our demographic modelling that does show an enrichment of divergent selection signatures in low recombination regions18,38 as predicted for low marker densities41,42. Our approach to avoid effects of selection is thus justified and should preclude effects of a possible parallel evolution bias on our demographic inference.

To confirm that our analyses of excess allele sharing are not affected by a parallel evolution bias, we now included additional stream-adapted West and South European populations that we do not expect to have contributed to the gene pool in Lake Constance. Such populations should show the same excess allele sharing with stream-adapted stickleback of Lake Constance under the parallel evolution hypothesis but not under the admixture hypothesis (Fig. 1c). We found that excess allele sharing signatures were much stronger for, or entirely confined to, Rhine and upper Rhone stickleback than for other stream populations (Fig. 1c), consistent with admixture but not with parallel evolution. Furthermore, all Lake Constance stickleback show signatures of admixture between Rhine / upper Rhone and East European lineages independent of habitat / ecotype (Supplementary Fig. 2 and Supplementary Fig. 2 in2), as predicted by admixture but not by a parallel ecotype evolution bias. Additionally supported by evidence for admixture from the mtDNA phylogeography2,3, we can confidently reject the hypothesis that parallel evolution caused false admixture signatures in Lake Constance stickleback2.

Evidence for admixture-facilitation of ecological speciation

Our conclusion on admixture-facilitation of ecological speciation was deemed premature by Berner1, barring a more rigorous demonstration of the absence of adaptive alleles in the source populations or tracing the origin of haplotypes back to source populations. The former would be a challenging task due to uncertainty about the exact introduction or colonization routes, more recent admixture in populations along that colonization route and trade-offs between sequencing more individuals and full genomes. We welcome future research, such as haplotype-based reconstruction. Nonetheless, we believe that our evidence of divergent sorting between habitats of admixture-derived alleles in genomic regions under selection2 does already lend significant support to admixture-facilitation of ecological speciation.

Methods

We added to our SbfI (+PstI) RAD-sequencing dataset2 previously published data from three upper Danube populations (DAN;1 SZO;6 MUR6), one Lake Constance stream population (GRA1) and one West European stream population (KIN6, see data availability statement below for accessions) and repeated read alignment, variant and genotype calling and filtering with the same parameters used in our previous analysis2. We also repeated the addition of outgroup alleles to the resulting SNP dataset, using the Black-spotted stickleback Gasterosteus wheatlandi genome43 as fixed outgroup in computations of Patterson’s D-statistic44 with Dsuite v0.345. We used D-statistics to test (i) whether stream-adapted stickleback populations from West and South Europe, regardless of admixture history, show excess allele sharing with stream-adapted populations from Lake Constance, relative to the lake ecotype (Fig. 1c), and (ii) whether both Lake Constance and upper Danube stickleback show signatures of admixture between West and East European stickleback lineages (Supplementary Fig. 2). We used D-statistics of four taxon topologies with the highest number of shared derived alleles (‘BBAA pattern’45), a two-tailed standard block-jackknife procedure implemented in Dsuite with default parameters and considered p-values corrected for false discovery rate46 in R v4.0.247 below 0.01 as significant excess allele sharing. We also repeated previous phylogenetic analyses2 with the three upper Danube populations and the additional Lake Constance stream population (GRA) included (Supplementary Fig. 1d), as well as for subsets excluding all Lake Constance and upper Danube populations (Supplementary Fig. 1c), including only one Lake Constance or upper Danube population or including one Lake Constance and upper Danube population each (Supplementary Fig. 1e), with filtering and parameters as used previously2.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.