Introduction

While the biogeography of terrestrial habitats has been studied for centuries1, our understanding of insular freshwater habitats remains poorly understood. Before genetic methods, freshwater animals were considered as exemplars of cosmopolitanism2. Later, the slogan “Everything is everywhere, but the environment selects” emerged3. For decades, cosmopolitanism reigned as the universal paradigm of freshwater biogeography4. Then, in the 1970’s, as cladocerans became developed as model organisms, non-cosmopolitanism gained favor5,6. But, with the recent discovery that freshwater anthropogenic introductions have been widespread7,8,9, the diagnosis of disjunct populations became problematic. Moreover, regional hotspots seemed to contain cryptic diversity for some taxa such as cladocerans10. The breeding system of most cladocerans, cyclic parthenogenesis, appeared to provide dispersal advantages. Ultimately, discerning recent introductions from relictual insular populations remains difficult without additional evidence from genetic markers. Fortunately, genetic analysis is now possible for testing biogeographic hypotheses for all sizes of freshwater animals.

The “ejected relict” scheme initially proposed by Wallace1 (see review by Eskov11), and later adapted for the freshwater Cladocera (Crustacea) by Korovchinsky10 has been used to explain latitudinal diversity centers. According to this hypothesis, a step-by-step mass extinction from the middle Caenozoic to recent times resulted from climate aridization. Refugia were located in the subtropics and the adjacent temperate zones in both the northern and southern hemispheres. Today these aridity refugia represent centers of cladoceran endemism: e.g. the Mediterranean zone, the Iberian Peninsula, and the southernmost extremity of Africa10.

Importantly, Pleistocene glacial oscillations have also had a strong regional effect on extinction and the current biogeography of Holarctic plants and animals12. Evidence for a biogeographic effect of glaciation in freshwater taxa has been reported in fishes13,14,15, freshwater insects16, molluscs17, and aquatic plants18. Similarly, cladoceran biogeography may have been markedly affected by Pleistocene glaciation in North America19,20 and in Europe21,22. However, large scale geographic studies are rare23,24,25 and some large regions such as the Eastern Palearctic are poorly studied26,27,28 compared to other regions of the Holarctic. Even less is known of the interactions of ecological and evolutionary processes that contributed to the biogeography of Eastern Palearctic cladocerans29,30,31,32.

Recently, evidence has emerged that the temperate Eastern Palearctic also represents a zone of cladoceran endemism33. Several endemics (or possible endemics) have been found here such as: Macrothrix pennigera Shen, Sung & Chen, 1961 in NE China34, Ilyocryptus uenoi Kotov & Tanaka, 2004 and Diaphanosoma kizakiensis Korovchinsky & Tanaka, 2013 in Japan35,36, Daphnia sinevi Kotov, Ishida & Taylor, 2006 and Chydorus irinae Smirnov & Sheveleva, 2010 in the Far East of Russia37,38 and Pleuroxus jejuensis Jeong, Kotov & Lee, 2013 in Korea39. Genetic evidence has been important in the establishment of the special zoogeographic status of the Eastern Palearctic24,25,27,28. Although some phyloclades are shared with the Western Palaearctic30, numerous endemic Eastern phyloclades have been detected. Some of these lineages are described as separate taxa40,41, but many others are waiting for such a description27,42. Kotov33 concluded that the Eastern Palearctic contains the highest degree of cladoceran endemism in the world.

Here we use genetic markers to test geographic boundaries of lineages in the widespread cladoceran, Daphnia curvirostris s.l. (Cladocera: Anomopoda: Daphniidae). Recently, three endemic species in the complex were described from the Palearctic: Daphnia tanakai Ishida, Kotov & Taylor, 2006 in Japan40, Daphnia sinevi Kotov, Ishida & Taylor, 2006 from Primorski Krai, Russia37, and Daphnia hrbaceki Juračka, Kořínek & Petrusek, 2010 from Central Europe43. We know little of the geographic boundaries of these species because there has been very limited geographic sampling.

Although the entire species complex is thought to be widespread in the temperate Palearctic (with rare occurrences in high altitude African lakes44), there are only four known sites from the Nearctic – all disjunct, see below. D. curvirostris was first reported in North America from three ponds in northwestern Canada45. It has not been reported from other locations in Canada since 1986. Duffy et al.46 then provided paleogenetic evidence that Daphnia curvirostris had invaded Onondaga Lake, New York and thrived from the 1950’s until the 1980’s when ephippia disappeared in the dated sediment record. The samples from NY showed 99% similarity in the mitochondrial 12SrRNA gene to sequences from two European specimens. A third disjunct population of Daphnia curvirostris was found in a thermally disturbed pond complex (Pilgrim Hotsprings) in northwest Alaska40. While clearly assignable to Daphnia curvirostris, the relationship of this Alaskan population to Palearctic populations remains unknown. Lastly, Nandini et al.47 recorded Daphnia curvirostris from Central Mexico and proposed an anthropogenic origin. However, there is presently no genetic comparison of the Mexican populations with potential sources in the Palearctic.

Here, we aim to better establish the geographic boundaries of the newly discovered Palearctic lineages and compare the disjunct Nearctic Mexican and Alaskan specimens to Palearctic specimens.

Results

Sequence variation

We collected genetic information for 194 individuals of the D. curvirostris complex from 51 populations across the Palearctic with a special focus on the Eastern Palearctic (Fig. 1; Supplementary Table S1).

Figure 1
figure 1

All sites from which diversity of the Daphnia curvirostris group has been analyzed (a), and a region of the group maximum diversity — Far East (b). The base map is the Marble Virtual Globe 1.5.1 “plain map” (i.e., no attributable data layers) available at https://marble.kde.org/.

The alignment of ND2 (mitochondrial NADH dehydrogenase 2) was unambiguous and each sequence had a single open reading frame. Only single codon deletions were observed within the complex. This deletion was present in a 10–codon long variable region of an Alaskan sequence that also had differing single codon deletions in the outgroup species and in D. hrbaceki. The best fit model for the ingroup ND2 alignment was TN + F + I + G4 which had a Bayesian Information Criterion score of 23451.8840. All but one of our analyses found the same root for the D. curvirostris complex (Figs 2 and 3; Supplementary Figs S1S3). For the ND2 alignments, three outgroup sequences for the nucleotide data failed the base composition homogeneity test provided by IQtree while no outgroup sequences for the amino acid data failed the same test.

Figure 2
figure 2

Maximum likelihood tree based on sequences of the mitochondrial ND2 gene representing the diversity among phylogroups of the Daphnia curvirostris group. The support values of individual nodes are based on maximum likelihood. Colors and shapes for main phylogroups correspond to those in Fig. 1.

Figure 3
figure 3

Maximum likehood tree based on sequences of nuclear HSP90 gene representing the diversity among clades of Daphnia curvirostris group. The support values of individual nodes are based on maximum likelihood. Colors and shapes correspond to mitochondrial phylogroups in Figs 1 and 2.

ND2 phylogenetic tree and geographic distributions

The entire D. curvirostris complex was monophyletic and divided into five major clades which may represent species or species groups (Fig. 2). Major clade 1 (Daphnia curvirostris s.str.) had three geographic subclades (subclades are lettered from A-C, red in Figs 1 and 2). Subclade A had a broad distribution in the Northern Palearctic from France to Eastern Siberia (Yakutia; Fig. 1). The Mexican specimens were nested within subclade A, supporting the Palearctic origin hypothesis. A more basal Subclade B within D. curvirostris was detected in Eastern Siberia (Yakutia), the continental portion of the Far East of Russia and Sakhalin Island (but absent from the Japanese samples). Another basal lineage was subclade C which was found only in a single locality (Pilgrim Hotsprings) in northwestern Alaska. Major clade 2 (D. hrbaceki) contained the type locality population (a recently created pool in Central Europe) of D. hrbaceki (subclade D, black circle in Fig. 2). The remaining major clades were restricted to the eastern Palearctic. Major clade 3 (also a described species, D. tanakai) was represented by a single subclade E (grey circle in Fig. 2), and was restricted to mountain lakes in Japan. Major clade 4 (D. sinevi, blue in Fig. 2) was ranged from the continental Far East of Russia to Sakhalin Island. Clade 4 contained three more regionally distributed subclades: subclade F was endemic to Sakhalin Island; subclade G was found in a single population in Primorie (the continental Far East of Russia); and subclade H was relatively widely distributed in the continental Far East of Russia. Major clade 5 (termed Daphnia sp. nov. here to denote a putative new species, green in Fig. 2) was present in the continental Far East of Russia, Japan and Korea, and was represented by four subclades, each with very restricted ranges: subclade I was found in a single locality on the continental Far East of Russia, the subclade J was found in a single lake in Japan, subclade K was present in a single locality in continental Korea, and subclade L was found in four localities on Jeju Island (South Korea). Major Clade 5 differed from the closest clade with an existing species name (D. sinevi) by an average of 19.2% for the ND2 sequence using a Jukes-Cantor corrected distance (Table 1).

Table 1 Average number of nucleotide substitutions per site with Jukes Cantor correction (Dxy(JC)) between taxa of the Daphnia curvirostris complex.

The protein based ND2 trees (Supplementary Figs S2 and S3) were very similar to the nucleotide tree with strong support for the five major clades and most of the minor subclades. One difference between the protein and nucleotide based trees was the switched positioning of the Alaskan D. curvirostris with the subclade B sequences – but both B and C remained basal to the geographically widespread A subclade of D. curvirostris. Notably, outgroup rooting with an amino acid alignment (Supplementary Figs S2 and S3) supported the same root for the curvirostris complex as found by the midpoint rooted nucleotide tree (Fig. 2).

HSP90 tree

Model-fitting determined three models for the partitions by codon position and intron/exon. For the first and second positions, TN + F + I was the best fit model, while K2P + G4 was the best fit model for third codon positions. For introns the best fit model was HKY + F + G4. The well supported branches of the HSP90 and ND2 trees were in agreement (Fig. 3). Five major clades with strong support were found (note that we lacked HSP90 sequences for D. hrbaceki, the major clade 2 of mitochondrial tree). Again, major clade 1 corresponded to the Daphnia curvirostris group. It contained two subclades, corresponding to the mitochondrial clades A + C (with moderate support) and clade B (with a strong support). Major clade 2 corresponded to D. tanakai, represented by sequences from two lakes in Japan. It contained no structure and corresponded to mitochondrial subclade E. Major clade 3 corresponded to D. sinevi. It contained two clades, corresponding to mitochondrial subclades G, and F + H. Major clade 4 corresponded to a single mitochondrial subclade J, a portion of the undescribed species complex Daphnia sp. nov. Major clade 5 corresponded to the rest of this proposed new species complex, mitochondrial subclades I + K + L. As with ND2, D. curvirostris-like taxa form a well-supported monophyletic clade with the same root.

ND2 median joining (MJ) haplotype network for Daphnia curvirostris s.str

The MJ network of Daphnia curvirostris contained 40 different ND2 haplotypes (Fig. 4). The three main divisions (A-C) found in the tree analyses for D. curvirostris were apparent in the network.

Figure 4
figure 4

Median-joining ND2 haplotype network. Median vectors are indicated by small black circles. Number of mutations for each branch is represented.

Subclade A was represented by a sub-network of closely-related haplotypes (the average number of nucleotide differences k = 3.8). Still only private haplotypes (no regional sharing) for the eastern and western Palearctic were observed. There was at least one star-like topology, suggesting recent population expansion. The central haplotype of this star contained European-Western Siberian and Mexican individuals (Tajima’s D = −1.55; P = 0.048; Fu’s Fs = −14.57; P < 0.001). Subclade B was represented by a group of haplotypes that lacked a star-like pattern (Tajima’s D = −0.65072, P > 0.10; Fu’s Fs = −1.08), with modest haplotype divergence (average number of nucleotide differences, k = 19.6). Subclade C from Alaska was represented by a single haplotype connected to a haplotype within the clade A, but differed from subclade A by an average of 19.1 nucleotides.

The mismatch distribution analyses (Supplementary Fig. S4) for subclade A deviated significantly from the graph expected by constant population size (raggedness index = 0.0126; P = 0.017).

In order to assess the potential for mitochondrial pseudogenes in our data we translated the codon sequences and tested for pervasive purifying selection (an expected pattern for mitochondrial genes). The prevailing pattern of selection for ND2 in the D. curvirostris complex was purifying selection (non-synonymous/synonymous rate ratio = 0.1299). However, MEME did identify eight sites with significant episodic positive selection (Supplementary Fig. S5). A similar pattern was found with FUBAR (Supplementary Table S2) where four sites had significant episodic positive selection and 206 sites had significant purifying selection.

Discussion

The results indicate that the present genetic diversity of the geographically widespread Daphnia curvirostris complex is greatest in the eastern Palearctic. However, we did find evidence for a widely distributed clade that appears to have expanded from a single glacial refugium to a large portion of the Palearctic (with recent, presumably anthropogenic introduction to Mexico).

Often, the present ranges of Holarctic freshwater animals are proposed to result from extinction and bottlenecks during the last glacial maximum followed by expansion from multiple glacial refugia12. The present-day contact zones of these lineages is often detected at a considerable distance from the refugia. In the Palearctic, for example, a contact zone has been detected in central Siberia for several cladocerans30,31,32. Here, the genetic results suggest that the Daphnia curvirostris complex contains a single geographically widespread (Palearctic and beyond) subclade A and several divergent Eastern Palearctic clades and species (including putative novel species). The geographic limit of D. curvirostris s.str. (Clade 1) appears to be near the eastern extremity of the Palearctic in the Lena River basin (Yakutia) – no shared haplotypes were detected across this boundary.

Also unusual for cladocerans (but see48) is the observation that only one clade (that appears to emanate from a single refugium) for the D. curvirostris complex has a broad geographic distribution. The star-like network and significantly negative Tajima’s D and Fu’s Fs for this clade are consistent with a recent population expansion. The raggedness r statistic indicates that the mismatch distribution has a significantly different shape from that expected from constant size (but is consistent with population growth). The population structure does not appear to be limited to mitochondrial DNA as we detected very similar patterns (albeit with less divergence) in the nuclear encoded introns of HSP90 (Fig. 3). Nor does our assessment of variation in the population variation of mtDNA in the eastern Palearctic populations appear to be contaminated with pseudogenes. We detected no stop codons and the codons are under strong purifying selection as would be expected from a functional mitochondrial protein-coding gene. Although we lack a reliable method of calibrating the molecular clock for Daphnia curvirostris, we note that the genetic divergences and star-like patterns for ND2 (subclade A) are very close to those found for the multiple proposed postglacial expansions in Daphnia galeata, Daphnia dentifera, and Bosmina longispina using the same gene24,25,49. Because the distribution is in a region that was profoundly affected (or covered) by glaciation during the LGM, and normally only the most recent glacial expansions can be detected with contemporary data, the data are consistent with an expansion time that is post-LGM50,51.

We observed one difference in rooting involving the position of D. tanakai, which moved to a more basal in the nucleotide-based tree (Supplementary Fig. S1). However, the evidence suggests that a basal position of D. tanakai may result from a systematic bias (such as long branch attraction to the outgroups). First, three of the outgroups failed a base composition heterogeneity test which would complicate ML analyses with a single substitution model. Removal of the outgroups (using midpoint rooting) placed the root in the same position as the nuclear gene tree. Notably, the more slowly evolving amino acid tree for ND2 (with outgroup rooting) agreed with the nuclear gene tree in root position (even after adding more closely related outgroup species). Lastly, the phylogenetic grouping of D. tanakai with the other far eastern Palearctic lineages is a biogeographically simpler explanation than a basal position of D. tanakai.

Unlike in other Daphnia52, the widespread subclade of D. curvirostris does not appear to originate in the eastern Palearctic. Japan has been regarded as a diversity source for several widespread cladocerans lineages24,25,52. However, some mitochondrial lineages appear to have colonized Japan from more northerly regions31,32. Why some lineages of Daphnia have recently expanded over vast distances while others remain restricted to a single or a few sites remains a mystery.

The finding that the Mexican population shares the identical central mitochondrial haplotype for the widespread Palearctic subclade of D. curvirostris indicates recent connectivity with Palearctic populations. The finding is consistent with the recent introduction of the disjunct Mexican D. curvirostris from Europe. As there are no natural dispersal vectors for Daphnia between the Palearctic and Mexico, the simplest explanation is anthropogenic introduction53,54. Duffy et al.46 made a similar proposal for the now extirpated population of D. curvirostris in Western New York. Unlike the other North American D. curvirostris, the Alaskan D. curvirostris had a unique codon deletion and was differentiated from the widely distributed clade of D. curvirostris. So, there is some evidence that this Pilgrim Hotsprings population may be unique and endemic. Other studies have failed to detect D. curvirostris in Alaska beyond the thermally-disturbed region associated with Pilgrim Hot Springs despite extensive sampling in western Alaska. More sampling will be needed to determine if other thermally and conductivity disturbed regions in the western Nearctic contain rare genetically differentiated populations in the D. curvirostris group. It is plausible that hot springs in Beringia have acted as microrefugia for temperate freshwater species during the LGM.

The greater diversity of the Daphnia curvirostris species group in the temperate Eastern Palearctic compared to the Western Palearctic is a common geographic pattern in plants and animals55. Even before the modern era of genetics, endemic cladocerans35,36 and other invertebrates56,57 were known from the eastern Palearctic. Reduced extinction during the Pleistocene glacial cycles is normally proposed as a contributing factor. The relatively moderate climate of the Eastern Palearctic compared to regions directly affected by Pleistocene ice sheets may have been the most important factor affecting extinction rates58. Even so, the Eastern and Western Palearctic were far from homogenous regions during the LGM. During the LGM, nunataks and even patches of forest existed in the Western Palearctic while regions of harshness and smaller glaciers (montane regions) existed in the Eastern Palearctic.

Daphnia curvirostris has been proposed as a temperate species complex with a distribution limited by temperature requirements8. In the eastern Palearctic, D. curvirostris-like specimens have been detected in North China (the exact location is unknown, see34,59. However, in the true Asian tropics60,61 and subtropics (i.e. in Hainan Island62) D. curvirostris remains undetected. It is unlikely that any of the lineages that we detected are common in the tropics. Most of the lineages belong to a relatively narrow region close to the Japan and Yellow Sea (including Sakhalin Island, the Korean Peninsula, Japan and NE China (about 30–50°N, 120–145°E). We conclude that this temperate geographic region is a hotspot of the biodiversity and a zone of endemism for curvirostris-like taxa (and perhaps other cladocerans). The explanation for this geographic diversity remains to be addressed – but both extinction and speciation appear to have played a role10. Speciation may explain the pronounced endemism and regionalism within the “hot spot”.

Speciation seems to have played a role because the D. curvirostris complex has very strong sub-regional differentiation: there is no subclade sharing among Japan, the Russian Far East, continental Korea or Jeju Island. A single subclade only is shared between the continent and Sakhalin Island, while another Sakhalinian clade is an endemic of the island. One interesting location for endemics is the mountains of the largest Japanese island, Honshu. Subclade J (Daphnia sp. nov.) was undetected in other water bodies of Japan (or any other region), although about 100 lakes and ponds were sampled and populations of different species of Daphnia were found in many water bodies of Japan24,25,42,51. So, this subclade (and likely a new biological species, see the HSP90 tree) may be present in very few water bodies on the planet. One possible geographic speciation scenario for the origin of species in the eastern Palearctic is the disruption of a continuous land mass, connecting Korea Peninsula, Japan, Sakhalin and continental Far East of Russia63,64. The same biogeographic pattern is known for several other freshwater and terrestrial animals57,65.

Our study indicates that there is more diversity in the Daphnia curvirostris complex than previously appreciated. We find an amalgam of several lineages of differing divergences in this complex, with one lineage having a broad Palearctic distribution and 10 others being endemic to parts of the Eastern Palearctic. Thus, the effect of the LGM and Holocene deglaciation has been very different within a freshwater species complex. This study represents the first step towards understanding the global biogeography and diversity of the Daphnia curvirostris complex. The new species detected here will be formally described in a taxonomic paper. Although local extirpation of populations of the geographically widespread clade (as happened in New York) will represent a negligible loss in diversity, losses of single populations in the Eastern Palearctic (or from thermally disturbed pools) could represent the loss of a divergent lineage or even a species.

Empirical studies have often shown that relicts are particularly susceptible to extinction66,67. Moreover, there are ongoing threats to small freshwater habitats such as eutrophication, drainage, and anthropogenic introductions9,68,69. As such, we consider the study of cladoceran biology and conservation in the eastern Palearctic and subarctic hotsprings to be priorities.

Methods

Ethics statement

No species in this study are listed as protected or endangered. Field collection in Russia was carried out by our team or by colleagues as part of a governmental project “Ecology and biodiversity of aquatic ecosystems and invasions of alien species” (No. 0109-2014-0008), with governmental permission to collect samples from public property. Sampling in the natural reserves of Russia was conducted with special permissions of their Directors. Verbal permission to collect in private farm ponds was obtained from local owners. Mongolian samples were collected by the Joint Russian-Mongolian Complex Biological Expedition with permission of the Ministry of Nature, Environment and Tourism of Mongolia. Field collections in South Korea were carried out in frames of fulfillment of the program from the National Institute of Biological Resources of Korea. Some samples from other countries were provided by colleagues who collected them with permissions linked to their activity as hydrobiologists in governmental institutions of their countries, where the collection of these invertebrates is not subjected to restriction by national or international laws and does not require special permission.

Field collection and DNA extraction

Specimens were collected by plankton nets with diameter of 20–40 cm and mesh size of 30–50 µm, or rectangular dip nets of same mesh size with width of 0.2–0.3 m, handle length of 0.5–2 m. After collection specimens were in 96% alcohol. Before the start of the genetic studies, each specimen was preliminarily identified by morphological characters37,40,44. The majority of samples are Palearctic but we also included specimens from disjunct Nearctic populations (Mexico and Alaska). DNA was extracted from single adult females using the DNA Quickextract (Epicentre) protocol as modified by Ishida et al.40.

Sequencing and alignment

PCR was carried out using the Promega GoTaq master reaction mix (PCR cycling conditions were 95 °C for 2 m, 95 °C for 30 s, 48 °C for 30 s, and 72 °C for 1 m for 39 cycles, followed by 72 °C for 5 m). We used Daphnia-specific primers40 for the mitochondrial ND2 gene (approximately 960 bp) and the nuclear HSP90 gene (approximately 700 bp). After gel verification of amplicon size, PCR products were purified and Sanger sequenced (TACGEN, California). Sequence chromatograms were assembled and examined in Geneious R770. Each amplicon was sequenced in both directions and the assembled reads were examined manually for internal peaks and translation. Primer regions were trimmed and consensus sequences were generated based on highest quality. The nuclear gene amplicon contained two introns. Alignment was carried out using Muscle as implemented in Seaview 4.671. Translation alignment (to preserve the codon structure) was carried out by aligning amino acids and then reverting to nucleotides in Seaview. Intron boundaries were determined according to Ishida et al.40.

Phylogenetics

Substitution model fitting was carried out in IQtree 1.672 with the number of partitions optimized by codon position and intron/exon. Tree estimation used the maximum likelihood algorithm of IQtree and the best fit partition model73. Branch support was estimated with nonparametric bootstrapping and approximate likelihood ratio tests (aLRT) with 1000 replicates. Sequences of four species (Daphnia laevis, Daphnia dubia, Daphnia longiremis, and Daphnia cristata) were used as outgroups. The effect of adding more closely related (to the ingroup D. curvirostris complex) outgroups (D. galeata and D. umbra) was also explored. Phylograms were visualized in Figtree 1.4374. Protein based phylogenies were estimated using PhyML 3.0 as implemented by Seaview 4.6 using the arthropod-mitochondrial (mtart) + gamma substitution model.

To explore geographic structure in populations of Daphnia curvirostris sensu stricto, we estimated median joining networks in PopArt v1.7 beta75. Tests of selection for the ND2 gene were carried out using Fast Unconstrained Bayesian Approximation (FUBAR76) and Mixed Effects Model of Evolution (MEME77) as implemented in Datamonkey 2.0. Mismatch analyses, coalescent simulations and statistics associated with population growth and divergence were carried out using DnaSP 678.