Contrasting endemism in pond-dwelling cyclic parthenogens: the Daphnia curvirostris species group (Crustacea: Cladocera)

Pond-dwelling cyclic parthenogens are often proposed to be highly vagile. However, the Holarctic biogeography of parthenogens has been hampered by very limited sampling in the eastern Palearctic. Here we examine the geographic boundaries, diversity, and connectivity across the Palearctic for the Daphnia curvirostris complex (Cladocera: Daphniidae). Nuclear (HSP90) and mitochondrial (ND2) sequence data supported the existence of five main clades (most of which corresponded to presumptive species) with one eastern Palearctic clade being novel to this study (the average mitochondrial genetic divergence from known species was 19.2%). D. curvirostris s.s. was geographically widespread in the Palearctic, with a population genetic signature consistent with postglacial expansion. The Eastern Palearctic had local nine endemic species and/or subclades (other Holarctic regions lacked more than one endemic subclade). Even though several endemic species appeared to have survived Pleistocene glaciation in the eastern Palearctic, much of the Palearctic has been recolonized by D. curvirostris s.str. from a Western Palearctic refugium. A disjunct population in Mexico also shared its haplotypes with D. curvirostris s.str., consistent with a recent introduction. The only apparently endemic North American lineage was detected in a thermally disturbed pond system in northwestern Alaska. Our results for pond-dwelling cyclic parthenogens further support the hypothesis that the Eastern Palearctic is a diversity hotspot for freshwater invertebrates.

The alignment of ND2 (mitochondrial NADH dehydrogenase 2) was unambiguous and each sequence had a single open reading frame. Only single codon deletions were observed within the complex. This deletion was present in a 10-codon long variable region of an Alaskan sequence that also had differing single codon deletions in the outgroup species and in D. hrbaceki. The best fit model for the ingroup ND2 alignment was TN + F + I + G4 which had a Bayesian Information Criterion score of 23451.8840. All but one of our analyses found the same root for the D. curvirostris complex (Figs 2 and 3; Supplementary Figs S1-S3). For the ND2 alignments, three outgroup sequences for the nucleotide data failed the base composition homogeneity test provided by IQtree while no outgroup sequences for the amino acid data failed the same test.
ND2 phylogenetic tree and geographic distributions. The entire D. curvirostris complex was monophyletic and divided into five major clades which may represent species or species groups (Fig. 2). Major clade 1 (Daphnia curvirostris s.str.) had three geographic subclades (subclades are lettered from A-C, red in Figs 1 and 2). Subclade A had a broad distribution in the Northern Palearctic from France to Eastern Siberia (Yakutia; Fig. 1). The Mexican specimens were nested within subclade A, supporting the Palearctic origin hypothesis. A more basal Subclade B within D. curvirostris was detected in Eastern Siberia (Yakutia), the continental portion of the Far East of Russia and Sakhalin Island (but absent from the Japanese samples). Another basal lineage was subclade C which was found only in a single locality (Pilgrim Hotsprings) in northwestern Alaska. Major clade 2 (D. hrbaceki) contained the type locality population (a recently created pool in Central Europe) of D. hrbaceki (subclade D, black circle in Fig. 2). The remaining major clades were restricted to the eastern Palearctic. Major clade 3 (also a described species, D. tanakai) was represented by a single subclade E (grey circle in Fig. 2), and was restricted to mountain lakes in Japan. Major clade 4 (D. sinevi, blue in Fig. 2) was ranged from the continental Far East of Russia to Sakhalin Island. Clade 4 contained three more regionally distributed subclades: subclade F was endemic to Sakhalin Island; subclade G was found in a single population in Primorie (the continental Far East of Russia); and subclade H was relatively widely distributed in the continental Far East of Russia. Major clade 5 (termed Daphnia sp. nov. here to denote a putative new species, green in Fig. 2) was present in the continental Far East of Russia, Japan and Korea, and was represented by four subclades, each with very restricted ranges: subclade I was found in a single locality on the continental Far East of Russia, the subclade J was found in a single lake in Japan, subclade K was present in a single locality in continental Korea, and subclade L was found in four localities on Jeju Island (South Korea). Major Clade 5 differed from the closest clade with an existing species name (D. sinevi) by an average of 19.2% for the ND2 sequence using a Jukes-Cantor corrected distance (Table 1).

Figure 1.
All sites from which diversity of the Daphnia curvirostris group has been analyzed (a), and a region of the group maximum diversity -Far East (b). The base map is the Marble Virtual Globe 1.5.1 "plain map" (i.e., no attributable data layers) available at https://marble.kde.org/. www.nature.com/scientificreports www.nature.com/scientificreports/ The protein based ND2 trees (Supplementary Figs S2 and S3) were very similar to the nucleotide tree with strong support for the five major clades and most of the minor subclades. One difference between the protein and nucleotide based trees was the switched positioning of the Alaskan D. curvirostris with the subclade B sequences -but both B and C remained basal to the geographically widespread A subclade of D. curvirostris. Notably, outgroup rooting with an amino acid alignment (Supplementary Figs S2 and S3) supported the same root for the curvirostris complex as found by the midpoint rooted nucleotide tree (Fig. 2). HSP90 tree. Model-fitting determined three models for the partitions by codon position and intron/exon.
For the first and second positions, TN + F + I was the best fit model, while K2P + G4 was the best fit model for third codon positions. For introns the best fit model was HKY + F + G4. The well supported branches of the HSP90 and ND2 trees were in agreement (Fig. 3). Five major clades with strong support were found (note that we lacked HSP90 sequences for D. hrbaceki, the major clade 2 of mitochondrial tree). Again, major clade 1 corresponded to the Daphnia curvirostris group. It contained two subclades, corresponding to the mitochondrial clades A + C (with moderate support) and clade B (with a strong support). Major clade 2 corresponded to D. tanakai, represented by sequences from two lakes in Japan. It contained no structure and corresponded to mitochondrial  www.nature.com/scientificreports www.nature.com/scientificreports/ subclade E. Major clade 3 corresponded to D. sinevi. It contained two clades, corresponding to mitochondrial subclades G, and F + H. Major clade 4 corresponded to a single mitochondrial subclade J, a portion of the undescribed species complex Daphnia sp. nov. Major clade 5 corresponded to the rest of this proposed new species complex, mitochondrial subclades I + K + L. As with ND2, D. curvirostris-like taxa form a well-supported monophyletic clade with the same root.

ND2 median joining (MJ) haplotype network for Daphnia curvirostris s.str. The MJ network of
Daphnia curvirostris contained 40 different ND2 haplotypes (Fig. 4). The three main divisions (A-C) found in the tree analyses for D. curvirostris were apparent in the network.
Subclade A was represented by a sub-network of closely-related haplotypes (the average number of nucleotide differences k = 3.8). Still only private haplotypes (no regional sharing) for the eastern and western Palearctic were observed. There was at least one star-like topology, suggesting recent population expansion. The central haplotype of this star contained European-Western Siberian and Mexican individuals (Tajima's D = −1.55; P = 0.048; Fu's Fs = −14.57; P < 0.001). Subclade B was represented by a group of haplotypes that lacked a star-like pattern (Tajima's D = −0.65072, P > 0.10; Fu's Fs = −1.08), with modest haplotype divergence (average number of nucleotide differences, k = 19.6). Subclade C from Alaska was represented by a single haplotype connected to a haplotype within the clade A, but differed from subclade A by an average of 19.1 nucleotides.
The mismatch distribution analyses ( Supplementary Fig. S4) for subclade A deviated significantly from the graph expected by constant population size (raggedness index = 0.0126; P = 0.017).
In order to assess the potential for mitochondrial pseudogenes in our data we translated the codon sequences and tested for pervasive purifying selection (an expected pattern for mitochondrial genes). The prevailing pattern of selection for ND2 in the D. curvirostris complex was purifying selection (non-synonymous/synonymous rate ratio = 0.1299). However, MEME did identify eight sites with significant episodic positive selection ( Supplementary Fig. S5). A similar pattern was found with FUBAR (Supplementary Table S2) where four sites had significant episodic positive selection and 206 sites had significant purifying selection.

Discussion
The results indicate that the present genetic diversity of the geographically widespread Daphnia curvirostris complex is greatest in the eastern Palearctic. However, we did find evidence for a widely distributed clade that appears to have expanded from a single glacial refugium to a large portion of the Palearctic (with recent, presumably anthropogenic introduction to Mexico).
Often, the present ranges of Holarctic freshwater animals are proposed to result from extinction and bottlenecks during the last glacial maximum followed by expansion from multiple glacial refugia 12 . The present-day contact zones of these lineages is often detected at a considerable distance from the refugia. In the Palearctic, for example, a contact zone has been detected in central Siberia for several cladocerans [30][31][32] . Here, the genetic results suggest that the Daphnia curvirostris complex contains a single geographically widespread (Palearctic and beyond) subclade A and several divergent Eastern Palearctic clades and species (including putative novel species). www.nature.com/scientificreports www.nature.com/scientificreports/ The geographic limit of D. curvirostris s.str. (Clade 1) appears to be near the eastern extremity of the Palearctic in the Lena River basin (Yakutia) -no shared haplotypes were detected across this boundary.
Also unusual for cladocerans (but see 48 ) is the observation that only one clade (that appears to emanate from a single refugium) for the D. curvirostris complex has a broad geographic distribution. The star-like network and significantly negative Tajima's D and Fu's Fs for this clade are consistent with a recent population expansion. The raggedness r statistic indicates that the mismatch distribution has a significantly different shape from that expected from constant size (but is consistent with population growth). The population structure does not appear to be limited to mitochondrial DNA as we detected very similar patterns (albeit with less divergence) in the nuclear encoded introns of HSP90 (Fig. 3). Nor does our assessment of variation in the population variation of mtDNA in the eastern Palearctic populations appear to be contaminated with pseudogenes. We detected no stop codons and the codons are under strong purifying selection as would be expected from a functional mitochondrial protein-coding gene. Although we lack a reliable method of calibrating the molecular clock for Daphnia curvirostris, we note that the genetic divergences and star-like patterns for ND2 (subclade A) are very close to those found for the multiple proposed postglacial expansions in Daphnia galeata, Daphnia dentifera, and Bosmina longispina using the same gene 24,25,49 . Because the distribution is in a region that was profoundly affected (or covered) by glaciation during the LGM, and normally only the most recent glacial expansions can be detected with contemporary data, the data are consistent with an expansion time that is post-LGM 50,51 .
We observed one difference in rooting involving the position of D. tanakai, which moved to a more basal in the nucleotide-based tree (Supplementary Fig. S1). However, the evidence suggests that a basal position of D. tanakai may result from a systematic bias (such as long branch attraction to the outgroups). First, three of the outgroups failed a base composition heterogeneity test which would complicate ML analyses with a single substitution model. Removal of the outgroups (using midpoint rooting) placed the root in the same position as the nuclear gene tree. Notably, the more slowly evolving amino acid tree for ND2 (with outgroup rooting) agreed with the nuclear gene tree in root position (even after adding more closely related outgroup species). Lastly, the phylogenetic grouping of D. tanakai with the other far eastern Palearctic lineages is a biogeographically simpler explanation than a basal position of D. tanakai.
Unlike in other Daphnia 52 , the widespread subclade of D. curvirostris does not appear to originate in the eastern Palearctic. Japan has been regarded as a diversity source for several widespread cladocerans lineages 24,25,52 . However, some mitochondrial lineages appear to have colonized Japan from more northerly regions 31,32 . Why some lineages of Daphnia have recently expanded over vast distances while others remain restricted to a single or a few sites remains a mystery.
The finding that the Mexican population shares the identical central mitochondrial haplotype for the widespread Palearctic subclade of D. curvirostris indicates recent connectivity with Palearctic populations. The finding is consistent with the recent introduction of the disjunct Mexican D. curvirostris from Europe. As there are no natural dispersal vectors for Daphnia between the Palearctic and Mexico, the simplest explanation is anthropogenic introduction 53,54 . Duffy et al. 46 made a similar proposal for the now extirpated population of D. curvirostris in Western New York. Unlike the other North American D. curvirostris, the Alaskan D. curvirostris had a unique codon deletion and was differentiated from the widely distributed clade of D. curvirostris. So, there is some evidence that this Pilgrim Hotsprings population may be unique and endemic. Other studies have failed to detect D. curvirostris in Alaska beyond the thermally-disturbed region associated with Pilgrim Hot Springs despite extensive sampling in western Alaska. More sampling will be needed to determine if other thermally and conductivity disturbed regions in the western Nearctic contain rare genetically differentiated populations in the D. curvirostris group. It is plausible that hot springs in Beringia have acted as microrefugia for temperate freshwater species during the LGM.
The greater diversity of the Daphnia curvirostris species group in the temperate Eastern Palearctic compared to the Western Palearctic is a common geographic pattern in plants and animals 55 . Even before the modern era of genetics, endemic cladocerans 35,36 and other invertebrates 56,57 were known from the eastern Palearctic. Reduced extinction during the Pleistocene glacial cycles is normally proposed as a contributing factor. The relatively moderate climate of the Eastern Palearctic compared to regions directly affected by Pleistocene ice sheets may have been the most important factor affecting extinction rates 58 . Even so, the Eastern and Western Palearctic were far from homogenous regions during the LGM. During the LGM, nunataks and even patches of forest existed in the Western Palearctic while regions of harshness and smaller glaciers (montane regions) existed in the Eastern Palearctic.
Daphnia curvirostris has been proposed as a temperate species complex with a distribution limited by temperature requirements 8 . In the eastern Palearctic, D. curvirostris-like specimens have been detected in North China (the exact location is unknown, see 34,59 . However, in the true Asian tropics 60,61 and subtropics (i.e. in Hainan Island 62 ) D. curvirostris remains undetected. It is unlikely that any of the lineages that we detected are common in the tropics. Most of the lineages belong to a relatively narrow region close to the Japan and Yellow Sea (including Sakhalin Island, the Korean Peninsula, Japan and NE China (about 30-50°N, 120-145°E). We conclude that this temperate geographic region is a hotspot of the biodiversity and a zone of endemism for curvirostris-like taxa (and perhaps other cladocerans). The explanation for this geographic diversity remains to be addressed -but both extinction and speciation appear to have played a role 10 . Speciation may explain the pronounced endemism and regionalism within the "hot spot". Speciation seems to have played a role because the D. curvirostris complex has very strong sub-regional differentiation: there is no subclade sharing among Japan, the Russian Far East, continental Korea or Jeju Island. A single subclade only is shared between the continent and Sakhalin Island, while another Sakhalinian clade is an endemic of the island. One interesting location for endemics is the mountains of the largest Japanese island, Honshu. Subclade J (Daphnia sp. nov.) was undetected in other water bodies of Japan (or any other region), although about 100 lakes and ponds were sampled and populations of different species of Daphnia were found in www.nature.com/scientificreports www.nature.com/scientificreports/ many water bodies of Japan 24,25,42,51 . So, this subclade (and likely a new biological species, see the HSP90 tree) may be present in very few water bodies on the planet. One possible geographic speciation scenario for the origin of species in the eastern Palearctic is the disruption of a continuous land mass, connecting Korea Peninsula, Japan, Sakhalin and continental Far East of Russia 63,64 . The same biogeographic pattern is known for several other freshwater and terrestrial animals 57,65 .
Our study indicates that there is more diversity in the Daphnia curvirostris complex than previously appreciated. We find an amalgam of several lineages of differing divergences in this complex, with one lineage having a broad Palearctic distribution and 10 others being endemic to parts of the Eastern Palearctic. Thus, the effect of the LGM and Holocene deglaciation has been very different within a freshwater species complex. This study represents the first step towards understanding the global biogeography and diversity of the Daphnia curvirostris complex. The new species detected here will be formally described in a taxonomic paper. Although local extirpation of populations of the geographically widespread clade (as happened in New York) will represent a negligible loss in diversity, losses of single populations in the Eastern Palearctic (or from thermally disturbed pools) could represent the loss of a divergent lineage or even a species.
Empirical studies have often shown that relicts are particularly susceptible to extinction 66,67 . Moreover, there are ongoing threats to small freshwater habitats such as eutrophication, drainage, and anthropogenic introductions 9,68,69 . As such, we consider the study of cladoceran biology and conservation in the eastern Palearctic and subarctic hotsprings to be priorities. Field collection and DNA extraction. Specimens were collected by plankton nets with diameter of 20-40 cm and mesh size of 30-50 µm, or rectangular dip nets of same mesh size with width of 0.2-0.3 m, handle length of 0.5-2 m. After collection specimens were in 96% alcohol. Before the start of the genetic studies, each specimen was preliminarily identified by morphological characters 37,40,44 . The majority of samples are Palearctic but we also included specimens from disjunct Nearctic populations (Mexico and Alaska). DNA was extracted from single adult females using the DNA Quickextract (Epicentre) protocol as modified by Ishida et al. 40 . sequencing and alignment. PCR was carried out using the Promega GoTaq master reaction mix (PCR cycling conditions were 95 °C for 2 m, 95 °C for 30 s, 48 °C for 30 s, and 72 °C for 1 m for 39 cycles, followed by 72 °C for 5 m). We used Daphnia-specific primers 40 for the mitochondrial ND2 gene (approximately 960 bp) and the nuclear HSP90 gene (approximately 700 bp). After gel verification of amplicon size, PCR products were purified and Sanger sequenced (TACGEN, California). Sequence chromatograms were assembled and examined in Geneious R7 70 . Each amplicon was sequenced in both directions and the assembled reads were examined manually for internal peaks and translation. Primer regions were trimmed and consensus sequences were generated based on highest quality. The nuclear gene amplicon contained two introns. Alignment was carried out using Muscle as implemented in Seaview 4.6 71 . Translation alignment (to preserve the codon structure) was carried out by aligning amino acids and then reverting to nucleotides in Seaview. Intron boundaries were determined according to Ishida et al. 40 . phylogenetics. Substitution model fitting was carried out in IQtree 1.6 72 with the number of partitions optimized by codon position and intron/exon. Tree estimation used the maximum likelihood algorithm of IQtree and the best fit partition model 73 . Branch support was estimated with nonparametric bootstrapping and approximate likelihood ratio tests (aLRT) with 1000 replicates. Sequences of four species (Daphnia laevis, Daphnia dubia, Daphnia longiremis, and Daphnia cristata) were used as outgroups. The effect of adding more closely related (to the ingroup D. curvirostris complex) outgroups (D. galeata and D. umbra) was also explored. Phylograms were visualized in Figtree 1.43 74 . Protein based phylogenies were estimated using PhyML 3.0 as implemented by Seaview 4.6 using the arthropod-mitochondrial (mtart) + gamma substitution model.

Methods
To explore geographic structure in populations of Daphnia curvirostris sensu stricto, we estimated median joining networks in PopArt v1.7 beta 75 . Tests of selection for the ND2 gene were carried out using Fast Unconstrained Bayesian Approximation (FUBAR 76 ) and Mixed Effects Model of Evolution (MEME 77 ) as implemented in Datamonkey 2.0. Mismatch analyses, coalescent simulations and statistics associated with population growth and divergence were carried out using DnaSP 6 78 .

Data Availability
Voucher specimens are deposited into AAK private collection. DNA sequences were submitted to Genbank resulting in the following accession numbers (MH613987 -MH614169 and MH733953 -MH734039).