Phylogeography of the sand dollar genus Encope: implications regarding the Central American Isthmus and rates of molecular evolution

Vicariant events have been widely used to calibrate rates of molecular evolution, the completion of the Central American Isthmus more extensively than any other. Recent studies have claimed that rather than the generally accepted date of ~3 million years ago (Ma), the Isthmus was effectively complete by the middle Miocene, 13 Ma. We present a fossil calibrated phylogeny of the new world sand dollar genus Encope, based on one nuclear and four mitochondrial genes, calibrated with fossils at multiple nodes. Present day distributions of Encope are likely the result of multiple range contractions and extinction events. Most species are now endemic to a single region, but one widely distributed species in each ocean is composed of morphotypes previously described as separate species. The most recent separation between eastern Pacific and Caribbean extant clades occurred at 4.90 Ma, indicating that the Isthmus of Panama allowed genetic exchange until the Pliocene. The rate of evolution of mitochondrial genes in Encope has been ten times slower than in the closely related genera Mellita and Lanthonia. This large difference in rates suggests that splits between eastern Pacific and Caribbean biota, dated on the assumption of a “universal” mitochondrial DNA clock are not valid.


Results
Phylogeny. Bayesian inference (BI) and Maximum likelihood (ML) analyses of the concatenated data of five genes of Encope, Mellita, Lanthonia, and Mellitella, rooted on Leodia, produced congruent phylogenies, with only slight differences in weakly supported subclades within species (Fig. 1). Cross validation of the node dates through re-estimation by removing one fossil offset at a time indicated the estimates to be robust. All median dates thus estimated fell well within the Highest Posterior Density (HPD) ranges of the run based on all eight fossil constraints (Table S1). Nominal genera resulted as different phylogenetic entities, separated from each other 26-16 Ma (Fig. 2). Mellitella stokesii was similarly distinct, justifying Mooi's 35 removal of this species from Encope. Clade credibility values >75% of Maximum Likelihood (first number next to node) and >85% of Bayesian (second number) reconstruction are shown. Numbers after locality names indicate individuals with indistinguishable haplotypes, scale bar reflects number of changes per site. Names next to terminal branches indicate the morphology of the specimens; names to the right of the pictures are our interpretation as to species affiliation according to the molecular phylogeny. Colors represent geographic range (red = Atlantic and Caribbean, blue = eastern Pacific, green = Gulf of California, black = unknown). Aboral club-spine structures (lateral view) are shown next to subclades in E. micropora.
Scientific REPORTS | 7: 11520 | DOI: 10.1038/s41598-017-11875-w Phelan's 33 subdivision of the genus Encope was not supported by the molecular phylogeny, because E. grandis was not related to E. aberrans and E. michelini as he supposed, but shared a clade with E. galapagensis, E. borealis and E. emarginata.
The clade composed of the Atlantic species Encope aberrans and E. michelini was the first to have split from all other extant species of Encope (Fig. 1) in the Middle Miocene, 15 Ma (95% HPD: 19.76-11.05 Ma) (Fig. 2). Each of these two sympatric morphospecies was monophyletic, having split from each other approximately 6 Ma (95% HPD: 8.49-3.49 Ma). Thus, the molecular phylogeny did not justify A. Agassiz's 40 and Mortensen's 24 suggestions that they should be considered as conspecific. The sister clade of E. aberrans and E. michelini was split 9 Ma (95% HPD: 12.24-6.75 Ma) into a lineage consisting of the eastern Pacific species E. californica and E. micropora and a lineage that included three Pacific species, E. grandis, E. galapagensis, and E. borealis, plus an Atlantic one, E. emarginata. In the clade that occupies both oceans, E. grandis, endemic to the Gulf of California, was the first one to split 8 Ma (95% HPD: 10.82-5.93 Ma), whereas E. galapagensis, endemic to the Galapagos, split from the transisthmian pair nearly 7 Ma (95% HPD: 9.09-4.99 Ma). The transisthmian pair consisted of E. borealis, now endemic to the Gulf of California, and the western Atlantic E. emarginata. The BEAST median estimate of this split was 4.90 Ma with a 95% HPD of 6.54-3.57 Ma, indicating that there was exchange of water between the eastern Pacific and the Atlantic until this time. Nested within the Atlantic clade of E. emarginata are the morphospecies E. subclausa, E. oblonga and E. valensciennesii. Genetic distances between concatenated sequences of E. emarginata, E. subclausa, and E. oblonga ranged between 0.28 and 0.35%; mean intra-morph distances ranged between 0.03 and 0.47% (Table S2). Thus, inter-morph variation was no larger than intra-morph variation, suggesting very strongly that these three morphs belong to the same species. The single specimen of E. valensciennesii we were able to obtain had similarly small distances in COI, ATPase8 and 28S, but values of 3.48-4.04% in ATPase6 and of 1.38% in 16S, which suggest a certain degree of differentiation (Table S1). There Scientific REPORTS | 7: 11520 | DOI:10.1038/s41598-017-11875-w were well-supported subclades within E. emarginata sensu lato, but they did not correspond to morphological variation, nor did they conform to geographical structure.
The second lineage in the eastern Pacific contained two clades. The genetic distances in all genes between Encope californica and the large clade we have designated as E. micropora was <1.89% (Table 1), but given that the haplotypes form a monophyletic entity that corresponded with distinctive morphology, we consider it as a separate species. This was not the case in the other Pacific clade. The E. micropora clade contained specimens with a broad range of morphological variation, revealing high levels of plasticity in test and spine structures with no apparent phylogenetic structure. Such morphotypes have been described as separate species or subspecies, but the genetic data contain no compelling information for their separation.  Rates of molecular evolution. A striking feature of evolution of Encope is the extremely low divergence rate in all four mitochondrial genes. A comparison of rates (genetic distance divided by time) of Encope, Mellita and Lanthonia is presented in Table 1. There were few substitutions in 28S in all genera, even between the deepest clades. In 16S and COI, however, Encope mean rate of evolution is an order of magnitude slower than that of Mellita and Lanthonia. Differences remain in the same direction even when rates in Encope are biased to be as the fastest possible (as the ratio of the highest Confidence Limit of distance divided by the youngest 95% HPD point of age) and rates in Mellita and Lanthonia are biased as the slowest possible (as the ratio of the lowest Confidence Limit of distance divided by the oldest HPD point of age). There is no apparent correlation of divergence rate and time since separation in any of the genera (Spearman Rank Correlation Coefficient: Encope: r = 0.429, p = 0.297, Lanthonia-Mellita: r = 0.129, p = 0.693).

Discussion
The phylogeny of Encope in characterized by four notable features: (a) A phylogeographic signature composed of six species endemic to a restricted area and two species, one in the eastern Pacific and one in the western Atlantic, with a remarkably wide distribution. (b) Extreme morphological plasticity in the widespread species that does not show geographic structure. (c) An unusually slow rate of molecular evolution. (d) A time of separation between sister clades in the two oceans that pre-dates the final closure of the Central American Isthmus, but postdates the recently proposed ancient blockage of seaways. These features are interconnected to produce a story of extinction and range contraction that may have resulted in the relationships of species as they are seen today. That the eastern Pacific Encope borealis, now endemic to the Gulf of California, was separated from the western Atlantic E. emarginata 5 Ma, when Central America was a peninsula and the only water connections were close to South America [41][42][43] indicates either that E. borealis has undergone a range contraction since the Pliocene, or that the true sister species of E. emarginata on the other side of the Isthmus of Panama has become extinct. The multitude of extinct "species" of Encope 24,34,[44][45][46][47][48] suggests that extinction in this genus has been high. Given the morphological plasticity of E. micropora and E. emarginata, some of these fossil species were probably not biological species. However, that these forms no longer exist is consistent with the notion that a pattern of extinction characterizes the evolution of the genus. For example, the subspecies Encope macrophora macrophora from the Late Miocene and E. macrophora tamiamiensis from the Plio-Pleistocene in south-eastern United States are morphologically very close to E. grandis in the Gulf of California (they have a thick test and test margin, with five notches rather than enclosed lunules in the ambulacra) 49 . Encope macrophora macrophora and E. macrophora tamiamiensis did not survive the changing environment of the western Atlantic that followed the emergence of the Central American Isthmus 5 . That range contractions also happened through time is suggested by the contrast between the wide distributions of the younger clades (those of E. micropora and E. emarginata) and the much more endemic distributions of older species (those of E. aberrans, E. michelini, E. grandis, E. borealis and E. californica). If extensive range contractions and extinctions have taken place, present day geographic distributions of Encope species do not provide reliable information on which to base speculation about the causes of speciation.
We have proposed that a number of morphospecies actually belong to Encope micropora and that others belong to E. emarginata, because in the molecular phylogeny they did not form monophyletic clades and their genetic distances were minuscule. However, we have also found that all genes used in our reconstruction have evolved very slowly, the mitochondrial ones much slower than those in Mellita or Lanthonia. One might reasonably ask whether haplotypes of the slowly evolving molecules of true species have not yet sorted out and thus that the apparent intraspecific morphological variation is an artifact of the slow molecular evolution. We cannot completely exclude this possibility, particularly for the clade with E. micropora morphology from Bahia Magdalena, in which several specimens form their own clade, even though this clade does not contain all specimens from this location. However, with the exception of specimens of E. micropora insularis from Isla Socorro, there was no structure to support these variants either in morphology (as indicated by the specific and subspecific names) or in geographical location. Subclades did not include all members of a morphospecies; the same subclade included morphologically divergent specimens, such as those of E. micropora micropora, mixed with E. wetmorei and E. laevis. There did, however, appear to be a correspondence between morphology and habitat. Members of E. micropora sensu lato with flattened aboral club-spines forming a contiguous spine canopy and interambulcral lunules positioned between the posterior petals were only found in the surf zone of high energy beaches; members with bulbous-tipped aboral club spines, resulting in an open spine canopy, with interambulacral lunules positioned behind the posterior petals were found in deeper water, in sheltered bays, or on lower energy beaches. Morphological intermediates between these forms, such as those corresponding to E. perspectiva and E. micropora irregularis were also present at Isla San Jose (Panama) Bahia Octavia (Colombia), and Golfo de Nicoya (Costa Rica) suggesting a gradation in morphological characters that may correspond to intermediate habitats. Thus, the variation that has led systematists to designate different species and subspecies in the eastern Pacific, may well correspond to ecophenotypes.
The extremely slow rate of molecular evolution of Encope relative to Mellita and Lanthonia, genera in the same family, was surprising. There were no differences between the genera in substitution rate in 28S, but this nuclear gene has evolved so slowly in both cases, that comparisons are based on the incorporation of a few mutations; comparing their rate is, thus, not meaningful. 16S and COI, on the other hand, diverged a minimum of 10 times more slowly in Encope. A comparison to divergence across the Central American Isthmus in other echinoid genera 2 indicates that differentiation of COI in Encope, with a transisthmian distance of 1.11%, is only half as large as it is in Diadema 50 , the previously smallest known transisthmian distance, whereas the same gene in Mellita with a transisthmian distance of 25.2% is twice as large as it is in Lytechinus 51 , the previously largest known distance. Because Diadema and Lytechinus lack a fossil record, we cannot determine whether the differences in divergence between their species on the two sides of the Isthmus are due to different times of separation or different rates of evolution. In the sand dollars, on the other hand, time of separation is based on fossil calibrations. Apparent differences in rates can be caused by the vagaries of coalescence 52, 53 , by miscalibration of phylogenies, or by the application of different models of DNA evolution 54 . However the differences between rates in Encope and rates in Mellita and Lanthonia throughout their respective phylogenetic trees are too great to be entirely due to these factors. In the time interval considered here, stratigraphic errors in deciding the age of the fossils could not be wrong by an order of magnitude. Different models of DNA evolution affect the estimated genetic distances between the most divergent taxa within each genus. The maximum likelihood distances presented here for Mellita differ somewhat from those calculated from gene-specific models in Coppard et al. 17 , but only by a factor of 0.35 in the largest deviation. Nor could the differences in rates be due to saturation of the sequences or some other factor that causes them to depend on time 54 , because the time scales in all genera are similar, and because the trend, if any, in the data is that rates of substitution are higher between the most divergent clades. So what could be the cause of such large differences between closely related genera?
A number of hypotheses have attempted to explain deviations from the molecular clock [55][56][57][58][59][60][61] . Unfortunately, we do not know the biology of sand dollars in sufficient detail to distinguish how well each of the possibilities apply to Encope, except by excluding some of the alternatives. It is unlikely that differential selection can account for the differences between Encope and Mellita or Lanthonia in the evolution of the same genes. Although mtDNA is under strong selection, most of the variation between species in the three genera is in codon positions that do not alter amino acid composition. Given that the differences are consistent between genes, a demographic factor, affecting all genetic markers, is a more likely explanation. According to the nearly neutral theory Ohta's 62 or the theory of coalescence 63 , (though not according to Kimura's 64 neutral theory), differences in substitution rate depend on effective population size. In our experience, however, species of Encope are much more difficult to encounter than species of Mellita, so differences caused by effective population size would be in the opposite direction to the one observed. Similarly, there is no reason to assume that mutation rate is lower in Encope by an order of magnitude. We tentatively suggest that differences in body size 55,57,65 , with its correlated factors, generation time 66 , metabolic rate 55 and longevity 58 , may be the most likely explanation for the ten-fold slow-down of mtDNA evolution in Encope. Whereas species of Mellita and Lanthonia have thin tests that rarely exceed 10 cm in diameter, Encope lays down a heavy, thick, internally supported test that in some of its species can easily reach 15 cm. The size at which species of Encope reach sexual maturity is not known, but Mortensen 24 remarked that a 3.8 cm specimen of E. emarginata had not yet developed genital pores. In Mellita a 3.8 cm specimen is fully mature. It is, therefore, easy to imagine that generation time in Encope is longer than in Mellita. Whatever the cause of the differences in mtDNA evolutionary rate in the three mellitids may be, the data presented here strongly illustrate the dangers of applying an assumption of a universal mtDNA clock in obtaining dates of completion for the Isthmus of Panama 14 , or any other event.
That the geminate species of Encope were separated 4.9 Ma (95% HPD: 6.54-3.75 Ma) would have been expected and unremarkable before the publication of the articles claiming that only narrow channels connected the eastern Pacific and the Caribbean as of 13 Ma 11-13 , and various claims they generated, including proposals that even the causes of the Great Faunal Interchange 67 of land organisms crossing between continents need to be re-evaluated 13,14,[68][69][70][71] . The 4.9 Ma date does not coincide with any of the vicariant events suggested by Bacon et al. 14 . O'Dea et al. 5 have summarized the geological, paleontological, paleoceanographic and genetic data that show that this claim is not likely to be correct. The molecular data from Encope add another piece of evidence that genetic (and thus water) connections between the oceans were strong until the Pliocene.

Materials and Methods
All described extant morphospecies of Encope, were either collected or borrowed from museums (Fig. 3). Morphological characters of the tests, spines and pedicellariae of each specimen were matched with the original species or sub-species descriptions. Additionally, specimens of Leodia sexiesperforata, Mellita quinquiesperforata, and Lanthonia longifissa were sequenced as outgroups. Mellitella stokesii was also sequenced to resolve its phylogenetic position. Muscle from the Aristotle's lantern of each specimen was preserved in 95% Ethanol for DNA extraction.

DNA extraction, sequencing and alignment. Genomic DNA was extracted from 150 Encope, from four
Leodia sexiesperforata, and from five each Mellitella stokesii, Mellita quinquiesperforata and Lanthonia longifissa using a DNeasy tissue kit (Qiagen) ® . No intact DNA could be extracted from the type specimen of E. perspectiva jonesi, which may have been fixed in formalin. A 5000 bp fragment of mtDNA was initially sequenced in three species of Encope (two Atlantic and one eastern Pacific) and M. stokesii (eastern Pacific) using 16Sech150FL 5′AGAGACTGGTATGAATGGCAAGAC and NAD3echR 5′ GATTTTAGYGGRTCAAAKCCACAYTCGTA primers. 16S rDNA, cytochrome c oxidase subunit I (COI), and the Lysine-tRNA, ATPase-6 and ATPase-8 regions were selected based on the presence of conserved priming regions in all four species. Two overlapping fragments of COI were amplified. The 5′ region of the gene using the primers 5′TTTCAACWAATCAYAARGAC and 5′GGTTCTCGCTTYCCDGA, and the 3′ region using 5′ GCTTTCCTCCTGCTCCTATC and 5′GCATCTGGGTAGTCTGAGTATCGC. The Lysine-tRNA, ATPase-6 and ATPase-8 regions were amplified using primers 5′-GTGAAGGCAGATAACTACTGT and 5′-TCTACAAGGTGGTATGGGTG; a fragment of the 16S rDNA was amplified using primers 5′-CGCCTGTTTACCAAAACA and 5′TCGTAGATAGAAACTGACCTG. The same amplification conditions were employed for all mtDNA regions.  72 . To avoid heterozygous polymorphisms, PCR products of 28 S were cloned using Promega pGEM-T Easy kits ® . One randomly chosen clone was cycle-amplified from each specimen using Promega M13 and M13R primers before being sequenced in one direction using the M13 primer. After purification in Sephadex columns, amplification with the same primers, and labeling with Applied Biosystems (ABI) Prism BigDye Terminators, nucleotides were sequenced in both directions in an ABI 3130 XL automatic sequencer. Alignments were performed in MacClade 73 .
After trimming sequence ends of ambiguous bases and overlapping segments, there were 1296 base pairs (bp) of COI, 168 bp of ATPase-8, 687 bp of ATPase-6, 568 bp of 16S and 1137 bp of 28S rDNA, a total of 3856 bp of DNA sequence for each specimen.
Phylogenetic analyses. Unique haplotypes of each DNA region were subjected to analysis in jModeltest v.
2.1.1 74 to determine the best model of molecular evolution based on the AIC criterion 75 . The algorithm suggested as the best model for 16S HKY + G 76 (α = 0.0240), for COI TIM3 + I + G 74 (I = 0.7030, α = 1.7370), for ATPase-6 TrN + I + G 74 (I = 0.6500, α = 1.919), for ATPase-8 HKY + G 76 (α = 0.2100), and for 28 S "001120 + I + F", a model with unequal base frequencies, five rates of base change, no site heterogeneity, and a proportion of 0.884 of invariable sites 74 . These DNA regions were then concatenated, outgroups were added, and they were subjected to phylogenetic analyses with the appropriate model for each gene partition.
Bayesian phylogenetic analysis was carried out with MrBayes v. 3.2 77 in partitioned analyses applying the best model to each DNA region but allowing the program to estimate values, a heating parameter T of 0.15 and Dirichlet priors for rates and nucleotide frequencies, in 3 × 10 8 steps, sampling every 3000 th tree from two runs. Convergence was assessed as having been reached when the average standard deviation of split frequencies became <0.01, and the potential scale reduction factor 78 1.00 for all parameters. The runs were also visually checked for lack of trends in Tracer v1.6 79 . The first 25% of trees were discarded from each run as burn-in, and a 50% majority rule tree was constructed. Maximum likelihood (ML) analysis was conducted in RAxML v.8.2.x 80 using the GTR + G model and rapid bootstrapping for 10,000 iterations. Clades with less than 85% Bayesian support and less than 75% ML support, were collapsed.
Maximum likelihood composite genetic distances 81 based on the Tamura & Nei 82 method were calculated between morphotypes and clades in MEGA v. 7.020 83 for each gene separately and for the concatenated sequences, with gamma corrections as estimated by jModelTest. Maximum likelihood composite distances for all genes were preferred to the models of DNA evolution used in phylogeny construction, so as to minimize the effect of correction factors in the comparisons between Encope, Mellita and Lanthonia.
We estimated the time of splitting of clades in BEAST v.1.8.2 84 , using a relaxed log-normal clock with calibrations from fossil Encope (see below). An uncollapsed ML tree (which was compatible with the Bayesian tree) constructed in RAxML was used in BEAST as a starting tree. Operators causing topology searches ("SubtreeSlide", "Narrow Exchange", "Wide exchange", "WilsonBalding") were turned off to force BEAST to place time estimates on the nodes of the ML tree, but BEAST was allowed to estimate branch lengths. The appropriate model for each gene partition was entered as a prior, but the program was allowed to estimate parameters and substitution rates specific to each gene. Each fossil date was used as the offset of an exponential prior with a mean adjusted so that the upper 95% percentile extended to twice the age of the fossil. Trees were linked across genes, clocks were unlinked. Distributions of priors for mean rates (ucld.mean) of all genes were uniform, ranging from 0.001 to 1 100 . Distributions of priors for standard deviations of rates (ucld.stdev) were exponential with a mean of 0.333. Nine separate runs, each of 10 8 steps and starting from different random seeds, were carried out, saving every 1000 th tree. The logs produced from all runs were combined in LogCombiner v.1.8.3, after removing the first 10% of the steps of each run, and viewed in Tracer v.1.6 79 to verify that there were no trends and that all effective sample sizes (ESS) were >218. Trees were also combined in LogCombiner v.1.8.3 after removing the first 10% from each run. The resulting file was thinned out by taking every 10 th tree and entered in Tree Annotator v.1.8.3 to estimate age of each node.
We used the following fossil dates to calibrate the tree: The oldest species assigned by Durham 22 , to Encope, E. michoacanensis from the Ferrotepec Formation (Mexico eastern Pacific), dates from the late Early to Middle Miocene. Durham 22 referred to affinities between E. michoacanensis and Panamic Miocene species. The oldest Encope from Panama (which is undescribed) occurs in the lowest Lower Gatun Formation at Sand Dollar Hill, but does not closely resemble any extant species. This formation was dated to 11.0 Ma by Smith & Jackson 85 . We therefore used 11 Ma as the minimum age for Encope. Well-dated fossils of extant Encope occur in Pliocene and Pleistocene deposits. These were used to constrain minimum ages of nodes leading to species of this genus. Encope aberrans has been recorded from the Late Pliocene  89 reported E. californica, and E. micropora from the Pleistocene of the Gulf of California. He described possible fossil representatives of E. grandis as ancestral species/subspecies, with E. shepherdi restricted to the Late Pliocene and E. grandis inezana restricted to the Pleistocene. Morphological differentiation between E. grandis, and E. grandis inezana are very slight, and relates to small differences in the size of the marginal notches, the size of the interambulacral lunule, and the concavity of the abactinal system. In our study such characters were observed to vary greatly in extant E. grandis and, thus, do not reliably differentiate these forms. We therefore used the Gelasian Stage of the Early Pleistocene for the minimum age of E. grandis. Encope perspectiva is recorded from the Pleistocene near Santa Cruz, Oaxaca, Mexico 22 . Encope borealis has been recorded (CASG 68681) from Pleistocene deposits between Puerto Peñasco and Punta Borrascosa, Mexico. For fossils, the age of which was recorded generally as the Pleistocene, we used the date of the start of the Middle Pleistocene, 0.78 Ma, as a minimum age. In summary, the offsets used in the BEAST analysis were as follows: Encope: 11  We cross-validated the estimated dates of each node of the tree by carrying out eight separate runs of 10 9 steps, each with one fossil date removed. In every other respect, these runs were identical with the ones including all fossil constraints. Equipment and settings. Images of sand dollars in Figs 1 and 2 were taken by S.E.C. using a Nikon D60 digital SLR camera and were edited in Affinity Photo V. 1.5.2. Brightness and contrast were unchanged. Scanning electron micrographs of spines in Fig. 1 were taken using a FEI Quanta 400 environmental scanning electron microscope.