Implications for management and conservation of the population genetic structure of the wedge clam Donax trunculus across two biogeographic boundaries

In a resource management perspective, the understanding of the relative influence of the physical factors on species connectivity remains a major challenge and is also of great ecological and conservation biology interest. Despite the overfishing threat on the wedge clam Donax trunculus in Europe, relatively little information is known about its population genetic structure and connectivity and their consequences on conservation policies. We employed 16 microsatellite loci to characterise the genetic diversity and population structure of D. trunculus. A total of 514 samples from seven different localities along the Atlantic-Mediterranean transition, from the Atlantic (Gulf of Cádiz) to the north-western Mediterranean were genotyped. The analysis of the population genetic structure displayed a clear distinction along the Atlantic-Mediterranean transition with different clusters in the Atlantic Ocean, the Alboran Sea and the northwestern Mediterranean. Consequently, we recommend that these three areas should be considered as different management units. We showed that all populations seem to be at high long-term risk of extinction with the exception of the protected Doñana National Park population which still seems to have evolutionary potential. Therefore, our results emphasized the necessity of protection of this economic resource and the validity of molecular tools to evaluate the population dynamics.

Many soft and hard faunal barriers to dispersal have been identified in marine environments throughout the world, including, for instance, the Isthmus of Panama 1-3 and the Terminal Tethyan Event 4,5 as hard barriers, as well as the East Pacific Barrier 6,7 and the Benguela upwelling 8 as soft barriers. Some potential physical factors such as ocean currents, habitats discontinuities or temperature gradients are recognised as driving the biogeographic barriers structuring these regions. Thus, in a resource management perspective, the understanding of the relative influence of the physical factors on species connectivity remains a major challenge and is also of great ecological and conservation biology interest.
Genetic studies have demonstrated that gene flow between conspecifics either side of barrier is often limited for marine [9][10][11] and terrestrial fauna [12][13][14] . Among the major marine biogeographic boundaries, the Strait of Gibraltar that separates the Mediterranean Sea from the Atlantic Ocean and the Almeria Oran Front, which is a semi-permanent dynamic oceanographic front connecting the main jet of incoming Atlantic water and the Results Genetic diversity. A total of 514 individuals from seven different localities were genotyped at 16 microsatellite loci. The genetic diversity found in each locus across samples is summarised in Table 1. Highly significant deviations from Hardy-Weinberg equilibrium were found in all sampling sites and for each locus. Exact tests for genotypic linkage disequilibrium confirmed the absence of linkage among most loci (91.6%; P < 0.05 after Bonferroni correction). Genotyping was consistent across the 514 samples as revealed by mis-scoring rates from two independent scorers below 2%. The number of alleles (N A ) per locus ranged from 12 to 33 χ σ = .
= . ( 079, 012), respectively. The software Microchecker 32 depicted the occurrence of null alleles in all loci and the presence of large alleles dropout for some loci. Thus, to better estimate the frequency of null alleles, we relied on the software Cervus 33 (Table 1). The null allele frequencies results from Cervus revealed an estimate for each locus ranging from 6.6% to 55.3% χ σ = .
= . ( 079, 003), respectively. Despite the strong departure from HWE observed, we estimated the effective population size, using NeEstimator to explore current demographic patterns in localities exposed to different fishing pressure. The Ne estimates ranged from 210 for Isla Canela to 1191.7 for Doñana (95% confidence interval of the parametric test ranging from 165.1 to infinite; Table 2).
Genetic structure and gene flow. The analysis of molecular variance (AMOVA) revealed that the majority (98.6%) of the observed variation was explained by differences within the sampling sites, however a small but significant portion of the variation (1.1%) occurred among sampling sites between groups ( Table 3). Results of the F ST analysis revealed a significant genetic differentiation (P < 0.05) between the Gulf of Cádiz (Southern Atlantic Spanish coast), the Alboran Sea (Southwestern Mediterranean coast) and the Northwestern Mediterranean coast. This corroborates the presence of two well characterised biogeographical barriers, the Strait of Gibraltar 34 and the AOF (P > 0.05; Table 4). STRUCTURE analyses using the RECESSIVEALLELES and LOCPRIOR functions further substantiated the existence of three genetically differentiated clusters: one in the Gulf of Cádiz; a second one for the samples collected east of the Strait of Gibraltar, in the Alboran Sea, but west of AOF in the Mediterranean Sea; and a third one   for the samples collected northwest of the AOF in the Northwestern Mediterranean coast. A clear discrimination was obtained between these three clusters (Fig. 1). The assigned q-value for each pooled sampling site were of 0.806 ± 0.080 (range from 0.462 to 0.920) for the Gulf of Cadiz, 0.542 ± 0.084 (range from 0.357 to 0.736) for the Alboran Sea, and 0.845 ± 0.100 (range from 0.524 to 0.959) for the Northwestern Mediterranean Coast. Some individuals were not included into the mean individual admixture proportions of each pooled sampling site because they clustered into a different group: one individual from Isla Canela (q-value = 0.589) clustered into the Gulf of Cadiz and, two individuals from Caleta de Vélez (q-value = 0.438 and q-value = 0.355) and two from Cabo de Gata (q-value = 0.444 and q-value = 0.458) into the Northwestern Mediterranean Coast. Analyses using BayesAss suggest a consistent gene flow among the groups of populations ( Table 5). The average migration rate in all pairwise comparisons ranged from 0.003 to 0.034. Each value represents the proportion of individuals that is derived from a corresponding source population for each generation. Altogether, the average migration rate was very low between samples from the Alboran Sea (i.e. source) and the two other groups of populations (i.e. sink), 0.034 and 0.017, respectively for the Gulf of Cadiz and the Northwestern Mediterranean Coast and were higher than any other pairwise comparison.

Discussion
Several recent studies have clearly shown that the degree of connectivity among marine taxa separated by biogeographic barriers varies from extensive isolation to complete panmixia and thus the actual influence of these marine transitions remains controversial 35 . Furthermore, they have also shown that the differences observed are largely dependent on early life history traits, such as pelagic larval duration of the species concerned 11,36 . In fact, the number of studies that have utilised different molecular markers to unravel the contrasting results of the influence of biogeographical barriers to dispersal in marine environments is so vast that a new synthesis is required (e.g. 37 ). The failure to account for differences in heterozygosity among markers and species may explain to some extent for the irreconcilable patterns observed across ecological and evolutionary timescales 38 . For example, a recent study on European sea bass Dicentrarchus labrax confirmed that the Atlantic and Mediterranean basins harbour two distinct lineages 39 . Similarly, Luttikhuizen et al. 10 showed a very clear phylogeographic differentiation among the common shrimp (Crangon crangon) populations from the Atlantic Ocean and the Mediterranean Sea. Very high levels of genetic differentiation between Atlantic and Mediterranean populations (pairwise θ B values range from 0.395 to 0.510) were also found in the Lusitanian sea star (Asterina gibbosa) 18 . On the other hand, the analysis of the genetic structure among samples from Atlantic Ocean and Mediterranean Sea of the European   spiny lobster (Palinurus elephas) showed small but significant differences (F ST for pairwise comparisons within basins = 0.003 ± 0.004 and F ST among basins = 0.011 ± 0.005) indicating the presence of two partially-overlapping groups (overlapping of some localities from the Atlantic and Mediterranean basins) 40 . In contrast, a recent study on the genetic structure the holoplanktonic jellyfish, Pelagia noctiluca, across the Atlantic Ocean and Mediterranean Sea indicated a high degree of connectivity with little evidence of population subdivision between basins 41 . A very small but significant differentiation (F ST = 0.005) among Atlantic and Mediterranean stocks was detected when using microsatellite allele frequencies but no evidence of differentiation was observed using mtDNA haplotype diversity. The magnitude of genetic differentiation observed between the different pooled sampling sites in the wedge clam can be considered as large values for marine taxa (from 2 to 3%). In this study, the level of genetic differentiation observed between the main oceanographic discontinuities along the Gulf of Cádiz to the Western Mediterranean Sea (the Gibraltar Strait and the AOF) confirms that these biogeographic barriers to dispersal effectively restrict gene flow between D. trunculus populations as supported by our Bayesass analysis and also as reported in many other taxa 11 . Before the entry of the Atlantic waters throughout the Gibraltar Strait a branch of these waters recirculates near the Strait, in front of the Cape of Trafalgar, towards the northwest along the coast of Cádiz. This area is also influenced by the intense tidal-current regime of the Strait of Gibraltar and the strong topographic interaction between the swift along-shore tidal flow and a submerged ridge running perpendicular to the shoreline 15,42 . These processes originate persistently a patch of cold water that can affect the connectivity between populations at both sides of the Gibraltar Strait 11,43 . On the other hand, the AOF is a semi-permanent dynamic oceanographic front connecting the main jet of incoming Atlantic water and the Mediterranean Sea and it is situated at approximately 400 km east of the Strait of Gibraltar 16 . This front is formed by the convergence between two different water masses (of Atlantic and Mediterranean origin) present in the upper 300 m, and it is known to act as a barrier to gene flow in numerous species 11,15 . This phenomenon is clearly demonstrated through our results (Table 4 and Fig. 1), which displayed a significant genetic differentiation between the populations from the Gulf of Cádiz, the Alboran Sea and the Northwestern Mediterranean Coast. As such, these three areas should be considered as different management units. However, the levels of genetic differentiation obtained in this study might represent a conservative estimate as levels of heterozygosity may be artificially decreased due to the presence of null alleles. However, despite the heterozygous deficits of the loci used in this study, they were all sufficient for clearly differentiating the populations between the Atlantic Ocean and the Southeast and Northwest Mediterranean Coast of Spain.

Gulf of Cádiz
The high incidence of null alleles found in this study is consistent with many reports in natural populations of bivalve species analysed with DNA markers such as microsatellites 26,44,45 . In a previous study on D. trunculus, Nanton et al. 26 showed the presence of null alleles (range from 0.109 to 0.277) in 10 loci out of 19. Another study aiming to examine the genetic variability and relationships between two shellfish (Ruditapes philippinarum) populations in Korea also reported the occurrence of null alleles in 7 loci out of 10 46 . The presence of null alleles has also been reported in a study that assessed the genetic diversity of four wild and six hatchery stocks of the hard clam (Mercenaria mercenaria) in Florida. Null alleles were present in all loci (range from 0.015 to 0.296), but for 4 loci out of 7, the frequency of null alleles was particularly high (> 0.146) 47 . Although the causes of these heterozygote deficits remain uncertain, several hypotheses have been suggested to explain them including the Wahlund effect, selection, inbreeding and the presence of null alleles 48 . Inbreeding seems unlikely given the large population size of bivalves. Moreover, we cannot demonstrate from these results, selection against heterozygotes. Indeed, although microsatellite markers are supposed to be neutral, it may be possible that one or more loci are linked to genes under selection 49 or are under selective constrains due to DNA structure 50 . Another reason that may explain the presence of heterozygote deficits would be scoring errors. However, we can confidently discard scoring errors as a cause of heterozygous deficits because the robustness of the genotyping was thoroughly assessed by double checking independently all genotypes by two experienced scorers and secondly by using the software Microchecker (unpublished data).
Alternatively and certainly a far more interesting result would be that the deviations from HWE are due to the known severe exploitation that the species has been subjected over the last decades. For example, Hoarau et al. 51  were not. They performed extensive testing to rule out that these deviations for HWE were not due to genotyping artefacts and Wahlund effects. Subsequently, the authors were able to attribute the significant heterozygote deficiencies found from 1970 onward to inbreeding and overexploitation. Unfortunately, we do not have access to archived samples to verify if the populations where in HWE in the past. However, we attempted to rule out genotyping errors (see above) and PCR artefacts using the following approaches. First, we redesigned primers whenever the flanking sequences where adequate (D.tru2, D.tru8, D.tru14, D.tru28, and D.tru32). Secondly, we re-amplified 10 loci including the loci for which we redesigned primers and 5 more which produced codominant amplicons for a subsample of individuals (approx. 200) under relaxed annealing temperatures (4 °C lower) to verify that additional alleles had not been missed. None of these strategies resulted in a reduction of the HWE deviations and thus, it was concluded that PCR artefacts were minimal and insufficient to explain the large heterozygous deficits observed. While it is impossible with the current dataset and with the funding that as available for the project to demonstrate that overexploitation may play and important part in the lack of conformance with HWE, this explanation remains plausible and interesting. In this study, the microsatellite loci characterized for D. trunculus showed variable levels of genetic variation than those previously found 26 . The allele number of these polymorphic markers ranged from 12 to 33, which is higher than previous results on D. trunculus 26 as well as the most part of the studies on other bivalve species such as Sinonovacula constricta 52 and Mactra chinensis 53 . A heterozygosity deficit has been observed for each locus and all sampling sites ( Table 1; Table 2). Although these results are higher than those reported by Nanton et al. 26 , and are consistent with the levels of variability found in both for D. trunculus and other clam species 54 .
The effective population size (Ne) is an important parameter of numerically small populations, which is positively correlated with their ability to persist in a changing environment and to evolve. The 50/500 rule often cited by conservation practitioners 55 has been recently revised to ≥ 100/1000 by Frankham et al. 56 . They estimated that a minimum of 100 for Ne was necessary to avoid immediate risk of extinction due to inbreeding depression. Populations with Ne less than 500-1000 were at high long-term risk of extinction due to the loss of ecologically relevant genetic diversity by drift and populations with Ne above 1000 were considered as being able to retain their evolutionary potential. D. trunculus populations have been intensively exploited over the last decades, and particularly in Europe 24 , and thus, all populations seem to be at high long-term risk of extinction except in Doñana, a managed area located in a National Park where effective population sizes appear to be larger ( Table 2) 57 . However, although we used the P crit value recommended for microsatellite markers (P crit = 0.02; 58 ), these results should be taken with caution given the presence of null alleles, and as a consequence of the heterozygote deficits. Interestingly however, Isla Canela suffers from severe overexploitation from both, registered and licensed fishermen and from Illegal, Unregulated and Unreported (IUU) extraction and has the lowest Ne estimate. This result supports to some extent the robustness of the analyses 57 .
Both populations, Isla Canela and Doñana, are geographically very close to each other (85 km) and share a similar habitat structure: long sandy beaches close to a river mouth (Guadiana and Guadalquivir Rivers, respectively) (Fig. 2). However, they also present clear differences in respect to the management of the Donax trunculus fishery. The Doñana beach belongs to the Doñana National Park, and since 1996 has a regulation of the D. trunculus fishery: only 160 shellfish collectors are allowed (have a license). These shellfish collectors have to work using only an artisanal rake with a long net attached to the end of the drag (fishing from boats is forbidden) on the intertidal zone. During five days a week, they can collect a maximum of 25 kg of D. trunculus individuals per day and these must be larger than 25 mm. Furthermore, they have to respect the breeding season (two months a year) without fishing, and they cannot collect when toxic algae are detected in the wedge clams. In the case of Isla Canela, in addition to the shellfish collectors with license that work in the D. trunculus fishery, there are numerous illegal collectors who operate in the area throughout the year and sell their catches in the streets of Huelva and Seville and other smaller villages in both Provinces of Andalusia and, in general, visitors to the beach also harvest clams mostly during the summer months (CR & JAC personal observations). Fishing from boats is also allowed, so the area of collection is wider, covering both the subtidal and intertidal zones. Moreover, there is neither a strict and respected reproductive period without fishery, nor a strict control over the D. trunculus minimum sizes. All these differences in the exploitation of D. trunculus fishery are reflected in the size structure of both populations. For example, in July of 2011, 2012 and 2014, although significant differences in D. trunculus density between the two beaches were only observed in 2014, individuals from Doñana were always significantly larger than those from Isla Canela (Supplementary Methods and Results; Figs 1 and 2), possibly reflecting different age structures resulting from over-harvesting in Isla Canela. This observation is also in line with the observed differences in effective population size between these locations.
In conclusion, for the first time, the genetic diversity and population structure of Donax trunculus is described along the Atlantic-Mediterranean transition. Using 16 microsatellite markers, we identified the presence of a large proportion of null alleles in this species. However, the presence of null alleles in the loci appears to have no effect on the population genetic structure uncovered. Our results showed that all populations seem to be at high long-term risk of extinction with the exception of the Doñana National Park population which still seems to have evolutionary potential. Interestingly, this population is rigorously managed by the authorities of the National Park with restrictions in the number of fishing licences, the amount of allowed harvest and the use of hand operated dredges. This work represents a contribution for the understanding of the genetic diversity and population structure of this species, and de facto in terms of biodiversity conservation and resource management. Carles de la Ràpita and Roses). All the samples used in this study belong to species targeted by the commercial fishing industry and are neither protected, nor require approval of any animal ethics committee since they are invertebrates. Permit of sampling in the Doñana National Park was granted for the study. DNA extraction. DNA was extracted from different wedge clam's tissues using a slightly modified version of Aljanabi and Martinez 59 salting out method. Namely, after addition of the saline solution, the mixture was centrifuged for 30 min at 10,000 g. Also, DNA precipitation was achieved by incubation at − 20 °C with 600 μ L isopropanol for 30 min. then, after washing out the pellet with 70% ethanol, we centrifuged at 10,000 g for 10 min. Genetic diversity analyses. Deviation from Hardy-Weinberg equilibrium and linkage disequilibrium were tested using Genepop (version 4.0 60 for each marker. We used the software MicroChecker 32 to test the presence of genotyping errors due to stuttering, allele drop-out or null alleles. Results showed the presence of null alleles for almost each marker, as well as stuttering and large allele drop-out for some of them (results not shown), but no genotyping errors from allele dropouts or stuttering affected allele scoring. Additionally, a deficit of heterozygosity for almost each marker was observed. Consequently, all samples were re-amplified for each of the 10 loci under relaxed annealing temperature (4 °C lower) to verify that additional alleles had not been missed. After scoring all individuals, the same analyses were re-conducted for each marker. Furthermore, primers were redesigned for the loci where sufficient and adequate flanking regions were available. Primer sequences for this new primers are; D.tru2 F 5′ FAMACCACCAATTCTCCTACGG3′ , D.tru2 R 5′ ATGTGGGCGATGATTTCCT3′ ; D.tru8 F 5′ PET TAAAATTGCCATGCGTGCAG3′ , D.tru8 R 5′ GAAATATATTGCAGGCTGGTAGG3′ ; D.tru14 F 5′ VIC TTTTTGTTCTTCTGAATAGTGCAA3′ , D.tru14 R 5′ TTTACAGCGTGTCGCCATCT3′ ; D.tru28 F 5′ FAM CTGAGAAGTAATAAAACGTGAATTGTTG3′ , D.tru28 R 5′ CAACAGAGCCACTGATGACAA3′ ; D.tru32 F 5′ NED TGGGTCCTGGAGGGTAAAAT3′ , D.tru32 R 5′ TTAGTCAGAACCTCAACTCTCAAA3′ .

Methods
The total number of alleles (N A ), the observed (H O ) and expected (H E ) heterozygosity, the polymorphism information content 61 and the null-allele frequency (NAF) were also determined for each locus and each sampling site using the program CERVUS 3.0 33 . Effective population size for all the populations were also estimated using the linkage disequilibrium method 62 as implemented in NeEstimator (version 2.0 58 ).
Analyses of population genetic structure. In order to assess the significance of the level of genetic differentiation, an analysis of molecular variance (AMOVA) was implemented in Arlequin (version 3.5.2.1) 63 . Specifically, we evaluated the amount of variance attributable to groups of sampling sites (Gulf of Cádiz: Isla Canela and Doñana; Alboran Sea: Caleta de Vélez and Cabo de Gata; Northwestern Mediterranean coast: Gandía, San Carles de la Ràpita and Roses), sampling sites within the groups and within sampling sites. The extent of genetic differentiation among the group of sampling locations was assessed using pairwise F ST estimates 64 obtained in FSTAT version 2.9.3 65 .
The individual admixture proportions (q-values) for each individual in each sampling site, considering each one like a population, was performed using STRUCTURE v.2.2 66 . Because of the presence of null alleles, analyses were done incorporating RECESSIVEALLELES option. In this case, the program assumes that the recessive allele is never observed in homozygous state but it might be present (e.g. when there might be null alleles). Moreover, the genetic differentiation of population being weak, the LOCPRIOR model has been used. The program assumes an initially probability that each individual is a resident of its sampling locality calculating posterior probabilities that individuals belong to their sampled locality. The number of clusters for each analysis was therefore assessed using the ad hoc statistic ∆ K 67 . An admixture model with correlated allele frequency was used for each analysis with 250,000 steps of the Markov-Chain preceded by a burnin-period of 100,000 steps.
BAYESASS version 3.0 68 was used to estimate direction and rates of migration that has occurred more recently (i.e., within the last several generations) between the different groups of populations (Gulf of Cadiz, Alboran Sea and Northwestern Mediterrannean coast). Analysis were performed pooling locations as recommended by Meirmans 69 (i.e. few populations with many individuals). This method allows for deviations from HWE but assumes linkage disequilibrium and constant migration rates for two generations prior to sampling. Migrants are defined by hybrid genotypes and the program determines a migration rate m, which is the proportion of first generation migrants. Parameters were as follows: number of iterations was 3,000,000, of which 10 6 are burn in, and the sampling frequency was 2,000.