Introduction

The rockpool shrimp Palaemon elegans Rathke, 1837 is a crustacean decapod with a native geographical distribution ranging from the Atlantic Ocean (from Scotland and Norway to Mauritania, including the Azores, Madeira and Canary Islands) to the entire Mediterranean Sea and the Black Sea1. Nowadays, this shrimp also inhabits the Caspian Sea and the Aral Sea because of unintentional introductions2. Similarly, this species was introduced in the Baltic Sea where it is replacing the native congeneric P. adspersus2,3. Occurrence of P. elegans in the northeast coast of the United States was recently reported4, so its distribution range is currently extended beyond European waters.

This shrimp is common in tidal rockpools and in Zostera, Posidonia and Cymodocea sea grasses meadows. It also can be found in hypersaline lagoons and in slightly brackish water close to river mouths5. Palaemon elegans is characterized by its capability to adapt to highly variable environmental conditions. Therefore, due to its broad ecological niche and its recent and ongoing geographic expansion, it is considered an important species within the European coastline fauna6.

Population genetics analyses in P. elegans are scarce and exclusively based on mitochondrial DNA (mtDNA) markers. Reuschel et al.6 carried out a phylogeographic analysis with two mtDNA markers, revealing the existence of three haplogroups, one of them in the Atlantic localities (type I) and two from the Mediterranean localities (types II and III). Genetic differentiation between the Atlantic populations (type I) and the Mediterranean populations (type II) was observed as well as the putative occurrence of a cryptic species within P. elegans (type III). This population genetic pattern was supported again with mtDNA markers in later phylogeographic studies in the Mediterranean Sea7,8. These findings highlighted the need to carry out further genetic studies with polymorphic nuclear markers, such as microsatellites, in order to clarify the population biology of P. elegans.

Given its codominant nature, biparental mode of inheritance and high levels of polymorphism9, microsatellite markers have been used in a wide range of applications in population genetics, ecological, conservation and evolutionary studies10. Indeed, microsatellites loci are extremely valuable tools in population genetics because they might reveal the existence of genetically isolated populations even in fine-scale studies11. Polymorphic microsatellite loci have been developed for three Palaemon species12,13,14. In the case of P. serratus14, thirteen loci showed positive cross-species amplification in P. elegans. Nevertheless cross-amplification from congeneric species is not generally feasible because inherent problems like allele size homoplasy, polymorphisms biases, null alleles presence, broken repeats motifs or amplification of non-orthologous loci could arise15,16,17,18. Thus, de novo development of species-specific microsatellite markers is strongly recommended.

Classically, microsatellite development required the construction of an enriched library by cloning and Sanger sequencing, a laborious, time-consuming and expensive strategy19,20. This drawback could be overcome with the advent of next-generation sequencing (NGS) technologies, which produce a large amount of sequences, providing a faster and cost-effective approach for microsatellite loci discovery21. The first microsatellite markers developed using NGS were based on Roche 45422, however Illumina has demonstrated its capability for microsatellite isolation23,24,25 and it is currently the platform used for this purpose.

The aim of this study was the isolation and characterization of novel polymorphic microsatellite loci for P. elegans using Illumina high-throughput sequencing. Given that there are no microsatellite loci previously developed for this shrimp species, these markers will provide new suitable tools in order to assess the genetic diversity, geographic structure and connectivity of populations at smaller geographic scales, among other possible future applications.

Results and Discussion

The development of microsatellite markers in non-model species has become a rapid and cost-effective process thanks to the advances in high-throughput sequencing technologies26. Thousands of microsatellite loci can be identified in the large amount of sequence data. Illumina is currently the platform used to accomplish microsatellite isolation and it produces more reads at lower prices than Roche 45427. Paired-end sequencing is frequently the preferred strategy for microsatellite isolation as longer reads are generated. In this study the P. elegans microsatellite-enriched library was sequenced using Illumina MiSeq PE, an approach already used for microsatellite discovery in other organisms, e.g. Ewers-Saucedo et al.28, Landínez-García & Márquez29 and Gaeta et al.30. Here, the microsatellite-enriched library sequencing produced 3,902,540 paired-end reads. These paired-end reads were processed and overlapped into 1,766,031 sequences, with a mean length of 171 bp (range: 50–538 bp). Tandem repeats were identified and 500 sequences containing microsatellite motifs were used for primer design. For the preliminary screening, ninety-six out of these 500 primer pairs were tested in five individuals from the different Atlantic and Mediterranean localities that already showed genetic divergence with mitochondrial markers in Reuschel et al.6. From them, primer pairs that produced no amplification or unexpected size PCR products were discarded. Likewise, monomorphic loci were excluded, as well as loci that showed stuttering patterns of ambiguous interpretation. A final suite of 21 microsatellite loci (perfect tri- and tetranucleotide repeats) yielded consistent amplification and reliable polymorphism. Sequences containing these 21 microsatellite markers were deposited in GenBank under accession numbers MH078079-MH078099 (Supplementary Material S1).

The microsatellite loci were characterized through the genotyping of 30 individuals collected in the Santoña locality (Table 1). Successful amplification and scoring was achieved for the 21 microsatellite loci in all individuals. No evidence of linkage disequilibrium between pairs of loci was detected after Bonferroni correction. Hence, all microsatellites were considered as independent markers. Regarding to the genetic variability, the number of alleles per locus ranged from 2 to 12, with a grand mean of 4.6 alleles per locus. The observed heterozygosity (Ho) for each locus ranged from 0.033 to 0.833, and the expected heterozygosity (He) for each locus ranged from 0.033 to 0.869 (Table 1). Ho and He averaged 0.390 and 0.464, respectively. These levels of polymorphisms are in line with those found in the counterpart common littoral shrimp P. serratus in other Atlantic locality14. Significant departures from Hardy-Weinberg equilibrium (HWE) were found in four loci (Pe11, Pe14, Pe18 and Pe19) after sequential Bonferroni correction (Table 1). A heterozygote deficit was detected for these four loci. Similarly, heterozygosity deficiency has been reported in other shrimp species12,14,31,32,33. Among the frequent causes for HWE deviations, inbreeding is quite common. The four loci out of HWE were accompanied with positive FIS values (Table 1). A global FIS value of 0.174 also suggested a heterozygosity deficit in the Santoña locality. However, null alleles could also be invoked to explain heterozygote deficits. Although microsatellite null alleles are widespread, marine invertebrates have demonstrated particularly higher frequencies than other groups34,35. Due to the instability of the flanking sequences of microsatellites, some alleles could not be amplified and consequently dropped out, resulting in a homozygote excess. The four loci that deviated from HWE showed evidence of occurrence of null alleles even though mainly in a low proportion (Table 1). Thus, presence of null alleles is the most likely explanation for those departures from HWE. Overall, most of the 21 loci showed low null allele frequencies (Table 1). Only one locus, the locus Pe11, exhibited a high null allele frequency, >0.2 according to Chapuis & Estoup36. This locus Pe11 is precisely one of the four loci that showed significant deviation from HWE. Therefore, given that the locus Pe11 could be potentially problematic, its inclusion in future analyses might be carefully considered.

Table 1 Characterization of the 21 microsatellite loci for Palaemon elegans. 5′ tails attached to reverse primers are in brackets.

Microsatellite markers have proven to be crucial in understanding population genetics and ecology of several shrimp species as Penaeus setiferus37, P. vannamei38, P. monodon39, or Aristeus antennatus40. Assessment of genetic diversity and population structure provides interesting data for conservation and fishery management measures. The microsatellite loci characterized here provide new suitable tools available for the scientific community to study Palaemon elegans population dynamics. Specifically, future studies using these microsatellite loci will shed light about P. elegans phylogeography at smaller geographic scales. In fact, there are ongoing projects that aim to analyze Atlantic and Mediterranean localities along the European coastlines using these markers to address the genetic differentiation between these basins. This microsatellite set will be also greatly useful for inferring the introduction routes and the genetic profiles of the introduced populations of this species in European and North America waters41,42,43.

Material and Methods

Specimens of P. elegans used in this study were collected from Ártabro Gulf (Galicia, Spain, 43°18′48.1″N, 8°33′51.3″W) in 2012 and ethanol-stored at laboratory. Morphological identification was carried out according to González-Ortegón & Cuesta5. Genomic DNA was extracted from 25 mg of muscle abdominal tissue using NZY Tissue gDNA Isolation kit (NZYTech) following the manufacturer’s instructions. Extracted DNA quality and concentration was determined with the NanoDrop ND-1000 spectrophotometrer (Thermo Fisher Scientific).

A microsatellite-enriched genomic library from this sample was constructed at AllGenetics & Biology SL (A Coruña, Spain). The library was prepared using Nextera XT DNA Library Preparation kit (Illumina) following the manufacturer’s instructions. Microsatellite motifs AC, AG, ACG and ATCT were used for the library enrichment. The enriched library was sequenced in the Illumina MiSeq PE300 platform (Macrogen Inc., Seoul, Korea). Reads were processed using Geneious 10.0.544 and in-house developed scripts. Sequences containing microsatellite loci were selected for primer design. Primer pairs were designed using Primer345 for PCR amplification of 500 microsatellite loci, with primers hybridizing at the flanking regions of tandem repeats.

A preliminary panel of 94 primer pairs was screened in individuals from five different Atlantic and Mediterranean localities (one individual per locality): Ré Island (France, 46°11′36.2″N, 1°30′10.2″W), Santoña (Spain, 43°28′04.2″N, 3°28′30.6′′W), Benijo (Spain, 28°34′21.7″N, 16°11′47.0″W), Llança (Spain, 42°21′06.3″N, 3°11′15.1″E) and Collioure (France, 42°31′43.8′′N, 3°05′07.6′′E). All genomic DNA extractions were accomplished following the aforementioned protocol. PCR reactions were performed following Schuelke46 so for each microsatellite locus three primers were used: a specific forward primer, a specific 5′-tailed reverse primer and a fluorescently-labeled oligonucleotide identical to the 5′-tail of the reverse primer. As 5′-tails we used the universal sequences M13 (5′-GGAAACAGCTATGACCAT-3′) and CAG (5′-CAGTCGGGCGTCATCA-3′). M13 oligonucleotides were labeled with the HEX dye, meanwhile CAG oligonucleotides were labeled with the 6-FAM dye. Nested PCR reactions were conducted in a final reaction volume of 12.5 µL, containing 1 µL of DNA (10 ng/µL), 6.25 µL of the Type-it Microsatellite PCR Kit (Qiagen), 4 µL of PCR-grade water, and 1.25 µL of the primer mix (2 µM for both forward primer and HEX-M13 or 6-FAM-CAG oligonucleotides and 0.2 µM in the case of M13- or CAG-tailed reverse primers). The optimal PCR conditions consisted in an initial denaturation step at 95 °C for 5 min, followed by 30 cycles of 95 °C for 30 s, 56 °C for 90 s, 72 °C for 30 s; 8 cycles of 95 °C for 30 s, 52 °C for 90 s, 72 °C for 30 s; and a final extension step at 68 °C for 30 min. Amplification reactions were performed on a T100 Thermal Cycler (Bio-Rad). Fluorescently labeled PCR products were run on a 3130xl Genetic Analysis System (Applied Biosystems) for fragment analysis in the Scientific Research Support Services (University of A Coruña), using GeneScan 500 (−250) ROX internal size standard (Applied Biosystems). Geneious 8.0.544 was used for fluorescent profiles analyzing and allele peaks calling.

All microsatellite loci that yielded consistent amplification and reliable polymorphism were further assessed by genotyping 30 individuals from the Santoña locality (Cantabria, Spain). Singleplex PCRs were carried out as mentioned above. For each locus characterization, number of alleles, observed heterozigosity (Ho), expected heterozigosity (He) and deviations from Hardy-Weinberg equilibrium (HWE) were estimated using GenAlEx 6.547. GENEPOP48 was used to estimate inbreeding coefficient (FIS) and to test linkage disequilibrium between pairs of loci. Significance level was adjusted applying the sequential Bonferroni correction for multiple testing. Null allele frequency was estimated using FreeNA36 and the EM algorithm49.

Accession codes

Sequences containing the microsatellite loci were deposited in GenBank under accession numbers MH078079-MH078099.