Development of microsatellite loci and optimization of a multiplex assay for Latibulus argiolus (Hymenoptera: Ichneumonidae), the specialized parasitoid of paper wasps

Microsatellite loci are commonly used markers in population genetic studies. In this study, we present 40 novel and polymorphic microsatellite loci elaborated for the ichneumonid parasitoid Latibulus argiolus (Rossi, 1790). Reaction condition optimisation procedures allowed 14 of these loci to be co-amplified in two PCRs and loaded in two multiplex panels onto a genetic analyser. The assay was tested on 197 individuals of L. argiolus originating from ten natural populations obtained from the host nests of paper wasps. The validated loci were polymorphic with high allele numbers ranging from eight to 27 (average 17.6 alleles per locus). Both observed and expected heterozygosity values were high, ranging between 0.75 and 0.92 for HO (mean 0.83) and from 0.70 to 0.90 for HE (mean 0.85). The optimized assay showed low genotyping error rate and negligible null allele frequency. The designed multiplex panels could be successfully applied in relatedness analyses and genetic variability studies of L. argiolus populations, which would be particularly interesting considering the coevolutionary context of this species with its social host.


Scientific RepoRtS
| (2020) 10:16068 | https://doi.org/10.1038/s41598-020-72923-6 www.nature.com/scientificreports/ The marker of choice for many areas of population research, such as mating systems, kinship structure, demography or conservation genetics, are microsatellite loci [6][7][8] . The usefulness of these markers results from their codominance, high polymorphism, and abundance throughout the genome 9 . The high popularity of microsatellite loci, resulting in their frequent application in population genetics studies, is caused also by the beneficial proportion between the amount of information gained and the sum of financial costs and expended work efforts. Thus far, no such loci have been described for the L. argiolus species; therefore, the purpose of this study was to investigate these loci. Several methods for novel microsatellite marker development exist. One of them introduces third generation sequencing on a PacBio RSII platform. The described de novo loci must be verified first for reliable amplification and high polymorphisms. Microsatellites fulfilling the criteria of error free amplification, high genotyping quality, and polymorphisms may be applied for population genetic analysis. In studies applying microsatellites, at least over a dozen loci are required to obtain high statistical power of the calculations results 10,11 . Therefore, concurrent amplification of markers in multiplex polymerase chain reactions (PCR) is desirable because this procedure will significantly shorten the time of laboratory work and reduce the cost of analysis. Such assays need to be first optimized and validated to justify the use of the loci in future research projects [12][13][14] .
This study aimed to develop a set of novel microsatellite loci and then to optimize and verify the newly designed multiplexes that could be applied in population and evolutionary studies of the specialized ichneumonid parasitoid L. argiolus species.

Material and methods
Sampling and DnA extraction. The  Pacific biosciences RS platform sequencing. The PacBio library was constructed using 9 µg of genomic DNA consisting of 16 (12.5 µl) equally mixed female DNA samples. DNA was fragmented using miniTube (cat. no. 520064) in the Covaris System to a size of 1.5 to 3 kb according to protocol with minor modifications in which the intensity was set to 0.2 while the sonication time was reduced to 40 s. The shared sample was purified applying 0.6 × volume ratio of AMPure PB Beads (cat. no. 100-265-900, Pacific Biosciences). Furthermore, the 2 kb library was prepared using 750 ng of DNA according to the Pacific Biosciences protocol available online 15 and using the SMRTbell Template Prep Kit 1.0 (cat. no. 100-259-100, Pacific Biosciences). The procedure included DNA damage and end repair followed by blunt ligation of hairpin adaptors at both ends of the DNA fragments. Failed ligation products were removed with ExoIII and ExoVII enzymes. Purifications steps separating enzymatic reactions were performed by applying 0.6 × volume ratio of AMPure PB Beads. Size distribution of DNA fragments during the library preparation procedure and for the final library was checked by electrophoresis on 0.5% agarose gels. The resulting library was doubly purified and prepared for sequencing using DNA/ Polymerase Binding Kit P6 v2 (cat. no. 100-372-700, Pacific Biosciences) and applying MagBeads Kit v2 (cat. no. 100-676-500, Pacific Biosciences) for loading onto the sequencer according to the protocol generated with Binding Calculator v.2.3.1.1 (available online: https ://githu b.com/Pacifi cBio scien ces/Bindi ngCal culat or). Single molecule real-time testing (SMRT) was carried out on a PacBio RS II sequencer running two SMRT Cells 8Pac v3 (cat. no. 100-171-800, Pacific Biosciences) and a 360-min data collection mode.
Data gathered from sequencing were subject to the RS_ReadsofInsert.1 protocol analysis with default settings to generate reads of insert (ROI). The ROI reads were subsequently subjected to microsatellite analysis and primer design using msatcommander v.1.08 16 with a threshold of at least eight or ten repetitions for tri-and tetra-nucleotide repeats, excluding mononucleotide repeats. Primer design parameters including the msatcommander option "combine loci" consisted of several parameters: (1) size of primers ranging from 18 to 22 bp, (2) annealing temperature (T m ) ranging from 58 to 62 °C, (3) GC content ranging from 30 to 70%, and (4) amplicon product size range of 80 to 400 bp. Finally, only microsatellite sequences containing tri-or tetra-nucleotide motif with at least 10 repeats were used for marker selection and amplification test performance.

Microsatellite marker selection and testing.
Loci with the highest number of tandem repeats were chosen from the identified microsatellite sequences set. The DNA sequences of the newly developed loci were aligned against the NCBI database of already described nucleotide sequences using BLAST algorithms (BLASTN 2.10.1 + and BLASTX 2.10.1 + programmes) [17][18][19] . In the next step, loci were filtered for their optimal amplification performance by applying following criteria calculated in silico: (1) maximal PCR efficiency and no dimer formation, (2) difference in annealing temperature between primer pair below 2 °C, (3) minimal penalty score, and (4) sequence length of microsatellite ranging from 100 to 350 bp. For multiplex design purposes, the selected markers were further ordered according to the predicted PCR product size into three categories long (> 300 bp), medium (150-300 bp), and short (< 150 bp). The finally selected markers were amplified on DNA of 16 diploid females. Polymorphisms and the allele range of the chosen markers were tested using the universal primer labelling method 20 Positive controls were not used as the DNA concentration and quality of all samples was satisfactory enough to permit successful PCR reactions. All loci were amplified separately, and the reaction efficiencies were visualized on 2% agarose gels. For loci in which no amplification products were observed, the amplifications were repeated by lowering the annealing temperature to 57 °C. PCR products were separated using an automated sequencer ABI 3500XL Genetic Analyzer applying GeneScan 600 LIZ as an internal lane size standard (cat. no. 4366589, Applied Biosystems). PCR products labelled with different fluorescent dyes were mixed for joint analysis. The resulting fragment sizes were read using GeneMapper v.4.1 (Applied Biosystems) software. Out of the tested set, only loci fulfilling following criteria were selected for multiplex panel optimization: (1) clear peak morphology, (2) no artefacts co-amplification, and (3) high number of alleles.

Multiplex panel optimization.
Out of 41 de novo described loci, 14 microsatellite markers were selected and organized into two multiplex PCR sets (seven loci each) by maximizing the number of loci labelled with the same dye (avoiding though allele overlap between loci) as shown in Fig. 1. The gaps between the neighbouring loci were set to the size of 66 to 137 bp according to the genotype data set of 16 females obtained during single microsatellite marker testing. The forward primer of each marker was fluorescence-labelled (  Table S2). Only six tetra-nucleotide loci fulfilling the microsatellite selection criteria (out of these set) were detected (Supplementary Table S1). From this set, 41 tri-nucleotide loci, named LA1-LA41 were selected for PCR amplification and testing (Supplementary Table S3) and then deposited in GenBank under accession numbers MN531308-MN531348.
Out of 41 amplified markers tested in 16 diploid L. argiolus females, 40 (97.6%) were successfully amplified and were polymorphic (Supplementary Table S4). For four markers (LA1, 14, 16, and 38), the annealing temperature had to be lowered to 57 °C in order to obtain PCR products. Most of the markers (78%) amplified well and yielded clear and readable products. Twenty-four (60%) of the markers gave single peaks, while in the others 16 (40%), additional loci peaks impeding potential uncertainness in allele scoring were observed. These peaks consisted of stutter bands or after-peaks. In seven cases, two peaks differing by one bp in length were observed. The latter effect is caused in cases in which polymerase does not add adenine to the 3′ end of the newly synthesised strand during replication (resulting in so called − A and + A bands occurrence) 27 . BLASTX analysis showed no results for neither of sequence of the 41 tested loci. BLASTN results indicated that eight DNA sequences were similar to previously described DNA or cDNA sequences deposited in GenBank (Supplementary Table S5). However, in all results, the obtained query cover values were too low for convenient sequence annotation to the already described in GenBank genes or transcripts. This indicates that the selected loci are probably located in the non-coding regions of the L. argiolus genome. The exception could be locus La36 for which the query cover values exceeded 50% together with low E value scores and relatively high percentage identity of the aligned sequences for several insect species. In most of these cases BLAST search identified transcription factors sequences as the most probable results. The observed allele number in the tested loci ranged from 4 to 18 with 25 loci exhibiting at least 10 alleles. The best amplified, readable, and most polymorphic loci (Supplementary Table S4) were considered for multiplex development. LA11 was excluded from this set since it exhibited difficulties in raw read rounding, which could have been caused by its hypervariability (N A = 18).

Development and characterization of the microsatellite multiplex assay.
For multiplex optimization purposes, 14 loci were selected and organized into two multiplex panels (Table 1, Fig. 1). The combined microsatellite primers pairs in these assays successfully amplified the chosen loci to yield peaks that were clear and easy to score. For most of the loci, the initial primer concentration had to be adjusted to achieve balanced amplification of the markers (Table 1). We did not detect any evidence of genotypic errors using the MICRO-CHECKER 2.2.1 software, a finding that was also proven by the low frequencies of null alleles explored by CER-VUS 3.0.3 ( Table 1). The amplification success of the multiplexes was high, reaching 197 (100%) of the microsatellite profiles. Robustness of the optimized assay was further verified by obtaining 100% correctly genotyped samples for 20 duplicated samples (~ 10% of the studied samples set as recommend by Pompanon et al. 27  www.nature.com/scientificreports/ significant linkage disequilibrium among any pair of the loci under study after Bonferroni correction (adjusted value of P < 0.004) was found, indicating that the studied loci were most probably not linked. Deviations from Hardy-Weinberg equilibrium were significant (P < 0.001) for only one (La24) of the 14 loci tested after Bonferroni correction ( Table 1). The average number of alleles (N A ) per locus was 17.6, ranging from 8 in locus La23 to 27 in loci La19 and 21. The set of diploid female samples was characterized by the highly observed heterozygosity (mean H O = 0.83) ranging from 0.75 in locus La24 to 0.92 in locus La8 in addition to the highly expected heterozygosity (mean H E = 0.85) ranging from 0.70 in locus La23 to 0.90 in loci La8 and 29. The mean polymorphic information content (PIC) ranged from 0.65 in locus La23 to 0.89 in loci La8 and 29 (Table 1).

Discussion
In this study, we successfully describe new loci and optimization of a multiplex assay for L. argiolus, the highly specialized parasitoid of social paper wasps. PacBio RSII platform sequencing allowed for reliable discovery of microsatellite markers. This method has been proven to perform very well in de novo microsatellite description and was recently often applied [28][29][30] . Application of this method is especially useful in non-model organisms in which no information about genomic data exists. In such cases, it also outperforms other methods by being more efficient and accurate 30,31 . This method is also relatively cost and time efficient compared to the classical methods of microsatellite development 9,28 . Out of 11,355 DNA sequences, approximately 1.7% contained tri-or tetra-nucleotide motifs exceeding ten repeats. Tri-and tetra-nucleotide loci were selected as they were proven to cause less difficulties in scoring and rounding 32 . This finding is important, especially since insects are considered problematic for microsatellite isolation and genotyping and are often biased with particularly high frequencies of null alleles 9,33,34 . Tri-and tetra-nucleotide loci have been suggested to be less frequent than di-nucleotide in the genome; however, they are less prone to amplifications errors, especially stuttering 29 . For this reason, primers and in silico amplifications were very carefully chosen. Sequences with high repeat numbers were selected as it was proven that such loci have higher polymorphism due to higher mutation rate caused by polymerase slippage 35 .
The amplification success of the designed markers was very high (97.6%), and most of the loci amplified well, which points to the high quality of the RSII sequencing and well-defined parameters for primers design [28][29][30][31] . The relatively long reads (on average, exceeding 2 kb) facilitated the design of primers that enabled high PCR performance 28 .
The approach of applying universal primer labelling 20 has been reported to be very efficient and cost effective in other studies 21,32,36 . In our current study, it was also proven to be a very useful and suitable method for primer labelling. The selected markers amplified well, and in most cases, produced (78%) clear bands and in 60% produced single peaks that permitted confident reading. Almost all of the tested markers were polymorphic, which may have been the consequence of selecting sequences with more than ten repeats. The rate of polymorphic loci in conventional methods of de novo microsatellite isolation may be as low as 30% 30,35 . This finding indicates that appropriate microsatellite sequences were selected for amplification tests in single specimens. This finding may also point to a high intrapopulation polymorphism of the studied species. However, a trade-off between high variability and possible artefacts may exist. The extremely polymorphic markers may be biased with high null allele frequency as a consequence of high a mutation rate in these loci 9,27,33 . This bias could have occurred in the LA11 locus. For multiplex optimization, direct fluorescent labelling was applied to reduce primer-dimer formation and enhance the PCR efficiency 32 , which was 100% in the studied sample set.
Our data show that the 14 microsatellite loci selected for assay development are reliable markers for genetic analyses of L. argiolus individuals with a good quality of detection, high polymorphism, and low frequency of null alleles. It is true that one of the loci was not in HWE, but it should be emphasized that the HWE test was performed for pooled diploid individuals that came from different populations resulting most probably in Wahlund effect. We believe that the presented tool will be very useful in L. argiolus studies, especially in the context of its life history strategies after considering co-evolution of this haplodiploid parasitoid with its social host. Because oviposition decisions of female parasitoids primarily influence their fitness, we hypothesize that female of L. argiolus lays more eggs in larger nests of the paper wasp and exhibits adaptive sex ratio manipulation in response to the number of her offspring produced in a single nest of the host. The previously mentioned hypotheses could be now addressed by applying the developed marker set that would allow determination of kinship and could also be a very good tool for the rapid identification of sex of preimaginal stages of L. argiolus. We believe that application of microsatellite markers will yield many answers concerning the biology and genetic structure of this specialized parasitoid. Additionally, the deposited sequence data may be subject to further microsatellite searches and testing in other closely related species of this interesting group of parasitoids.