Detection of sex in adults and larvae of Leptinotarsa decemlineata on principle of copy number variation

The identification of sex in larvae of insects is usually challenging or even impossible, while in adults the sexual dimorphism is usually evident. Here, we used copy number analysis to develop a method of sex detection in Colorado potato beetle (Leptinotarsa decemlineata), which has an X0 sex determination system. The X linked gene LdVssc and autosomal gene LdUBE3B were identified as appropriate target and reference loci, respectively. The copy numbers (CNV) of LdVssc in males and females were estimated using standard droplet digital PCR (ddPCR) and real-time PCR (qPCR). With both methods, CNVs were bimodally distributed (BAddPCR = 0.709 and BAqPCR = 0.683) with 100% ability to distinguish females from males. The use of qPCR-based sex detection in a broad collection of 448 random CPB adults showed a perfect association (Phi = 1.0, p < 0.05) with the true sexes of adults, with mean CNV in females of 2.032 (SD = 0.227) and 0.989 in males (SD = 0.147). In the collection of 50 random 4th instar larvae, 27 females and 23 males were identified, consistent with the expected 1:1 sex ratio (p = 0.689). The method is suitable for sexing in all stages of ontogenesis. The optimal cost-effective application of the method in large populations requires the DNA extraction using CTAB, the qPCR assay in one biological replicate and three technical replicates of each marker, and the use of one randomly chosen male per run to calibrate calculation of CNV.

www.nature.com/scientificreports/ In Coleoptera, adults are morphologically sexually dimorphic 16,17 , but dimorphism is mostly absent earlier in ontogeny 18,19 , including the CPB. The sex of CPB adults can be reliably identified based on differences in the last abdominal sternite 16 . However, there is no morphological marker of sex for the four larval instars. Here, we show that sexing of CPB larvae is possible based on copy number variation (CNV) of X-linked genes (present in one copy in males, and two in females), relative to an autosomal reference gene 20 (with two copies in both sexes). We tested the CPB LdVssc gene as a target locus, as it has been shown to be X-linked 21 . This gene is a candidate for pyrethroid resistance 22 that encodes a voltage-sensitive sodium channel protein, and is highly conserved across Insecta and within populations of different insect species 23 . As a potential reference gene, an autosomal singlecopy gene LdUBE3B, encoding ubiquitin-protein ligase E3B, was chosen. This gene was localized on linkage group 09 of the beetle species Tribolium castaneum Herbst, 1797 24 .
CNV can be reliably detected using droplet digital PCR (ddPCR). However, the hardware is prohibitively expensive for widespread use 25 . The main objective of our work was therefore to develop method of sex detection in CPB larvae based on CNV using standard quantitative real-time PCR (qPCR), to compare its reliability with CNV detection using ddPCR, and to identify possible factors influencing the results of such analysis.

Material and methods
Sample collections and sex detection of adults. Adults and larvae of CPB used in the research were collected in 16 localities of the Czech Republic in July 2019. The individuals collected were separately placed into 1.5 polypropylene tubes, frozen in liquid nitrogen, and stored in a deep freezer at − 80 °C (SANYO).
The method for molecular sex identification based on CNV was developed and optimised on a collection of adult CPB (collection A, with 10 females (F1-F10) and 10 males (M1-M10)) collected in potato fields of the Experimental station of Czech University of Life Sciences in Prague-Suchdol. An additional set of randomly collected adults from 15 other localities in the Czech Republic (collection B), with 234 females and 214 males, was used to test the reliability of the method. The true sexes of adults were identified using morphological differences in the last visible abdominal sternite, following Khidhir and Mustafa 16 . A third sample of 50 random 4th instar larvae collected in Prague (collection C) was used to test the accuracy of the new sexing method in larvae.

DNA extraction and quantification.
To check flexibility of the molecular sexing method, two different methods of DNA extraction from various tissue materials were used.
Genomic DNA from adults in collection A was isolated in two biological replicates (I and II), from caput and thorax with legs, respectively. This was performed using a NucleoSpin® Tissue Kit (Macherey-Nagel, Germany) following an animal tissue support protocol. Prior to the DNA extraction, tissues of both replicates were homogenized using Precellys Lysing Kit for hard tissue (Bertin Instruments, France) at maximum speed in Minilys (Bertin Instruments, France) for 180 s.
Single extractions of genomic DNA of samples from collections B and C used an optimised CTAB method following Chen et al. 26 . The CTAB buffer consisted of 2% (w/v) CTAB diluted in 100 mM Tris-HCl, 20 mM EDTA, and 1.4 M NaCl; 0.2% (v/v), β-mercaptoethanol was added immediately before use 27 . 200 µL of CTAB buffer and 1 µL β-mercaptoethanol were added to each tissue sample in 1.5 mL polypropylene tubes, represented by 4 legs of adult or by caput with thorax of larva. The samples were homogenized using a sterile glass rod and then 2 µL of RNase A 100 mg mL −1 solution (Qiagen, Germany) were added. After incubation at 37 °C for 1 h, 2 µL of Proteinase K 20 mg mL −1 solution (Macherey-Nagel, Germany) was added, followed by further incubation at 50 °C for 1 h. The homogenate was then extracted with 200 µL of Roti-Phenol/chloroform/isoamyl alcohol (Carl Roth GmbH, Germany), vortexed vigorously and centrifuged (14,000×g, 10 min). The supernatant was transferred into a new 1.5 mL tube to which 200 µL chloroform/isoamyl alcohol (24:1) was added, and the mixture was then vortexed and centrifuged (14,000×g, 10 min). The supernatant was transferred into a new 0.5 mL tube. To precipitate DNA, an equal volume of chilled isopropanol was added, and the tube was very gently mixed. After the formation of a precipitate, the samples were incubated at − 20 °C for 1.5 h and centrifuged (14,000×g, 10 min). The pellet was washed twice with 200 µL of absolute ethanol, and centrifuged (14,000×g, 10 min). The pellet was dried in a thermoblock at 45 °C for 30 min. The dry pellet was then dissolved in 50 µL of 1 × TE buffer overnight.
DNA samples were quantified with a NanoPhotometer (Implen, Germany) and diluted with PCR grade water (Sigma, Germany) to a concentration of 5 ng µL −1 .
Design of primers. The gonosomal LdVssc and autosomal LdUBE3B genes were used as the target and reference genes, respectively, because the complete sequences of both genes were identified in whole genome sequencing data of L. decemlineata and enabled to design primers of equivalent parameters needed for amplification at the same conditions. To prevent problems with specificity across population, the specific primer pairs were located in the exon regions of each gene using Primer3 Input 0.4.0 Program 28 and Primer-BLAST (National Center for Biotechnology Information). Source data used to design the primers are specified in Table 1 with the final primer sequences for both genes.

Real-time PCR
The optimization and validation of the qPCR assay using the CPB collection A adults was designed in two biological replicates and three technical replicates. One adult male M1 was used as a calibrator. The verification of the method on DNA samples of collections B and C extracted by CTAB used one biological replicate and three technical replicates, when one random male (a putative male larva) was used as the calibrator for each PCR run.
The copy number value based on the REST© algorithm according to Pfaffl et al. 30 was calculated using a male as a calibrator (Eq. 2).
where E: Efficiency of primers, ∆Ct: Difference in cycle threshold (Ct) between the mean of technical replicates of the calibrator and the mean of technical replicates of the tested sample.
Sequencing of LdVssc and LdUBE3B amplicons. DNA fragments of one male and one female from collection A, were amplified by qPCR and sequenced in three replicates. Amplicons were extracted from 1.5% agarose gel using a MiniElute PCR Purification Kit (Qiagen, Germany), amplified using BigDye Terminator v 3.1 Kit (Life Technologies, USA) and analysed with a ABI 3730xl DNA Analyser (Life Technologies, USA). The sequences obtained were compared with published data using the program BioEdit version 7.0.5.3 31 and deposited in the NCBI international nucleotide database.
Statistical evaluation of experiments. The distributions of CNV values obtained for all datasets were tested for bimodality 32 using the statistical program R v4.1.0 33 . In the optimization process, the equality of the CNV values of biological replicates detected by ddPCR and qPCR was evaluated by paired sample t-tests, after confirming the normality of differences by the Shapiro-Wilk test. The results of molecular sex identification of all adults were compared with the true sex using association coefficient (�) . The sex ratio detected in the collection of larvae by qPCR was compared with the expected 1:1 ratio using a χ 2 -test. These statistical evaluations were done using the Dell Statistica Software (Dell, USA).
Ethical approval. All of the experimental procedures were conducted in accordance with Czech legislation (Section 29 of Act No. 246/1992 Coll. on the protection of animals against cruelty, as amended by Act No. 77/2004 Coll.). We hereby declare that animal handling conducted in our study complies with the relevant European and international guidelines on animal welfare, namely Directive 2010/63/EU on the protection of animals used for scientific purposes and the guidelines and recommendations of the Federation of Laboratory

Results
Specificity of PCR markers of LdVssc and UBE3B genes. The electrophoretograms (Fig. 1) XM_023167301.1). Similarly, the LdUBE3B amplicon was fully homologous with the NW_019289582.1 sequence used to design the primers. The amplicon of the LdUBE3B gene contained partial sequences of exons 6 and 7 with the entire intron 6 (58 bp), and its exon region sequence was also identical with the published LdUBE3B mRNA sequence (XM_023174728.1). The specificity of our primers was also confirmed by melting analysis (Fig. 1E,F). The graphs show the melting curves typical for both genes in both male and female samples. The melting temperatures (T m ) 81.5 °C and 80.0 °C were determined for both LdVssc and LdUBE3B amplicons respectively. Table 2 shows the parameters characterizing ddPCR in collection A, including descriptive statistics, and the CNV values are in Table 3. The analysis outputs (Fig. 1A,B) from QuantaSoft Version 1.7.4.0917 (Bio-Rad, USA) demonstrated that we had achieved an optimal setup of the assay, since most counted droplets were classified as positive or negative, and few droplets were detected in the "rain region". Comparable parameters were found for our optimized qPCR protocols, slope = − 3.324 (− 3.341), y-int = 38.476 (36.528), R 2 = 0.985 (0.998) and E = 99.9% (99.2%), for both the target LdVssc and reference LdUBE3B genes. Since the difference between the slope values (0.017) was between − 0.1 to + 0.1, the CNV values could be determined by processing of raw Ct data, as suggested by Larionov et al. 29 . The qPCR-based Ct and CNV values detected specifically for collection A are presented in Tables 2 and 3, respectively. The optimization of method resulted in characteristic sets of qPCR curves for both studied genes (Fig. 1C,D). In the case of females (Fig. 1C), the curves have almost identical positions, resulting in ΔCt value approximately equal to 0.0, indicating the same copy numbers of target and reference gene. The shifted curves in males (Fig. 1D) correspond with a ΔCt value of approximately 1, indicating the expected halved quantity of target template in the sample, compared with the reference gene.

Statistical evaluation of the experiments.
In collection A, the CNV values had a bimodal distribution, as characterized by the bimodality amplitude (BA) for both the ddPCR (BA = 0.709) and the qPCR (BA = 0.683) assays, as documented by histograms ( Fig. 2A,B). The bimodality corresponded with the true sex of adults, with average CNV values of females and males of approximately 2 and 1, respectively ( Table 4). The CNV values associated perfectly with individuals' true sexes ( = 1.0, p < 0.05). Paired sample t-tests detected no significant differences in the mean CNV values of biological replicates using either ddPCR (t = 0.430, p = 0.672) or qPCR (t = − 1.809, p = 0.086). Normality of differences between CNV values of the biological replicates (required for the paired t-test) was confirmed by the Shapiro-Wilk test (W = 0.972, p = 0.802; W = 0.953, p = 0.414) for the ddPCR and qPCR methods, respectively. These results confirmed that both our sex prediction approaches are not affected by the biological material used to extract the genomic DNA.
In collections B and C, the results agreed with those detected for collection A. The CNV values detected by qPCR were comparable with values obtained in collection A and again showed bimodal distributions in males (Table 4 and Fig. 2C,D). True sex associated perfectly with the CNV values in collection B ( = 1.0, p < 0.05). In collection C, 27 females and 23 males were predicted by qPCR, in agreement with the expected 1:1 sex ratio (χ 2 = 0.16, p = 0.689).

Discussion
The study confirmed that CNV based sexing is possible using the X-linked LdVssc gene as a target and LdUBE3B gene as a reference. The target gene was detected in single copy in all 224 males (X0) and all 244 females (XX) were assessed as having two copies, after normalisation with the reference gene. The LdUBE3B gene required validation as a reference gene. Its genomic location in CPB was unknown (it is in an assembled unplaced genomic scaffold of the species, Ldec_2.0 scaffold202 (NW_019289582.1)), but its presence on autosomal linkage group 9 of Tribolium castaneum (NCBI gene ID: 659339) suggested that it could potentially be a suitable reference gene. Our results showed that the efficiencies of our primer pairs for both this gene and the target gene were comparable. Our results also documented that the LdUBE3B gene does not show sign of copy number variation (whereas the approximate mean copy number values for the LdVssc target gene in all experimental collections behaved as expected, with a bimodal distribution (Fig. 2) with peaks at CNV = 1 and CNV = 2, in males and females respectively, as just described). The LdUBE3B gene was therefore confirmed as an autosomal single-copy gene in CPB.
One of the expected options of the optimization was to design a valuable, flexible, repeatable, and costeffective method. The DNA extraction kit used in the optimization stage is not cost-effective for large numbers of population samples. Our analyses of collections B and C confirmed the flexibility of the method, because the    25 , qPCR can be recommended as an affordable method for extensive sample sets, although three technical replicates for each marker and sample are recommended, in agreement with Svec et al. 34 . The greater variability of CNVs in collection B using qPCR ( Fig. 2C and Table 4) may be related to the number of independent DNA samples, with individual quality differences. The set of 448 samples of adults required 28 independent runs, and an effect of different qPCR runs was also detected 34,35 . Although the CN values in each run were correctly calculated using a male individual as an internal calibrator, comparisons between runs, and calculation of correct CNVs using a single universal calibrator, was almost impossible. Technically, the use of a single individual from a population as a universal calibrator for each independent run is also not realistic for large sets or very small biological samples such as 1st instar larvae, which can yield insufficient DNA amounts for tens of replicates. For these reasons, a randomly chosen male from run is recommended used as a calibrator, and the CNVs should be calculated and interpreted individually for each run. Especially for prediction of sex in larvae, the individuals with lowest ΔCt were more probably males than females. This is likely an optimal solution for majority of insect species with regular occurrence of males, because an absence of males is unlikely in colonies of larvae in such case.
Despite these difficulties, qPCR sex determination associated perfectly with the true sexes of adults, giving very good prospect for predicting the sexes of each instar CPB larvae. Our tests in a set of 50 larvae confirmed that the method gave comparable results and works on the same principle as in adults. There is no way to identify morphologically the sex in larval stages for many insect species, an in those where this is possible, it is only for the final larval instar or pupae, to a limited extent 18,19,[36][37][38] . The method presented here therefore opens up new possibilities in the study of sex-related topics in all ontogenetic stages across insects.
The target gene LdVssc, known as a candidate gene of knock-down resistance (kdr) to pyrethroids 39,40 , was also confirmed by our study as a single copy gene in relatively large samples of CPB individuals collected across all regions of the Czech Republic that are important for production of potato. Copy number values higher than 2 were never detected in our experiments, suggesting that kdr susceptibility/resistance to pyrethroids in CPB is caused only by previously presented mutations within the gene 41 and not by increased CNV of the gene, as in corresponding gene in mosquitoes 25,42 .

Data availability
All obtained sequences were stored in the NCBI international nucleotide database under referred accession numbers. Unpresented datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.