Construction of a high-density genetic map and QTL mapping for pearl quality-related traits in Hyriopsis cumingii

A high-density genetic map is essential for quantitative trait locus (QTL) fine mapping. In this study, 4,508 effective single nucleotide polymorphism markers (detected using specific-locus amplified fragment sequencing) and 475 microsatellites were mapped to 19 linkage groups (LGs) using a family with 157 individuals. The map spanned 2,713 cM, with an average of 259 markers and 79 loci per LG and an average inter-marker distance of 1.81 cM. To identify QTLs for pearl quality traits, 26 putatively significant QTLs were detected for 10 traits, including, three for shell width, seven for body weight, two for shell weight, two for margin mantle weight, five for inner mantle weight, and seven for shell nacre colour. Among them, five QTLs associated with shell nacre colour were mapped to LG17 and explained 19.7% to 22.8% of the trait variation; this suggests that some important genes or loci determine shell nacre colour in LG17. The linkage map and mapped QTLs for shell nacre colour would be useful for improving the quality of Hyriopsis cumingii via marker-assisted selection.

SLAF marker detection and genotype definition. Among the 239,704 high-quality SLAFs, 132,542 were polymorphic with a polymorphism percentage of 55.3% (Table 1). Of the 132,542 polymorphic SLAFs, 108,004 were classified into eight segregation patterns. The segregation patterns were as follows: ab × cd, ef × eg, hk × hk, lm × ll, nn × np, aa × bb, ab × cc, and cc × ab (Fig. 1). Since the F 1 population was used in this study, only SLAF markers with segregation patterns of ab × cd, ef × eg, hk × hk, lm × ll, nn × np, ab × cc, or cc × ab were used for the genetic map construction. To further improve SLAF accuracy, only SLAFs with more than 80% integrity and more than 20-fold average sequence depths in the parents were used for the map construction. Finally, 4,508 of the 108,004 markers were selected for the linkage map construction (Table 2). Of these 4,508 markers, 60.96% were heterozygous in the male parent; 61.78%, in the female parent; and 21.33% to 47.07%, in the F 1 individuals. Average sequencing depths of the 4,508 markers were 86.71-fold and 92.66-fold for the male and female parents, respectively, and 7.99-fold for each progeny.

Construction of the high-density linkage map.
To construct a high-density, high-quality genetic linkage map for H. cumingii, the newly developed 4,508 SNPs and 506 SSR markers from the framework map 32 were used. After linkage analysis, 4,920 markers were mapped to 19 linkage groups (Table 3, Fig. 2). The final map was 2,713 cM in length, with an average of 79 loci per LG and an average inter-marker distance of 1.81 cM. The number of markers mapped to each linkage group varied from 114 markers in LG15 to 436 markers in LG2, with an average of 258 markers per LG. The largest LG was LG2, which contained 436 markers with a genetic length of 196.50 cM, and the smallest LG was LG10, which contained only 180 markers with a length of 91.09 cM. The mean chromosome region length was 142.80 cM. The 'Gap ≤ 5 cM' value (indicates the percentages of gaps in which the distance between adjacent markers is smaller than 5 cM) of the 19 linkage groups ranged from 80.28% to 100% (average, 97.79%). The maximum gap was 18.91 cM, which was observed in LG15. In this study, sex-specific maps were also constructed (Table S4). The male map contained 3,233 markers and spanned 2,561 cM, while the female map contained 3,123 markers and spanned 2,810 cM. In addition, the male map contained 170 markers and 48 loci per LG, while the female map contained 164 markers and 56 loci per LG. The average inter-marker distances for the male and female maps were 2.78 and 2.66 cM, respectively. Although the lengths of the male and female maps were similar, we observed significant differences in some of LGs. For instance, the ratios of lengths of LG10, LG13, and LG18 in the male and female maps were 1.39, 1.45, and 1.61, respectively (Table S4).
Comparison of the constructed map with the SSR-based framework map. Since we constructed an SSR-based framework map for H. cumingii in a previous study, we performed a brief comparison of the two maps. Although both maps comprised 19 linkage groups, significant differences were observed in the number of markers, map length, and resolution between the two maps. The SSR-based framework map contained only 506 SSR markers and spanned 1,922.3 cM, which is obviously less than the 4,920 markers and 2,713 cM map length observed in this study. Our current linkage map has higher resolution at 1.81 cM, which is significantly superior to the resolution of the previous genetic map with an average inter-marker distance of 3.99 cM. Besides, we noted many large gaps in the SSR-based map and a lesser number of large gaps in the current map.
QTL mapping for pearl quality-related traits. The basic statistics for the pearl quality traits are listed in Table S2. On the basis of the high-density genetic map, a total of 26 QTLs were detected on LG1, LG4, LG8,  LG14, LG15, and LG17 ( The validation of four QTLs associated with shell nacre colour on LG17. Statistical analysis showed that all three markers were significantly associated with shell nacre colour (P < 0.05). For Marker257873, the H. cumingii with CC genotype showed a significantly higher value in AL while significantly lower values in Aa, Ab and AdE, compared with the individuals with CT or TT genotypes. For Marker189034, the individuals with AA genotype showed significantly higher values in Aa, Ab and AdE, but a significantly lower value in AL when compared with the individuals with GA or GG genotypes. Similarly, as for Marker273509, the individuals with GG genotype had a significantly lower value in AL but higher values in Aa, Ab and AdE (Table 5).

Discussion
High-density linkage maps are especially important for QTL mapping, positional cloning, comparative genomics, MAS, and identification of functional genes [35][36][37][38] . Currently, although SNPs have become the first-selected genetic markers for high-density linkage map construction, their widespread application remains limited by the high costs of high-throughput sequencing and genotyping 39 . Fortunately, NGS-based marker identification and genotyping technologies provide a powerful method for the identification of large numbers of SNPs. The SLAF-seq approach balances genotyping accuracy and sequencing cost, thus providing an economical and efficient method for linkage mapping of non-model species with complex genomes [24][25][26][27]40 .
In this study, SLAF-seq was used to develop SLAF markers in an F 1 full-sib family of H. cumingii. By SLAF library construction and high-throughput sequencing, 239,704 high-quality SLAFs were detected, and 129,967 SLAFs were polymorphic with a high polymorphism rate of 54.22%. Although the SLAF-seq strategy provides a powerful tool for large-scale SNP identification, some inevitable errors and missing values occur during SLAF marker development and genotyping 41 . Therefore, strict criteria were used to avoid false-positive markers, and 4,508 markers were finally used for the linkage mapping. Since the sequencing depth reflects the accuracy of de novo SNP identification and genotyping quality scores can be used to detect suspicious markers, they are  both important criteria for ensuring genotyping accuracy. The minimum sequencing depth for each individual is 6-fold when the SLAF-seq approach is used 24 . In our study, average sequencing depths were 86.71-fold and 92.66-fold for the male and female parents and nearly 8-fold for each progeny, which provided a high level of genotyping accuracy. In addition, genotyping quality scores of these 4,508 markers reached 30, which effectively eliminated the suspicious markers. We excluded markers with significant segregation distortion (the genotype integrity of these markers was less than 80% in the mapping population), thus ensuring the quality of the SLAF markers and genotyping accuracy. Therefore, our SNP genotyping data can be reliably used for high-density linkage map construction.
We constructed a high-density genetic map for H. cumingii, containing 4,920 markers with a resolution of 1.81 cM. The high-density genetic map is a great improvement over our previous map for H. cumingii 32 , providing an important tool for fine mapping of QTLs for important traits. Our current linkage map is better than previous genetic maps for bivalves such as P. maxima (1,189 SNP markers) 42 , Pinctada fucata martensii (3,117 markers) 43 , and C. farreri (3,806 markers) 7 with respect to marker number, but it is slightly inferior with respect to average marker interval because of the larger genome size of H. cumingii. We also observed some large gaps in the current map; this suggested that there may be a higher level of recombination or that markers in these regions have not been developed. Hence, other enzymes may be used for SLAF library preparation, and the size of the mapping population needs to be improved in the future.
When compared with the SSR-based framework map 32 , the map constructed in this study spanned a cumulative distance of 2,713 cM, showing a significant increase in map length. Of course, the genetic length of a linkage map could be affected by many factors, such as the number and type of markers, size of the mapping population, and accuracy of the genotyping 37 . Errors in genotyping usually cause elongation of a linkage map. In our study, strict criteria were used during SLAF marker development and genotyping, which ensured the quality of the SLAF markers and greatly reduced genotyping errors; this enhanced the accuracy of the genotyping and, thus, effectively avoided artificial inflation of map length. Therefore, elongation of the genetic map in this study was probably caused by the variation in the number and type of the markers and changes in the size of the mapping population.
Accuracy of marker order and the genetic map is one of the most interesting aspects for people who work with genetic mapping. In this study, the order of common SSR markers in the present map and the previous framework map was compared 32 . For most LGs, a roughly similar order was observed for the SSR markers; however, different orders of common SSR markers were observed in some LGs. The variation may be partly due to the different numbers of markers and individuals used for the two genetic maps. A previous study showed that a larger-sized population is considered sufficient for the construction of reasonably accurate genetic maps 44 . Furthermore, this effect will be even more obvious when the number of different families increased, since it could increase the power to map loci by increasing the number of informative meiosis events. In addition, different methods for map construction may influence the accuracy of marker order and the genetic map. Some minor differences in marker order have been observed between linkage maps constructed using JoinMap4.1 and HighMap 41 . A previous study has reported that HighMap has many advantages over JoinMap4.1 45 . HighMap not only permits the use of more NGS data but could also guarantee higher accuracy of marker order and map distance for data with a large proportion of missing or erroneous markers.
Many factors affect pearl quality, such as size, shape, colour, and lustre. Since production of a non-nucleated pearl requires the participation of both donor and host mussels, previous studies have highlighted the link between pearl traits and growth-related traits in H. cumingii. A previous study has shown that shell weight, body weight, and shell width are significantly associated with pearl diameter and pearl yield 46 . Margin mantle of the donor mussel is used to make the saibo, and size and thickness of the margin mantle determines the size and thickness of the saibo. The inner mantle, in which a non-nucleated pearl is implanted in the receptor mussel, directly affects pearl growth. Therefore, these five growth traits are considered related to pearl quality 47

Table 4. Summary of QTLs for pearl quality-related traits in F 1 populations of H. cumingii.
colour has always been an important criterion for evaluating pearl value, and shell colour is believed to have a connection with pearl colour because of their similarities in formation. A previous study revealed that pearl colour is related mainly to the shell nacre colour of the donor mussel in H. cumingii 48 . The results showed that extremely significant correlation existed between inner shell color of the donor mussel and pearls color (P < 0.01). Moreover, the greater the dE (total colour change) value of the donor mussels is, the higher percent of purple pearl the mussel can produce. Therefore, pearl colour may be improved by improving shell nacre colour in H. cumingii. A high-density linkage map is an effective tool for the fine mapping of QTLs for these important traits in H. cumingii. In this study, shell nacre colour and growth-related traits were fine mapped using the improved  linkage map, and 26 putatively significant QTLs were detected. Compared to our previous study, six more QTLs for growth-related traits were detected, and the positions of the QTLs were refined to a greater degree. However, the QTLs detected on LG1, LG4, LG8, and LG15 were significantly different from the results of our previous study 32 , in which growth-related QTLs were located on LG1, LG2, LG4, LG6, LG8, and LG18. Such inconsistent results are possibly due to the differences in the resolution of the two genetic maps. To date, one or two QTLs for some traditional growth-related traits, such as body weight, shell width, and shell weight, have been reported in many bivalves 7,28,30,33,49 . In this study, seven QTLs for body weight, three QTLs for shell width, and two QTLs for shell weight were identified, with an average PVE of approximately 10%; thus, a higher QTL number and lower PVE were observed when compared with the abovementioned studies. In addition, five QTLs for inner mantle weight and two QTLs for margin mantle weight, which were considered to be related to pearl formation, were identified in our study. Since there is little doubt that shell formation and pearl formation are inextricably linked 50 and close ties exist between the growth traits and pearl traits, all the detected QTLs may have a potential impact on shell or pearl formation and should be investigated further. In the present study, seven QTLs associated with shell nacre colour were detected, showing a high percentage of PVE of 19.7% to 22.8%. Of these seven QTLs, five significant QTLs related to shell nacre colour tended to colocalize, clustering in a special region (17.3 cM to 36.1 cM) on LG17. This result showed a high similarity to the previously identified five QTLs located on the same chromosome in the region of 0-42.6 cM. In particular, the significant LOD region (17.3-36.1 cM) of these five QTLs contained 29 markers and showed smaller marker gaps of no more than 3.8 cM. These findings were more refined than those of our previous study in which only seven markers were found and the biggest gap was 16.2 cM (0-42.6 cM). Furthermore, three SNPs located on four QTLs were selected and validated to be significantly associated with shell nacre colour in other families, which suggested that the shell nacre colour related QTLs identified in the study had high accuracy and reliability. Although there are few studies on QTLs for shell nacre colour, a recent study on P. maxima reported nine putative QTLs for pearl colour and one QTL for pearl surface complexion 34 . Moreover, the most prominent QTLs were located within a 2-cM interval on LG10, which is similar to our findings on QTLs for shell nacre colour. Given the correlation between shell nacre colour and pearl colour, we believe that there are some important genes or loci that determine shell nacre colour or pearl colour on LG17.
Further studies need to be performed to identify genes that determine shell nacre colour in H. cumingii.

Methods
Mapping and DNA isolation. An F 1 full-sib family of H. cumingii produced by an intra-specific cross between two mussels was used as the mapping family. The female and male parents are from the Poyang Lake population and Dongting Lake population, respectively. The aquaculture environment and management process were as described by Bai et al. 32 . DNA was extracted from 200 F 1 individuals and their parents, according to a method described previously 51 . Finally, parentage analysis was carried out, and paternity of 157 F 1 progenies was confirmed using eight microsatellite markers, as described previously 52 .
SLAF library preparation and sequencing. The SLAF library was prepared as described previously 24 , with minor modifications. A pre-designed experiment was conducted to evaluate the enzymes and restriction fragment sizes. On the basis of the results of the pilot experiment, the optimum scheme was confirmed and used to construct the SLAF library. Genomic DNA from the two parents and F 1 progenies was incubated at 37 °C with MseI (New England Biolabs, NEB), T4 DNA ligase (NEB), ATP (NEB), and MseI adapter. Restriction-ligation reactions were heat-inactivated at 65 °C, and digestion was performed using HaeIII and PvuII-HF TM at 37 °C. PCR was performed with a diluted restriction-ligation mixture, dNTP, Taq DNA polymerase (NEB), and MseI primer containing barcode 1. Next, we used the E.Z.N.A. Cycle Pure Kit (Omega) to purify and pool the PCR products. The pooled PCR products were subsequently incubated at 37 °C with MseI, T4 DNA ligase, ATP, and Solexa adapter. Then, the sample was purified using a Quick Spin column and run on a 2% agarose gel. Subsequently, fragments ranging from 314 to 394 bp were excised and purified using a Gel Extraction Kit, and these DNA fragments were amplified using PCR with the Phusion Master Mix (NEB) and Solexa Amplification primer mix to add barcode 2. Then, DNA fragments of 314-394 bp were gel-purified and diluted for pair-end sequencing on an Illumina High-seq 2500 sequencing platform at Biomarker Technologies Corporation in Beijing.
SLAF-seq data analysis and genotyping. SLAF-seq data analysis and genotyping were performed according to the method developed by Sun et al. 24 . Briefly, all SLAF pair-end reads with clear index information were clustered on the basis of sequence similarity detected using BLAT 53 . Then, sequences with more than 90% similarity were assigned to the same locus. Minor allele frequency (MAF) evaluation was subsequently performed to define alleles in each SLAF. H. cumingii is a diploid species, and one locus contains no more than four SLAF tags; therefore, loci containing more than four tags were discarded as repetitive SLAFs. Correspondingly, groups with two, three, and four tags were identified as polymorphic and considered potential SLAFs. To further improve SLAF accuracy, SLAFs with more than 80% integrity and more than 20-fold average sequence depths in parents were used for the map construction.

Construction of the genetic linkage map.
To reduce the influence of distorted markers, the chi-square test (χ 2 ) was performed and the SLAF markers with significantly distorted segregation (P < 0.05) were excluded from the linkage mapping. Then, all high-quality SLAF markers and 506 SSR markers of the previous study were used for the linkage map construction 32 . The markers were divided into LGs by using the single-linkage clustering algorithm at logarithm of odds (LOD) threshold ≥ 4.0 and a maximum recombination fraction of 0.4. To avoid genotyping errors and deletion caused by NGS, High Map Strategy was used 45 . MSTmap algorithms 54 and SMOOTH algorithms 55 were used to order the SLAF markers and correct genotyping errors, respectively. The LGs were constructed as follows: Primary marker orders were first obtained by their location on the chromosomes, according to the relationship between ordered markers, and genotyping errors or deletions were corrected using the SMOOTH algorithm; then, MSTmap was used to order the map and SMOOTH was again used to correct the newly ordered genotypes. The processes were performed iteratively to ensure the accuracy of marker order and map distances, and high-quality maps were constructed after four or more cycles.
Phenotypic measurement and QTL analysis. The following five growth-related traits were measured in the 157 progenies: shell width, body weight, shell weight, margin mantle weight, and inner mantle weight and five shell colour-related traits, purple mantle scar length, CIE 1976 L* (lightness), a* (redness), b* (yellowness) colour space, and total colour change (dE) 32 . The average L* (AL), average a* (Aa), average b* (Ab), and average dE (AdE) represent average values measured on the three points of inner shell. Statistical analysis of the phenotypic data was conducted with SPSS 18.0. MapQTL 5.0 software was used for the QTL mapping with the interval mapping method. The genome was scanned at 1-cM intervals, the maximum LOD score along the interval was considered as the position of the QTL, and the region in the LOD score greater than the threshold was considered as the confidence interval. The LOD score threshold was initially set at 3.0 for QTL declaration, and QTLs that exceeded this LOD threshold were considered as suggestive QTLs. If any relevant QTL was identified, the LOD score threshold was determined using the 1,000-permutation test with a confidence of 0.99. QTLs with LOD scores greater than the threshold at a confidence of 0.99 were declared significant.
The validation of four QTLs associated with shell nacre colour on LG17. To examine whether the SNPs located on QTLs were associated with shell nacre colour, three SNPs were selected from shell nacre colour related QTLs (AL-1, Aa-1, Ab-3, AdE-1) and used for primer design (Table S5). Then, the primers and markers were assessed in three families of H. cumingii (one hundred individuals for each family) from Chongming aquaculture farm of Shanghai Ocean University in Shanghai, China. Genomic DNA was extracted and CIE 1976 L* (lightness), a* (redness), b* (yellowness) colour space, total colour change (dE) were measured as described above. Primer designing and genotyping assays were carried out using KASP technology (LGC Genomics, Hoddesdon, UK) as described previously 56 . Associations between SNPs and shell nacre colour related traits were analyzed using Chi-square test in JMP 8.0 software.