Development of an SSR-based genetic map in sesame and identification of quantitative trait loci associated with charcoal rot resistance

Sesame is prized for its oil. Genetic improvement of sesame can be enhanced through marker-assisted breeding. However, few simple sequence repeat (SSR) markers and SSR-based genetic maps were available in sesame. In this study, 7,357 SSR markers were developed from the sesame genome and transcriptomes, and a genetic map was constructed by generating 424 novel polymorphic markers and using a cross population with 548 recombinant inbred lines (RIL). The genetic map had 13 linkage groups, equalling the number of sesame chromosomes. The linkage groups ranged in size from 113.6 to 179.9 centimorgans (cM), with a mean value of 143.8 cM over a total length of 1869.8 cM. Fourteen quantitative trait loci (QTL) for sesame charcoal rot disease resistance were detected, with contribution rates of 3–14.16% in four field environments; ~60% of the QTL were located within 5 cM at 95% confidence interval. The QTL with the highest phenotype contribution rate (qCRR12.2) and those detected in different environments (qCRR8.2 and qCRR8.3) were used to predict candidate disease response genes. The new SSR-based genetic map and 14 novel QTLs for charcoal rot disease resistance will facilitate the mapping of agronomic traits and marker-assisted selection breeding in sesame.

Controlling the disease in a safe and efficient way is a current pivotal problem facing plant pathologist, geneticist, and breeders. Although breeding cultivars with integrated resistant genes is expected to be fundamentally the best choice, progress in genetic improvement efforts has been slow due to the lack of information regarding the gene-for-gene relationship between sesame and M. phaseolina fungus 20 .
Genetic mapping provides the foundation for genetic study, especially for discovering and manipulating the loci or genes underlying simple and complex traits in crop plants 21,22 . The construction of genetic linkage maps, QTL mapping, and evolutionary analyses performed in standard molecular biology laboratories have provided SSR markers the primary choice for marker-assisted selection (MAS), based largely on the properties of co-dominance, reproducibility, and relative abundance in complete genomes [23][24][25] . However, molecular genetic research in sesame had lagged for decades, and some SSR markers were developed only recently [26][27][28][29] . The first map for sesame was constructed by Wei et al. based on amplified fragment length polymorphisms 30 and was last updated in 2013.
Since few SSRs have been validated, genetic maps and gene mapping have been hampered in sesame. Several SNP-based maps were constructed by restriction-site-associated DNA sequencing (RAD-seq) and specific length amplified fragment sequencing (SLAF-seq) technologies using next generation sequencing platforms in recent years [31][32][33] . However, such genetic maps and SNP tags are not easy to be used by most sesame researchers in different molecular laboratories, because of the lack of available sequence information and special instruments.
The present study was designed to develop a greater number of SSR markers based on sesame genome and transcriptome sequences 34,35 , and construct a genetic map with these co-dominant markers. Furthermore, the loci associated with sesame charcoal rot resistance were screened. The SSR-based genetic map will provide an essential and effective tool for QTL mapping of genes to identify various traits in sesame, facilitating future gene exploration and discovery of elite sesame cultivars for more productive breeding.

Results and Discussion
Polymorphic genomic-SSR development. A total of 110,495 genomic-SSR loci were detected in the sesame genome using the microsatellite identification tool (MISA) software. Of these, 39.1% were mono-nucleotides, 34.3% were di-nucleotides, 17.7% were tri-nucleotides, and 27.2% were compound SSRs. Relying on a previously published sesame transcriptome, 7,702 cDNA-SSR loci were investigated 28 . Here, 5,587 genomic-SSRs and 1,770 cDNA-SSRs were selected for designing primer pairs to synthesise markers. PCR analysis showed that 498 of the markers were polymorphic between the parents, ZZM2748 and Zhongzhi No. 13, accounting for 6.8% of the total synthesised markers. As these markers were developed based on the sesame genome, such differences might represent general polymorphisms of sesame SSR loci. However, SSRs with varying repeat units differed in their polymorphisms. The highest levels of polymorphisms were found with dinucleotide repeat units (11.9%), followed by compound (7.8%) and mononucleotide (7.4%) repeat units. Tetra-and pentanucleotide repeat unit SSRs showed lower polymorphic rates, <2%.
SSR-based genetic map construction. All 498 polymorphic SSR markers were used to genotype the 548 RILs generated from the cross between ZZM2748 and Zhongzhi No. 13. After filtering out markers that lacked polymorphic alleles and those with significantly distorted segregation ratios (P < 0.01) 36,37 , the remaining 462 markers were used to construct a genetic map using Joinmap 4.0. Finally, 424 SSR markers were mapped to the genetic map and distributed into 13 linkage groups (LG). All the 424 mapped markers were newly developed and published (Table S1). The LGs were numbered from LG1 to LG13 (Table 1, Fig. 1) and corresponded to the 13 assembled pseudomolecule chromosomes of sesame 38 .
The lengths of the 13 LGs ranged from 113.6 to 179.9 cM, with a mean value of 143.8 cM, and the resulting map length was 1869.8 cM in total. The number of markers in each LG ranged from 17 (LG5) to 53 (LG8). The interval distances between adjacent markers varied from 0.1 to 36.2 cM, with a mean interval distance of 5.1 cM across different LGs; 81.4% of markers showed interval distances less than 10 cM relative to adjacent markers. The highest marker density was observed in LG7 (an average of 2.2 cM between adjacent markers), followed by LG8  LG5 showed the lowest density average (8.1 cM). Obviously, the total marker number in the map was less than those presented in previously published genetic maps that were constructed using next-generation sequencing technology, such as RAD-seq (1,230 and 1,522) 31, 38 and SLAF-seq (1,233) 32 . As few SSR-based genetic maps were available previously and a large number of markers need to be designed and screened for polymorphisms with proper segregation of the population for genotyping, the construction of such a SSR-based genetic map was very time consuming, which lasted for 5 years. However, considering the small and diploid genome of sesame (357 Mb), the current map is competent for genetic analysis including genetic or QTL mapping.
Mapping the QTL associated with charcoal rot resistance. The (Table S2). The DIs from the four environments showed that Yangluo and Jinxian were highly correlated, but they all showed low correlations with Luohe. Those findings reveal the diversities among M. phaseolinas pathogenic races in southern and northern China, as well as among environmental conditions (Fig. 3). Using the composite interval mapping method implemented in Windows QTL Cartographer 2.5 software (Microsoft, Inc., Redmond, WA, USA) 39 , 14 QTLs were found to be significantly associated with sesame charcoal rot disease resistance, with contribution rates of 3-14.16% (mean, 6.95%; Table 2, Fig. 1). A previous study showed enhanced infection of M. phaseolina when soil water was deficient, and several overlapped or pleiotropic QTLs for drought tolerance and charcoal rot resistance were reported in other crops 40,41 . Sesame is generally drought tolerant, as it originated from the tropical regions of Africa or India 42 , and several of these loci may function in drought tolerance. However, such cases require further investigation, as no genes or QTLs for sesame drought-tolerance have yet been identified.
The 95% confidence intervals of the 14 mapped QTLs ranged from 0.8 to 25 cM; 60% were located within 5 cM (Table S3) Sesame charcoal rot resistance-related QTLs were distributed differently among the LGs.
LG3 contained the greatest number of mapped loci, with four QTL, followed by LG8 and LG13, with three QTLs each ( Table 2). In LG12, the loci qCRR12.2 had the highest phenotype contribution rate (14.16%). After mapping the flanking markers to the sesame genome 34 , the locus qCRR12.2 fell into a region that included 15 genes. Annotation of these genes predicted a cluster of plant receptor-like serine threonine kinase (RLK) genes (SIN_1001382, SIN_1001381, SIN_1001380, SIN_1001379, SIN_1001377, SIN_1001376, SIN_1001373, SIN_1001372). It has been hypothesised that the RLK gene family expansion allowed accelerated evolution among domains implicated in signal reception, playing a central role in signalling during pathogen recognition 43 . We also focused on qCRR8.2 and qCRR8.3 for their common function in different environments. The physical mapping regions of the two loci on the sesame genome showed they coincidently contained several homologous plant disease resistance genes encoding nucleotide-binding sites (NBSs) (Fig. 4) 44 . Genes encoding NBSs are the largest class of disease resistance genes in plants 44 . Thus, these mapped loci may represent an important locus for sesame resistance to charcoal rot disease.

Conclusions
This study provided a novel genetic map for sesame and generated 424 polymorphic SSR markers. The mean interval between adjacent markers was 5.1 cM. Based on the constructed genetic map, 14 QTLs for sesame charcoal rot disease resistance were detected, with contribution rates of 3-14.16%; ~60% of these were located within 5 cM (95% confidence interval). The QTL with the highest phenotype contribution rate was qCRR12.2. Two loci (qCRR8.2 and qCRR8.3) were detected in plants grown in all the four trial environments. QTL mapping   Table 2. Mapped QTLs* associated with sesame charcoal rot resistance. *"Additive effect" indicates the estimated value for the genotype transmitted stably to offspring, and the "-" represents a negative contribution to disease. "R2" signifies the contribution rate of the locus to the phenotype; "Y" in the last four columns indicates that the QTL was detected at a specific trial site. YL, Yangluo; JX, Jinxian; LH, Luohe.
revealed several candidate genes that were predicted to confer disease resistance. Thus, the genetic map of sesame is competent for mapping genes or QTLs. Moreover, the genetic map consisted of SSR markers, which will be more easily to be supplemented in a common molecular laboratory rather than SNP-based maps. Thus, this study will provide a useful reference for gene mapping and genetic studies in sesame. , during the normal sesame planting season (June-September). In each field trial, the 548 lines were grown in a randomised complete block design with three replicate plots, each comprising three 2-m rows spaced 40 cm apart with a plant spacing of 10-20 cm. The highly susceptible material ZZM2748 was planted in every tenth plot as a check.

Methods
When over 50% of plants in each growing area exhibited apparent charcoal rot symptoms at the end stage of flowering, an investigation was performed on all plants having disease progressions of differing degrees (I) as follows: 0 = normal plant without disease spots; 1 = less than 1/3 of the plant and less than 1/4 of the capsules exhibited charcoal rot; 3 = 1/3 to 2/3 of the plant and 1/4 to 1/2 of the capsules exhibited charcoal rot; 5 = over 2/3 of the plant and 1/2 to 3/4 of the capsules exhibited charcoal rot; 7 = the entire plant died. The disease index (DI) was calculated based on the following formula 45 : SSR marker development. A Perl script MISA tool (http://pgrc.ipk-gatersleben.de/misa) was used to search microsatellite sites in the sesame genome (http://ocri-genomics.org/Sinbase) and transcriptome sequences 28,34,35,38 . The SSRs with mono-, di-, tri-, tetra-, penta-, and hexa-nucleotide repeat units and compound units comprising two or more repeat motifs but interrupted by ≤100 bases were identified, and the minimum repeat numbers were defined as ten for mono-, six for di-, and five for tri-, tetra-, penta-and hexa-nucleotide repeat SSRs. Following the steps, the two Perl scripts p3_in.pl and p3_out.pl were used to handle the data generated from MISA and to format input data for primer design. Based on the SSR flanking sequences, PRIMER3 46 software was employed to design the primer pairs. The major parameters were adjusted as follows: primer length of 18-23 bases (optimal, 20 bases), GC content of 40-70% (optimal, 50%), annealing temperatures of 50-60 °C (optimal, 55 °C), and PCR product size of 100-400 bp (optimal, 200 bp). The markers developed based on sesame genome sequence were named with the prefix "ZMM", "D", or "ID" (genomic SSR), and those from transcriptomes were named with the prefix "ZM" (cDNA-SSR). All primer pairs were synthesised by GenScript Co., Ltd. (Nanjing, China). No. 13, and the 548 RILs (F8) were selected and used for total genomic DNA extraction employing the cetyltrimethylammonium bromide (CTAB) method 32,47 . DNA concentration qualities were estimated using an ND-1000 spectrophotometer (NanoDrop, Wilmington, DE, USA) at 260 nm, and the quality was confirmed by 0.8% agarose gel electrophoresis using a lambda DNA standard.
Linkage map construction. The polymorphic SSR markers were used to genotype 548 lines. Using the software JoinMap 4 48 , the segregation ratios of these markers were evaluated using the chi-squared test, and significantly distorted (P < 0.01) markers were removed. With a logarithm of minimum odds (LOD) score of 4.0, the remaining markers were grouped and ordered according to their pair-wise recombination frequencies. The Kosambi mapping function was chosen to translate the recombination frequencies into map distances in cM. The goodness of fit of the calculated regression map for each tested position was checked with default parameters. QTL analysis. The frequency distributions of the mean phenotypic data for all RILs in each trial were analysed using R package software (https://www.r-project.org/). The QTLs related to charcoal rot resistance in sesame were detected with Windows QTL Cartographer 2.5 software (Microsoft, Inc., Redmond, WA, USA) using the composite interval mapping method. An associated peak with LOD score over 2.5 was judged as the presence of a QTL, and the statistical significance of the QTL effect was determined based on 1,000 permutations. The detected QTLs were named according to trait and LG location, referring to the rules of wheat gene nomenclature (http://wheat.pw.usda.gov/ggpages/wgc/98/Intro.htm).