Detection of candidate gene LsACOS5 and development of InDel marker for male sterility by ddRAD-seq and resequencing analysis in lettuce

A new breeding method of F1 hybrid using male sterility would open an exciting frontier in lettuce breeding, a self-pollinating crop. Male sterility is a crucial trait in F1 hybrid breeding. It is essential to map the causative gene for using male sterility. The ms-S, male-sterile (MS) gene of ‘CGN17397’, was mapped to linkage group (LG) 8 by ddRAD-seq and narrowed down between two markers using two F2 populations. This region spans approximately 10.16 Mb, where 94 genes were annotated according to the lettuce reference genome sequence (version8 from ‘Salinas’). The whole-genome sequencing of the MS lines ‘CGN17397-MS’ and male-fertile (MF) lines ‘CGN17397-MF’ revealed that only one gene differed in the area of Lsat_1_v5_gn_8_148221.1, a homolog of acyl-CoA synthetase5 (ACOS5), and was deleted in the MS lines. It was reported that ACOS5 was needed for pollen wall formation and that the null mutants of ACOS5 were entirely male sterility in some plants. Thus, I concluded that Lsat_1_v5_gn_8_148221.1 designated as LsACOS5 was a biologically plausible candidate gene for the ms-S locus. By using the structural polymorphism of LsACOS5, an InDel marker was developed to select the MS trait. The results obtained here provide valuable information for the genic male-sterility in lettuce.

Lettuce (Lactuca sativa L.), a cool-season vegetable crop, is stressed in high-temperature environments 1,2 . Increasing temperatures associated with climatic change have been shown to affect negatively the growth of lettuce, a major leafy vegetable, and necessitate the development of new cultivars with enhanced stress tolerance. Hybrids usually have better stress tolerance due to hybrid vigor than pure lines and have also been extensively used in leafy vegetable crops such as cabbage and Chinese cabbage to enhance crop production 3,4 . Harnessing hybrids are considered as one of the effective approaches for many leafy vegetable crops 5 , and the cultivation of F 1 hybrids allows quantum jump in their productivity. Since a cultivation test has already confirmed that lettuce yield of F 1 hybrids increased over the parent, and exploitation of hybrid vigor allowed to promise in improving the yield and other quality parameters 6 . Precise control over pollen fertility is a key factor in the production of F 1 hybrids in self-pollinating crops 7 . Although the F 1 hybrid breeding of the self-pollinating crops such as rice, soybean, wheat, and lettuce would challenge many common-sense assumptions in plant breeding, developments of hybrid rice using genic male sterility (GMS) and cytoplasmic male sterility (CMS) are already underway with great success in China 8,9 . In addition, numerous studies have been also performed for male sterility in soybean and wheat 7,[10][11][12][13][14] .
The present study began from the finding of a GMS plant in the inbred lines of 'CGN17397' (Fig. 1) 15 . Because lettuce has a compound autogamous floral structure, it is impossible to completely remove pollen from the flower 1 . Male sterility which can avoid unnecessary maternal self-pollination is not only an essential trait for the hybrid breeding approach in lettuce, and is also useful in the fundamental study of genetic and phenotypic investigations using F 1 progeny such as disease resistance. In contrast to CMS, the phenotype of GMS is recognized after flowering. Hence, genetic markers linked to the male-sterile (MS) locus are needed to select MS plants at the pre-planting stage 16 . The markers for the ms-S gene have been developed by an amplified fragment length polymorphism (AFLP) technique so far, but all markers were located on the same side of the gene 15 . In this study, genetic mapping of the ms-S gene was conducted in two F 2 populations obtained from a cross between MS and male-fertile (MF) plants. Additionally, by employing the whole-genome sequencing of MS lines 'CGN17397-MS' Linkage analysis for male sterility trait by ddRAD-seq analysis. For genetic mapping of the locus for the male sterility, double-digest restriction site-associated DNA sequencing (ddRAD-seq) analysis was conducted for constructing a linkage map using the F 2 population from a cross between '2008-83-MS' and 'Uenoy-amaMaruba' . For the setting of RAD-R scripts 17 , BWA mode, construction method, and correction approach were "mem_60″, "ABH", and "6US" respectively. Then, the 1241 pairs of RAD tags in two parents were employed as codominant markers for genetic mapping of male sterility and used for linkage map construction (Fig. S1). By summarizing the linkage map, the total length of the linkage map was 1815.6 centi-Morgan (cM). Marker density ranged from 1.2 cM (LG2) to 2.0 cM (LG1) per marker. The number of markers in the linkage groups ranged from 93 (LG1) to 194 (LG5). Summary statistics of the linkage map are shown in Table 2. The segregation data of the genotype of the F 2 population and the phenotype of MS traits showed that the ms-S gene was located at the position between 238.429 Mbp and 257.031 Mbp with the interval of 4.6 cM on LG8 (Fig. 2a). Genotyping using  www.nature.com/scientificreports/ three PCR-based markers designed in this region was conducted for fine mapping (Table 3). However, the area could not be further narrowed in this population because these three markers showed complete cosegregation with male sterility (Fig. 2a). Then, the F 2 population derived from a cross between 'CGN17397-MS' and 'Salinas' was employed to further mapping of the target locus using PCR-based markers. The gene of the male sterility was located at the position between 246.869 Mbp and 263.743 Mbp with the interval of 6.6 cM on LG8 (Fig. 2b), and LG8_v8_250.793Mbp indicated complete cosegregation with the male sterility based on the two F 2 populations  Identification of candidate genes in ms-S locus by whole-genome sequencing. The ms-S locus was found to include 94 genes annotated according to the lettuce reference genome sequence (version8 from crisphead cultivar 'Salinas') (Table S1). Whole-genome sequencing data of the MS and MF lines revealed that a genomic region of about 4 kb containing the Lsat_1_v5_gn_8_148221.1 was completely deleted in only the MS lines (Fig. 4). According to the reference genome sequence, Lsat_1_v5_gn_8_148221.1 encodes an acyl-CoA synthetase5 (ACOS5), which might be orthologous to Arabidopsis MS gene AtACOS5 18,19 . To further elucidate the relationship among Lsat_1_v5_gn_8_148221.1, AtACOS5, AAO25511, and BnACOS5, these four genes were examined for amino acid alignment by employing Clustal W. The results showed that there was significant conservation within the AMP-binding domain and the fatty acid-binding domain of ACOS5 20 (Fig. 5a). The phylogenetic analysis showed that Lsat_1_v5_gn_8_148221.1 was categorized into the ACOS5 group, which is related to male sterility in some plant species 18,21 (Fig. 5b). Based on the results, the gene might be the candidate gene for ms-S because of its homology with the known recessive MS gene and was designated as LsACOS5. For the other 93 genes, the genomic sequences were completely identical between the two lines (Table S1). And, some genes were reported to be expressed in flowers such as SCD1 indicated as ORF 5 22 , but no genes were known to cause the null mutant to be the MS phenotype. The LG8_v8_250.793Mbp designed using the genomic regions of the candidate gene (Fig. 4, Table S1) had polymorphism between 'CGN17397-MS' and 'CGN17397-MF' and was completely cosegregated with the MS trait in the two F 2 populations (Fig. 2). These results suggest that LsACOS5 is a biologically plausible candidate gene for ms-S.

Discussion
Because an F 1 hybrid has a potential character that grows faster and has a shorter cultivation period in a field, the risk against bacterial disease accelerated by rain would be below. Thus, F 1 hybrids are commonly anticipated to display high productivity under stressful conditions. In lettuce, the exploitation of the F 1 hybrid could be Table 3. Primers for the PCR-based markers in ms-S locus.

CGN17397-MS Salinas 2008-83-MS Uenoyama Maruba
LG8_v8_241 .563Mbp_F  TTC GAT CTC CGA CGA TTT ATG  231  268  231  268 LG8_v8_241.563Mbp_R CTA AGG AAA CGG GAG GCA AT LG8_v8_246 .869M_F  GTT TGG TTT GCG GAT TCC TA  242  267  242  267 LG8_v8_246.869M_R GTG CAA CCA ATT AGC ATT CG LG8_v8_250.793Mbp_F GAT CCC TTC CAA AAC TTG AGG 220 573 220 573 LG8_v8_250.793Mbp_MS_R GGG CGG AGT CCA TTA TTT GT LG8_v8_250.793Mbp_MF_R TGC TCA ACG ATC TTG TTT GTG LG8_v8_263.743M_F TTT GAA AGC ATA GGG ATC ATCT 297 304 304 297 LG8_v8_263.743M_R GTT CAT ACC GTC GGA TCG TT www.nature.com/scientificreports/ one of the effective approaches to maintain a stable yield, particularly in tropical and subtropical regions. A new crisphead cultivar 'Fine green' was indeed the first F 1 hybrid bred by Kaneko seeds CO., LTD. in Japan, but unfortunately, the technical detail of the breeding method was not announced publicly. In general, the MS plant is worth exploring as the key factor of F 1 hybrid breeding, and several GMS mutants were also reported in lettuce so far 23 . The genetic mechanism is not understood, and this is the first report of the identification of the MS gene in lettuce. It is valuable to ascertain the genetic mechanism of MS plants to select a future breeding strategy. In this study, the two F 2 populations were used to locate the MS gene to the region between the two PCR-based markers, LG8_v8_246.869Mbp and LG8_v8_257.031Mbp. Although the genomic region of the ms-S locus was relatively large, the whole-genome sequencing for 'CGN17397-MS' and 'CGN17397-MF' revealed only 1 different gene, Lsat_1_v5_gn_8_148221.1, between 2 lines in these 94 annotated genes in the ms-S locus (Table S1, Fig. 4). The gene encoded an acyl-CoA synthetase 5 (ACOS5) and was a potential ortholog of the key MS gene ACOS5 in some plants such as Arabidopsis, Tobacco, and Brassica napus 24,25 (Fig. 5a). The ACOS5 acted as acyl-CoA synthase to regulate the biosynthesis of sporopollenin to affect male fertility, and a null mutant was entirely male-sterility 18 . 'CGN17397-MS' displayed normal vegetative growth and complete male-sterility insensitive to environmental conditions. There were no other obvious morphological differences between the MS and MF lines. Lettuce was generally only flowering for about two hours in the morning, but the MS lines could continue to flower through the afternoon. Thus, the MS mutants of lettuce and Arabidopsis showed phenotypic similarities 18 . I concluded that LsACOS5 was a biologically plausible candidate gene for the ms-S locus (Figs. 2, 3, 4, Table S1).
In addition, the insertion/deletion (InDel) marker-LG8_v8_250.793Mbp-tightly linking to LsACOS5 was developed. By using the InDel marker, it was possible to select MS plants for a conventional-breeding program (Figs. 1c, 3). Due to the structure of the lettuce flower, it was challenging to examine the inheritable characteristics of valuable traits 1 , such as disease resistance in only the F 1 seeds because crosses produced not only F 1 seeds but also self-pollinated seeds. Because only F 1 hybrid seeds can be produced using GMS plants for crossbreeding, research on valuable traits that could not be analyzed in the past would be facilitated.
The F 1 seed production system was needed to promote the commercial production of F 1 hybrids. To propagate the F 1 hybrid seeds in the case of rice, the maternal and paternal plants were alternately cultivated in a field to cross by the wind and artificial pollination 26 . But lettuce pollen was not dispersed by wind, the F 1 seed production system has been already developed using insect pollination at a greenhouse. The fact that flies and bees were adopted for the system due to an absence of specialist pollinators of lettuce, the self-pollinating crop, could propagate the F 1 hybrid seeds 27,28 . Moreover, the F 1 hybrids are likely to be suitable for cultivation in not only fields but also plant factories. The trait of rapid growth was economically important for the cultivation in plant factories. The breeding of F 1 hybrids suitable for cultivation in fields and plant factories is an issue for the future.
To date, genome editing technology makes it possible to create knockout mutants of the target gene. GMS plants generally have a problem of seed mixture for the MS and MF progeny. Still, a novel hybridization platform known as the third-generation breeding technique has been successfully selected for non-transgenic GMS seeds 8 . Combining these two techniques could also be applied for the F 1 hybrid breeding in lettuce, and it converts any elite cultivars into a commercial MS plant and accelerates the development of F 1 hybrid cultivars. The applications of the GMS plant initiative to the rise of considerable potential for lettuce breeding. www.nature.com/scientificreports/ ally, 96 individuals of F 2 progeny obtained from a cross between 'CGN17397-MS' and 'Salinas' were used for further mapping using PCR-based markers.
Linkage analysis based on ddRAD-seq. Genomic DNA was extracted from leaves using the Nucleo-Spin Plant II Extract Kit (Machery-Nagel, Duren, Germany). The RAD-seq library construction was performed following a previously described method 2,29 . The ddRAD-seq libraries were sequenced using the HiSeq4000 platform (Illumina, San Diego, CA, USA). Paired-end sequencing reads (100 bp × 2) were analyzed for ddRADseq tag extraction, counting, and linkage map construction using RAD-R scripts 17  Resequencing analysis. Genomic DNA was extracted from young leaves of the two lines ('CGN17397-MS' and 'CGN17397-MF') using NucleoSpin Plant II (Machery-Nagel, Duren, Germany) and was used to construct paired-end sequencing libraries (100 bp × 2) and subjected to whole-genome sequencing using the HiSeqX (Illumina) and DNBSEQ-500 (MGI) platform. The resequencing analyses were conducted according to the previously described method 2 . Raw sequence data (fastq) for this resequencing analysis are available in the DDBJ Sequence Read Archive at accessions DRA012737.
Phylogenetic analysis. The protein sequence of the candidate gene was searched for homologs from the plant species using basic local alignment search tools (BLAST) at the National Center for Biotechnology Information (http:// www. ncbi. nlm. nih. gov/). Multiple sequence alignments of the full-length protein sequences were conducted using ClustalW and displayed using BOXSHADE (https:// embnet. vital-it. ch/ softw are/ BOX_ form. html). The phylogenetic tree was generated using MEGA X program 33 using the neighbor-joining method with default parameters besides 1000 bootstrap replications.
Ethical statement. The author assures that legislation on seed collection has been accomplished. Permission obtained from responsible authority to collect seeds.

Ethical approval.
All the experiments carried out on plants in this study were in compliance with relevant institutional, national, and international guidelines and legislation.