Genetic analysis of the “head top shape” quality trait of Chinese cabbage and its association with rosette leaf variation

The agricultural and consumer quality of Chinese cabbage is determined by its shape. The shape is defined by the folding of the heading leaves, which defines the head top shape (HTS). The overlapping HTS, in which the heading leaves curve inward and overlap at the top, is the shape preferred by consumers. To understand the genetic regulation of HTS, we generated a large segregating F2 population from a cross between pak choi and Chinese cabbage, with phenotypes ranging from nonheading to heading with either outward curving or inward curving overlapping heading leaves. HTS was correlated with plant height, outer/rosette leaf length, and petiole length. A high-density genetic map was constructed. Quantitative trait locus (QTL) analysis resulted in the identification of 22 QTLs for leafy head-related traits, which included five HTS QTLs. Bulked segregant analysis (BSA) was used to confirm HTS QTLs and identify candidate genes based on informative single-nucleotide polymorphisms. Interestingly, the HTS QTLs colocalized with QTLs for plant height, outer/rosette leaf, and petiole length, consistent with the observed phenotypic correlations. Combined QTL analysis and BSA laid a foundation for molecular marker-assisted breeding of Chinese cabbage HTS and directions for further research on the genetic regulation of this trait.


Introduction
Leafy heads are an important trait of leafy vegetable crops, such as cabbage (Brassica oleracea), Chinese cabbage (Brassica rapa), and lettuce (Lactuca sativa). The leafy head is composed of curving leaves, and these leaves form various head shapes depending on the curvature of the distal end, hereafter referred to as the tops of leaves: leaves with outward curving tops, leaves with inward curving tops that do not overlap, and inward curving leaves with their tops overlapping or forming a spiral 1 . Leaf morphology is determined by leaf cell division and cell elongation rates along the main axes of the leaf [2][3][4] . The heading leaves curve inward to form a leafy head at the heading stage. The upward curvature of Chinese cabbage heading leaves is facilitated by increased growth of abaxial cells 5 . As a storage organ, the leafy head is an important source of mineral nutrients, crude fiber, and vitamins. Head traits include head weight (HWe), compactness, size, and shape. The Chinese cabbage overlapping head type, in which the heading leaves curve inward and overlap at the top, is the shape that current consumers prefer 6 . However, the molecular mechanism of leafy head formation and regulation of head shape is not fully understood.
The vegetative growth of Chinese cabbage is divided into four stages: seedling, rosette, folding, and heading stages. The leaves in these four stages are different in shape, size, and physiological function 7 . The rosette leaves are large and round with short petioles. These Chinese cabbage rosette leaves provide the basis for plant growth via photosynthesis 8 . A number of recent studies have shown that rosette leaves supply products for leafy head formation and that their size and shape are related to the leafy head shape or size 6,9,10 . In a study by Mao et al. 6 , the correlation between rosette leaves and head shape was demonstrated by using 150 recombinant inbred lines (RILs): round heads with flat rosette leaves, cylindrical heads with rosette leaves with wavy margins, oblong heads with shrinking rosette leaves, and cone-like heads with incurved rosette leaves. In this paper, they also identified that allelic variation in the TCP4 gene in the miR319a recognition sequence modulated head shape by differentially arresting cell division in different leaves. Sun et al. 9 also showed the correlation between rosette leaf traits, such as leaf length (LL), width, and midvein length, with leafy head traits in a diverse set of 152 Chinese cabbages. Although the phenotypic correlation between rosette leaves and leafy heads has been described, it is unclear how rosette leaves influence leafy head growth and how this is regulated at the molecular level.
Genetic information from quantitative trait loci (QTLs), sequencing-based bulked segregant analysis (Seq-BSA), and RNA-sequencing approaches provides clues for understanding Chinese cabbage leaf and leafy head traits. QTLs have been identified for Chinese cabbage heading traits, such as head diameter, head height, and HWe, which are all important components of head yield and quality 9,[11][12][13] . Seq-BSA revealed genomic regions enriched with candidate genes related to plant hormones associated with Chinese cabbage heading type and leafy head formation 14,15 . In addition, several studies have described gene expression profiles during Chinese cabbage development and constructed rosette and inner leaf gene expression networks [16][17][18][19] . Although these studies provide insight into the general pathways involved in the formation of leafy heads in Chinese cabbage, they do not explain the genetic control of the relation between rosette leaf growth and leafy head formation.
In this study, we phenotyped an F 2 population of B. rapa derived from a cross between heading Chinese cabbage and nonheading pak choi and analyzed the association between rosette/heading leaves and several heading traits, including the leafy head top shape (HTS). In addition, a combined approach of BSA of pools with contrasting head traits and QTL analysis was used to study rosette leaf growth and the HTS trait at the genetic level. Several colocalizing genomic regions for rosette leaf traits and HTS were identified. Further investigation of these regions resulted in candidate genes for the HTS trait in Chinese cabbage.

Phenotype analysis
At the heading stage, the leafy HTS was evaluated for all 1307 F 2 plants. This resulted in 228 plants in class 4 (inward curving leaves that overlap at the top of the leafy head), 389 plants in class 3 (inward curving leaves that do not overlap at the top of the leafy head), 591 plants in class 2 (with outward curving outer heading leaves), and 99 plants in class 1 (that did not form a leafy head). Based on these four classes of the HTS trait, it was hypothesized that the qualitative trait HTS was controlled by two major genes with recessive epistasis (2MG-RecessiveI) based on SEgregation Analysis (SEA) software analysis (Table S1A) . The scores of the parental genotypes were PC-101 = "1", CC-48 = "2", and F 1 = "3", Based on these ten phenotypic classes, it was hypothesized that the qualitative trait HTS is controlled by four major genes with additive and epistatic effects (4MG-AI) based on SEA software analysis (TableS1B). From this population, 104 plants were selected to be phenotyped for additional leaf and leafy head traits. Seventeen plants that did not form a leafy head (N: nonheading) were selected, while from the plants that formed a head, we selected equal numbers of the most obviously different heading types, namely, the OC (the shape of the leaf on the top of the head is outward curving) and O (the shape of the leaf on the top of the head is inward curving with an overlap) HTSs.
The HTS and the other 20 traits could be divided into three clusters based on their correlation coefficients (Fig. 1A). Significant phenotypic correlations were found between HTS and all outer/rosette leaf traits (outer leaf width (OLW), outer LL (OLL), outer leaf area length (OLaL), outer leaf midvein length (OLvL), outer leaf area (OLA), outer leaf petiole length (OLPL), outer leaf petiole width, and outer leaf petiole area), plant height (PH), plant width (PW), plant weight (PWe), and HWe at the 0.05 level. Among them, HTS and OLL had the highest correlation coefficient of 0.65. However, HTS was not correlated with most of the head leaf traits (heading leaf width, heading LL, heading leaf area length, heading leaf area, heading leaf petiole width, and heading leaf petiole area), except for head leaf petiole length (HLPL) at the 0.05 level and head leaf midvein length (HLvL) at the 0.01 level.
In nonheading plants (N), the outer/rosette leaf dimensions (O) leaf width (LW), LL, leaf blade length (LaL), leaf midvein length (LvL), LA, LPL, leaf petiole width (LPW), and leaf petiole area (LPA), PH, and PWe values were lower than those of both groups of heading plants (both OC: outward curving heading plants and O: overlapping heading plants) (Fig. 1B). These differences between nonheading plants and heading plants were significant, except for the OLvL. The heading OC and O plants differed only in two outer/rosette leaf traits (OLL and OLPL) and one heading leaf trait (HLvL) and showed no significant difference in other outer or heading leaf traits (Table S2).

Construction of the linkage map
The F 2 -104 genetic map consisted of ten linkage groups and was constructed using 3194 bin markers (Fig. S1).
A recombination bin map and heat map of the recombination fraction were generated to evaluate and verify the quality of the linkage map (Fig. S2). The total map length was 1522.89 cM, with an average intermarker distance of 0.48 cM. The length of each linkage group ranged from 280.91 cM for A03 to 88.49 cM for A02, and the number of markers ranged from 571 markers on A03 to 171 markers on A01 (Table S3). The syntenic map of adjacent markers demonstrated that the distribution of singlenucleotide polymorphism (SNP) markers on the genetic map corresponds well with the reference physical map (Fig. S3).

QTL regions and QTL colocalization
QTL analysis of leaf and leafy head traits was performed for the F 2 -104 population using interval mapping (IM) and composite IM (CIM) methods (Table 1).
By the IM analysis method, four QTLs (HTSQ1, 2, 3, and 4) for HTS were detected on A04, and three QTLs (HTSQ5, 6, and 7) were detected on A05. HTSQ1A4 and HTSQ7A5 had the highest logarithm of the odds (LOD) scores (4.833 and 4.895) and explained~6.1% and 4% of the variation, respectively. For leaf-related traits, three QTLs for OLL (OLLQ1, 2, and 3) and two QTLs for OLPL (OLPLQ1 and 2) were detected on A05. Three QTLs for HLPL (HLPLQ1, 2, and 3) were detected on A09, and two QTLs for head leaf vein length (HLvLQ1 and 2) were detected on A06. For leafy head traits, a total of four QTLs (PHQ1, 2, 3, and 4) for the PH traits were detected on A04, with LOD values ranging from 3.879 to 4.622, and one QTL for the HWe (HWeQ1A8) trait was detected on A08. By the CIM analysis method, two QTLs for HTS, namely, CIM_HTSQA4 and CIM_HTSQA5, were detected on A04 and A05, which explained~16.9% and 17.6%, respectively. Together, these findings accounted for 34.5% of all the phenotypic variation in the HTS trait. For the OLL, OLPL, and PH traits, single QTLs were detected on A05 (CIM_OLLQA5 and CIM_OLPLQA5) and A04 (CIM_PHQA4). The genes in these QTL regions are shown in Table S4.

Candidate genes selected by BSA
We generated three independent plant DNA pools, each representing different head-type shapes. In the bulks, the plants were all barcoded, and SNP markers with different allele frequencies between bulks with different phenotypes were selected (p < 0.05).
We focused on SNPs that had significantly different allele frequencies among pools in upstream, downstream, intergenic, intron, and exon regions of genes. This included 272 genes in the "N vs. O" pool comparison, 145 genes in the "N vs. OC" comparison, and 398 genes in the "O vs. OC" comparison ( Fig. 3A and Table S5). In the "N vs. O" comparison, most candidate genes were mapped to regions on A05 (60 genes), A06 (80 genes), and A08 (55 genes); in the "N vs. OC" comparison, the genes were distributed on A08 (56 genes); and in the "O vs. OC" comparison, they were distributed on A06 (152 genes) and A08 (85 genes) (Fig. 3B). We investigated the Map-Man functional categories to which these genes belonged (Fig. 3A). In addition to the common top three categories (not assigned, protein, and RNA), the cell and cell wall and signaling categories were highly represented in these three group comparisons. In the cell and cell wall functional category, the majority of the genes were involved in the cell organization subcategory. In the signaling category, the majority of the genes were involved in the receptor kinase subcategory. In total, 566 candidate genes were included in these three pooled comparisons (Table  S5). Only two genes, namely, Bra001163 (embryo defective 2016: EMB2016) in the development category and Bra016948 (glutamate synthase 2: GLU2) in the Nmetabolism category, were identified in all three comparisons (Fig. 3C).
As expected, some of the candidate genes with significant allele frequency differences in the "N vs. O," "N vs. OC," and "O vs. OC" comparisons colocalized with QTL regions on A04, A05, A06, A08, and A09 ( Fig. 2). For the HTS trait, 15 candidate genes detected by BSA were located in the CIM_QTL regions on A04 and A05. According to the annotation information, five genes (Bra017274, Bra029597, Bra029713, Bra039431, and Bra039444) were in the protein category, four genes (Bra039420, Bra039421, Bra039462, and Bra039446) were in the cell and cell wall category, and two genes (Bra020751 and Bra039445) were in the RNA category.  (Table S5). Seventy-three common genes were identified in the "N vs. O" and "N vs. OC" comparisons ( Fig. 3C), and 59% (43) of the genes were mapped to A08. Among these 43 genes, 16 colocalized with the HWeQ1A8 QTL. The cell organization gene Bra014031 (A08:4344956..4347618) and cell wall degradation gene Bra014180 (A08:2743942..2745200) with SNPs causing amino acid changes were among these 16 genes in HWeQ1A8.

Discussion
The leafy head of Chinese cabbage is an important agronomic trait associated with both yield and quality. Compared to heading Chinese cabbage, nonheading plants (such as pak choi) are generally smaller and produce less biomass (lower PWe or total leafy yield). For heading Chinese cabbage, the head shapes include round, oblong, cylindrical, and cone-like head shapes 6 . HTS refers to the shape of the top of the leafy head, which is affected by the shape, overlap, and curvature of the outer Head-type shape is associated with the length of both the leaf blade and petiole/midvein of the outer rosette leaves HTS was judged based on the heading leaf top shape at the heading stage. The F 2 population was segregated into various HTS types, and we tried to fit a genetic model to the segregation data of the leafy head shape based on the HTS phenotypes of the 1307 F 2 plants. However, regardless of whether we scored the HTS into four or ten classes, we could not clearly select the most likely genetic model. This was likely caused by the fact that the scored phenotype is a combination of traits that are all under the control of different genetic loci 10,20 . To reveal genetic regulation, we should identify and score the subtraits that make up the final head shape-type trait. These subtraits include the curvature of the petiole, angle of the midvein to the soil, head size and weight, LL, and curvature of the distal end of the leaf blade.
At the QTL level, we detected QTLs for rosette leaf-, outer heading leaf-, and leafy head-related traits in this study. We did not detect the same QTLs for rosette LL and width or rosette LPL compared to the previous QTL study 9 . In addition to environmental variation, in these studies, the timing of phenotypic data collection differed: we measured the rosette leaf traits at the heading stage in this study, but phenotyped the growing rosette leaves at rosette stages (30,34,37,41,44, and 48 days after sowing (DAS)) in a previous QTL study. In a study by Sun et al. 9 , QTLs for rosette LPL also varied over time (DAS), and generally, LOD scores were lower when measured at later stages. The phenotyping of the leafy heads was performed at the same developmental stage in both studies, the heading stage. In both studies, QTLs for the leafy head trait "weight" (HWe) were detected at the top of A08 (Fig.  S4). Another QTL study using 150 RILs derived from a cross between heading and nonheading Chinese cabbage also identified QTLs for HWe in linkage group A08: 35.2-62.3 cM in 2010 and 27-34.5 cM in 2012 10 . In the smaller populations of the previous study, the HTS trait was not scored 9 . In the present study, the F 2 population was larger, which allowed us to study a subpopulation with balanced representation of the extreme phenotypes in more detail. In addition, a high-density genetic map was constructed for this subpopulation, and QTL analysis was combined with a BSA approach. This made it possible to highlight candidate genes with SNPs in the HWeQ1A8 QTL region.
The colocalized QTLs indicate the causal relationships among the traits 23 . Colocalized QTLs for pod-related traits (pod length, pod width, and hundred-pod weight) have been identified on A05 and contribute to phenotypic variation for these pod-related traits 24 . In rice, colocalization between PH-and flowering time-related QTLs was identified, and the major gene Ghd7, which has pleiotropic effects on PH and heading date, was identified 23,25 . Here, we detected two QTLs for the HTS trait. Colocalization of HTS with outer/rosette LL and rosette LPL (OLL/OLPL) was identified on A05, and that of HTS with PH was identified on A04. These results were consistent with the phenotypic results: there were strong correlations among outer/rosette LL, LPL, and HTS traits, and significant differences existed in the OLPL, OLvL, and HLvL traits between "O" and "OC" HTS plants. This phenomenon was not unexpected and indicated pleiotropic effects of single genes or tight linkage. We identified an HTS QTL on A05. Interestingly, in an independent study on an F 2 population derived from a cross between two Chinese cabbages with different HTSs, BSA of the "O" and "OC" types identified significant SNPs on A05 (21700337..24709181) overlapping with the QTL region for HTS identified in this study ( Fig. S5; unpublished data). The colocalized QTLs provide a good foundation for fine mapping and for researching HTS and related traits at the molecular level.

Molecular pathways and candidate genes for the HTS trait
HTS is a complex trait that is controlled by endogenous and environmental factors. In this study, we constructed three pools, namely, one nonheading plant pool and two pools of plants with different HTSs, for BSA to identify genomic regions containing genetic loci affecting these traits. In this study, the number of selected candidate genes was higher in the "O vs. OC" comparison than in the "N vs. O/OC" comparison, and most of the candidate genes were colocalized with QTLs for head shape and leaf traits. The expression patterns of these candidate genes were checked in three different genotypes, namely, a nonheading pak choi and two heading Chinese cabbages (one with "OC" HTS and the other with "O" HTS), at the seedling, rosette, and heading stages (Table S7). All candidate genes were expressed in leaves; however, only a few candidate genes were differentially expressed between these genotypes.
Leaf growth is strongly associated with leafy head formation in Chinese cabbage and includes leaf primordium initiation, leaf polarity formation, cell division, cell expansion, and cell differentiation phases 26 . During these phases, the arrangement of cells is indicative of changes in leaf shape 2,4 . Our BSA results showed many candidate genes in the cell wall and cell cycle category. In particular, Bra039421 was mapped to the HTS QTL_CIM on A05. The SNP (A/C) located in exon 1 (540 bp) of Bra039421 resulted in an amino acid change from "Gln" to "His." According to classic cell theory, changes in cellular behavior are responsible for mutant morphology 27 . Thus, the genes in these QTLs may have functions that play roles in HTS formation in Chinese cabbage. Interestingly, genes common between the "N vs. OC" and "N vs. O" comparisons were mainly localized on A08 in the HWe QTL. Heading Chinese cabbage plants were consistently heavier than nonheading plants. We identified several candidate genes for this heading/nonheading comparison (N vs. OC/O), which were related to total PWe. Bra014180 in the cell wall degradation functional category encodes USPL1 (AT1G49320 in Arabidopsis) targeted to protein storage vacuoles, which is particularly interesting because it is closely related to seed development, protein storage vacuoles, and lipid vesicle morphology and behaves like a storage protein 28 . In addition, Bra014180 with an SNP (A/G) on exon 2 (444 bp) caused an amino acid change (Ile/Met) and displayed higher transcript abundance in pak choi than in two Chinese cabbage varieties at early heading stages, with fold change >1.5 and adjusted p value ≤ 0.01.
In conclusion, we phenotyped both leaf and leafy head traits of an F 2 population from a cross between a nonheading pak choi and a heading Chinese cabbage and identified correlations between HTS and other rosette leaf traits. These correlations between rosette leaf traits and head-type shape were confirmed by the colocalized QTLs for these traits. This finding is a good foundation for further research on the genetic regulation of the HTS trait in Chinese cabbage and provides tools for molecular marker-assisted selection of leafy head shape in Chinese cabbage. In addition, by combining QTL and BSA results, we identified a few genes with interesting annotations as candidate genes for quantitative trait loci for leafy head formation. These results will help us understand the genetic control of head shape traits in cabbage.

Plant materials
An F 2 population was derived from a cross between a heading Chinese cabbage (CC-48: CGN06867, origin Soviet Union) and a nonheading pak choi (PC-101: CGN13926, origin China) (Fig. 4A). The F 2 population (n = 1307) plus their parental lines were sown in seeding soil in a greenhouse in September 2017, and 2 weeks later, the plants were transplanted to an open field at Hebei Agriculture University (Baoding, Hebei, China) and grown under short-day conditions until December 2017.
Phenotyping and phenotypic data analysis HTS, as the main trait to be studied in this research, was classified on a 1-4 scale (1. nonheading shape; 2. the shape of the leaf on the top of the head is outward curving; 3. the shape of the leaf on the top of the head is inward curving without overlaps at the top, which we identified as the intermediate phenotype between class 2 and 4; 4. the shape of the leaf on the top of the head is inward curving and overlaps at the top) by visual observation at the heading stage for the F 2 plants (n = 1307) (Fig. 4A).
In addition, another 20 leaf and leafy head traits were evaluated at the same time for 104 F 2 plants that represented the different HTS classes (Fig. 4B and Table 2). For leaf traits, the LL, LaL, LvL, LW, leaf area, LPL (here, we measured the sum of leaf petiole and white midvein length), LPW, and LPA were measured for the largest outer leaf (O) and for the largest and most outward curving heading leaf (H). For plant traits, PH and PW were evaluated. For leafy head traits, we determined the total PWe (including outer and heading leaves) and HWe (after discarding the loose outer leaves).
The inheritance model analysis for a plant quantitative trait (HTS) was analyzed by SEA software 29 . The phenotypic data were analyzed by Statistical Package for the Social Sciences (SPSS, IBM) software, including the mean, minimum, maximum, standard deviation, variance, and t test, using p < 0.05. The correlation coefficients between traits were analyzed using an R package ("corrplot" package: http://www.R-project.org) with the Pearson's correlation coefficient.

Genotyping and SNP discovery
Genomic DNA from individual plants was extracted from young leaves of parents and their F 2 progeny plants (n = 104) by the CTAB method. The genotyping-by- Fig. 4 Construction and phenotype of the Chiense cabbage F 2 population. A The strategy for constructing the Chinese cabbage F 2 population and the phenotype variation of the leafy head top shape (HTS) trait. N nonheading type, OC leafy head with outward curving top leaf without overlap, and O leafy head with overlapping top leaf. B Description of the measurements of leaf traits sequencing (GBS) method was used to generate ApeKIassociated DNA fragments for sequencing on the Illumina HiSeq2000 platform at the Gene Denovo Company, China, and SNP genotyping and evaluation were then performed 30 . The "Chiifu" genome V1.5 was used as the reference sequence for the alignment of sequenced reads by the BWA software 31 . A total of 258,443 genetic variations (234,421 SNPs and 24,022 indels), with indels representing 9% of the total, were identified, and 71.8% of these SNPs resided in intergenic regions. Among the intergenic SNPs discovered, 11.3% were within 5 kb upstream, and 10.5% were within 5 kb downstream, of an open-reading frame. In addition, 14.9% of the SNPs were observed in exonic regions and 13.2% in introns. For the SNPs located in the coding region (exon), 56.9% were synonymous (silent) mutations, and 40.4% resulted in a change in amino acids (nonsynonymous substitutions) or stop codons. For the SNPs located in introns, 162 SNPs were observed at intron splicing sites that potentially alter the function of these genes.
The 234,421 SNPs were filtered for those that showed polymorphisms between parent lines (CC-48 and PC-101) and had allele frequencies >60% in the F 2 population. After filtering, a set of 16,570 SNPs were discovered through GBS, of which 8998 were "aa × bb", 1165 were "hk × hk", 2612 were "lm × ll", and 3795 were "nn × np" segregation types 32 . As the F 2 population was derived from a cross between homozygous parents (DH lines), 8898 "aa × bb" parental markers that formed 5973 bins were used as input data for "JoinMap" to construct the genetic linkage map. After creating population nodes, "ML mapping" was used to assign the markers into linkage groups, and "Kosambi's" mapping function was used to construct genetic maps.
Identification of QTLs and BSA to identify candidate genes for leafy head traits Genetic map construction and QTL mapping A total of 8998 polymorphic homozygous SNPs in both parents (aa × bb) were used to generate a genetic map of the F 2 population. SNPs in the F 2 population were only used if the parental source of the alleles could be unambiguously assigned. The F 2 plants (n = 104) were genotyped for a set of 8998 markers covering the ten linkage groups. A subset of 3194 specific SNP markers was used to construct the F 2 genetic linkage map in this study. Genetic mapping was performed using an R package 33,34 . Based on the genetic map, QTL regions for the phenotyped traits were identified by IM analysis with the R/qtl package 35 . The LOD critical values (Table S6) for accepting the presence of potential QTLs were determined by permutation analyses (p < 0.05) 36 .

Sequencing-based bulked segregant analysis
In addition, a BSA approach was used to identify genomic regions linked to the leaf blade top curvature trait. Four leafy head phenotypes were distinguished: no leafy head, with leaves that do not form a leafy head and are fully curved outward (N: 17 plants