Introduction

Heterosis occurs in a variety of species and has been observed and recorded in China since ancient times. For example, Jia Sixie described in “The Manual of Important Arts for the People” that interbreeding between horses and donkeys produced stronger mules, and the famous agricultural work “Tian Gong Kai Wu” also recorded crossbreeding techniques for silkworms. Heterosis has also been extensively studied in other countries. In 1763, the German scholar Koelreuter1 was the first to present concrete evidence that the growth of hybrid tobacco is superior to that of its parents. By comparing the height of hybrid and self-crossing offspring in maize, Darwin2 found that the average height of hybrid offspring was higher than that of self-crossing offspring. Beal3 found that the yield of maize hybrid offspring was greater than that of both parents. Shull4,5 observed heterosis in maize hybrid offspring and first proposed the concept of heterosis; he then formally named this phenomenon “heterosis.” Heterosis was first applied to genetic breeding in maize, and many excellent maize hybrids have been produced since the 1930s. Since 2011, the yield of maize increased by at least eightfold in America, due mostly to the cultivation of hybrids6.

As heterosis has been applied in cereal crop production, crossbreeding in vegetables has also rapidly progressed. Under natural planting conditions, 40–80% of seeds produced are usually hybrids due to fertilization competition between self-pollination and pollen from other plants7. Although the traits of randomly generated hybrid seeds are not organized at first, F1 hybrids exhibit higher yield, better adaptability, and higher stress resistance than pure line seeds under optimum production and fertilization protection management conditions. Therefore, farmers have paid much attention to the cultivation of hybrid seeds8. The first hybrid of eggplant (Solanum melongena) was released in 19249. Subsequently, hybrids of other vegetables, such as watermelon (Citrullus lanatus L.), cucumber (Cucumis sativus L.), radish (Raphanus sativus L.), tomato (Solanum lycopersicum L.), and cabbage (Brassica oleracea L.), were developed over the next 20 years7. The number of hybrid vegetable varieties is rapidly increasing, at a rate of 8–10% each year, while nonhybrid vegetable varieties are gradually being eliminated10.

The application of heterosis to vegetable cultivation was first proposed by Hayes and Jones11 using cucumbers. However, because of the high cost of producing hybrid seeds, hybrid cucumber seeds were not used until the 1930s7. Similarly, self-pollination and the occasional presence of indehiscent anthers in eggplant12 and styles that are shorter than anthers in tomato13 have resulted in a high degree of self-pollination, which in turn has limited hybrid utilization. Pearson (1933) and Jones and Clarke (1943) used the mechanisms of self-incompatibility in cabbage and cytoplasmic male sterility in onion, respectively, to produce pure line and hybrid seeds on a large scale8. To avoid undesirable selfing, various genetic and nongenetic mechanisms, including genic male sterility, cytoplasmic male sterility, self-incompatibility, gynoecious lines, auxotrophy, and the use of sex regulators and chemical hybridizing agents, have been applied to facilitate hybrid seed production in vegetables8,14. The various traits that exhibit remarkable heterosis in F1 hybrids, including yield, earliness, growth vigor, and stress tolerance15,16,17,18, have become a major area of research on vegetables. In an experiment with hybrid eggplant conducted by Balwani et al.19 and Makani et al.20 heterosis in the optimal F1 hybrid resulted in yield increases of 125.78% and 88.88%, respectively. A more productive eggplant hybrid will effectively decrease the time to first harvest18. Transgressive phenotypes have also been observed in other Solanaceae21,22, Cruciferae23,24, and Cucurbitaceae vegetables25,26.

Although heterosis in vegetables has historically been used in research and crossbreeding experiments, its genetic mechanism remains elusive. Different genetic models for heterosis have been described in various reviews27,28,29,30,31. However, it is apparent that the classical genetic hypothesis of heterosis cannot explain all mechanisms of heterosis. Therefore, genetic models of heterosis have been included in this review. In addition to genetic models, we also present a schematic diagram depicting the involvement of epigenetics in heterosis. Simultaneously, we discuss studies on heterosis at the molecular level based on QTL effects and differential gene expression analyses. We also describe the effects of QTL on heterosis in crop plants based on Shang et al.32 to guide future research studies on the genetic mechanisms of heterosis. We summarize recent findings on the interactions of QTL sites with regard to heterosis and discuss the contribution of various QTL effects to heterosis. Differential expression analysis of genes related to heterosis can also provide a different perspective on heterosis31. In addition, we present morphological improvement as another measure to increase yield and an important component of breeding7 and describe how to combine heterosis utilization and morphological improvement.

To date, studies on heterosis in vegetables mainly involve obtaining F1 hybrids through crossbreeding. The utilization of cucumber hybrids proposed by Hayes and Jones11 was likely the first instance of effective vegetable breeding that exploits heterosis. Kumar et al.30 introduced methods of predicting heterosis in eggplant hybrids, such as genetic distance prediction and combining ability tests, and proposed the application of a sterile line system as well as transgenic and gene editing techniques in eggplant breeding. Herath et al.33 summarized the QTL mapping of yield-related traits in chili, introduced the use of heterosis breeding to improve the economic and agronomic traits of chili, and suggested the use of genomic technology and sterile line materials in chili breeding. Mallikarjunarao et al.34 reviewed the progress of various balsam pear (bitter gourd) hybridization tests and indicated that heterosis does occur in the yield of balsam pear hybrids. However, studies on the genetic mechanisms of heterosis in vegetables are limited, which hinders the application of heterosis in vegetable breeding. Therefore, in this review, we describe the progress of research on the genetic mechanisms of heterosis, analyze the use of hybrid production systems and molecular biology technology in vegetable production, and propose a breeding strategy that can predict, obtain, and maintain heterosis. This review will provide a reference for the utilization of heterosis in vegetable breeding.

Study on the genetic mechanisms of heterosis

Genetic regulation of heterosis

Heterosis is a complex biogenetic phenomenon caused by the combination of many factors that is manifested in the performance of hybrid offspring. The classical hypotheses for the genetic mechanisms of heterosis include the dominance and overdominance hypotheses, which are based on allelic interactions, and epistasis, which is based on nonallelic interactions.

Davenport35 first proposed the dominance hypothesis (Fig. 1A), and Bruce36 and Jones37 developed it further. In the dominant hypothesis, favorable genes controlling growth and development are dominant, and unfavorable genes are recessive. In the hybrid generation, the alleles from the two parents are complementary, and the unfavorable recessive genes are suppressed by the favorable dominant genes; therefore, the hybrid generation exhibits heterobeltiosis.

Fig. 1: There are five hypotheses to explain the mechanism of heterosis based on gene effects.
figure 1

Suppose that the biomass is the sum of the genetic effects (A, B, C) and that the biomass of an organism is represented by the circular area. A Dominance effect: the dominant allele (A) inhibits the recessive allele (a); (B) overdominance effect: a single heterozygous allele (B/B) promotes the development of heterosis; (C) Epistasis effect: nonallelic (A1/B1) interactions in the parents promote the development of heterosis; (D) active gene effect: genes from parents (C) promote heterosis when heterozygous and produce genome imprinting when homozygous, which inhibits the occurrence of heterosis; (E) gene network system: genes from parents (A, B, C) are combined into a coordinated gene network system that enables F1 to develop heterosis; (F) single-cross hybrids P1 (AB) and P2 (CD) produced from four homozygous inbred tetraploids (with genotypes A, B, C, and D) are crossed to produce F1 (ABCD), a double-cross tetraploid hybrid

The overdominance hypothesis (Fig. 1B) was originally proposed by Shull4 and East38 as the opposite of the dominance hypothesis. This hypothesis denies that there is dominant-recessive relationship between alleles and suggests that the main cause of heterosis is the interaction of heterogeneous alleles from parents. Heterozygous alleles interact more strongly than homozygous alleles; thus, the hybrids exhibit heterobeltiosis. Using the isozyme technique, Dranginis39 found that the enzymes in heterozygotes exhibit many unique conformations of hybrid enzymes. For example, the regulatory proteins of heterozygotes often present as polymers that regulate genes, and different heterozygous and homozygous proteins consistently show different activity characteristics. In addition, the anthocyanin content heterobeltiosis that occurs due to the heterozygosity of a single locus (pl) in maize40 and the yield heterosis induced by the heterozygosity of a single locus (sft) in tomato15 also provide experimental evidence for the overdominance hypothesis. However, the interaction of closely linked alleles can also result in an overdominance effect that is known as pseudo-overdominance41.

The dominance and overdominance hypotheses for the heterosis phenomenon both suggest that heterosis is caused by individual allele loci. However, several reports have shown that plant traits such as yield and growth vigor are complex quantitative traits42. Wright43 visualized the network structure of population genotypes, i.e., multiple loci control the variations in most traits; in such networks, the replacement of anu gene may affect multiple traits. Based on this perspective, Sheridan44 proposed the concept of epistasis. He believed that heterosis may arise from interactions between nonalleles. In genetics, the phenomenon in which the genetic effect of a nonallele deviates from its additive effect is called epistasis (Fig. 1C). The significant special combining ability (SCA) effects in the hybridization experiment of Sao and Mehta indicated that epistasis plays a predominant role in the genetic control of eggplant heterosis45. Using a genetic map that covered the whole rice (Oryza sativa) genome, QTL mapping for yield-related traits was conducted in 250 F2:3 lines. The results showed that the correlation between marker heterozygosity and yield-related traits was low and that the interaction between most genes could not be detected on the basis of single-gene loci; the interactions were classified as dominance by dominance, additive by dominance, and additive by additive46. Therefore, Yu et al.46 also believed that epistasis is an important genetic basis for the development of heterosis.

Other ideas in addition to the classical hypotheses have been proposed. Zhong47 proposed the active gene effect hypothesis (Fig. 1D) by comparing the relationship between genomic imprinting and heterosis; this hypothesis suggests that heterosis is caused by additive effects between the active genes. When alleles are homozygous, only one of them is active. When genes are heterozygous, genomic imprinting does not occur, and all genes are active, showing all effects. The interaction between active genes increases the overall effect of gene expression; as a result, the hybrid exhibits heterosis. For example, in maize, the red1 (r1) gene, when inherited from both parents, causes different colors in corn kernels48. Genomic imprinting affects the differential expression of genes by affecting DNA methylation and histone modification49. Bao50 suggested that individuals have a specific set of genetic information that controls their growth. Genetic information is expressed as different coding genes in organisms; these genes form an orderly network of expression, and the activities of each gene are related to each other. An alteration in a single gene may cause changes in the entire network. The network of F1 hybrids is a new gene network system that is formed from the two different gene networks of the parents. If the interactions between alleles bring the whole genetic network system to an optimal state, the F1 hybrid exhibits heterosis; otherwise, it remains typical (Fig. 1E). In addition, the effects caused by genomic imprinting or active gene effects may be components of genomic dosage effects51; the other part of genomic dosage effects usually caused by polyploidy, which is a specific phenomenon in polyploid plants called progressive heterosis (Fig. 1F)52,53. The genomic dosage effects produced by allopolyploids are usually stronger than those produced by homologous polyploids38,51,54,55. The formation of polyploids is accompanied by extensive genetic and epigenetic changes56, which may provide the molecular basis for the development of heterosis.

Epigenetics is involved in the development of heterosis

Although many hypotheses have been proposed to explain the mechanisms of plant heterosis at the genetic level, studies have shown that the genetic mechanisms of heterosis cannot be fully explained by one or even several hypotheses at the genetic level. Through the intensive study of epigenetics, epigenetic factors such as DNA methylation, small RNAs, and histone modifications have been found to be involved in the development of heterosis in plants57,58,59,60,61,62.

Epigenetic modifications play an important role in the formation of plant phenotypes by regulating gene transcription and gene expression63,64,65. Alleles of known phenotypes have been studied more extensively in the context of DNA methylation than in the context of other epigenetic modifications63. RNA-directed de novo methylation (RdDM) is one of the pathways that triggers DNA methylation by 24 nt-siRNA, which is regulated by two key genes, namely, NRPD1 and NRPE166 (Fig. 2A, B). A silent epigenetic variant caused by differentially methylated regions (DMRs) in the promoter, sulfurea (sulf/+), can result in homozygous lethal tomato plants that exhibit only chlorotic leaf sectors64,65. This may occur due to the random combination of genetic information from the parents of the F1 hybrids because their genotypes are more prone to heterozygosity at the DNA methylation level; this is in line with the findings of Shen et al.59. The gene effect caused by such heterozygosity may enable F1 hybrids to avoid producing common phenotypes or hybrid weakness, thus achieving heterobeltiosis. Using experiments involving heterograft eggplants, Cerruti et al.62 found that scion vigor is related to DNA methylation and that the reduction in methylation in the CHH context promotes scion vigor. Tomato grafting experiments revealed that RdDM can cause a heritable enhancement-through-grafting phenotype67,68.

Fig. 2: Putative model of heterosis triggered by epigenetics.
figure 2

A DNA methylation: De novo methylation was catalyzed by DRM2, a homologous enzyme of DNMT3. In maintenance methylation, CG is catalyzed by MET1, a homologous enzyme of DNMT1; CHG is catalyzed by CMT3; and CHH is still catalyzed by DRM2. B Small RNA: Includes the miRNA produced by premiRNA and the siRNA produced by dsRNA. In general, 24 nt-siRNA mediates de novo DNA methylation catalyzed by the AGO4 protein. C Histone modifications: The modifications of histone amino acid residue includes acetylation, phosphorylation, methylation, and ubiquitination processes. Epigenetic modifications are produced by the parents. New epigenetic modifications may occur in F1 hybrids. D Epigenetic modification status of the parents and F1 hybrid: the increase and decrease in or recombination of epigenetic modifications induces the F1 hybrid to exhibit heterosis

Because de novo DNA methylation is mediated by siRNAs (Fig. 2B), siRNAs may also be involved in the regulation of heterosis. The level of siRNAs decreased in different genome regions between parents and hybrids, but this phenomenon was limited to 24 nt-siRNAs; in contrast, the levels of siRNAs of other sizes did not decrease67. Noncoding small RNAs can be used as signaling molecules in plants67. Shivaprasad et al.61 observed that miR395 is differentially expressed, mediates transgressive phenotypes in the hybrid progeny of tomato and is associated with suppression of the corresponding target genes, which indicates that the combination of parental genetic information can cause differences in miR395 abundance in the progeny. Simultaneously, 21–24 nt small RNAs can move through the intercellular filaments and phloem of the graft site69, and 24 nt sRNAs can guide genomic DNA methylation in recipient cells70; this information provides a theoretical basis for guiding grafting. In addition, sRNAs in plants usually play a major role in inducing gene expression silencing and gene posttranscriptional silencing71,72. This may be due to the downregulation of sRNA levels in hybrids, which lifts the silencing of some favorable genes and thus allows hybrids to exhibit heterobeltiosis71,72.

Different modifications, such as acetylation, phosphorylation, methylation, and ubiquitination, occur at the amino terminus of histones (Fig. 2C). These histone modifications can affect the binding of related proteins to chromatin and thereby affect the transcriptional activity of genes. At the same time, the combination of modifications of the amino terminus of histones expands the genetic information for and changes the phenotype of an individual73. Histone modifications are related to the stability of heterosis. Studies have shown that histone deacetylases cause the nonadditive expression of some genes in hybrids58. In addition, histone acetylation and methylation are related to the activation of regulatory (circadian-regulated) genes in F1 hybrids73. The biological clock controls the physiological activities of plants, including the synthesis of physiological and biochemical substances. Therefore, histone modifications can influence plant biomass heterosis.

The recombination of genetic information from parents may lead to new combinations of epigenetic modifications in the F1 generation (Fig. 2D). Epigenetic modifications essentially affect the expression of genes, causing them to be overexpressed or silenced. Therefore, epigenetic modifications may indirectly influence the development of heterosis in F1 by affecting the expression pattern of genes.

Study on heterosis at the molecular level

Progress in heterosis research based on QTL analysis

The genome contains all the genetic information of a species and determines whether an individual gene is expressed as well as its degree of expression. Heterosis is usually indicated if the hybrid generation is superior to the parents in terms of quantitative traits. Thus, it is essential to conduct a genetic analysis of heterosis from the perspective of the whole genome. With the rapid development of genome sequencing technology, it has become possible to identify gene loci related to heterosis by genome-wide association studies74, which lay a foundation for the study of individual phenotypic differences. This review summarizes the QTL effects on heterosis based on 35 studies that mainly addressed 6 crops and vegetables, i.e., rice (Oryza sativa), maize (Zea mays), cotton (Gossypium hirsutum), oilseed rape (Brassica campestris), sorghum (Sorghum vulgare), and tomato (Solanum lycopersicum) (Table S1). Among the six types of QTL effects, dominance and epistasis had equal proportions (19%, 23%, Fig. 3). Interestingly, the overdominance effect accounted for the largest proportion of all the effects (42%, Fig. 3). This means that although there are many gene loci in the plant genome, these interacted to produce different, complex, hard-to-imitate effects and resulted in heterosis; among these effects, overdominance effects occurred consistently and contributed significantly to heterosis. In addition, the overdominance effect can be conveniently used for artificial breeding, which has been well demonstrated in tomato15. However, efficiently and accurately locating the gene loci that impart the overdominance effect is necessary to make use of this effect. Heterosis may be the result of many traits. In addition, the results of QTL mapping differ among species and even within different groups of the same species75,76,77. Therefore, it is necessary to select a suitable genetic population based on the genetic background of the plants exhibiting heterosis.

Fig. 3
figure 3

Statistical analysis of the effect of quantitative trait loci on crop heterosis. A In the statistical analysis of the effect of quantitative trait loci on crop heterosis, the species and frequency of each species were studied; (B) in the statistical analysis of the effect of quantitative trait loci on crop heterosis, the quantitative trait locus effect on each species and the proportion of each type of effect were analyzed

Advances in gene action related to heterosis based on differential expression analysis of genes

The genome controls the formation of a biological phenotype by regulating the differential expression of genes78,79. Molecular-based expression analyses, such as allele-specific expression, DNA microarray, expression quantitative trait loci, RNA-seq, quantitative SNP-based Sequenom technology, and allele-specific RT-PCR, have made it possible to detect differential gene expression.

Yield and biomass heterosis in F1 hybrids may occur due to the altered expression patterns of genes that control biological functions such as carbon fixation, glucose metabolism, and circadian rhythm80. Gene Ontology (GO) analysis of pakchoi line parents and hybrids indicated that most of the differentially expressed genes between parents and hybrids enriched the photosynthetic pathway and that the enhancement of the photosynthetic capacity of the hybrids was related mainly to an increase in the number of thylakoids17. In addition, the increase in the number of thylakoids also promoted the enhancement of the carbon fixation capacity in the hybrids17; this is similar to the finding that differentially expressed genes that significantly enrich the optical signaling pathway occur between F1 and their parents in broccoli24. The same results were also found in other plants79,81. Transcriptome and differential gene expression analyses revealed that the modes of action of heterosis genes were mainly additive (F1 = MPV), overdominance (F1 > HPV), and underdominance (F1 < LPV)82 (Fig. 4). When the expression value of a differentially expressed gene in the hybrid line was higher or lower than that of the parent, the gene action patterns were classified as high-parent dominance (F1 ≈ HPV) and low-parent dominance (F1 ≈ LPV), respectively82 (Fig. 4). Li et al.24 reported that most genes exhibited additive expression patterns in hybrid broccoli and that nonadditive action was involved mainly in light and hormone signal pathways related to heterosis; a similar finding was reported in Chinese cabbage (Brassica campestris ssp. pekinensis cv. “spring flavor”)23. These gene expression patterns may have occurred due to selective inhibition or activation by the epigenetic modification of hybrid F1 genes83,84; the genes from inactive inbred lines can be activated by genes or regulatory factors of active inbred lines85,86. Epigenetic modifications and the interactions of heterogeneous factors occur in only a few genes, and the genome that produces differential expression in F1 hybrids and parents accounts for only a small part of the total genome87. Moreover, Springer and Stupar88 have shown that additive gene expression accounts for the majority of gene expression, while nonadditive gene expression is responsible for a small proportion of gene expression. These findings suggest that nonadditive expression of this fraction facilitates the development of heterosis.

Fig. 4: By comparing the gene expression of the F1 hybrid and its parents, the gene expression patterns of F1 were divided into additive gene expression patterns and nonadditive gene expression patterns.
figure 4

Midparent value [MPV = (HPV + LPV)/2]; High-parent value (HPV); low-parent value (LPV)

Traits contributing to yield heterosis in vegetables

Traits related to yield heterosis

Hybrids that exhibit heterosis show significant heterobeltiosis in yield, which is a complex trait that is usually measured by weight. To clearly study the mechanisms of yield increase in hybrids, it is essential to divide yield into other, simpler traits. This review describes the traits that contribute to vegetable yields. Fruits are the source of the yield of most plants; the yield contributing traits related to fruits usually include the fruit number, fruit size and fruit weight; earliness is usually also taken into account. Cabbage is a typical leafy, head-forming vegetable in Cruciferae, so its main yield contributing traits are head weight and head size (Fig. 5A, C). Similar to that of cabbage, the yield of radish is determined by its taproot. For leafy vegetables that do not form heads, the main yield heading traits are the number and size of the leaves. Unlike cruciferous vegetables, Cucurbitaceae and Solanaceae vegetables are produce multiple harvests and multiple fruits per plant (Fig. 5B, D), so the average single fruit weight and fruit yield per plant should be taken into account. In addition, Solanaceae vegetable flowers consist mostly of compound inflorescences89, so the numbers of flowers per cluster and fruits per cluster contribute greatly to production. Cucurbitaceae are single-inflorescence vegetables; only the fruits on the main vine are harvested in production, and the first nodal position of female flowers and sex ratio (M/F) affect the days to first harvest and the number of fruits per plant, respectively. Regardless of the trait considered, the total yield can be affected only by changes in yield-related traits. Therefore, it is necessary to analyze the mechanisms that regulate yield-related traits.

Fig. 5
figure 5

Contributing traits of yield heterosis in cucumber, cabbage and tomato. A Traits contributing to yield heterosis in cucumber, cabbage, and tomato: cucumber yield contributing traits include the number of fruits, days to first female flowering, days to first harvest, first nodal position of female flower, sex ratio (M/F), fruit length, fruit diameter, and fruit weight; cabbage yield contributing traits include fruit length, fruit diameter, and fruit weight; tomato yield contributing traits include number of fruits, days to first female flowering, days to first harvest, number of flowers/fruits per cluster, fruit length, fruit diameter, and fruit weight. B Cucumber: cucumber model in production, gynoecious line with a small number of branches. C Cabbage: an aerial and cross-sectional model of cabbage consisting of leaves and heads. D Tomato: a tomato with single inflorescences and indeterminate growth is crossbred with a tomato with compound inflorescences and determinate growth to produce the hybrid F1 with earlier fruiting, more compound inflorescences, and determinate growth

Relationship between yield heterosis and plant architecture

Since the “green revolution”, interest in breeding for specific plant architecture has significantly increased, and the idea of combining heterosis breeding with plant architecture breeding has been proposed90. Donald91 conducted research on half-dwarf plant architecture, which gradually turned into the concept of the ideotype. Donald introduced the ideotype concept, which refers to the plant architecture form that results in the minimum competitive intensity in population breeding. Although this definition is no longer used, the concept of an ideal plant architecture has played a major role in promoting plant breeding for high yields. Research on ideotypes first made progress in rice. It is worth mentioning that a key gene regulating ideotype, IPA1, was proven by Huang et al.75 to influence genes that are important in heterosis by using the indica-japonica hybrid rice group. Studies of heterozygosity and ideotype were also combined effectively in tomato. The self-pruning (sp) gene promotes indeterminate growth in tomato, while the sft gene changes indeterminate growth into determinate growth by inhibiting the sp gene92. The sft gene results in the development of heterosis in tomatoes through the heterozygosity of a single gene15 and induces changes in plant architecture on the ground, causing tomato to produce compound inflorescences rather than single inflorescences93. The earliness of F1 was also higher than that of its parent (Fig. 5D), which increased tomato yield. Other vegetables in addition to tomato may also have ideotypes, and the key genes controlling plant architecture may also be important genes that are involved in the development of heterosis. Therefore, it is particularly important to study the genetic mechanisms of heterosis. By identifying the important genes involved in heterosis, the key genes that control plant ideotypes can be characterized.

Advances in heterosis utilization and biotechnology in vegetables

Breeding for heterosis has been extensively studied in plants, and research on the heterobeltiosis of hybrid offspring in vegetables has focused mainly on yield94 and disease resistance29. Wellington95 and Tschermak96 showed that tomato hybrids exhibit heterosis in early maturity and during yield production. Krieger et al.15 cloned the single-gene sft that affects the female flower fertility rate in tomato by infiltrating the IL and TC populations. When the sft gene exhibited heterozygosity, the tomato yield exhibited heterosis. According to this study, tomato plants that showed yield heterosis also showed resistance to both biological and abiotic stresses. The heterozygous state of the Tm and Tm22 genes contributes to tobacco mosaic virus resistance97,98 and high-temperature stress tolerance99,100. Naresh et al.101 suggested that heterosis is the result of nonadditive gene effects and that it also plays an important role in improving Cercospora leaf spot resistance in eggplant in the field. Similar to studies on other vegetables, studies on heterosis in Cucurbitaceae vegetables have also focused mainly on yield and disease resistance. Pandey et al.102 used 77 cucumber hybrid generations and their parents to study the yield heterosis and contributing traits of different cucumber hybrid varieties and found that DC–1 × B–159 and VRC–11–2 × Bihar–10 were the best hybrid combinations for yield and prematurity. Using 48 F1 hybrids and their parents, the gene effects caused by diseases and insect pests under natural conditions29 were investigated. The results indicated that nonadditive gene effects had a significant regulatory effect on other traits in cucumber (except morbidity caused by Drosophila), demonstrating the importance of heterosis in cucumber breeding for disease resistance.

Different molecular markers, such as simple sequence repeats (SSRs), inter-simple sequence repeats (ISSRs), amplified fragment length polymorphisms (AFLPs), random amplified polymorphic DNAs (RAPDs), and sequence-related amplified polymorphisms (SRAPS), have provided the molecular basis for the construction of genetic maps and the mapping of important trait genes (Table 1). Whole-genome sequencing has been conducted for a variety of vegetables (Table 1), which has provided a basis for whole-genome strategies. Whole-genome approaches can help obtain complete sequences of germplasm resources, increase the coverage of molecular markers, and increase the accuracy of genetic maps103. Molecular markers are often used for the determination of genetic distance and the classification of heterotic groups. To elucidate the breeding processes and to improve the efficiency of breeding techniques in cabbage, heterotic cabbages are usually divided into two groups: The round head type and the flat head type. Xing et al.104 further divided 21 flat cabbage inbred lines into three heterotic groups and divided 42 round cabbage inbred lines into five heterotic groups in order to provide a more definite direction for the preparation of hybrid combinations of cabbage. The method of dividing heterotic groups by molecular markers and genetic distance is widely used in vegetable breeding (Table 1).

Table 1 Research progress in vegetable breeding

Chen83 proposed that determining how to obtain hybrid seeds is the key to the utilization of heterosis. The purpose of obtaining hybrid seeds is to make heterosis in the offspring permanent. The sporophyte of cruciferous vegetables is a self-incompatible system105 that can prevent self-pollination and produce normal seeds through cross-pollination. Hence, this system is convenient for the generation of hybrid seeds. In cabbage106,107 and Chinese cabbage108, hybrids are usually obtained using self-incompatible and male-sterile lines. To produce hybrid tomato seeds, pollen-abortive type and functionally sterile lines are often used109,110,111. Cytoplasmic male sterility occurs in eggplant112,113 and pepper114,115. Gynoecious lines tend to exist in Cucurbitaceae116. A new male-sterile system in tomato was developed by Du et al.117. Plant growth regulators such as ethylene, auxins, and brassinosteroids118,119 can increase the number of female flowers in Cucurbitaceae; this effect and male sterility are both convenient for hybrid seed production.

Strategies for heterosis breeding in vegetables (with tomato as an example)

Obtaining F1 hybrids that exhibit heterosis based on heterosis prediction

It is not advisable to conduct extensive hybridization tests to obtain hybrid F1 lines that exhibit heterosis, as this approach requires considerable resources and time and produces unreliable results13. Melchinger and Gumber120 proposed that heterotic groups should be used as the basis for crossbreeding. The heterotic group is the population that is classified according to breeding requirements, with abundant genetic variation and high combining ability. Chen et al.121 carried out a genome-wide association study (GWAS) on the yield traits, general combining ability (GCA), and SCA of rice. The study provided strong evidence for the use of combining ability to classify heterotic groups and provided a reference for studies on combining ability in vegetables (Fig. 6). Other studies have also shown that combining ability, genetic distance, and molecular markers can provide the basis for evaluating parental inbred lines and predicting F1 hybrid heterosis in vegetables122,123,124,125.

Fig. 6: There are two key factors involved in applying heterosis breeding strategies: obtaining heterotic lines and maintaining heterosis in the elite lines in the offspring.
figure 6

There are two strategies for obtaining heterotic lines in crop breeding. The first is the use of crossbreeding or molecular biotechnology. Genealogical analysis, molecular markers, combining ability, and genetic distance can usually predict heterosis development, so they are often used to classify heterotic groups. The inbred lines from different heterotic groups can be crossed with each other to obtain elite lines that exhibit heterosis. The second strategy is to use modern molecular biotechnology. Elite lines were obtained based on GWAS and linkage analysis, mapping and cloning genes related to heterosis, gene editing, and gene transformation

The GCA characterizes the average performance of a set of hybrid combinations and is mainly the consequence of additive gene effects and additive × additive interactions; SCA evaluates the average performance of certain hybrid combinations compared to the parental lines and is the result of dominance, epistatic deviation and genotype × environmental interactions126. Parents with a high GCA effect have higher adaptability and fewer environmental effects127. Parents with superior traits do not always pass on their traits to offspring126; hence, the evaluation of combining ability is more reliable than the performance of the lines per se. Many types of combining ability tests can be used to identify superior parental lines for developing heterotic hybrids, including line × tester analysis, topcross tests, single-cross tests, poly-cross tests, and diallel mating128. Singh et al.129 conducted a complete diallel cross test on seven diverse bitter gourd lines and found that combinations with high × high GCA usually produced high SCA effects and could therefore be considered for use in developing superior variants through the pedigree method. High/low × low GCA combinations can also achieve high but unstable SCA effects that are suitable for heterosis breeding and are in line with the results of Kenga et al.130 in sweet sorghum and Franco et al.131 in common bean.

In addition to combining ability, heterotic groups are often classified by genealogical information132. For parents with known genealogical relationships, heterosis in hybrids can usually be predicted according to these genealogical relationships. Genetic distance is a quantitative description of the genetic differences that provide the genetic basis for the development of heterosis in offspring133,134. Parental lines with a longer genetic distance are more likely to produce hybrids with strong predominance135,136. Molecular markers can also be used to directly or indirectly classify heterotic groups by assessing their genetic distance125,137,138. RAPD and AFLP have been successfully used to detect the genetic distance between tested lines, and the yield of carrots was found to be significantly correlated with genetic distance125. Genetic distance has also been applied to predict hybrid pepper fruit diameter139 and hybrid melon (Cucumis melo L.) fruit shape diameter140. The scientific classification of heterotic groups improves the efficiency of selecting hybrid combinations of superior parents and utilizing heterosis (Fig. 6).

In addition, some omics approaches, such as genomics, transcriptomics, and metabolomics, have become tools for predicting hybrid yield in rice141. Xu et al.141 analyzed metabolomic and genomic data from 21,445 hybrids developed by 210 recombinant inbred lines and found that metabolomic data were more effective than genomic data in predicting hybrid yield. Research on the prediction of heterosis in vegetables with omics data has not been published. However, the genome or epigenome is the most fundamental source of the plant phenotype, and the transcriptome, proteome, and metabolism are the direct sources of plant phenotypes. Therefore, omics data could represent a more accurate way to predict vegetable hybrid heterosis, and studies of crop hybrid yields can provide a reference for predicting heterosis in vegetables.

Obtaining elite lines based on molecular biotechnology

GWAS is a method used to identify the gene loci that control certain traits in a population by combining phenotypes with genotypes. GWAS is often used to identify certain traits, such as green flesh color or thermotolerance, in cucumber142,143 but can also be used to analyze complex traits, such as yield and biomass144,145,146,147,148,149,150,151,152,153,154,155,156. In addition, whole-genome sequencing of various vegetables provides a basis for GWAS (Table 1). Due to the unique phenotype of heterosis and its genetic background sources, a genetic population can be composed of different populations or ecotype hybrid populations. A segregated F2 population that was produced by a strongly predominant F1 population is regarded as the best population for studying heterosis27. Such an F2 population not only has a reasonable proportion of lines with heterozygous genotypes and homozygous genotypes but also has allele combinations that are distributed evenly at each site27.

DeVicente and Tanksley157 randomly paired an RIL population obtained by strong F1 self-crossing to produce a new population. This population not only preserves the genotype of the RIL population but also reproduces the F2 population; thus, it is called an IF2 population. At present, IF2 populations have been established in rice158,159,160,161, maize150,162,163,164,165,166,167,168,169, cotton170, and other crops. In addition, there are also diverse F1156, IL171,172,173,174,175, BILF1176,177, and SSSL178 populations that can be used to study heterosis. Except for two studies on tomato, there are few relevant studies on heterosis in vegetables using such populations that would provide a reference for conducting heterosis-related studies in other vegetables.

Using genome editing techniques to knockout adverse genes or overexpress favorable genes can transform ordinary lines into strong predominance lines. For example, biomass, plant height, and leaf photosynthetic pigment contents increased in rice expressing maize GLK genes compared with those in wild-type rice;179 such results may cause researchers to think about studying mutual heterosis promotion among different vegetables. Dominance and overdominance effects account for a large proportion of the effects that produce heterosis and are easy to mimic (Fig. 3B). Understanding the mechanisms of heterosis helps breeders to improve current varieties and generate novel cultivars27 (Fig. 6).

Maintaining heterosis

The hybridization of the selfing line of two heterotic groups can generate hybrid offspring that exhibit heterosis. Through hybrid seed production, self-incompatibility and male-sterile line technology can be used to maintain the hybrid vigor of the hybrid F1 line. Some of the characteristics of the vegetables themselves, such as the gynoecious characteristic of Cucurbitaceae116 and asexual reproduction in potato (Solanum tuberosum L.)180, are convenient for hybrid seed production or heterosis maintenance. In addition, some plant hormones or chemical reagents can also be used for plant sex regulation14. However, exogenous regulation is often not completely effective14, which may affect the purity of hybrid seeds. Therefore, it is necessary to study hybrid systems of vegetables for hybrid seed production.

Du et al.117 used gene editing technology (Cas9) to knock out the male-specific gene SlSTR1 in tomato to obtain a sterile line and generated a maintainer line by transferring a fertility-restoration gene to the sterile line; it was easy to distinguish whether offspring of crosses between the maintainer and male-sterile lines were male-fertile maintainer plants because a seedling-color gene was linked to the fertility-restoration gene. This system combined tomato sterile lines and gene editing technology and represents a highly practical potential approach to hybrid seed production in tomatoes. Moreover, it may serve as an important reference for the use of gene editing technology for hybrid seed production in other vegetables.

Khanday et al.181 and Wang et al.182 found that genome editing can cause mitosis to replace meiosis in rice such that diploid clonal seeds have the original F1 gene heterozygosity and maintain F1 traits (Fig. 6). Unlike with knocking out the infertility gene using gene editing technology, with this method, fertilization and cell division are necessary for hybridization. Some vegetables do not have sterile line material. Therefore, this method, in which plant fertilization involves only mitosis and not meiosis, will be more widely applicable.

In addition, by repeatedly screening the F2 lines that were close to the F1 phenotype, Wang et al.85 obtained pure F5/F6 lines that were close to the F1 phenotype; these were called hybrid simulation lines, indicating that the phenotype of the F1 hybrids was fixed in this line. This method has also been used to maintain F1 heterosis in other vegetables, such as tomatoes183 and peas (Pisum sativum L.)184. Therefore, the heterosis of hybrid F1 vegetables produced by hybridization or molecular biotechnology can be maintained by diploid seed breeding and selection for hybrid simulation lines in the future (Fig. 6).

Conclusions and future perspectives

Research on vegetable heterosis has focused mainly on its applications in heterosis breeding. Studies on its genetic mechanism are limited, which hinders its utilization. Extensive progress has been made in the study of heterosis in cereal crops such as rice and maize. In vegetables, both hybrid production systems (male sterility lines, self-incompatibility lines, and gynoecious lines) and molecular biological techniques (gene editing, transgenosis, and asexual reproduction) have been used. Therefore, the methods and strategies proposed by this paper for studying the genetic mechanisms of heterosis can be applied to vegetable breeding. In the near future, we will identify certain heterosis-related gene loci in vegetables to understand the molecular genetics and mechanism of heterosis formation in vegetables and to make new breakthroughs in improving the yield, quality, and safety of vegetables. This review emphasizes the following points: (1) The application of heterosis in vegetable crops allows improvements in yield and quality and enhances plant resistance to biological and environmental stresses. (2) In the future, more attention should be paid to the study of the genetic mechanisms of vegetable heterosis to identify the important genes involved in the development of heterosis and to understand the regulation and activity modes of the key genes affecting vegetable heterosis. (3) By fully referencing and adapting the strategies used in cereal crop heterosis studies, exogenous genes can be applied to produce the same function in different species179. Therefore, transgenic and genomic editing technologies can significantly improve the efficiency of research on heterosis gene identification in vegetables. (4) Although a certain basic molecular knowledge of vegetable heterosis has been obtained, applying the knowledge acquired from cereal crops to vegetables will improve vegetable production and quality. It will also be useful to compare sterile line seed production with optimized transgenic systems to achieve more breakthroughs in vegetable production. (5) The study of heterosis can promote the study of ideal plant architecture in vegetable breeding. A breeding strategy that combines heterosis with the ideal plant architecture can achieve substantial gains in vegetable yield and quality. (6) Maintaining heterosis is the core factor of the extensive use of heterosis and has been reflected mainly in F1 hybrid seed production. With the development of gene editing technology, sterile line gene editing systems, MiMe (Cas9) systems and even new biotechnology approaches will have opportunities to be widely applied; this will be of great significance for hybrid seed production. (7) Progressive heterosis caused by the dosage effect in polyploid hybrids is also an important component of the genetic mechanisms of heterosis, and these phenomena have been observed in different plants55,185. Polyploid systems allow experiments to be performed that are impossible in diploid systems; hence, polyploid crossbreeding may lead to different plant performance results than diploid breeding. However, polyploids have highly heterozygous genomes and complex genetic structures, and we may not be able to evaluate their phenotypes and genetic structures using diploid criteria. This topic deserves future investigation.