Genetic dissection of heterosis using epistatic association mapping in a partial NCII mating design

Wen, Jia; Zhao, Xinwang; Wu, Guorong; Xiang, Dan; Liu, Qing; Bu, Su-Hong; Yi, Can; Song, Qijian; Dunwell, Jim M.; Tu, Jinxing; Zhang, Tianzhen; Zhang, Yuan-Ming

doi:10.1038/srep18376

Download PDF

Article
Open access
Published: 17 December 2015

Genetic dissection of heterosis using epistatic association mapping in a partial NCII mating design

Jia Wen^1,2,
Xinwang Zhao¹,
Guorong Wu²,
Dan Xiang²,
Qing Liu²,
Su-Hong Bu²,
Can Yi²,
Qijian Song³,
Jim M. Dunwell⁴,
Jinxing Tu¹,
Tianzhen Zhang² &
…
Yuan-Ming Zhang¹

Scientific Reports volume 5, Article number: 18376 (2016) Cite this article

2607 Accesses
15 Citations
Metrics details

Subjects

Abstract

Heterosis refers to the phenomenon in which an F₁ hybrid exhibits enhanced growth or agronomic performance. However, previous theoretical studies on heterosis have been based on bi-parental segregating populations instead of F₁ hybrids. To understand the genetic basis of heterosis, here we used a subset of F₁ hybrids, named a partial North Carolina II design, to perform association mapping for dependent variables: original trait value, general combining ability (GCA), specific combining ability (SCA) and mid-parental heterosis (MPH). Our models jointly fitted all the additive, dominance and epistatic effects. The analyses resulted in several important findings: 1) Main components are additive and additive-by-additive effects for GCA and dominance-related effects for SCA and MPH and additive-by-dominant effect for MPH was partly identified as additive effect; 2) the ranking of factors affecting heterosis was dominance > dominance-by-dominance > over-dominance > complete dominance; and 3) increasing the proportion of F₁ hybrids in the population could significantly increase the power to detect dominance-related effects and slightly reduce the power to detect additive and additive-by-additive effects. Analyses of cotton and rapeseed datasets showed that more additive-by-additive QTL were detected from GCA than from trait phenotype and fewer QTL were from MPH than from other dependent variables.

The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars

Article Open access 15 April 2024

Jarkko Salojärvi, Aditi Rambani, … Patrick Descombes

Genetic gains underpinning a little-known strawberry Green Revolution

Article Open access 19 March 2024

Mitchell J. Feldmann, Dominique D. A. Pincot, … Steven J. Knapp

A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range

Article Open access 11 April 2024

Qichao Lian, Bruno Huettel, … Raphael Mercier

Introduction

Heterosis, characterized by Darwin¹, refers to the existence of superior levels of biomass, stature, growth rate and/or fertility in hybrid offspring compared with the parents^2,3. The rediscovery of heterosis in maize a century ago has revolutionized plant and animal breeding and production^3,4,5,6. In China, hybrid rice and maize account for approximately 50% and 90% of the total cultivated acreages, respectively. It was estimated that the yield advantage of hybrid maize had contributed an additional 55 million metric tons to the production each year⁷. Although heterosis refers to the F₁ hybrid, the current knowledge of its genetic foundation is derived from the bi-parental segregating populations but not from F₁ hybrids. Therefore, it is necessary to dissect the genetic basis of heterosis based on F₁ hybrids.

Efforts have been made to dissect the genetic foundation of heterosis over the past hundred years^8,9. In early studies, classical quantitative genetic analysis methods were used to analyze the original trait value. As a result, dominance^10,11,12, over-dominance^4,13 and epistasis^14,15 hypotheses for heterosis were proposed. In general, dominance includes partial-, complete- and over-dominances and the epistasis between two loci includes additive-by-additive (aa), additive-by-dominant (ad), dominant-by-additive (da) and dominant-by-dominant (dd) effects. The dominance hypothesis for heterosis means that partial-dominance results in heterosis. However, these methods dealt only with the collective effects of all the polygenes. As the introduction of molecular markers and the wide application of quantitative trait locus (QTL) mapping analyses, the dominance^16,17, over-dominance^18,19,20,21 and epistasis^17,22,23,24 hypotheses were also supported and these analyses were performed for two kinds of dependent variables, i.e., trait phenotype or mid-parental heterosis (MPH)^22,23,25,26. In hybrid breeding for heterosis utilization, a genetic mating scheme is usually used to identify elite parents and hybrid combinations through the analyses of general combining ability (GCA) and specific combining ability (SCA), respectively. Recently, an association mapping approach was used for dependent variables such as GCA and SCA in triple testcross and North Carolina III mating designs^{27,28,29,30,31,32,33}. The North Carolina II (NCII) mating designs based on different base populations, such as BC₁F₈³⁴, recombinant inbred lines³⁵ and introgression lines^36,37, were reported and a comparison across different base populations was also conducted³⁸. However, the comparison with the differences in the genetic components of the trait phenotype, GCA, SCA and MPH has not been reported, especially for the existence of epistasis.

In this study, trait phenotype, GCA, SCA and MPH in a subset of the F₁ hybrids, named a partial NCII mating design, were analyzed by an association mapping approach under an additive-dominant-epistatic genetic model. All the main and epistatic effects for each dependent variable were estimated by the fast empirical Bayesian LASSO (EBLASSO) method³⁹. Our purpose was to compare the differences in the genetic components of the above four dependent variables for heterosis. In addition, the effect of the ratio of the number of F₁ hybrids to the total number of parental lines and F₁ hybrids in mapping population on association mapping was also investigated.

Results

Association mapping for micronaire in cotton and for length of main raceme in rapeseed

LD score regression analysis

The estimates for regression intercept were −6.05 ± 3.22 (standard error) in Xinjiang and −4.83 ± 3.30 in Jiangsu for micronaire in cotton and −3.46 ± 1.03 for length of main raceme in rapeseed; and the corresponding t statistics (probabilities) were −2.19 (0.029), −1.77 (0.0789) and −4.31 (2.53E-05), respectively. Thus, population structure should be considered in real data analyses.

Association studies

Q matrix for population structure was incorporated into the genetic model of epistatic association mapping. A total of 11, 7, 5 and 2 reliable QTL were identified for micronaire in cotton based on trait phenotype, GCA, SCA and MPH, respectively (Table 1). A total of 18, 16, 2 and 2 reliable QTL were identified for length of main raceme in rapeseed based on trait phenotype, GCA, SCA and MPH, respectively (Table 2). These QTL were detected in at least two instances, each with a different dependent variable. Clearly, all types of effects were detected from trait phenotype, additive and aa effects were identified from GCA and dominance-related effects were found from SCA and MPH.

Table 1 Position, type and effect of QTL for cotton micronaire in a mating design.

Full size table

Table 2 Position, type and effect of QTL for rapeseed length of main raceme in a partial NCII mating design.

Full size table

Genetic components of GCA, SCA and MPH

In the model (2), trait phenotype, GCA, SCA and MPH were used as dependent variables. When all the simulated QTL had only one type of genetic effect in each experiment, the additive QTL was detected with dependent variables of trait phenotype and GCA but not of SCA and MPH (Fig. 1a). The additive effect is a component of GCA. A similar result was obtained in the association mapping for length of main raceme in rapeseed, because a total of 15 common additive QTL were detected when trait phenotype and GCA were used as dependent variable. Furthermore, one additional additive-by-additive × environment interaction for micronaire in cotton was detected from GCA but not from trait phenotype; and the aa QTL were more likely detected from GCA than from trait phenotype.

A dominant QTL could be detected with trait phenotype, SCA and MPH (Fig. 1b). The dominant effect is a component of SCA and MPH. Two common dominant QTL between trait phenotype and SCA and one common dominant QTL between SCA and MPH for length of main raceme in rapeseed supported this result, indicating that the power of detecting QTL was slightly higher for the trait phenotype and SCA models than for the MPH model (Fig. 1b). Although the dominant QTL could sometimes be identified in the GCA model, their estimated effect was close to zero (Table S2).

Although the aa QTL were detected in the model with trait phenotype, GCA and, sometimes, SCA models as dependent variable, the detection power were significantly higher with the trait phenotype and GCA than with the SCA and MPH. For example, the power in the detection of QTL with the 0.05 heritability was 100% using GCA, but the power in the detection of all the simulated QTL was less than 10% using MPH (Fig. 1c and Table S2). The aa effect is a key component of GCA. One similar aa QTL in the models with trait phenotype and GCA for length of main raceme in rapeseed validated this result (Table 2).

The ad QTL could be detected from trait phenotype, SCA, MPH and, sometimes, GCA models (Fig. 1d). Trait phenotype and SCA models had the highest and the GCA had the lowest power. Compared with the trait phenotype and SCA models, the power of MPH was relatively low because some of the ad QTL was identified as an additive QTL at marker positions 20, 40, 75, 90, 155 and 180 cM (Table S2), indicating that sometimes the ad QTL could not be distinguished from the additive QTL. Although sometimes the ad QTL in GCA model could be detected, its effect estimate was close to zero (Table S2). Similar results were found for the da and dd QTL, except that the dd QTL could be distinguished from the additive or dominant QTL with the MPH (Fig. 1e,f and Table S2). For association mapping of micronaire in cotton, two ad QTL and three dd QTL were identified with trait phenotype and SCA; these ad and dd effects were components of SCA. One ad QTL and one dd QTL were also detected by MPH, indicating that ad or dd QTL were less likely to be detected with MPH than with trait phenotype and SCA.

The above results showed that the additive and aa effects were the major contributors to GCA; some other effects, except the additive effect, were components of SCA and the dominant-related effects were components of the MPH but a part of the ad or da QTL cannot be distinguished from the additive QTL.

Relative contribution of genetic components to heterosis

To further evaluate the genetic foundation of heterosis, we carried out three additional simulation experiments. In these three experiments, partial (), complete () and over () dominances were simulated, while the other parameters were the same as those in the first simulation experiment. In the three experiments, the powers of the dominant QTL detection with SCA and MPH increased as the degree of dominance increased (Table S3). When the above 2,160,000 simulated F₁ hybrids, along with their parents, were used to calculate MPH, the absolute estimates of MPH under the dominance, dd, over-dominance, complete dominance and partial dominance genetic models were 10.29, 8.45, 8.25, 5.72 and 3.25 (%), respectively (Fig. 2), indicating that the magnitude of heterosis derived from the same set of QTL was dominance > dominance-by-dominance > over-dominance > complete dominance (Table S4).

Effect of F₁ hybrid proportion in NCII on association mapping

To investigate the effect of the mating design on association mapping, each maternal line was crossed with 1, 2, 3, 4, 5, 6, 7 and 15 paternal lines, the proportion of F₁ hybrids in the total number of parental lines and F₁ hybrids in the mapping population increased from 33% to 88% (Fig. 3 and S1). We found that the power of QTL detection slightly decreased for the additive and aa QTL, but significantly increased for the dominant-related QTL as the proportion of F₁ hybrids in the mapping population increased and the power was higher for the additive-related QTL than for the dominant and dd QTL (Fig. 3a). The decreases for the additive and aa QTL detection powers were due to the decrease of homozygotes in the mapping population. The absolute deviation slightly decreased for the additive-related effects, but significantly decreased for the dominant and dd effects as the proportion of F₁ hybrids in the mapping population increased (Fig. 3b).

Discussion

The current study is unique as compared to previous studies in the genetic dissection of heterosis. We assessed the relative importance of various genetic components of heterosis using a series of Monte Carlo simulation experiments and found that the ranking of factors affecting heterosis based on the same set of QTL was dominance > dominance-by-dominance > over-dominance > complete dominance. We used the F₁ hybrids in the NCII mating design instead of bi-parental segregating populations to dissect the genetic foundation of heterosis and identified different types of QTL contributing to trait phenotype, GCA, SCA and MPH. In this study, we also adopted a new QTL mapping model; for example, all the main and epistatic effects were included in one genetic model, which overcame the effect of background QTL on association mapping. Generally, the EBLASSO algorithm can estimate 100,000 effects in a sample of size 200. However, if the effect is too small or two QTL are closely linked, the power of association mapping is low as well; in this case, the empirical Bayesian elastic net⁴⁰ is recommended.

GCA generally consists of additive and aa effects and SCA consists of dominance-related effects. When considering GCA, our conclusion is consistent with previous reports because the additive and aa effects were correctly estimated in our model. However, the dominance and dd effect was not detected with GCA (Table S2), because the design matrix for the two genetic components was the same among different individuals. The same scenario was observed in Table S3. Although the other two dominance-related components can sometimes be detected in the genetic model of GCA, their estimates were close to zero, indicating that GCA was hardly associated with heterosis. This study observed that the aa effect was the smallest genetic component in the SCA model (Table S2) and a similar result was also reported by Bhullar et al.⁴¹, Singh et al.⁴², Cho and Scott⁴³, Qi et al.³⁶ and Qu et al.³⁴. In the MPH model, ad or da QTL were partly identified as additive QTL so that the power in detecting the ad and da QTL was lower than those of other interactions (Table S2). Although trait phenotype was the best variable in the genetic dissection of quantitative traits or heterosis, other variables were beneficial to estimate some effects, e.g. trait phenotype and GCA were recommended for detecting additive and aa interaction effects and trait phenotype and SCA for detecting dominance-related effects.

The NCII design is the most efficient genetic mating design for the analysis of combining ability⁴⁴ and has been widely adopted in maize, rice and rapeseed breeding. In the genetic dissection of quantitative traits, the base population in NCII is often a bi-parental segregating population, such as BC₁F₈³⁴, recombinant inbred lines³⁵ and introgression lines^36,37. In crop breeding, however, an elite F₁ hybrid (high heterosis) is generally derived from the crosses between two kinds of inbred lines in maize breeding and between sterile and restorer lines in rice and rapeseed breeding. This is why we imitate one hybrid breeding experiment in rapeseed in this study. Note that it is generally impractical to conduct all the possible crosses between base population (a series of sterile lines) and testers; thus, only a limited numbers of crosses are evaluated in field experiments. To be consistent with real crop breeding programs, a portion of the NCII populations was used for analysis in this study. By comparing the results from different mating strategies, we suggested that F₁ hybrids and their parents be used if main-effect QTL need to be identified, but only F₁ hybrids are required if epistatic QTL need to be identified.

In bi-parental segregating populations, such as F₂, no significant differences in the estimates of positions, effects and detection powers of QTL were found between the models with trait phenotype and MPH (Table S5), as MPH is a linear function of the F₁ trait phenotype. This result may be applicable to backcross, doubled haploid and recombinant inbred line populations.

The GCA model had higher power than the trait phenotype model in detecting additive and aa QTL (Fig. 1), which is confirmed by real data analysis in cotton, that is, A4-1 and A5-1 additive QTL detected by GCA are not detected by trait phenotype. SCA and trait phenotype had similar power in detecting dominant and dd QTL. SCA had lower power than trait phenotype and MPH had slightly lower power than SCA in detecting ad and da QTL (Fig. S2 and Table S6). The proposed method provides choices in the dissection of genetic components of heterosis and might be used further to validate the results (Tables 1 and 2). More importantly, mating design was often adopted in crop breeding and the results we obtained from mating design could direct crop breeding.

Although a large population is recommended in current QTL mapping, sometimes a small population in crop breeding is also used to identify QTL⁴⁵. Cui et al.⁴⁵ found that a small breeding population with phenotypic selection has a high power to detect QTL. The cotton population in this study is a breeding population. In this population, each line of the eight parents is a chromosome segment substitution line with novel allele of various micronaire QTL. This is why the apparently good results are obtained in the small cotton population in this study.

Conclusion

Main components are the additive and aa effects for GCA and dominance-related effects for SCA and MPH. The aa interaction is a small component of SCA. The ad or da interaction for MPH is partly identified as an additive effect. The real datasets from rapeseed and cotton validated our findings. The ranking of genetic components that contribute to heterosis is dominance > dominance-by-dominance > over-dominance > complete dominance. In addition, if we increase the proportion of F₁ hybrids in a partial NCII design, the power to detect dominance-related effects could be significantly increased and the power to detect additive and aa effects could be slightly reduced.

Methods

NCII mating design in Monte Carlo simulation experiments

A random set of a cultivars as maternal lines was crossed with a random set of b cultivars as paternal lines to produce F₁ hybrid combinations. When only a subset of the F₁ hybrids was analyzed, we called this a partial NCII design. In the simulation study, we imitated one hybrid breeding experiment in rapeseed, in which each maternal line (sterile line) was crossed with two paternal lines (restorer lines); thus, the subset in this study was 2a F₁ hybrids.

Statistical model

Genetic model

The dependent variable y_i for the ith F₁ hybrid in the NCII population can be described as

where four variables are considered separately as dependent variable, being trait phenotype, GCA, SCA and MPH; μ is the total average; a_k and d_k are additive and dominant effects of the kth QTL, respectively; , , and are aa, ad, da and dd interaction effects between the kth and sth QTL, respectively; m is the number of the putative QTL and each marker is resided by one putative QTL; and are dummy variables defined as and for the kth QTL genotype QQ of the ith individual, and for Qq and and for qq; and is the normally distributed random error.

In order to simplify the model (1), we rewrote the model (1) into the following matrix form

where , is the vector of the main and epistatic effects of QTL and X is the design matrix of all the QTL effects.

Dependent variables in genetic model

In the genetic model (1) or (2), trait phenotype, GCA, SCA and MPH are dependent variable . GCA (g_i) is the mean performance of the ith parent in all its crosses with other parents and SCA (s_ij) between the ith and jth parents is the performance of their F₁ hybrid measured as the deviation from the total expected GCA of the two parents. They are described as follows:

where F_ij is the phenotypic value of F₁ hybrid between the ith and jth parents (;); and . (%) refers to the superior performance () of the F₁ hybrid relative to the average () of the parental lines i and j and can be calculated as

Parameter estimation

Several methods could be applied to estimate the parameters in the model (1) or (2), such as penalized maximum likelihood⁴⁶, Bayesian LASSO^47,48, hierarchical generalized linear model^49,50, empirical Bayes⁵¹ and EBLASSO³⁹. Here, all the parameters were estimated using EBLASSO. We provide the main outline here; more details on the EBLASSO can be found in the study by Cai et al.³⁹.

Three-level hierarchical prior distributions were employed in the EBLASSO. In the first level, β_j was set up to have an independent normal distribution with a mean of zero and unknown variance . In the second level, followed an independent exponential distribution with a common parameter λ: . In the third level, a conjugate Gamma prior distribution, Gamma (a, b), was used for the parameter λ. In this study, a and b were determined by three-fold cross-validation. In addition, non-informative uniform priors were used for μ and . The major steps for the algorithm are as follows:

First, and . Let , so . Let and , so . If , let .

The second step is the inner iteration. In this step, the purpose is to obtain a new . Let where ; the new candidates and can be obtained. Three criteria related to α_j and were used to determine whether is to be retained in model (6):

If is retained in model (6), . Note that μ and are fixed as constants. However, s_j and q_j need to be updated. If or , the inner iteration converges and is obtained.

The third step is the outer iteration and its purpose is to estimate and as shown below:

where , (the covariance of β), A= and (empirical Bayes estimate of β). The outer iteration converges when the two conditions are simultaneously satisfied.

Hypothesis test

The EBLASSO algorithm was used to select important effects from a full genetic model. When one effect was selected, its P-value in the t-test was provided as well. Here , where is the jth diagonal element of Σ. The probability threshold for declaring a significant main or epistatic effect was 0.05.

Data Analyses

Monte Carlo simulations

The purposes of the Monte Carlo simulation study were to compare four dependent variables in the genetic dissection of heterosis, to identify important components of heterosis and to investigate the effect of mating strategy on association mapping.

To compare four dependent variables in the genetic dissection of heterosis, six experiments were simulated (Table S1). In each experiment, 120 maternal lines, 120 paternal lines and all the 120 × 120. F₁ hybrids were simulated so that the GCA, SCA and MPH could be calculated. In the simulation with GCA as a dependent variable, all 240 parents were included in the mapping population. In the simulation study with trait phenotype, SCA and MPH as dependent variables, we created one hybrid breeding experiment in rapeseed with each maternal line (sterile line) crossed with two paternal lines (restorer lines), thus, a total of 240 F₁ hybrids were generated and viewed as a mapping population. We simulated the mapping population and genotype using the method described by Lü et al.⁵². Sixty equally spaced markers, each with two alleles of equal proportions, were simulated on three chromosome segments; the length of each segment was 95 cM. The genotypes of all the F₁ hybrids were then deduced from the simulated parental genotypes. In each experiment, the simulated data had six QTL: two each at = 0.05, 0.10 and 0.15 and each QTL had two alleles of the same frequency. Based on these heritabilities and residual variance , the total genetic variance was estimated by , which was further partitioned into each QTL. The QTL effect was determined by its genetic variance and allelic frequency. Six QTL with main effects in the first and second experiments were placed on marker positions 25 (chr. 1), 75 (chr. 1), 135 (chr. 2), 175 (chr. 2), 220 (chr. 3) and 270 cM (chr. 3), respectively; and six epistatic QTL in the third to sixth experiments were located on marker pairs at 20 & 60, 90 & 125, 155 & 205, 180 & 235, 40 & 275 and 75 & 220 cM, respectively. One type of QTL effect was assigned to all the six QTL in each experiment so that additive, dominant, aa, ad, da and dd effects were assigned to the first to sixth experiments, respectively (Table S1). Each simulation consisted of 1,000 replications. For each simulated QTL, we counted the number of samples in which the P-value < 0.05 and its ratio to the total number of replications (1,000) to represent the empirical power of this QTL.

To identify important components of heterosis, three additional experiments with partial (), complete () and over () dominances of QTL were conducted. The other parameters in the three experiments were similar to those used in the cases 1~6 listed in Table S1. In these nine experiments, all the F₁ individuals along with their parents were used to calculate the MPH and the relative sizes of MPH were used to measure the contribution of the genetic components to heterosis.

To investigate the effect of mating strategy on QTL mapping, eight simulation experiments were carried out by allowing one maternal line to be crossed with 1, 2, 3, 4, 5, 6, 7 and 15 paternal lines. To ensure a stable sample size, the mapping populations in the eight experiments were 80 (maternal) + 80 (F₁, the ith maternal line (M_i) × the ith paternal line (P_i), ) + 80 (paternal), 60 (maternal) + 60 × 2 (F₁: M_i×P_i and , . If , was changed into P₅₉) + 60 (paternal), 48 (maternal) + 48 × 3 (F₁) + 48 (paternal), 40 (maternal) + 40 × 4 (F₁) + 40 (paternal), 34 (maternal) + (34 × 5 + 2) (F₁: the additional 2 F₁ hybrids were and ) + 34 (paternal), 30 (maternal) + 30 × 6 (F₁) + 30 (paternal), 26 (maternal) + (26 × 7 + 6) (F₁: the additional 6 F₁ hybrids were from to M₂₆ × P₂₀) + 26 (paternal) and 15 (maternal) + 15 × 15 (F₁) + 15 (paternal), respectively (Fig. S1). For the efficiency of simulation, twenty-one equally spaced markers, each with two alleles of equal frequency, were simulated on one chromosome with a total length of 100 cM. In each experiment, six QTL with a heritability of 0.05 were simulated; and each QTL locus had only one type of effect. An additive (dominant) QTL was located at marker position 20 (85) cM; the aa, ad, da and dd interaction QTL were located between marker pairs 10 & 30, 40 & 55, 45 & 80 and 65 & 95 cM, respectively. The other parameters were the same as those in the first simulation experiment (Table S1).

Real datasets analyzed

A cotton dataset provided by Dr. Tianzhen Zhang’s group at Nanjing Agricultural University, China was used for the demonstration. The dataset contained phenotypes of micronaire (a fibre characteristic) from 8 parents and their 28 F₁ hybrids which were grown at two locations: Xinjiang and Jiangsu provinces, China. All the eight parents were chromosome segment substitution lines and bred from the crosses of TM-1 and cultivars with novel alleles of various micronaire QTL. Among these parents, there were fifteen chromosome substituted segments, which were located on 9 chromosomes and identified by 15 SSR markers. In the genetic model, 30 main effects, one environmental effect, 420 epistasis effects, 30 QTL-by-environment effects and 420 epistasis-by-environment effects were considered.

A rapeseed (Brassica napus) dataset provided by Dr. Jinxing Tu’s group at Huazhong Agricultural University, China was also used for the further demonstration. The data for length of main raceme were collected from 298 sterile lines, 143 restorer lines (restoring fertility of the F₁ hybrid from male sterile line) and 284 F₁ hybrids at Huazhong Agricultural University in 2010. A total of 205 SSR primer pairs were used to screen for polymorphisms among all the 441 parents and the genotypes of all the F₁ hybrids were deduced from their parents. The total number of effects included in the genetic model is 84050.

All the parameters were estimated by EBLASSO³⁹. In real data analyses, the best estimates for parameters a and b in the Gamma (a, b) distribution were determined from three-, five- and ten-fold cross-validations. The software (GAS_NCII) is available. The critical value of the P-value for statistical significance was set to 0.05. Q matrix was calculated using Structure 2.3.4 (http://pritchardlab.stanford.edu/structure.html) and incorporated into the genetic model of association mapping in real data analysis.

LD score regression

Bulik-Sullivan et al.⁵³ proposed linkage disequilibrium (LD) score regression to distinguish between inflation from a true polygenic signal and population stratification bias for a binary trait. In the regression of between the jth marker and binary trait on LD score (, is the correlation coefficient between the jth and kth markers and m is the number of markers), significant difference between the regression intercept estimate and one indicates the significant effect of population structure on association mapping. If the trait under consideration is continuous, extremely large (35%) and small (35%) values are transferred into 1 and 0 (binary), respectively and only 70% of individuals are adopted in the LD score regression.

Additional Information

How to cite this article: Wen, J. et al. Genetic dissection of heterosis using epistatic association mapping in a partial NCII mating design. Sci. Rep. 5, 18376; doi: 10.1038/srep18376 (2015).

References

Darwin, C. R. The effects of cross- and self-fertilization in the vegetable kingdom. (London: John Murray, 1876).
Comings, D. E. & MacMurray, J. P. Molecular heterosis: a review. Mol. Genet. Metab. 71, 19–31 (2000).
Article CAS Google Scholar
Chen, Z. J. Genomic and epigenetic insights into the molecular bases of heterosis. Nat. Rev. Genet. 14, 471–482 (2013).
Article CAS Google Scholar
Shull, G. H. The composition of a field of maize. J. Hered. 4, 296–301 (1908).
Article Google Scholar
Virmani, S. S. Prospects of hybrid rice in tropics and sub-tropics in Hybrid rice technology: new developments and future prospects (ed. Virmani, S. S. ) 7–20 (International Rice Research Institute Philippines, 1994).
Budak, H., Cesurer, L., Bölek, Y., Dokuyucu, T. & Akkaya, A. Understanding of heterosis. KSU J. Science and Engineering 5, 68–75 (2002).
Google Scholar
Duvick, D. N. Heterosis: feeding people and protecting natural resources in The genetics and exploitation of heterosis in crops (ed. Coors, J. G. & Pandey, S. ) 19–29 (Madison, 1999).
Sanghera, G. S. et al. The magic of heterosis: new tools and complexities. Nat. Sci. 9, 42–53 (2011).
Google Scholar
Goff, A. S. & Zhang, Q. F. Heterosis in elite hybrid rice: speculation on the genetic and biochemical mechanisms. Curr. Opin. Plant Biol. 16, 221–227 (2013).
Article CAS Google Scholar
Davenport, C. B. Degeneration, albinism and inbreeding. Science 28, 454–455 (1908).
Article CAS ADS Google Scholar
Bruce, A. B. The Mendelian theory of heredity and the augmentation of vigor. Science 32, 627–628 (1910).
Article CAS ADS Google Scholar
Jones, D. F. Dominance of linked factors as a means of accounting for heterosis. Genetics 2, 466–479 (1917).
CAS PubMed PubMed Central Google Scholar
East, E. M. Heterosis. Genetics 21, 375–397 (1936).
CAS PubMed PubMed Central Google Scholar
Powers, L. An expansion of Jones’s theory for the explanation of heterosis. Am. Nat. 78, 275–280 (1944).
Article Google Scholar
Williams, W. Heterosis and the genetics of complex characters. Nature 184, 527–530 (1959).
Article CAS ADS Google Scholar
Xiao, J., Li, J., Yuan, L. & Tanksley, S. D. Dominance is the major genetic basis of heterosis in rice as revealed by QTL analysis using molecular markers. Genetics 140, 745–754 (1995).
CAS PubMed PubMed Central Google Scholar
Radoev, M., Becker, H. C. & Ecke, W. Genetic analysis of heterosis for yield and yield components in rapeseed (Brassica napus L.) by quantitative trait locus mapping. Genetics 179, 1547–1558 (2008).
Article CAS Google Scholar
Stuber, C. W., Lincoln, S. E., Wolff, D. W., Helentjaris, T. & Lander, E. S. Identification of genetic factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. Genetics 132, 823–839 (1992).
CAS PubMed PubMed Central Google Scholar
Li, Z. K. et al. Overdominant epistatic loci are the primary genetic basis of inbreeding depression and heterosis in rice I. biomass and grain yield. Genetics 158, 1737–1753 (2001).
CAS PubMed PubMed Central Google Scholar
Luo, L. J. et al. Overdominant epistatic loci are the primary genetic basis of Inbreeding depression and heterosis in rice. II. Grain yield components. Genetics 158, 1755–1771 (2001).
CAS PubMed PubMed Central Google Scholar
Lu, H., Romero-Severson, J. & Bernardo, R. Genetic basis of heterosis explored by simple sequence repeat markers in a random-mated maize population. Theor. Appl. Genet. 107, 494–502 (2003).
Article CAS Google Scholar
Yu, S. B. et al. Importance of epistasis as the genetic basis of heterosis in an elite rice hybrid. Proc. Natl. Acad. Sci. USA 94, 9226–9231 (1997).
Article CAS ADS Google Scholar
Hua, J. P. et al. Single-locus heterotic effects and dominance-by-dominance interactions can adequately explain the genetic basis of heterosis in an elite rice hybrid. Proc. Natl. Acad. Sci. USA 100, 2574–2579 (2003).
Article CAS ADS Google Scholar
Zhou, G. et al. Genetic composition of yield heterosis in an elite rice hybrid. Proc. Natl. Acad. Sci. USA 109, 15847–15852 (2012).
Article CAS ADS Google Scholar
Hua, J. P. et al. Genetic dissection of an elite rice hybrid revealed that heterozygotes are not always advantageous for performance. Genetics 162, 1885–1895 (2002).
CAS PubMed PubMed Central Google Scholar
Yuan, Q. Q., Deng, Z. Y., Peng, T. & Tian, J. C. QTL-based analysis of heterosis for number of grains per spike in wheat using DH and immortalized F2 populations. Euphytica 188, 387–395 (2012).
Article Google Scholar
Kusterer, B. et al. Heterosis for biomass-related traits in Arabidopsis investigated by quantitative trait loci analysis of the triple testcross design with recombinant inbred lines. Genetics 177, 1839–1850 (2007).
Article CAS Google Scholar
Melchinger, A. E. et al. Genetic basis of heterosis for growth related traits in Arabidopsis investigated by testcross progenies of near-isogenic lines reveals a significant role of epistasis. Genetics 177, 1827–1837 (2007).
Article Google Scholar
Garcia, A. A. F., Wang, S., Melchinger, A. E. & Zeng, Z. B. Quantitative trait loci mapping and the genetic basis of heterosis in maize and rice. Genetics 180, 1707–1724 (2008).
Article Google Scholar
Li, L. Z. et al. Dominance, overdominance and epistasis condition the heterosis in two heterotic rice Hybrids. Genetics 180, 1725–1742 (2008).
Article Google Scholar
Reif, J. C. et al. Unraveling epistasis with triple testcross progenies of near-isogenic lines. Genetics 181, 247–251 (2009).
Article Google Scholar
He, X. H. & Zhang, Y. M. A complete solution for dissecting pure main and epistatic effects of QTL in triple testcross design. PLoS ONE 6, e24575 (2011).
Article CAS ADS Google Scholar
He, X. H., Hu, Z. L. & Zhang, Y. M. Genome-wide mapping of QTL associated with heterosis in the RIL-based NCIII design. Chinese Sci. Bull. 57, 2655–2665 (2012).
Article ADS Google Scholar
Qu, Z. et al. QTL Mapping of combining ability and heterosis of agronomic traits in rice backcross recombinant inbred lines and hybrid crosses. PLoS ONE 7, e28463 (2012).
Article CAS ADS Google Scholar
Hu, W. M., Xu, Y., Zhang, E. Y. & Xu, C. W. Study on the genetic basis of general combining ability with QTL mapping strategy. Scientia Agricultura Sinica 46, 3305–3313 (2013) (in Chinese).
CAS Google Scholar
Qi, H. H. et al. Identification of combining ability loci for five yield-related traits in maize using a set of testcrosses with introgression lines. Theor. Appl. Genet. 126, 369–377 (2013).
Article Google Scholar
Huang, J. et al. General combining ability of most yield-related traits had a genetic basis different from their corresponding traits per se in a set of maize introgression lines. Genetica 141, 453–461 (2013).
Article Google Scholar
Li, L. Z. et al. QTL mapping for combining ability in different population-based NCII designs by a simulation study. J. Genet. 92, 529–543 (2013).
Article Google Scholar
Cai, X., Huang, A. & Xu, S. Fast empirical Bayesian LASSO for multiple quantitative trait locus mapping. BMC Bioinformatics 12, 211 (2011).
Article Google Scholar
Huang, A. H., Xu, S. & Cai, X. Empirical Bayesian elastic net for multiple quantitative trait locus mapping. Heredity 114, 107–115 (2015).
Article CAS Google Scholar
Bhullar, K. S., Gill, K. S. & Khehra, A. S. Combining ability analysis over F1-F5 generations in diallel crosses of bread wheat. Theor. Appl. Genet. 55, 77–80 (1979).
Article CAS Google Scholar
Singh, O., Gowda, C. L. L., Sethi, S. C., Dasgupta, T. & Smithson, J. B. Genetic analysis of agronomic characters in chickpea. I. Estimates of genetic variances from diallel designs. Theor. Appl. Genet. 83, 956–962 (1992).
Article CAS Google Scholar
Cho, Y. K. & Scott, R. A. Combining ability of seed vigor and seed yield in soybean. Euphytica 112, 145–150 (2000).
Article Google Scholar
Shukla, S. K. & Pandey, M. P. Combining ability and heterosis over environments for yield and yield components in two-line hybrids involving thermosensitive genic male sterile lines in rice (Oryza sativa L.). Plant Breeding 127, 28–32 (2008).
Google Scholar
Cui, Y., Zhang, F., Xu, J., Li, Z. & Xu, S. Mapping quantitative trait loci in selected breeding populations: A segregation distortion approach. Heredity, online, 10.1038/hdy.2015.56 (2015).
Zhang, Y. M. & Xu, S. A penalized maximum likelihood method for estimating epistatic effects of QTL. Heredity 95, 96–104 (2005).
Article CAS Google Scholar
Yi, N. & Xu, S. Bayesian Lasso for quantitative trait loci mapping. Genetics 179, 1045–1055 (2008).
Article CAS Google Scholar
Park, T. & Casella, G. The Bayesian Lasso. J. Am. Stat. Assoc. 103, 681–686 (2008).
Article CAS MathSciNet Google Scholar
Yi, N. & Banerjee, S. Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics 181, 1101–1113 (2009).
Article CAS Google Scholar
Feng, J. Y. et al. An efficient hierarchical generalized linear mixed model for mapping QTL of ordinal traits in crop cultivars. PLoS ONE 8, e59541 (2013).
Article CAS ADS Google Scholar
Xu, S. An expectation–maximization algorithm for the Lasso estimation of quantitative trait locus effects. Heredity 105, 483–494 (2010).
Article CAS Google Scholar
Lü, H. Y., Liu, X. F., Wei, S. P. & Zhang, Y. M. Epistatic association mapping in homozygous crop cultivars. PLoS ONE 6, e17773 (2011).
Article ADS Google Scholar
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS Google Scholar

Download references

Acknowledgements

We thank Dr. Nengjun Yi, Department of Biostatistics, University of Alabama, Birmingham, for improving this manuscript in language and Dr. Huang Anhui, Department of Electrical and Computer Engineering, University of Miami, for providing a new version of EBLASSO software. This work was supported by the National Natural Science Foundation of China (grant 31571268) and Huazhong Agricultural University Scientific & Technological Self-innovation Foundation (Program No. 2014RC020).

Author information

Authors and Affiliations

College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
Jia Wen, Xinwang Zhao, Jinxing Tu & Yuan-Ming Zhang
State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, 210095, China
Jia Wen, Guorong Wu, Dan Xiang, Qing Liu, Su-Hong Bu, Can Yi & Tianzhen Zhang
United States Department of Agriculture, Soybean Genomics and Improvement Laboratory, Agricultural Research Service, Maryland, 20705, USA
Qijian Song
School of Agriculture, Policy and Development, University of Reading, Reading, RG6 6AS, United Kingdom
Jim M. Dunwell

Authors

Jia Wen
View author publications
You can also search for this author in PubMed Google Scholar
Xinwang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Guorong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Dan Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Qing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Su-Hong Bu
View author publications
You can also search for this author in PubMed Google Scholar
Can Yi
View author publications
You can also search for this author in PubMed Google Scholar
Qijian Song
View author publications
You can also search for this author in PubMed Google Scholar
Jim M. Dunwell
View author publications
You can also search for this author in PubMed Google Scholar
Jinxing Tu
View author publications
You can also search for this author in PubMed Google Scholar
Tianzhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Ming Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.-M.Z. designed the project. J.W., Q.L., S.-H.B. and Y.-M.Z. performed the experiments and analyzed the data. C.Y. developed software (GAS_NCII). X.Z., G.W., D.X., T.Z. and J.T. conducted rapeseed and cotton experiments. J.W. prepared figures and tables. Y.-M.Z., J.W., Q.S. and J.M.D. wrote the manuscript text. All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Wen, J., Zhao, X., Wu, G. et al. Genetic dissection of heterosis using epistatic association mapping in a partial NCII mating design. Sci Rep 5, 18376 (2016). https://doi.org/10.1038/srep18376

Download citation

Received: 09 September 2015
Accepted: 17 November 2015
Published: 17 December 2015
DOI: https://doi.org/10.1038/srep18376

This article is cited by

Genetic dissection of heterosis of indica–japonica by introgression line, recombinant inbred line and their testcross populations
- Wenqing Yang
- Fan Zhang
- Jianlong Xu
Scientific Reports (2021)
Genetic analysis of yield and fiber quality traits in upland cotton (Gossypium hirsutum L.) cultivated in different ecological regions of China
- Kashif SHAHZAD
- Xue LI
- Jianyong WU
Journal of Cotton Research (2019)
Integration of conventional and advanced molecular tools to track footprints of heterosis in cotton
- Zareen Sarfraz
- Muhammad Shahid Iqbal
- Xiongming Du
BMC Genomics (2018)
Development of a multiple-hybrid population for genome-wide association studies: theoretical consideration and genetic mapping of flowering traits in maize
- Hui Wang
- Cheng Xu
- Yunbi Xu
Scientific Reports (2017)
Exploitation of heterosis loci for yield and yield components in rice using chromosome segment substitution lines
- Yajun Tao
- Jinyan Zhu
- Guohua Liang
Scientific Reports (2016)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.