Abstract
The genetic control of 22 quantitative traits, including developmental rates and sizes, was examined in generations of Arabidopsis thaliana derived from the cross between the ecotypes, Columbia (Col) and Landsberg erecta (Ler). The data were obtained from three sets of families raised in the same trial: the 16 basic generations, that is, parents, F1's, F2's, backcrosses, recombinant inbred lines (RILs) and a triple test cross (TTC), the latter produced by crossing the RILs to Col, Ler and their F1. The data were analysed by two approaches. The first (approach A) involved traditional generation mean and variance component analysis and the second (B), based around the RILs and TTC families, involved marker-based QTL analysis.
From (A), genetic differences between Col and Ler were detected for all traits with moderate heritabilities. Height at flowering was the only trait to show heterosis. Dominance was partial to complete for all height traits, and there was no overdominance but there was strong evidence for directional dominance. For most other traits, dominance was ambidirectional and incomplete, with average dominance ratios of around 80%. Epistasis, particularly of the duplicate type that opposes dominance, was a common feature of all traits. The presence of epistasis must imply multiple QTL for all traits.
The QTL analysis located 38 significant effects in four regions of chromosomes I, II, IV and V, but not III. QTL affecting rosette size and leaf number were identified in all four regions, with days to maturity on chromosomes IV and V. The only QTL for height was located at the expected position of the erecta gene (chromosome II; 50 cM), but the additive and dominance effects of this single QTL did not adequately explain the generation means. The possible involvement of other interacting height QTL is discussed.
Similar content being viewed by others
Introduction
In recent years, the analysis of quantitative traits has largely centred round the use of molecular markers to locate and measure the effects of the individual underlying genes, quantitative trait loci or QTL (Kearsey and Farquhar, 1998; Tanksley, 1993). In some situations, it has also been possible to examine epistatic interactions and genotype environment interaction involving QTL (Monforte et al, 2001). These approaches have thus provided a very powerful tool for the genetical analysis of quantitative traits and have been a major impetus to quantitative genetical research and breeding.
There are, however, limitations to these approaches. Firstly, the accuracy of chromosomal location of QTL is often low and they focus on those QTL that have large effects. Secondly, the size of the estimated QTL effects will be biased because only those estimates that exceed some given threshold of significance are reported (Utz et al, 2000). Concentration on the proportion of the variation explained by individual QTL tends to encourage acceptance of a low number of genes, and many studies have emphasised how few QTL are necessary to explain the observed genetic variation (Doganlar et al, 2000). However, this can often be misleading. If, for example, a trait is controlled by one QTL with an effect of nine units and 19 QTL each with individual effects of one unit, the former will account for 81% of the total variation but only 32% of the combined gene effects. Thus, the contribution to the variance is relevant to the response to selection per generation but not to the genetic potential and the ultimate limits to selection.
Conventional biometrical genetical procedures, on the other hand, deal with the combined effects of all the QTL on the means, variances and correlations of traits. These procedures, mainly developed and improved during the second half of the last century, use statistical relations among relatives to infer the nature of the additive, dominance and epistatic effects of the genes. The estimates obtained are automatically weighted by the sizes of the gene effects and their linkage relations. However, they tell us nothing about the location or effects of the individual QTL (Kearsey and Pooni, 1996; Lynch and Walsh, 1998).
The experiments reported in the present paper were designed to be analysed by a combination of conventional biometrical and current molecular methods. The intention was to reconcile and integrate the conclusions from the two approaches in the hope of gaining a more complete picture of the underlying genetical control of several traits. They are based on the well-analysed lines of Arabidopsis thaliana, Columbia and Landsberg, and generations derived from them.
Materials and methods
Plant material
The inbred strains of A. thaliana, Columbia (Col) and Landsberg (Ler), derived from seed obtained from the Nottingham Arabidopsis Stock Centre (NASC, UK), were used as source material. The 16 basic generations (BGs) shown in Table 1, which include both parents and all possible reciprocal F1's, F2's and first backcrosses (Bc1 and Bc2) derived from them, were produced by hand emasculation and pollination. The 101 recombinant inbred lines (RILs) (Lister and Dean, 1993) derived from Ler × Col were each selfed and also crossed as seed parent to Ler, Col and their F1 in a triple test cross design (Kearsey and Jinks, 1968), but sufficient seeds were obtained for all four families from only 85 of the RILs.
A total of 20 seeds from each basic generation family and five seeds from each RIL or TTC family were sown in separate pots containing JI compost and the 2020 pots were immediately, individually randomised in a single block in an unheated polytunnel in long (16 h) days. A total of 22 traits, representing plant size and development, were scored over the following 2 months and these are listed in Table 2.
Genetical analysis
The means and within-family variances of each of the 16 basic generations were calculated for all traits. Models that included standard genetical and cytoplasmic effects (Table 1) were fitted to the means, the parameters being estimated by weighted least squares using the reciprocal of the variance of each family mean as the weight. The genetical parameters were as follows: additive [a], dominance [d] and epistatic additive by additive [aa], additive by dominance [ad] and dominance by dominance [dd] (see Kearsey and Pooni (1996), for definitions and estimation procedures). For consistency, [a] has been defined as 1/2(Ler−Col), and so will be negative when Columbia has the highest score. The nongenetical, cytoplasmic or maternal effects tested were: inbred vs F1 mothers, Col vs Ler mothers and F1 vs reciprocal F1 mothers. The simplest model was accepted for which all the parameters were individually significant and the χ2, testing the adequacy of the model, was nonsignificant. The genetical and environmental variance parameters, VG and VE, were estimated from the within-family variances of the F2 and nonsegregating generations (Kearsey and Pooni, 1996).
For each trait, the subset of RILs (65) for which the selfed families and the three TTC families had complete progeny sets were chosen, but a few individual plants were inevitably lost due to pests or disease. We refer to the crosses of the n RILs to Col, Ler and the F1 as ‘L1i’, ‘L2i’ and ‘L3i’ (i=1–n), respectively. The RIL families were analysed by one-way ANOVA, while the TTC families were subjected to standard ANOVAs (Table 3) to test for additive (L1i+L2i), dominance (L2i−L1i) and epistatic variation (L1i+L2i−2L3i) as described by Kearsey and Pooni (1996) and Kearsey and Jinks (1968). These ANOVAs will be illustrated in the ‘Results’ section. Phenotypic (VP), additive (VA), dominance (VD) and environmental (VE) components of variance were estimated from these ANOVAs and used to calculate narrow (VA/VP) and broad ((VA+VD)/VP) sense heritabilities and dominance ratios, √(2VD/VA) (see Kearsey and Pooni, 1996).
QTL analysis was carried out for every trait for which the earlier ANOVAs (RILs and TTC) had indicated significant variation between families. The method of Kearsey and Hyne (1994) (also known as marker difference regression, MDR; Lynch and walsh, 1998) was employed, using the ‘QTL Café’ (http://web.bham.ac.uk/g.g.seaton/). Significance testing and confidence intervals were obtained by simulation. The analyses were carried out on three independent sets of data: (i) the means of the RILs; (ii) the means of the TTC (L1i+L2i) values; and (iii) the means of the TTC (L2i−L1i) values. The first two identify QTL on the basis of their additive and the third on their dominance genetic effects. The software provides estimates of QTL effects, locations and their respective confidence intervals. A total of 65 markers were used for the QTL analysis. They were chosen from the published genotype data at NASC to be evenly spaced across the five chromosomes and markers for which the genotypes for most RILs were known.
Results
A summary of the significant parameters for the 22 traits is presented in Table 2, together with the observed means of the two parental lines and their F1. In all cases, fairly simple models could be fitted to the data resulting in a nonsignificant χ2, and all but two traits showed significant genetic effects. The two exceptions, both scored at flowering, were cauline leaves and bud number. Additive and/or dominance effects exist for 19 traits and epistasis for 15 traits, so epistasis is a common feature and must imply two or more QTL for these traits. Where it exists, dominance is for early maturity and greater size. It can be seen that the F1 mean is generally intermediate to the two parents; better-parent heterosis only occurs for “height at flowering” although it is not significant (t=0.83 for 75 df; P>0.3). Maternal effects were also a minor component of variation and were detected in only two cases; a difference between the progeny of crosses with reciprocal F1 mothers for “days to maturity” and between crosses with Col vs Ler mothers for “rosette size at 36 days”. These effects were allowed for as parameters in model fitting and the df for the χ2 are consequently reduced by 1.
Figure 1 shows the distribution of the RIL means for a sample of traits scored either at a fixed time or at a set physiological age, namely at flowering. It can be seen, by comparing these distributions with the confidence intervals of the parental lines, that there is considerable transgressive segregation in all cases, indicating that increasing alleles are dispersed among the parents. This, again, implies two or more QTL for such traits. Where there are differences between the parental lines, Col is always the faster developer and has the larger size.
Table 3 illustrates the basic ANOVAs of the RILs and TTC families. Given the completely randomised design, the RILs analysis involves a simple one-way ANOVA with the expected mean squares and parameter estimates as shown. The coefficients of the expected mean squares are not whole numbers because of missing data. They have been calculated, following Sokal and Rohlf (1981), as n0= {1/(a−1)}{Σni−(Σni2/Σni)}, where ni is the size of the ith family and a is the number of families. The TTC families were analysed as two separate ANOVAs. The first follows the standard NCIII design (Kearsey and Pooni, 1996) to detect additive and dominance effects from the crosses of the RILs to the two parents. The second is based on an analysis of the comparison (L1i+L2i−2L3i), where L1i, L2i and L3i represent the means of the families derived from crossing RILi to Col, Ler and the F1, respectively. This comparison is solely a function of epistasis and should be zero for all ‘i’ in the absence of epistasis. The correction term in the ANOVA tests for additive-by-additive epistasis, while the variation among RILs is due to additive by dominance and dominance-by-dominance epistasis (Kearsey and Jinks, 1968; Kearsey and Pooni, 1996). In the example shown in Table 3, only the latter is significant. When epistasis is present, estimates of the additive (VA) and dominance (VD) components from the NCIII ANOVA are inflated, but it is not easy to correct for this. The estimates given for VA and VD, therefore, must be accepted as being biased.
The estimates of VA and VD and tests for epistasis from the analyses illustrated in Table 3 are shown for all traits in Table 4. VA and/or VD are significant for all traits while epistasis was detected for 14 traits. Using these estimates plus those from the basic generations, heritabilities and average dominance ratios, √(2VD/VA), were calculated and they are summarised in Table 5. The conclusions concerning epistasis from the BGs and TTC are generally highly consistent, indicating the almost complete absence of [aa] but the general presence of [ad] epistasis. Of the seven traits showing no epistasis in the BGs, four also showed no epistasis in the TTC.
QTL analysis was carried out on 84 RILs and 74 TTC families for which both L1 and L2 data were available. The RILs and L1+L2 data provide locations and additive effects (a) while L2−L1 provides locations and dominance effects (d). Such small numbers of genotypes are far from ideal and reduce the power of the analysis considerably. Of the 330 analyses performed (22 traits × 5 chromosomes × 3 sets {RILs, L1+L2 and L2−L1}), 38 were significant at P<1% on a whole chromosome basis (compared to those expected by chance alone of 330/100=3.3) giving an expected false-positive rate of about 8% (3.3/38). However, consistency of locations and effects across the three independent sets of data strengthens belief in the QTL identified. The estimated QTL locations and effects are shown in Table 6 for cases where P<1%, together with the additive and dominance effects of the same traits estimated from the basic generations, for comparison. The locations are also shown in Figure 2.
Those traits representing height at various ages locate a single major QTL at ∼50 cM on chromosome II, the site of the erecta mutation (Table 6). This location, the size and direction of the genetic effects are consistent over time and across RILs and TTC families. Thus, the additive effect of the QTL, a, is negative (ie Col is larger than Ler) and the dominance effect, d, is positive as before, that is Col alleles are dominant. However, there are inconsistencies. The additive effect of the ‘erecta’ QTL is considerably less than the additive effect [a] at 30–40 days but is larger by 45 days and beyond. This small initial effect, which increases greatly at maturity, is entirely consistent with the known effects of erecta. The size of d is considerably smaller than [d] as estimated from the basic generations by weighted least squares (eg 68.9 as opposed to 184.9 for height at maturity). Conversely, d is always much larger than we would expect from the F1 mean (eg 68.9 opposed to 32.3 for height at maturity). These discrepancies between the effects from the basic generation and QTL analyses are consistent with our previous conclusion that more than one QTL is involved, but the other QTLs have failed to achieve significance. Similar effects associated with erecta are found for rosette size at all ages, except at 26 days, and for leaf size at 40 days. However, there is also evidence for another QTL for rosette size on chromosome IV.
A further QTL at ∼60 cM on chromosome IV affects time to maturity, rosette size at 26 days and number of cauline leaves at 26 days. There is some indication of QTL on chromosomes I (cauline leaves at 26 days and at flowering) and V (rosette leaves at 26 days and at flowering, and cauline leaves at flowering), but the locations are inconsistent. No QTL was detected on chromosome III.
Discussion
All traits show significant genetical variation that, apart from two exceptions (buds and cauline leaves at flowering), is consistently detectable from the basic generations (means and variances), and from the RILs and TTC ANOVAs. Heritabilities are typically 20–40% and, in the case of the height traits, increase consistently with age. This trend is not observed for the other traits but is not uncommon for quantitative traits (Jayasekera et al, 1994). There is good agreement between the heritability estimates from the Basic Generation, RILs and the TTC families which points to a consistent expression of alleles under the varying genotypic backgrounds.
Given the repeated measurements over time and the nature of the traits scored, one would expect them to be correlated inter se. Ler develops more slowly than Col (Table 2) and hence, not surprisingly, it is smaller at any given chronological time from sowing. Thus it has fewer leaves, smaller rosettes and is shorter. However, this is also true when these same traits are scored at a fixed physiological time, that is, at flowering.
Dominance effects among the means were detected consistently for the height traits, but among the other traits, only rosette size at 26 days and days to maturity had significant dominance. Where it exists, dominance was for faster development and greater height. ‘Better parent heterosis’ does not occur for any trait except ‘height at flowering’, although the F1 is not significantly taller than tallest parent, Col. Conversely, the TTC ANOVAs consistently detect dominance variation, except for ‘buds at flowering’, and the dominance ratios, √(2VD/VA), indicate partial dominance for most traits through to complete dominance for height. There is no evidence for significant overdominance. The fact that VD occurs when [d] does not, implies ambidirectional dominance for such traits, because the former is a function of Σd2 and the latter Σd. Similarly, the very large ratios of [d]/[a] (from 2 to 7) for heights contrast with the dominance ratios of around 1, indicating directional dominance and dispersed increasing alleles in the parents.
Model fitting confirms the above interpretation to a large extent. While additive differences [a] are significant for 18 of the 22 traits under study, dominance and dominance × dominance epistasis are present for only eight traits, six of which are heights. In all cases where both [d] and [dd] are significant, they take opposing signs indicating the presence of duplicate epistasis; that is, heterozygosity at several loci has less heterotic effect than would be suggested by their individual effects. The only trait for which [d] is negative and significant is days to maturity, indicating that genes conferring early flowering are dominant to those responsible for late flowering. Another interesting feature of the results in Table 2 is the consistency with which [ad] is detected in this study. This component is significant for 15 traits and it takes a negative sign in every case. Detection of [ad] with such a high frequency and the consistency of its sign suggests that Ler and Col have an excess of genes in coupling for most traits, particularly for plant height, leaf size, rosette size, rosette leaves and maturity. There is little evidence for additive-by-additive [aa] epistasis either in the BGs or in the TTC.
The fact that there is epistasis, dispersion and ambidirectional dominance indicates that these traits must be controlled by at least two QTL. Following Wright (1934), it is possible to estimate the number of genes, k, from (Σa)2/Σa2; Σa is best estimated as half the difference between the extreme RILs, while Σa2 can be estimated from 2VA. These estimates of k are shown in Table 5 and they suggest that there are between three and 15 genes for each trait. It is well known that such estimates are imprecise and tend to be minimal estimates (Mayo and Hopkins, 1985; Kearsey and Pooni, 1996), but they are all consistent in suggesting that, despite their close origins, Col and Ler now differ by several genes controlling most quantitative traits.
The QTL analyses are consistent with the biometrical approach in many respects (Table 6). Thus, the direction and size of the additive effects for height, rosette size and rosette leaf number agree, with Col alleles giving larger plants: the positive direction of the dominance effects for height and rosette size and the absence of evidence for dominance for rosette leaf number. However, there are also significant inconsistencies. The basic generations and TTC indicated significant gene interaction for most traits implying two or more QTL, although QTL analysis seldom revealed more than one QTL for any trait. This may reflect the low power of QTL detection given the restricted sets of genotypes. However, the estimated sizes of the dominance effects of the QTL for height were much larger than the corresponding additive effects and also larger than expected given the observed F1 values. Both these effects predict overdominance at the QTL and heterosis, neither of which was detected by the more sensitive biometrical analyses. The estimates of additive and dominance effects from the TTC have equal precision, so this apparent exaggeration of dominance cannot be due to estimation bias. The squared additive effects of the QTL from RILs and TTC in Table 6 equal or slightly exceed the estimated additive genetic variances from these generations, suggesting that all the variance has been explained by these QTL. The squared dominance effects, however, are considerably larger than the estimated dominance variation. These effects strongly suggest an upward bias of estimates as shown by Melchinger et al (1998) and Utz et al (2000). We have good evidence from other work (Koumproglou et al, 2002) that there is another QTL, located at the top of chromosome III, at which the Col allele also increases height. Jansen et al (1995) also located a QTL affecting leaf number in this location. It is almost certainly flowering related and decreases in effect with time. The combined effects of the two QTL at 30 days to maturity would probably match the additive effect [a] from the basic generations. This other gene may be interacting with erecta in the heterozygous condition causing the discrepancies in the amount of dominance discussed above.
There are potentially other analyses that may be carried out on these data, such as the use of principal components analysis to identify key components for QTL and generation mean analysis and these are ongoing.
References
Doganlar S, Frary A, Tanksley SD (2000). The genetic basis of seed weight variation: tomato as a model system. Genetics 100: 1267–1273.
Jansen RC, Van Ooijen JW, Stam P, Lister C, Dean C (1995). Genotype-by-environment interaction in genetic mapping of multiple quantitative trait loci. Theor Appl Genet 91: 33–37.
Jayasekera NEM, Karunasekera KB, Kearsey MJ (1994). Genetics of production traits in Hevea brasiliensis (rubber). I. Changes in genetical control with age. Heredity 73: 650–656.
Kearsey MJ, Farquhar AJ (1998). QTL analysis in plants: where are we now? Heredity 80: 137–142.
Kearsey MJ, Hyne V (1994). QTL analysis: a simple ‘marker-regression’ approach. Theor App Genet 89: 698–702.
Kearsey MJ, Jinks JL (1968). A general method of detecting additive dominance and epistatic variation for metrical traits. I. Theory. Heredity 23: 403–409.
Kearsey MJ, Pooni HS (1996). The Genetical Analysis of Quantitative Traits. Chapman and Hall: London.
Koumproglou R, Wilkes TM, Townson P, Wang XY, Beynon J, Pooni HS et al (2002). STAIRS: a new genetic resource for functional genomic studies of Arabidopsis. Plant J 31: 355–364.
Lister C, Dean C (1993). Recombinant inbred lines for mapping RFLP and phenotypic markers in Arabidopsis thaliana. Plant J 4: 745–750.
Lynch M, Walsh B (1998). Genetics and Analysis of Quantitative Traits. Sinauer Associates: Sunderland, MA.
Mayo O, Hopkins AM (1985). Problems of estimating the minimum number of genes contributing to quantitative variation. Biom J 27: 181–187.
Melchinger AE, Utz HF, Schon CC (1998). QTL mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics 149: 383–403.
Monforte AJ, Friedman E, Zamir D, Tanksley SD (2001). Comparison of a set of QTL NILs for chromosome 2 of tomato: deductions about natural variation and implications for germplasm utilization. Theor Appl Genet 102: 572–590.
Sokal RR, Rohlf FJ (1981). Biometry. WH Freeman and Co.: New York.
Tanksley SD (1993). Mapping polygenes. Annu Rev Genet 27: 205–233.
Utz HF, Melchinger AE, Schön CC (2000). Bias and sampling error of the estimated proportion of genetic variance explained by QTL determined from experimental data in maize using cross validation and validation with independent samples. Genetics 154: 1839–1849.
Wright S (1934). The results of crosses between inbred strains of guinea pigs differing in number of digits. Genetics 19: 537–551.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kearsey, M., Pooni, H. & Syed, N. Genetics of quantitative traits in Arabidopsis thaliana. Heredity 91, 456–464 (2003). https://doi.org/10.1038/sj.hdy.6800306
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sj.hdy.6800306
Keywords
This article is cited by
-
Intricate environment-modulated genetic networks control isoflavone accumulation in soybean seeds
BMC Plant Biology (2010)
-
Gene actions at loci underlying several quantitative traits in two elite rice hybrids
Molecular Genetics and Genomics (2010)
-
Both additivity and epistasis control the genetic variation for fruit quality traits in tomato
Theoretical and Applied Genetics (2007)
-
Genetic mechanisms and evolutionary significance of natural variation in Arabidopsis
Nature (2006)
-
Molecular marker genotypes, heterozygosity and genetic interactions explain heterosis in Arabidopsis thaliana
Heredity (2005)