Introduction

Geneticists have abundant choices of parents to use for mapping population development, and may have numerous extant mapping populations from which to choose for mapping quantitative trait loci (QTL) (Young, 1996). For example, the publicly available maize nested association mapping (NAM) population consists of 25 recombinant inbred line (RIL) families derived from crosses between the reference parent B73 and 25 diverse inbred lines (McMullen et al., 2009). Each mapping family is composed of 200 RILs; hence, evaluation of the entire population requires testing 5000 inbred lines, beyond the capability of many researchers to assay, particularly for phenotypes that are difficult to measure. Thus, methods that could effectively predict which families are maximally segregating for genetic variation for a trait would aid geneticists by permitting efficient use of resources. Maximizing genetic variance results in higher power of QTL detection. Theoretically, progeny variance is increased in crosses between genetically more distant parents because the number of segregating loci is maximized (Cox et al., 1984). Studies in wheat (Triticum aestivum), oat (Avena sativa) and soybean (Glycine max) suggest that pedigree divergence between parents, estimated on the basis of the coefficient of parentage (Kempthorne, 1969), could be used to predict genetic variance in F2 or later segregating generations (Bhatt, 1970, 1973; Cowen and Frey, 1987; Manjarrez-Sandoval et al., 1997). However, other studies in the same species (Moser and Lee, 1994; Helms et al., 1997; Kisha et al., 1997; Burkhamer et al., 1998; Bohn et al., 1999) indicated that the relationship between parental pedigree distance and progeny genetic variance was neither consistent nor strong enough to permit reliable prediction of genetic variance. One possible explanation for the limited relationship between pedigree divergence and progeny variance observed in these studies is that coefficients of parentage may be inaccurate because the parents of some specific crosses might differ for many genes affecting a trait.

An alternative to estimating parental genetic divergence on the basis of pedigree information is the use of molecular marker-based genetic distance (GD). GD between two individuals or populations was defined by Goodman and Lasker (1974) and Nei (1974) as the proportion of non-matching nucleotide bases at homologous nucleotide sites between the genomes of two individuals or two populations. Although GD estimates based on molecular marker estimates have been effective at grouping related germplasm (Melchinger et al., 1998), the relationship between GD in parents and genotypic variance components (GVCs) in their progenies has been reported as weak or non-significant across many studies (Helms et al., 1997; Manjarrez-Sandoval et al., 1997; Burkhamer et al., 1998; Melchinger et al., 1998; Bohn et al., 1999; Gumber et al., 1999; Brachi et al., 2010).

Another alternative is to use parental phenotypic differences (PDs) to predict progeny GVC, because greater parental PD should be due to allelic differences at more loci for polygenic traits. However, parental PD was weakly related to or unrelated to progeny genetic variances in studies in maize, oat and wheat (Souza and Sorrells, 1991; Melchinger et al., 1998; Utz et al., 2001). Kuczyñska et al. (2007) found that, among five traits, only the differences between parents for spike length were significantly associated with the GVC of their progeny in barley. Melchinger et al. (1998) suggested that one cause of the weak relationship between PD and GVC in their maize study was that although larger values of PD were associated with larger values of GVC, smaller values of PD were not necessarily indicative of smaller values of GVC.

In summary, results from a range of plant species suggest that the relationship between genetic or PDs of parents and genetic variances of progenies is not strong enough to recommend the use of this relationship as a practical approach to select parental combinations to maximize within-family variation either for breeding or for trait analysis studies. Limitations in these studies include a lack of sufficient range of parental genotypic or phenotypic diversity, evaluation of small progeny sample sizes and use of limited samples of molecular markers. For example, the study by Melchinger et al. (1998) was based on crosses among elite inbred lines within two early-maturing European heterotic groups, and most of the other studies also involved elite germplasm representing limited diversity within those species. Therefore, it remains untested whether a stronger relationship between PD and genetic variance may exist in crosses between more diverse maize germplasm.

The maize NAM population represents an ideal resource to test this question. It represents the largest progeny sample evaluated for QTL mapping of any species to date (Buckler et al., 2009). The 25 diverse founder inbred lines selected to create mapping families in crosses with the reference parent B73 were chosen to maximally sample the genetic diversity available in global public maize inbreds (Liu et al., 2003; Buckler et al., 2009). The maize NAM families were genotyped with a common set of 1106 single-nucleotide polymorphism (SNP) markers (McMullen et al., 2009) and evaluated jointly for numerous quantitative traits across multiple environments. Furthermore, the gene-rich regions of all NAM parental lines have been sequenced with next-generation sequencing methods to produce a maize HapMap composed of 1.6 million SNPs (Gore et al., 2009). The combination of large genetic diversity, dense genetic maps, founder sequence information and robust phenotypic data associated with the maize NAM has permitted high-resolution genetic mapping of QTL and genes affecting important quantitative traits (Buckler et al., 2009; Kump et al., 2011; Tian et al., 2011).

The objective of this study was to use genotypic and phenotypic data of the 25 NAM families to evaluate the potential to select families for QTL mapping with high genetic variance based on phenotypic distance or GDs between the parental lines.

Materials and methods

Population development

B73 was crossed as a female parent to 25 genetically diverse inbred lines (Table 1) to form 25 F1 combinations. A sample of 200 randomly selected RILs from the intermated B73 × Mo17 (IBM) population (Lee et al., 2002) was also added to the evaluation. B73 was chosen as the common reference parent because it is well adapted to the evaluation environments, is the most important public inbred in commercial maize pedigrees in the United States (Mikel and Dudley, 2005) and is the source of the reference maize genome sequence (Schnable et al., 2009).

Table 1 The 26 RIL families and the most probable subpopulation assignment of the diverse (non-reference) parent based on SSR-based population structure analysis (Non-Stiff Stalk, NSS; Stiff Stalk, SS; Tropical/Subtropical, TS; Popcorn, Pop and Sweet corn, Sweet; Liu et al. (2003))

Several F1 plants of each cross were selfed to form F2 generation families. Self-fertilization for 3 additional generations with minimal conscious selection was practiced to form 200 F5 progenies per family. Each F5 descended from a unique F2 plant. The self-fertilized progeny from each F5 plant were harvested in bulk to form an F5:6 RIL. To produce seed for field evaluations, at least 15 plants within each F5:6 line were sib-mated. Approximately one-third of RILs were developed by each of the USDA-ARS (US Department of Agriculture–Agricultural Research Service) maize genetics programs at North Carolina State University (Raleigh, NC, USA), University of Missouri (Columbia, MO, USA) and Cornell University (Ithaca, NY, USA). Each program used a summer pollination season at their location and a winter pollination season in Homestead, Florida, or Ponce, Puerto Rico, each year to create the NAM RILs.

Field evaluation

The NAM population was evaluated across a total of 11 environments, although not all traits were measured in all environments. In 2006, the evaluation of the populations was carried out in four summer locations: Clayton, North Carolina, Columbia, Missouri, Ithaca, New York, and Urbana, Illinois and two winter locations: Homestead, Florida and Ponce, Puerto Rico. In 2007, the experiment was grown in the same four summer locations and one winter location in Homestead, Florida. The genetic entries consisted of 5000 NAM RILS representing 25 families, 200 randomly selected RILs from the intermated B73 × Mo17 (IBM) population (Lee et al., 2002) and 281 inbred lines representing the global diversity of public maize inbreds (including all NAM founders) and useful as an association analysis panel (Flint-Garcia et al., 2005). Thus, the experiment contained 5481 unique inbred maize lines.

Within a location, the experimental design was a sets design (Federer, 1955), in which each set contained all lines of a family or population (Supplementary Figure 1). The positions of the 27 sets (25 NAM families, the IBM family and the association panel) were randomized across environments. Each set was randomized across environments as a 10 × 20 incomplete block α-lattice design (Patterson and Williams, 1976). The α-design was augmented by including the two parental lines of the family within each incomplete block (Federer, 1961). Thus, each incomplete block included 20 random RILs plus B73 and the other parental line of the family were included as a repeated check in all family sets. The order of the 22 entries within each incomplete block was randomized. The association panel (with 280 entries after excluding Mo17) was arranged as a 14 × 20 α-lattice design, and the incomplete blocks were augmented by adding B73 and Mo17 to random positions within each incomplete block (Supplementary Figure 1). In 8 of 11 environments, 1 complete replication of the experiment was grown. In North Carolina in 2006, a second replication of families derived from crosses between B73 and lines CML247, CML277, Ki3, M162W, Mo17, Tzi8, and the association mapping population was grown adjacent to the first complete replication of the experiment. In Missouri, 2006, families corresponding to CML247, CML322, IL14H, M162W, Mo18W, MS71, NC350, NC358 and P39 were not scored because of the germination rate and drought condition. In Missouri 2007, the Mo17 family (IBM) was not grown.

Experimental units were single row plots of variable size at each location. In Clayton, North Carolina, plots were 1.07 m in length with a 0.61-m alley at the end of each plot. Inter-row spacing was 0.97 m. Plots were thinned to approximately eight plants per row. In Columbia, Missouri, the experiment was planted in plots that were 2.14 m in length with a 0.92-m alley at the end of each plot. Inter-row spacing was 0.90 m. Nine kernels were planted in each plot and they were not thinned. In Aurora, New York, the plots were 2.60 m in length with a 1.22-m alley at the end of each plot. Inter-row spacing was 0.76 m. In all, 12 kernels were planted in each plot and they were not thinned. In Urbana, Illinois, plots were 4.57 m in length with a 1.0-m alley at the end of each plot. Inter-row spacing was 0.76 m. In all, 25 kernels were planted in each plot and they were thinned to 15 plants per row. In Homestead, Florida, plots were 1.07 m in length with a 0.76-m alley at the end of each plot. Inter-row spacing was 1.1 m. Plots were thinned to approximately eight plants per row. In Ponce, Puerto Rico, plots were 1.83 m in length with a 0.91-m alley at the end of each plot.

Traits evaluated in this study

The traits evaluated were days to anthesis (DTA, days after planting until 50% of plants in the row shedding pollen), days to silk (DTS, days after planting until 50% of plants in the row silking), anthesis-silk interval (ASI, difference between DTA and DTS), plant height (distance from soil surface to the base of flag leaf), ear height (distance from soil surface to the highest ear-bearing node), tassel length (length from the bottom branch to the tip of the tassel), tassel primary branches (a count of the number of tassel primary braches), upper leaf angle (angle between the leaf immediately below the flag leaf and the stalk at or near flowering time), leaf length (distance from base to tip of the leaf below the primary ear, at or near flowering time,), leaf width (distance of the widest section of the leaf below the primary ear at or near flowering time), ear row number (number of rows of kernels around the diameter of the ear), cob diameter, cob length (length of a cob from base to tip), number of kernels per row (EKPR, number of potential kernels per row from base to tip of the ear), ear mass, cob mass (weight of a cob after shelling seeds), total seed weight (KW, difference of ear mass and cob mass), 20-kernel weight (TWKW, weight of 20 randomly chosen kernels) and total kernel number (KNUM, KW multiplied by 20 and divided by TWKW). DTA, DTS and ASI were measured on a plot basis. Plant height, ear height, tassel length, tassel primary branch, upper leaf angle, leaf length and leaf width were measured on one random representative plant per row. Ear traits (ear row number, cob diameter, cob length, EKPR, ear mass, cob mass, KW, TWKW and KNUM were measured on two open-pollinated ears harvested from each plot. Not all traits were measured in all locations; the locations where each trait was measured are listed in Supplementary Table 1.

Genotyping to calculate GDs

Genotyping of simple sequence repeats and SNPs on the parental inbred lines was described by Liu et al. (2003), Flint-Garcia et al. (2005), Wright et al. (2005) and McMullen et al. (2009). Genotype data were extracted from the publicly available Panzea database (http://www.panzea.org). Among the SNPs available, 1144 were used to create the NAM map because B73 had a relatively rare allele at them (McMullen et al., 2009). We excluded these 1144 SNPs from our estimates of parental GD because they have very strong ascertainment bias, which is expected to influence relationship estimates. After removing loci with any missing data, GDs based on SNPs were calculated from the remaining 2303 SNP markers by computing the percentage of matched alleles between inbred line B73 and the other 26 parental lines and dividing by total number of alleles (Goodman and Lasker, 1974). Similarly, a separate GD estimate was computed based on 44 simple sequence repeat loci with complete data.

Statistical analyses

Outliers were detected from initial analysis fitting only environment, set and genotype effects and their interactions using the DFFITS criterion, which measures the influence of each observation on predicted values (Belsley et al., 2004). We used as the DFFITS threshold value, where p′ is model df+1 and n the sample size, and we deleted observations exceeding this threshold. This is twice the DFFITS threshold suggested by Rawlings et al. (1998), resulting in a conservative approach to dropping outliers from the analysis. Genotypic analysis of the 5000 NAM lines (McMullen et al., 2009) performed after field experiments were completed revealed that some lines were contaminated (contained non-parental alleles) or had high levels of heterozygosity. We considered contaminated lines or lines with >8% heterozygosity as genetic outliers, and these 301 genetic outliers were excluded from the set of NAM seed stocks deposited at the USDA Maize Genetics Cooperation Stock Center for public release (http://maizecoop.cropsci.uiuc.edu/nam-rils.php). However, we maintained the genetic outliers in the statistical analysis, as they provide information on genotype-by-environment variation and within-environment spatial variation, but we did not want their phenotypic values to influence the estimates of genetic variation within or among the NAM families. Therefore, we coded the 301 genetic outliers as belonging to family P=28. The mean value of family 28 was excluded from computation of the among-family variation and the variation within family 28 was excluded from the computation of average within-family variation.

The next step of analysis was to analyze each combination of trait and environment separately to account for as much extraneous variation due to spatial variation as possible (Gilmour et al., 1997). Mixed model analyses implemented with ASReml version 2 (Gilmour et al., 2006) were used to account for the unbalanced design and data structure. Within each environment, the initial model included random effects due to sets, incomplete blocks, families and lines within families. Families, sets and incomplete blocks were not confounded because the parental lines were considered to be from the association population, B73 was repeated across all incomplete blocks and sets and the other parental line of each population was repeated across incomplete blocks within a set. Therefore, the experimental design enables estimation of genetic effects of lines separately from field design effects. In environments in which the experiment was grown in adjacent but separate fields, we also fit a field effect and nested sets and blocks within fields. For Clayton, 2006, where a partial second replication was grown, we also fit a main effect because of the complete replication and nested sets and blocks within complete replications.

The basic model was then enhanced by including random effects due to the rows and columns of the physical layout of the grid of plots in the field and by fitting separate spatially autoregressive correlations in the row and column directions among plot residuals (AR1 × AR1 error structure) (Cullis and Gleeson, 1991; Gilmour et al., 1997; Smith et al., 2001). Model terms were tested with likelihood ratio tests (Littell et al., 1996), and terms not significant at P<0.05 were dropped from the final model for an environment. If the residual autocorrelation in one direction but not the other was significant, we fit AR1 × AR1 error structures.

Next, a combined model was fitted including all environments. We included within-environment non-genetic sources of variation only in those environments in which they were significant in the individual location analyses. The full model across environments was:

where, Yijklmnopq=individual observation; μ=overall mean; envi=the effect of the ith environment (location-by-year combination), i {1, …, 11}.; field(env)ij=the effect of the jth field within the ith environment (multiple fields within a location were used only at Missouri, 2007, and Illinois, 2007); rep(field*env)ik=the effect of the kth replication within the jth field within the ith environment (modeled only for North Carolina in 2006); set(rep*field*env)ijkl=the effect of the lth set within the kth replication within the jth field within the ith environment; block(set*rep*field*env)ijklm=the effect of the mth incomplete block within the lth set within the kth replication within the jth field within the ith environment; row(field*env)ijn=the effect of the nth plot grid row direction within the jth field within the ith environment; column(field*env)ijo=the effect of the oth plot grid column within the jth field within the ith environment; familyp=the effect of the pth family; RIL(family)pq=effect of the qth RIL within the pth family which are two levels of genotypes; env*familyip=the effect of the interaction between the pth family and the ith environment; env*RIL(family)ipq=the effect of the interaction between the qth RIL within the pth family and the ith environment; and ɛijklmnopq=the experimental error on plot containing all the experimental factors above.

Unique error variances (ς̂ɛi2) and spatial autoregressive error correlations were modeled for each environment. Unique genetic (line within family) variance components, ς̂RIL(family)p2, were modeled for each family. We also attempted to fit unique environment-by-RIL(family) variance components for each family for flowering traits, but obtained variance estimates equal to zero for numerous families, which we regarded as unlikely for this trait. This result likely occurred because of the high level of confounding between environment-by-RIL(family) and residual effects for most genotypes. Therefore, we chose to fit a homogeneous variance component for environment-by-RIL(family) for all traits to avoid overfitting the mixed models.

For ear traits, which were measured on one or two open-pollinated ears per plot, depending on productivity of the lines, we averaged individual ear measurements for each plot. Plot mean values were then analyzed with a weighted mixed model, with the number of observations per plot used for weighting. We attempted to include correlated error terms with these weighted analysis models, but convergence consistently failed. Therefore, residual effects within an environment for ear traits were modeled as independent and identically distributed. Otherwise, these traits were analyzed with similar mixed models as the other traits, including row, column and block effects to account for spatial effects, and unique residual variances for each environment.

Likelihood ratio tests were used to test the significance of factors with variance components near zero in the combined analysis (Littell et al., 1996). Non-significant terms were dropped from the combined model. The final model containing only significant terms was used to estimate the parameters reported in this study, which included unique genetic components of variance for each family. Best linear unbiased predictors for RILs were also obtained from these models for use in QTL mapping and genome-wide association studies (Buckler et al., 2009; Kump et al., 2011; Tian et al., 2011). Heritabilities on an individual plot basis (hp2) (Holland et al., 2003) were estimated for the entire NAM population as:

In this and other heritability equations, the residual error variance is the mean residual error variance across all environments. Heritabilities on a line mean basis (hl2; Holland et al., 2003) were estimated for the entire NAM population as:

To account for unbalanced data, we used the harmonic means of the number of environments in which each family was observed for (nenvf), the harmonic mean of the number of environments in which each RIL was observed for (nenvl) and the harmonic mean of the total number of plots in which each RIL was observed for (nplot) in equation (2) (Holland et al., 2003; Piepho and Möhring, 2007).

Heritabilities on a line mean basis within only the pth NAM family (hlwp2) were estimated as:

where envlp is the harmonic mean of the number of environments in which each RIL was observed for each family and n plot p the harmonic mean of the total number of plots in which each RIL was observed for each family. Mean within-family heritabilities were estimated by averaging the heritabilities obtained for each family, except the association panel.

An alternate estimator of heritability (hc2) that pertains to an entire experiment (in this case the entire NAM population, IBM population and the association panel) was given by Cullis et al. (2006):

Where, VPPE is the average prediction error variance for all possible pairwise comparisons (including repeated check lines), obtained directly from the ASReml prediction output.

The s.e. of the estimators of heritabilities from equations (1), (2) and (3) were estimated using the delta method (Holland et al., 2003) in ASReml. The among-family variance components (ς̂family2) in equations (1) and (2) were computed based only on NAM and IBM family means to exclude the effect of the association panel, but this estimator is not directly available from the ASReml output. Therefore, we used the s.e. of the heritability estimate including the association panel as an approximation to the s.e. for equations (1) and (2). Approximate s.e. for each heritability estimated were computed. However, the s.e. for heritability estimators from equation (4) was not described in Cullis et al. (2006).

To test the hypothesis that family genetic variance increases with increasing phenotypic parental differences and genetic differences for a given trait, the estimates of within-family genetic variance (ς̂RIL(family)2) and heritability (ĥlwp2) were regressed separately on the parental PD (estimated as the absolute value of difference between B73 and other parental line means; PD) or on the parental GD estimate from simple sequence repeat markers (GDssr) and SNP markers (GDsnp) using PROC REG in SAS version 9.1 (SAS Inc., Cary, NC, USA) (SAS Institute, 2004).

Results

Among the 135 spatial autocorrelation coefficients fit for residual effects across all traits and environments, only 2 were negative, suggesting that spatial variability in the trials was primarily due to physical variation due to soil and management, rather than inter-plot competition (Stringer and Cullis, 2002). The experimental design used involved replication of NAM lines across environments but not within environments, as a means to most efficiently estimate their genotypic main effects across environments. The additional use of repeated checks within environments permits modeling non-genetic field effects within environments and estimation of within-environment error variance separately from genotype-by-environment interaction variance. Thus, the design provided efficient estimation and testing of genotype main effects, as well as genotype-by-environment interactions, although it sacrifices somewhat the precision of environment-specific genotypic values compared with designs with more replication within fewer environments.

Within-family genotypic variation varied significantly among families for all measured traits (Supplementary Table 2). The maize association panel had larger genetic variation than all biparental families for 13 of 19 traits (Supplementary Table 2). Estimated heritabilities on a plot-basis (ĥp2.) ranged from 23 to 71%, whereas estimates of heritabilities on a RIL mean basis (ĥl2) ranged from 59 to 94%. Average within-family heritability estimates on a RIL mean basis (hlwp2) ranged from 52 to 90%, and were always lower than the heritability for the entire NAM population (ĥc2) (Figure 1; Supplementary Table 3). The difference between the average within-family heritabilities and corresponding heritability estimates for the entire NAM population ranged from close to 0 (EKPR) to 19 (DTA) percentage points (Figure 1; Supplementary Table 3) among traits. This difference reflects the relative amount of genetic variation among and within families, and was not consistent among the different types of trait measurements.

Figure 1
figure 1

Heritability estimates and their s.e. for 19 traits based on evaluation of the maize NAM population across up to 11 environments. Black bars represent individual plot basis heritability across all families in NAM (ĥp2), dark gray bars represent heritability on a line mean basis heritability across all families in NAM (ĥl2), white bars represent average within-family line mean basis heritability (ĥlwp2), and light gray bars represent heritability across the entire experiment (ĥc2) described by Cullis et al. (2006).

Most estimates of line mean-basis heritability based on Cullis et al. (2006) were within 2% point of, and were never >2.7% points different from the heritabilities estimated with equation (2) (Figure 1; Supplementary Table 3). Heritabilities based on the Cullis et al. (2006) equation (ĥc2) were in all cases greater than ĥl2, because they include the genetic variation within the association panel (inflating the numerator) and reflect the greater precision for measurements on the repeated check founder lines (reducing the denominator).

Phenotypic differences (Supplementary Table 4) between founder best linear unbiased predictors (BLUPs) (Supplementary Table 5) were used to predict within-family genetic variation. The regression of within-family GVC (ς̂RIL(family)2) on between-parent PD was significant (P<0.05) for 7 of 19 traits, with r2 values for these significant regressions ranging from 18 to 75% (Table 2). Increasing parental phenotypic diversity was positively correlated with GVC for the three flowering traits (ASI, DTA and DTS), for upper leaf angle and for both tassel architecture traits (tassel length and tassel primary branch). Increasing parental phenotypic diversity was negatively correlated with GVC for cob mass (Figure 2). Only cob length showed significance for regressions of GVC on GDssr (r2 ranged among traits from 0 to 28%; Supplementary Figure 2). GDsnp was not significantly related to GCV for any trait. The pattern of significant regressions of within-family heritability (ĥlwp2) on parental PD and GD closely followed the pattern observed for GVC (Supplementary Table 6; Supplementary Figures 3 and 4), as expected because GVC is the numerator of the heritability estimates.

Table 2 r2, β̂ and P-value of regressions of GVC (ς̂RIL(family)p2) on the absolute value of parental phenotypic difference (PD), genetic distance estimate from SSR markers (GDssr) and genetic distance estimate from SNP markers (GDsnp)
Figure 2
figure 2

Significant regressions of progeny genetic variance component (GVC) on parental phenotypic difference (PD). The X axis is PD and the Y axis is GVC. ASI, anthesis-silking interval; CM, cob mass; DTA, days to anthesis; DTS, days to silk; TPB, tassel prime branches; TSL, tassel length; ULA, upper leaf angle.

Discussion

Heritability estimates ĥl2 and ĥp2 are functions of genotypic and phenotypic variations across the entire NAM population. Importantly for genetic mapping applications, the line mean-basis heritability across the entire NAM population corresponds to the maximum amount of phenotypic variation among NAM lines that can be attributed to genetic effects and thus to the cumulative effects of QTL (Buckler et al., 2009; Kump et al., 2011). Correspondingly, heritability on a line-mean basis within a family indicates that the proportion of variation among line means that can be attributed to QTL within that family. On average, within-family variation heritability for the whole NAM population was always less than the average heritability of the whole NAM population, demonstrating the greater potential for QTL identification by incorporating the genetic variation among and within families. Flowering time (except ASI) and whole plant, leaf and tassel architecture traits had line mean-basis heritabilities of 89%, indicating that we have good power to detect and resolve QTL for these traits. In contrast, traits measured on ears consistently had lower line mean-basis heritabilities (from 61 to 79%; Figure 1; Supplementary Table 3). These traits tend to be more highly related to fecundity, and thus are more strongly affected by environmental variation in inbreds (Falconer and Mackay, 1996). Indeed, KW, which represents fecundity directly, had the lowest heritability among all traits (Figure 1). Nevertheless, the line mean-basis heritabilities for all traits measured were sufficiently high to permit reasonable power for QTL detection.

The relationship between the genetic variance component (GVC) of the traits and PDs between the parental lines was examined by linear regression, with 6 of 19 traits exhibiting significant positive regression coefficients. We inspected the scatter plots of GVC vs PD for consistent non-linear trends, but did not observe any (Figure 2, Supplementary Figure 5). Therefore, predicting progeny genetic variance based on the absolute PDs of parents may be moderately effective when a large number of genetically diverse populations and traits are evaluated in multiple environments. In contrast to Melchinger et al. (1998), we did not observe a trend whereby PD and GVC seemed to be related for higher but not lower values of PD. Instead, we observed that larger values of PD were associated with lower GVC values as often as not for those traits that did not exhibit a significant association between PD and GVC (Supplementary Figure 5).

Under relatively simple models of genetic architecture, the relationship between PD and GVC is expected to be strongest when alleles conferring positive effects are concentrated in one set of parents and those conferring negative effects are concentrated in other parents, such that the magnitude of parental PDs is associated with the number of polymorphic loci that affect the trait, and as a consequence, the magnitude of the progeny genetic variation (Figure 3a). In this situation, most pairs of loci affecting the trait tend to be in coupling-phase gametic disequilibrium in F1 parents of the mapping families.

Figure 3
figure 3

Gametic phase of positive trait alleles among founders affects the relationship between parental phenotypic differences (PDs) and progeny genetic variance components (GVCs). Idealized simple genetic architecture affected by four unlinked QTL with equal effects (+1 or −1 for each homozygous class) is shown. Each segregating locus contributes a variance of +1 in progeny RIL generation. In a, negative alleles are concentrated in reference parent (P1), resulting in a positive linear relationship between PD, the number of segregating loci and GVC. In b, both negative and positive alleles are more equitably distributed among founders, resulting in no relationship between PD and GVC.

For highly polygenic traits, most pairs of QTL will be unlinked because the component loci will be located on different chromosomes. Thus, most gametic disequilibrium that occurs in the F1 generation will be eliminated by independent assortment in the F2 and later inbreeding generations. Therefore, gametic disequilibrium between unlinked QTL may have a significant impact on parental PDs but will have little or no effect on progeny variance. Thus, when unlinked QTL are predominantly in repulsion-phase gametic disequilibrium in the F1 generations, large progeny variances can be generated by crosses between parents with similar phenotypes (Figure 3b). In summary, polygenic traits with largely additive (non-epistatic) genetic control will tend to exhibit a positive relationship between PD and GVC when allelic effects at QTL are consistent within a parent and distinct between parents (unlinked coupling-phase gametic disequilibrium). In contrast, when positive and negative allelic effects are distributed among parents (resulting in more unlinked repulsion-phase gametic disequilibrium in F1 generations), the relationship between PD and GVC will break down. Furthermore, more complex genetic architectures, such as those involving epistasis, are expected to reduce the relationship between PD and GVC.

The predictive ability of parental PD was strongly dependent on the traits evaluated. For example, there is a moderately strong relationship between PD and GVC for flowering time but not for plant height (Table 2). For DTA, the larger difference between CML277 and B73 values is reflected in the larger within-family variation for the B73 × CML277 RIL family compared with the smaller parental differences and progeny variation for the B73 × MS71 family. In contrast, the parental difference for plant height was much larger for B73 × MS71 than for B73 × CML277, and the progeny means were quite different for the two families, but the progeny variation was quite similar (Figure 4).

Figure 4
figure 4

Distributions of parental and progeny RIL BLUPs for (a) DTA and (b) PH. Arrows indicate BLUPs for founder line MS71, reference line B73 and founder line CML277. Black bars represent the histogram of B73 × MS71 RILs. Gray bars represent the histogram of B73 × CML277 RILs.

All three flowering time traits measured, DTA, DTS and ASI, had significant linear relationships between PD and GVC. Buckler et al. (2009) demonstrated that the genetic architecture of flowering time in the maize NAM population is characterized by series of additive small-effect allelic variants at a moderately large number of loci. Alleles conferring positive and negative flowering time effects relative to B73 are dispersed among other parental lines, but there is a general trend of later flowering alleles being concentrated in later flowering parents and earlier flowering alleles being concentrated in earlier flowering parents. This is congruent with the finding that flowering time traits are more strongly related to maize adaptation and population structure compared with other traits measured in this study (Flint-Garcia et al., 2005). Thus, the flowering time allele effects tended to be in the coupling phase among the NAM founder F1s, enhancing the relationship between PD and GVC.

However, for most traits, we observed no significant relationship between GVC and PD. We suggest that the lack of a relationship is likely due to larger proportions of repulsion-phase gametic disequilibrium between unlinked QTL pairs among the parental F1s for those traits. It is also possible that non-additive gene action due to epistasis could have a strong effect of reducing the relationship between parental PDs and progeny variation. However, limited epistasis has been detected for these traits in QTL analysis with NAM. We also observed one trait (cob mass) for which parental difference was strongly negatively related to within-family genetic variation (Table 2; Figure 2), suggesting that the genetic architecture of this trait is distinct from others, and perhaps is more strongly controlled by epistasis.

Genetic distance estimated by the percentage of matched markers between the parental lines was not a better predictor than the parental PD of genetic variances. The poor association between marker-based estimates of GDs and genetic variance is likely due to the inclusion of markers not linked to QTL affecting the trait in distance estimation. Such markers are not uninformative, but rather are mis-informative, as they disrupt the relationship between genetic differences and phenotype differences, as demonstrated by Charcosset et al. (1991), Bernardo (1992) and Flint-Garcia et al. (2009) for prediction of heterosis from random marker data. The distribution of QTL effects seems quite different from that of sequence variation among the NAM founders (Buckler et al., 2009), and this will tend to make random marker information less predictive of genetic segregation. In addition, epistasis will reduce the relationship between genetic differences and trait variation (Moser and Lee, 1994). By design, the maize NAM population involves crosses between a single reference parent, B73, and unrelated, genetically diverse inbreds to maximize the genetic diversity sampled. Therefore, the NAM does not include crosses between closely related inbred lines that would be typical of applied maize breeding programs; it is possible that inclusion of such crosses would result in a more obvious relationship between GD and GVC.

For researchers unable to evaluate the entire NAM population due to resource limitations, we offer the following suggestions regarding sampling subsets of the NAM for phenotypic evaluations. Sampling strategies should reflect the goal of the research. Sampling fewer families with more progeny per family seems to provide higher power for QTL detection, but sampling more families with fewer progeny seems to provide more reliable estimates of the overall genetic architecture (for example, general and specific combining ability variances), allele number and QTL variance (Wu and Jannink, 2004; Verhoeven et al., 2006). Different analysis approaches have different optimal sampling strategies as well. For example, QTL mapping based on joint linkage analysis with lower marker density strives to explain within-family variation within marker effects nested within families (Buckler et al., 2009). In contrast, high-density marker information provided by the maize HapMap (Gore et al., 2009) provides new opportunities to account for both among- and within-family variation based on identity-in-state models (Kump et al., 2011; Tian et al., 2011). In the latter case, the association between SNPs and variation among families can be modeled, such that sampling of more families becomes more advantageous.

Sampling should include as many NAM families as possible if high-density marker analysis is an option, as we observed that for most traits, parental phenotype differences were poor indicators of within-family variation, and variation among families was a significant component of genetic variation for all traits. However, if marker effects are to be tested as nested within families, a minimum sample of at least 40 progeny per family seems necessary to maintain good power of QTL detection (Wu and Jannink, 2004; Verhoeven et al., 2006; Yu et al., 2008). Larger sample sizes could be drawn from mapping families with greater parental PD, as some traits do exhibit a moderate relationship between PD and GVC. Thus, sampling among as many NAM families as possible with a weighted sampling scheme based on PD seems to be a reasonable compromise approach. Alternatively, for a given total sample size, a sample of RILs with maximum GDs based on available marker data could drawn from the entire NAM. Although we found that parental genotypic differences based on random markers were not predictive of progeny genotypic variance, it is possible that progeny marker variation would have a better relationship with progeny genotypic variance simply by ensuring adequate sampling of the available progeny genotypic combinations. As more traits are dissected with NAM, we expect to have a robust empirical data set with which to address these questions.

If the major objective is to identify the most important QTL for a trait (rather than attempt a more complete evaluation of genetic architecture), reasonable power of QTL detection is possible with 20% sample of NAM RILs if trait heritability is 70% and 20 or fewer QTL affect the trait (Yu et al., 2008). At the other extreme, at which a single gene affects a trait, for example, su1 and seed type or Ga1 and cross-incompatibility (McMullen et al., 2009), only the NAM families segregating for the causative locus are informative and they should be sampled in full. Li et al. (2011) demonstrated that the power of detection of rare QTL in NAM (those that are limited to one or a few families) is higher for individual family analysis if the QTL effect is moderate or greater. Of course, one must sample the correct family to be able to detect the QTL in this situation.

Data archiving

Raw data for all traits analyzed are available at http://www.panzea.org/db/gateway?file_id=Hung_etal_2011_Heredity_data.