Introduction

One of the most important issues in maize evolution is how to explain the extraordinary morphological and genetic diversity that exists among the maize landraces (Matsuoka et al, 2002). This is of great interest to breeders and maize geneticists because understanding maize phenotypic evolution is critical to future maize breeding. As new insights into phenotypic diversity are brought to light by association mapping in maize (Thornsberry et al, 2001), elucidating the mechanisms underlying the extraordinary morphological and genetic diversity will allow us to elaborate new strategies for breeding and for screening genebanks for alleles that confer desired traits. To better conserve the current and future adaptative ability of a species, it is necessary to investigate past and present evolutionary processes that shaped quantitative trait variation.

Crop plant evolution occurred and possibly still occurs in an environment that is managed by traditional farming communities. It results in a very patchy environment where a farmer in a limited area, contiguous to other farmers' fields, manages a given landrace. Little research has been aimed at better understanding the evolutionary process behind plant evolution within the agricultural ecosystems of traditional farming communities. Maize landrace populations, like natural populations, are subject to migration and drift, to both natural selection and farmers' selection, and finally to local extinction and recolonization processes. Therefore, to gain a complete picture of the system, it is important to document in detail the genetic dynamics in these farmers' fields and the impact of their various practices. Traditional maize agroecosystems provide a framework to investigate local selection for a number of quantitative traits in a highly subdivided metapopulation with high migration rates among populations. The management practices of small-scale Mexican farmers are key to the evolution of maize and its diversity. Crucial practices include the planting of numerous maize populations within a small area under the management of a single farmer. A maize landrace or population is defined by the farmer in terms of ear characteristics; ear type will be maintained by the farmers through conservative selection in spite of considerable gene flow (Louette et al, 1997; Louette and Smale, 2000). Today, in Oaxaca, many farmers still cultivate maize in the same way their ancestors have for thousands of years. Seeds for planting are obtained from ears selected by the farmers from the harvest of the previous cycle. On many occasions, seed is also obtained from neighbors or other sources, allowing farmer-controlled seed flow among populations (Smale et al, 1999). In addition, the Central Valleys of Oaxaca show very little presence of or impact from modern varieties (Smale et al, 1999; Bellon et al, 2003). Furthermore, previous research suggests that this region possesses a large amount of phenotypic variation (Bellon et al, 2003); therefore, it offers unique conditions for the study of the evolutionary processes that are fundamental to maize phenotypic evolution and diversification.

Measuring the strength of selection in a farmer's field and with a farmer's selection criteria is not an easy task. Our experience shows that it is difficult to obtain from farmers a precise description of their selection practices. While all the harvested ears go through the selection process, the farmer and his wife are both involved at different stages throughout the year in selecting the ears for the next cycle. Therefore, a constant strength of selection over time or across farms is unlikely.

To measure the effects of farmer selection and on-farm management on phenotypic differentiation, we prefer to use methods that provide indirect evidence of selection. When referring to ‘selection,’ we include as ‘farmer selection’, both direct selection by the farmers and indirect selection, resulting either from farmers' practices or from natural selection. The comparison of population differentiation for quantitative traits as opposed to that of molecular markers (see McKay and Latta, 2002 for a review) and the comparison of genetic variance and covariance matrices across populations (see Roff, 2000 for a review) are used.

After fierce debate over the comparison of variation of gene frequencies among populations with variation of quantitative traits (Lewontin, 1984; Felsenstein, 1986; Rogers, 1986), such comparison has recently allowed evolutionary and ecological geneticists to gain insight into the process of phenotypic evolution (McKay and Latta, 2002 for a review). As suggested by Rogers (1986) and Felsenstein (1986), over time this comparison has proven its value in providing valuable information about phenotypic evolution in natural populations, although Lewontin (1984) considered it to be meaningless. Wright standardized index Fst, was shown to be the same for neutral additive quantitative characters as for a neutral marker (Lande, 1992). Therefore, a value of Fst for quantitative trait (Qst) greater than Fst for neutral loci is evidence of spatially divergent evolution by natural selection (Whitlock, 1999). The use of the comparison of Qst and Fst for molecular markers provides a tool allowing the comparison of patterns of population structure and differentiation resulting from farmers' selection.

As genetic progress is not based on single trait improvement, and yield is a complex trait with many components, farmer selection could affect the covariance among traits of agronomical interest. Multivariate selection response depends on the genetic variance–covariance matrix (G matrix) structure, which ultimately determines the outcome of long-term selection. The sign and strength of the correlation between traits of agronomic interest in maize will affect the outcome of multitrait selection. Comparison of G matrix structure also yields information on past history, selection, and drift processes of populations. Interest in G matrix comparison has been discussed by Roff (2000). In this study, we investigate whether farmers' selection of certain quantitative traits permits a differentiation of populations and whether this results in changes in the G matrix structure. Therefore, in support of the previously mentioned comparison of population differentiation for molecular markers and quantitative traits, G matrix comparison also provides evidence of selection in farmers' fields and the consequences of this selection on the correlations and covariations among traits.

Our goal in this paper is to describe the impact of farmers' selection on phenotypic differentiation and diversification in the Central Valleys of Oaxaca. As 83% of the total maize in the Central Valleys is white maize (Smale et al, 1999), this study focuses on this maize type. We compare the patterns of population differentiation for quantitative traits to those previously described for molecular markers (see companion paper). This will provide a better understanding of the evolutionary dynamics and also of the adaptative ability of traditional maize landraces.

Material and methods

Material used for the genetic analysis

A total of 31 populations were assayed in the field evaluation and by genotyping. The material used in this study has been described elsewhere (see companion paper). All maize landraces used in this study predominantly showed characteristics of the Bolita race as described by Wellhausen et al (1952).

Simple sequence repeat genotyping

A total of 11 microsatellite markers were used for analysis (see companion paper). SSR primers were selected from the maizeDB database of public SSRs, and included the following: phi011, phi227562, phi96100, phi101049, phi029, phi093, phi024, phi452693, phi034, phi014, and umc1061. Markers were selected according to their chromosomal locations, in order to provide for genome-wide coverage, and also by the size of the amplification product, to allow multiplexing on an automated DNA sequencer.

Characterizing populations for quantitative traits

In all, 18 open-pollinated families were sampled for each of the 31 populations. The field layout was a two-replicate design with hierarchical structure (population plots randomly assigned and family plots randomly assigned within populations). The experiment was carried out at the CIMMYT experiment station at El Batán, Texcoco, Mexico. Up to 12 open-pollinated progenies per family were evaluated (giving a total of 31 × 18 × 12=6888 plants evaluated) for plant development, ear development, and kernel traits. These included days to silking (DS), days to anthesis (DA), anthesis-silking interval (ASI), plant height (PH), ear height (EH), ear length (EL), ear width (EWi), ear weight (EW), row number (RN), kernel count (KC), kernel count per rows (KCR), cob width (CWi), overall grain weight (GW), kernel weight (KW), and kernel thickness (KT). All measurements, except for days to silking and days to anthesis, were made at plant maturity.

Computing θ

Overall Fst=θ (Weir and Cockerham, 1984) was calculated for the entire set of 31 populations. Jackknifing over populations and loci was used to provide a confidence interval according to Weir (1996). A matrix of pairwise Fst among populations was also calculated. Fst values were estimated using GDA 1.1 (Lewis and Zaykin, 2002), which performs hierarchical F-statistics. Different levels of population subdivision were tested as suggested by Weir (1996).

Analysis of variation for quantitative traits

Analysis of variance and calculation of variance components were carried out using the following model:

where Pijk is the phenotypic value of the jth family in the ith population, μ the overall mean, Popi the ith population effect, HS(pop)ij the jth family effect within the ith population, and ɛijk the residual representing variability within families.

Assuming that allelic effects at each locus are additive, we have (Wright, 1965)

From this, one can deduce Fst for quantitative traits(Qst):

Assuming the variance component for open-pollinated (half-sib) families to estimate one-quarter of the additive genetic variance (Falconer and Mackay, 1996), we have

Therefore, in the case of assortative mating or maternal effect, we will underestimate Qst values as a result of overestimating σis2. Even with assortative mating or maternal effect, a value of Qst greater than Fst is evidence of spatially divergent evolution by selection.

Intraclass correlations for maternal open-pollinated families were estimated as follows:

where σop2 is the variance among open-pollinated families and σz2 the total phenotypic variance within populations. 4top provides the best estimate of heritability where there is no assortative mating or maternal effect.

Genetic correlations between pairs of traits (X and Y) were estimated over the entire experiment:

Genetic variance–covariance matrices (G matrices) as well as genetic correlation matrices were calculated for each population.

Because the villages were chosen to maximize heterogeneity for socioeconomic and maize production potential variables (Smale et al, 1999; Bellon et al, 2003), a fixed effect model was used to estimate variance and covariance components. All variance and covariance components were estimated by the method of moments. Mean squares were estimated using Proc GLM type III sums of square using the SAS software (SAS Online documentation at http://www.sas.com).

Comparison of population structure for neutral markers and quantitative traits

In the first step, Fst and Qst were evaluated for all populations. Confidence intervals were calculated through jackknifing over populations for quantitative traits and over populations and loci for microsatellite markers.

A Mantel test (Sokal and Rohlf, 1995) was used to test for the independence of the pairwise Fst vs Qst matrix.

Pairwise Fst and Qst matrices were used as genetic distance matrices to draw trees using the UPGMA clustering analysis.

Comparison of genetic G matrices

Matrices were compared using a multivariate analysis of variance (MANOVA) approach as described by Roff (2002). Wilks' lambda (Λ) as well as Pillais' trace (T) were examined to test for significance of village and population within village effects. Roff's method (Roff, 2002) was chosen over Flury's (Flury, 1988; Phillips and Arnold, 1999) because it allows for nested comparisons and can easily be computed on standard statistical packages. Furthermore, maximum-likelihood methods like the Flury method can be very computer-intensive (Roff, 2002), especially for very large data sets. MANOVA was carried out with a fixed effect model using a Proc GLM MANOVA statement under SAS software (SAS Online documentation at http://www.sas.com).

Results

Among-population genetic structure for molecular markers

All tested SSRs loci were polymorphic in all populations. Fst values obtained for molecular markers are 0.011 within villages, 0.003 among villages, and 0.011 not considering the village level of hierarchy. There was no statistically significant evidence for isolation by distance. We observed low among-villages Fst values, significantly lower than within-village values. Population structure using molecular markers has been described in more detail elsewhere (see companion paper).

Population structure for quantitative traits

Despite considerable gene flow among populations and the resulting low level of population differentiation at these loci, considerable population differentiation occurs for quantitative traits. Fst values for quantitative traits (Qst) are significantly higher than those observed for molecular markers (Table 1). Strong among-villages structure for a number of ear and kernel traits are observed (eg the among-villages Qst value for kernel weight is 0.535). Furthermore, the patterns of population structure differ greatly from one trait to another.

Table 1 Genetic variation among and within populations

For traits showing the lowest within-village or among-villages Qst values, a study of the pairwise Qst matrix or trees built by UPGMA using Qst as a distance shows that a number of populations are considerably differentiated from the others. Examples are given in Figure 1 for anthesis-silking interval and kernel weight. For all traits, we observe high pairwise Qst values in comparison to the pairwise Fst values measured for molecular markers, which show considerably less population differentiation (Figure 2).

Figure 1
figure 1

UPGMA tree using pairwise Qst as genetic distance measures (a) for anthesis-silking interval (ASI) and (b) for kernel weight (KW). Populations from six different villages include populations from Huitzo (107–139), Mazaltepec (211–235), San-Lorenzo (309–317), Amatengo (405–439), Valdeflores (512–536), and Santa-Ana Zegache (602–640).

Figure 2
figure 2

UPGMA tree using pairwise Fst=θ as genetic distance measures.

In addition to Qst being significantly higher than the Fst value for molecular markers, a Mantel test shows no significant correlation between pairwise Qst and Fst values for any of the traits considered.

Heritability

Within-population heritability estimates measured over the entire experiment are given in Table 1. We observe that four times the intraclass correlation (4top) for maternal open-pollinated families clearly overestimates heritabilities for flowering traits. This could be the consequence of assortative mating for flowering traits. Assortative mating in maize landrace populations has been treated elsewhere (see companion paper).

Comparison of G matrices

The G matrix was calculated for each population. We examined variation of the G matrix, both at the population and village levels, following the method described by Roff (2002). The nested MANOVA shows a highly significant effect of population within village as well as among villages (Table 2). The test is highly significant (P<0.0001) for both effects for the variance–covariance matrix as well as the correlation matrix, which indicates that nonproportional changes affect the structure of the variance–covariance matrix. Element-by-element comparisons show a significant (P<0.001) effect of population within-village as well as for village for the variance and covariance of all the studied traits. All traits examined in this study are either direct or indirect yield components and show Qst values higher than Fst. Furthermore, we observe considerable changes in the G matrix structure, including cases where high correlations that are present in some villages or populations are low and no longer significant in others. However, in no case did the sign of correlations change between villages or populations.

Table 2 Variation in the G matrix over population and village

Discussion

Comparison of Fst values for molecular markers and quantitative traits

Some authors have raised concerns about the effects of epistasis and dominance on the expected value of Qst (Lynch et al, 1999). In the case of epistasis, Whitlock (1999) has shown that additive-by-additive epistasis would decrease the expectation of Qst. Whether dominance will increase or decrease, the expected Qst deserves further study. However, it is suggested by Whitlock (1999) that, averaged over a uniform distribution of allele frequencies, the contribution of dominance to Qst approaches zero. Therefore, a value of Qst greater than Fst for neutral loci is evidence for spatially divergent evolution by selection (Whitlock, 1999).

Considering dominance in our case, except for yield and directly related traits such as ear weight and overall grain weight that show considerable dominance variance, all the other studied traits show low ratios of dominance to additive genetic variance (Hallauer and Miranda, 1988).

Comparative population structure for molecular markers and quantitative traits

The Qst values measured for quantitative traits are significantly higher than Fst values obtained for molecular markers, as expected in the case of diversifying selection. In this case, molecular markers do not provide any information about the different breeding potential of these populations. Most studies comparing population structure for molecular markers to that of quantitative traits have described situations in which QstFst (McKay and Latta, 2002 for a review). Their findings could be attributed to a bias because most of the authors looked at cases of diversifying selection. For the rare cases in which homogeneous selection was investigated, authors have found in some plant species that Qst<Fst (Petit et al, 2001). This suggests that in no case can Fst be considered as a conservative estimate of Qst.

Selection in spite of gene flow or gene flow in spite of selection

Farmers have been shown to practice conservative selection on their maize populations based on a limited number of traits (Louette and Smale, 2000). Our survey shows the same practice in the Oaxaca Central Valleys, as farmers reportedly select a given ear type over generations. The characterization of high gene flow among these populations (see companion paper) and the strong divergent selection characterized in this study clearly show differentiation between populations for quantitative traits despite considerable gene flow and, reciprocally, considerable gene flow despite strong divergent selection by Oaxacan farmers. The development of differentiation between populations for quantitative traits despite considerable gene flow will occur if the number of migrants is large enough (Nm>1) and when the migrant proportion is low enough (m<s) (McKay and Latta, 2002 for a review). In other words, selection at a given loci will have little effect at other loci, unless they are closely linked (Barton and Bengtsson, 1986).

Selection and changes in the G matrix structure

Under farmer management, maize landrace populations are subject to significant changes in their G matrices at both the within-village and among-villages levels of hierarchy. Nonproportional changes in the structure of the G matrix are associated with selection, while drift would be responsible for proportional changes (Roff, 2000). Therefore, our result showing nonproportional changes in the G matrix structure supports the result obtained by the comparison of the population differentiation for molecular markers to that for quantitative traits.

Populations with different G matrices are expected to respond very differently to multivariate selection. Therefore, populations and villages with different G matrices will have different breeding potential.

Possible impacts of genes of major effect on quantitative trait variation

One of the key questions is whether variation at genes of interest is better estimated by variation at neutral loci or by variation for quantitative traits. McKay and Latta (2002) suggest that in the case of a polygenic trait, variation at QTLs is probably better reflected by variation at neutral markers than by the one seen for the quantitative trait. A condition to that statement is that trait value should result from the variation of many loci. However, genes or QTLs of major effect have been shown in numerous studies in maize and other species to account for a large part of the phenotypic variation. Almost half of the traits have a QTL accounting for at least 20% of the total phenotypic variation (see Lynch and Walsh, 1998 for a review). Agrawal et al (2001) have recently suggested that genes of major effect could dramatically alter the G matrix structure and therefore significantly alter the outcome of multivariate selection. The considerable changes that we observe in the G matrix structure could be explained by farmer selection on genes of major pleiotropic effect. These changes in the G matrix structure and strong differentiation for quantitative traits among populations that are little differentiated for neutral markers could be used as a diagnostic to identify landraces to be screened for new allelic variation.

Implications for the management of genetic resources

Habitat fragmentation, resulting from the abandonment of traditional farming, could lead to gene flow erosion among traditional maize populations (Berthaud et al, 2001). Gene flow erosion could increase the rate of extinction (Stockwell et al, 2003) among populations that are under strong selection pressure. A potential corrective measure that uses current strategies would take advantage of the absence of strong population structure at neutral markers and then favor the establishment of artificial gene flow. However, in our case, the population structure for quantitative traits is considerably higher than that observed for molecular markers. As a consequence, both sets of criteria (neutral markers and quantitative traits) should be taken into account when developing genetic resource management strategies. At the same time, determining a management unit for these strategies should also be based on the observed population structure, which is mainly based on phenotypic variation. Here, the village should serve as the unit for managing genetic resources.

Maize racial classification

Wellhausen et al (1952) classified Mexican maize populations into major races. It is a taxonomic approach based on morphological traits. Many authors have used racial considerations to compare the diversity among maize landraces. All the landraces used in this study predominantly display the characteristics of the Bolita race as described by Wellhausen et al (1952). However, we note the presence in a few landraces of some traits that would be typical of races such as Pepitilla or Zapalote chico (Bellon et al, 2003) although they never encompassed all the landraces of a given village. For this reason we will define Bolita as a complex rather than a race.

Our results strongly suggest that the morphological characteristics of a given landrace are either maintained or created by an active process resulting from farmers' and villages' selection criteria. Phenotypical resemblance or lack of resemblance cannot be considered as only reflecting historical processes of seed flow as assumed in the work of Wellhausen et al (1952). It could be the product of either historical or convergent and divergent selection processes.

Farmer management results in phenotypic diversification in maize landraces

We report in this paper on the effects of farmers' management on population structure and differentiation for phenotypic traits as opposed to that seen for molecular markers. Not only do we observe considerably higher population differentiation for quantitative traits than for molecular markers, but we also observe highly significant changes in the G matrix structure that imply changes of covariation and correlation among traits. Both of these are responses to farmers' selection and management (indirect selection) practices. Although, to the best of our knowledge, it is the first time these techniques have been used for a crop species, they are increasingly used with wild species (see McKay and Latta, 2002 for a review) for the study of ‘contemporary evolution’ (Stockwell et al, 2003). The integration of surveys of neutral and selected variation is therefore advisable for the study of contemporary evolution and before building genetic resources management strategies.