Introduction

Fecundity selection is often suggested as the main causal factor underlying the prevalence of female-biased sexual size dimorphism (SSD, defined here as female size/male size) in the animal kingdom (e.g. Darwin, 1874; Williams, 1966; Head, 1995; see Shine, 1988 for a critique). As there is often a positive correlation between body size and fecundity in ectotherms (Roff, 1992; p. 126), larger females tend to be more fecund, and therefore ‘favoured by selection’. Strictly speaking, this is not selection for large females (sensu Lande & Arnold’s (1983) multivariate selection gradient method), because any evolutionary increase in body size, in this simplified scenario, is just a correlated response to direct selection acting on fecundity. Making this distinction between the two different targets of selection (body size or fecundity) is potentially important in terms of understanding the evolution of SSD. If selection acts directly on female size, the typically high correlations between sexes for body size (Lande, 1980; Roff, 1997, p. 247; Preziosi & Roff, 1998) would be expected to cause a strong correlated response in the size of males, and very slow evolution of sexual size dimorphism (Lande, 1980). This effect has been demonstrated in a previous experiment using artificial selection on body size in Drosophila melanogaster (Reeve & Fairbairn, 1996). Selection acting directly on fecundity would produce SSD through a correlated response in body size and might produce a more rapid divergence in size between the sexes, and therefore more rapid evolution of SSD.

The impact of fecundity selection on SSD will depend critically on the pattern of variances and covariances for fecundity and body size in the two sexes (i.e. the genetic architecture of these two traits). Although males do not express the phenotype fecundity (defined as egg production, not male fertility), they still possess genes for the trait. Sex differences may exist in the genetic correlation between fecundity and body size, both within and between sexes. In addition, the genetic correlation between the sexes for fecundity itself might be lower than the high values commonly found for morphological traits. Selection acting directly on fecundity in females would produce a more rapid change in SSD than selection acting directly on body size if, for example, fecundity was strongly correlated with body size within females, but weakly correlated with body size in males (i.e. intersexual correlations were weak).

Unfortunately, there are no estimates of the genetic variance–covariance matrix (G) for body size and fecundity in males, or of the covariances between sexes for these two traits (the B matrix of Lande, 1980). Even if such estimates were available, empirical and theoretical results show that predictions of correlated responses are unreliable (e.g. Bohren et al., 1966; Gromko, 1995; Lascoux, 1997), and there is evidence that reciprocal selection may not always result in symmetrical responses among a set of traits (e.g. Shiotsugu et al., 1997). The hypothesis that SSD can evolve in response to fecundity selection must therefore be tested empirically.

In this paper, we apply artificial selection on fecundity in D. melanogaster to see if it is possible, in a small number of generations, to produce a relatively greater change in body size in females than in males. Despite the long history of the fecundity selection hypothesis, we are unaware of any other experiments that select on fecundity and measure relative changes in body size in the two sexes. As there has been much interest recently in the correlated responses seen amongst various life history traits (e.g. Zwann et al., 1992; Nunney, 1996; Shiotsugu et al., 1997), we also look at correlated responses in development time, preadult viability, and adult longevity.

Materials and methods

Selection experiment and direct response

The base population was derived from a cross between four stock strains of Drosophila melanogaster. This population was kept in a random mating mass culture for four years at 25°C, with new generations initiated every two to three weeks from approximately 1000 adults.

Selection on fecundity was maintained for 20 generations, at 25°C. Two replicate lines were used for upward (U1, U2), and downward (D1, D2) selection, along with two controls (C1, C2) for six lines in total. Each generation, 90 females (five to seven days old; see below) were individually placed in small plastic vials (20 × 50 mm) containing a single randomly chosen male of the same age from the same replicate. Males were included in these vials because their presence is known to enhance egg laying (Bell et al., 1957). Each vial contained a mixture of agar and grape juice, which was coated with a thin film of yeast extract to enhance egg laying. Our measure of fecundity was the number of eggs laid over a single 18-h period, at an age between five and seven days posteclosion. This is the peak time for Drosophila at 25°C. The number of eggs produced during the peak phase early in life is known to be highly correlated with lifetime fecundity (Gowen & Johnson, 1946).

For each line, 30 of the 90 sets of parents were selected. In the C lines, parents were selected randomly. In the D and C lines, vials with 10 or fewer eggs were excluded, to minimize selection for flies that were unhealthy, rather than just at the low end of the ‘normal’ distribution of fecundities. The 30 pairs of selected parents were put in separate 30 × 100 mm plastic vials with commercial Drosophila media plus yeast for 24 h, and then removed. Adult flies for the next generation were collected three days after the start of eclosion. Where possible, five flies of each sex were collected from each of the 30 vials, to minimize the effects of natural selection. These 300 flies were mixed together, and redistributed into 10 vials. From generation three onwards, each vial of flies was transferred to a newly yeasted vial two days after the first transfer, which greatly reduced the environmental variation between vials and improved the consistency of the selection response. The following day, nine randomly selected pairs of flies from each of the 10 vials were placed in the grape juice–agar vials, for egg counting the next day.

All manipulation was carried out using ether anaesthesia. Lines were spaced at one-day intervals, in the order D1, C1, U1, D2, C2, U2. All statistical analyses, Figures (except Fig. 1) and Tables refer to counts and measurements for flies at generation 20. Towards the end of the selection experiment, it became apparent that the average number of flies emerging was not equal in all of the replicates. In particular, D1 had many fewer adults per vial. Because many traits in Drosophila are correlated with larval density, all generation 20 flies were raised at a density of 25 eggs per vial.

Figure 1
figure 1

Direct response to selection for increased or decreased fecundity in Drosophila melanogaster. (a) Response in each replicate. (b) Relative responses in the mean value of the two replicates per selection regime.

Correlated responses — morphological traits

Thorax width was measured as the distance between the posterior sternopleural bristles on the ventral surface of the thorax. Thorax length was the distance from the tip of the scutellum to the front of the thorax, measured from above, and abdomen width was the width of the third abdominal segment in females and the fourth in males. Each of these was usually the widest point in the respective sexes. All measurements were made using digitizing software attached to a 40× microscope.

Correlated responses — life history traits

Development time was measured as the time from the midpoint of egg-laying to the midpoint of the observation interval during which adults eclosed. Starting on the afternoon of the eighth day after egg-laying, at intervals of 6, 12 and 6 h, adults from each of 30 vials per replicate were removed, sexed and counted.

Preadult viability (egg to eclosion survivorship) was assessed during the experiment described above. Pupae were counted on day eight in each vial. Adults were collected at regular intervals (see above) until there were no more eclosing. For this stage (pupa to adult survivorship), which required anaesthetizing and sexing the flies, flies from all vials within a replicate were collected together, so among-vial effects were not available for calculating standard errors.

Estimates of adult longevity were obtained for each replicate by placing eight newly eclosed flies (four of each sex) into each of 13 vials. Flies were transferred to new vials at three-day intervals, at which time surviving flies were counted. As the experiment progressed and flies died, vials were consolidated, in an attempt to maintain 6–10 flies per vial. Males died at a faster rate than females, so this consolidation included ensuring that at least one male was present in each vial.

Statistical analyses

The fecundity distributions were generally negatively skewed, so were rank-transformed before statistical analysis. For the statistical analyses presented in the Tables, body size data were not transformed before analysis, as there was no indication of non-normality or heteroscedasticity. Outliers were removed before analysis, using the method given in Sokal & Rohlf (1995, p. 407.) Analysis of variance was performed with sex and selection (control, down and up) as fixed factors, and replicate as a random factor nested within selection. Because one of the replicates (D1) was very different from the others in terms of preadult viability, longevity, fecundity and development time, it is probable that its response resulted from a genetic mechanism quite different from the other replicates, including D2. For this reason, multiple comparisons were carried out using two different data grouping methods: (i) differences among selection types with replicate nested within selection; and (ii) differences among all six replicates. Two-tailed tests were used for all traits except fecundity, where there was an a priori expectation of direction of response in relation to the controls. We did not use one-tailed tests on morphological traits, as there is the possibility that trade-offs in resource allocation between growth and fecundity might lead to negative correlated responses, in contrast to the expected positive response.

Results

Direct response

Figure 1 shows the direct response to selection on fecundity. A change in experimental protocol after generation three (see Materials and methods) resulted in a noticeable reduction in variation between generations within replicates. The difference in fecundity between the selected and control lines at generation 20 (Table 1 and 2) corresponds, on average, to an increase of 5.7% in the U lines, and a decrease of 16.6% in the D lines. Analysis of variance showed a significant effect of selection (F2,3=90.5, P=0.002), but not of replicate (F3,534=0.56, P=0.64). Nested posthoc one-tailed P-values, corrected for multiple comparisons, were: C vs. U (0.152), C vs. D (0.024) and D vs. U (<0.0005) (Tables 1). Although the response in the U lines was not statistically significant, the fact that the U lines produced, on average, more eggs than the C lines in every generation from the eighth to the twentieth would indicate that a small but real response did take place. Realized heritabilities (total response/total selection differential), calculated from data in generations four to 20, were 0.055, 0.036, 0.018 and 0.017 for lines D1, D2, U1 and U2, respectively.

Table 1 Generation 20 means (and SE) for fecundity and three morphological traits in Drosophila melanogaster, with data from the two replicates pooled for each selection type. Superscripts indicate significant differences between selection types, after correcting for the appropriate number of multiple comparisons, using ANOVA with replicate nested within line. Sexes analysed separately
Table 2 Generation 20 means (and SE) for fecundity and three morphological traits in Drosophila melanogaster, for the six replicates. Superscripts indicate homogeneous subsets using Tukey's HSD test. Sexes analysed separately

Correlated responses of morphological traits

Correlated responses of the three morphological traits are also shown in Tables 1 and 2 In general, the U lines increased in size, whereas the response in the D lines was inconsistent. For thorax width, there was a significant effect of selection (F2,3=14.9, P=0.028), sex (F1,1030=4866, P< 0.0005), replicate (F3,1030= 4.1, P=0.006), and sex by selection interaction (F2,1030=3.75, P=0.024). The interaction is caused by an increase in SSD in the U lines, and a slight decrease in the D lines (Table 1 and Fig. 2. The up-selected lines had wider thoraxes than the controls and down lines, but this difference was only significant in females. Although the lines were not significantly different in males, it can be seen that there is a trend for the U line males to be bigger than the others when the six replicates are considered separately (Table 2).

Figure 2
figure 2

Sexual size dimorphism (female size/male size) in each replicate of Drosophila melanogaster, and the mean ± 1 SE. (a) Thorax width. (b) Abdomen width. (c) Thorax length.

For abdomen width, there was a significant effect of selection (F2,3=74.9, P=0.003), sex (F1,1034=15 911, P< 0.0005), and sex by selection interaction (F2,1034=12.1, P< 0.0005), but not replicate (F3,1034=0.87, P=0.46). As with thorax width, the interaction term can be seen as an increase in SSD in the U lines, and a decrease in the D lines (Table 1 and Fig. 2), but here the effect was more pronounced. In females, the up-selected lines had significantly wider abdomens than the controls and down lines, which were not significantly different from each other. In males, none of the lines was significantly different, but as with thorax length, a trend for larger abdomens in the U lines can be seen when the six lines are analysed separately (Table 2).

Thorax length showed a significant effect of sex (F1,1033=11 333, P< 0.0005) and replicate (F3,1033=9.39, P< 0.0005), but not for selection (F2,3=4.64, P=0.121), or sex by selection interaction (F2,1033=2.18, P=0.114). Although the effect of selection was not significant, the trends were similar to those seen in thorax width. In both sexes, the U lines had longer thoraxes than the C lines, which in turn were longer than the D lines. SSD increased not only in the U lines (Tables 1 and 2, and Fig. 2), but also in the D lines, as opposed to the decrease in SSD seen in the D lines for both thorax and abdomen width.

The difference between the selected and control lines can be summarized in terms of generation 20 control standard deviation units as follows. For the up lines, fecundity (0.25), thorax width (0.81 in females, 0.75 in males), abdomen width (0.87, 0.82) and thorax length (0.50, 0.20). Thus, selection for increased fecundity produced larger correlated responses in females than in males for all three morphological traits. Furthermore, the size increase in females was greater than the direct response of fecundity. The correlated responses to selection for decreased fecundity were not as strong, and were larger in males: fecundity (−0.79), thorax width (−0.08, 0.12), abdomen width (0.22, 0.54) and thorax length (−0.37, 0.42).

SSD can be measured as either a ratio or a difference. In the absence of detailed knowledge about the genetic basis for variation in body size, there is no a priori reason for preferring one measure over the other. For this reason, all of the morphological data were re-analysed after log-transformation. A significant sex by selection interaction in the ANOVA when using raw data indicates that the difference in size between the two sexes has changed. With log-transformed data, a significant interaction indicates that a change in size ratio has occurred. In all but one case, the statistical conclusions were the same as when raw data were used. The only difference was in the analysis of thorax width, when the interaction term was no longer significant (F2,1030=2.58, P=0.077).

Correlated responses of life history traits

Table 3 shows the corrected significance levels for all pairwise comparisons between development times for all replicates, with sexes analysed separately. There was a tendency for development time to increase in both the up- and down-selected lines (Table 3 and Fig. 3. As the distributions were not normal, and differed markedly between the two D replicates, comparisons were made using a Kolmogorov–Smirnov two-sample test for differences between cumulative frequency distributions.

Table 3 Significance levels for amongst-replicate multiple-comparisons of development time at generation 20 in Drosophila melanogaster. Female values are above the diagonal, male values below
Figure 3
figure 3

Distribution of development times (hours from egg to eclosion) in the six replicates of Drosophila melanogaster. The proportion of the total flies within a replicate that eclosed over the observation interval is shown on the y-axis.

Egg–pupa survivorship Fig. 4 showed no effect of selection (F2,3=3.0, P=0.19), apparently because of the large effect of replicate within selection (F3,174=12.9, P< 0.0005). If the data are analysed as six replicates, the D1 and D2 lines are significantly different from all other replicates (including each other) using a Tukey posthoc test. The method of data collection does not allow statistical analysis of differences in pupa–adult survivorship. However, from Fig. 4, it would appear that the reduction in egg–adult viability in the D lines is mostly caused by increased larval, rather than pupal, mortality.

Figure 4
figure 4

Proportion of Drosophila melanogaster individuals surviving the egg–pupa and pupa–adult stages, and the combined survival proportion over both stages. Error bars indicate +1 SE, estimated only for egg–pupal survival (see text).

Gehan’s generalized Wilcoxon’s test (SPSS 7.5 Advanced Statistics Manual, p.269) was used to compare the adult survivorship curves of the six replicates, separately for each sex Fig. 5. In females, the D1 line was significantly different from all other replicates (corrected P< 0.003 for all pairwise comparisons), which were not different from each other. For males, there were no differences among the six replicates.

Figure 5
figure 5

Adult survivorship curves for male and female Drosophila melanogaster.

Discussion

Selection on fecundity produced a significant decrease in the D lines and a nonsignificant increase in the U lines. Although the response in the U lines was small, it appears that selection did produce genetic changes, because some of the correlated responses were significant. These correlated responses (C=D < U for thorax width and abdomen width) were only significant in females, and were larger than the direct response of fecundity, in terms of standard deviation units. Despite this, it is clear from Tables 1 and 2 that there was a tendency in both sexes for an increase in all three morphological traits in response to selection for increased fecundity. Downward selection produced smaller and directionally less consistent changes in body size measurements. Significant sex by selection interaction terms in the ANOVAs for thorax width and abdomen width were produced mostly by greater increases in female size in the U lines. These results indicate that fecundity selection is potentially capable of producing fairly rapid changes in SSD.

Because fecundity is highly correlated with fitness, it might be expected that past selection will have depleted most of the genetic variance for the trait. However, numerous studies have found significant heritability for fecundity, both in Drosophila (e.g. Tait & Prabhu, 1970), and in other invertebrates (see Roff, 1992; p. 360). There have been relatively few selection experiments on fecundity in invertebrates. Although responses are usually small (e.g. Narain et al., 1962; Richardson & Kojima, 1965; both with Drosophila), Orozco & Bell (1974), using Tribolium castaneum, were able to quadruple fecundity after just 20 generations. In addition, evidence from agriculture clearly shows that it is possible in domestic fowl to increase fecundity substantially (e.g. Gowe & Fairfull, 1990). Therefore, fecundity can be treated in the same way as other quantitative traits, and should respond to selection in either direction. In natural populations, fecundity is probably under balancing selection because of trade-offs with other fitness-related traits. In our study, the only evidence for such a trade-off was in the slight increase in development time found in the selection lines. The crowded conditions in the base population from which the lines were derived probably selected for rapid development time. Relaxation of this selection pressure in our lines allowed the increase in development time necessary to produce increases in body size and fecundity seen in the U lines. The U lines did not suffer from increased preadult or adult mortality, although our direct response may have been too small to produce significant detrimental correlated responses. A count of fecundity at day 28 (results not shown) in the six replicates gave no indication that the U lines had reduced late-life fecundity relative to the controls.

There is evidence that abdomen size is the component of body size that responds most strongly to selection for higher fecundity in insects (Wickman & Karlsson, 1989; Preziosi et al., 1996). This is generally considered to result from space limitations, either for eggs or for storage of resources to produce eggs. In female Drosophila, the abdomen is thought to be space-limiting in both these respects (Robertson, 1957). The fact that a sex by line interaction term in the ANOVA was found for thorax width shows that the change in dimorphism was not solely caused by a secondary increase in size caused by swelling of the abdomen from extra eggs, as might be argued if the interaction had only been found in the analysis of abdomen width.

The large direct response and unusual correlated responses in life-history traits in line D1 probably resulted from inadvertent selection for unhealthy flies, rather than for flies with low values for ‘fecundity genes’. The patterns seen in development time for this line (increased development time, increased variance, flatter distribution) are similar to those seen in normal lines that have been grown in crowded conditions (Botella et al., 1985). However, D1 flies were unusual in that they were not smaller than the controls, as would be the case with flies grown in crowded environments. It is interesting that the increased adult mortality in line D1 was restricted to females. Sex differences in longevity are known to be very environment-specific (e.g. Zwann et al., 1992), and either sex may live longer under any particular set of experimental conditions. Chi-squared tests failed to show any significant reduction in the female: male sex-ratio of eclosing flies in this line, so the sex-specific detrimental effects seem to have been limited to the adult stage.

Many studies have examined the pattern of genetic and phenotypic relationships between body size and various life history traits in Drosophila. When development time is increased, either genetically or environmentally, body size increases, and this often results in increased fecundity. Depending on the base population used, it may be more difficult to select for decreased development time, but Nunney (1996), for instance, found that this resulted in a reduction of body size, and a sharp decrease in fecundity. SSD in body weight was not changed by this selection. Likewise, many studies have found correlated changes in development time when selecting on body size. Partridge & Fowler (1993) found that in lines selected for increased body size, development time increased. Reeve & Fairbairn (1996) found a positive correlation between body size and development time in flies selected for small or large thorax width.

Our U lines are consistent with these patterns, but the D lines were unusual in that decreased fecundity was associated with an increase in development time and even a slight increase in abdomen width. Although it is possible that line D2 suffered some of the same problems in terms of general health as line D1, it is also possible that selection for decreased fecundity targeted genes that specifically broke down the correlation between fecundity and the other traits. A similar asymmetrical set of correlated responses has been found by Shiotsugu et al. (1997) in D. melanogaster. In lines selected either for longevity or crowding tolerance, urea resistance evolved, whereas selection for urea resistance did not produce a correlated response in either longevity or crowding tolerance. Although there is an underlying causal relationship between development time/body size and fecundity, it may be that there are also genes for fecundity that have sex-specific effects on body size. If selection is imposed on size or development time, the selection intensity on these sex-specific loci will not be as strong as during fecundity selection, and the general pattern found between the traits will be better maintained.

Both empirical results (Reeve & Fairbairn, 1996) and theoretical considerations (Lande, 1980) suggest that SSD may be difficult to alter through single-sex selection on body size. Our results show that significant changes in morphological traits may occur through fecundity selection, even when the direct response in fecundity is small. In general, correlated responses may be greater than the direct response when the heritability of the selected trait is low and the correlation between traits is high. A smaller correlated response in males than in females, as in this experiment, would indicate that the genetic correlation between fecundity and size in males is lower than the same correlation within females. The present experiment supports the idea that fecundity selection may be a mechanism through which rapid changes in SSD are possible, by providing more direct access to genes with sex-specific effects on body size than is possible through single-sex selection on body size itself.